- Bare Metal
- Bare Metal Cloud
- Big Data Benchmarks
- Big Data Experts Interviews
- Big Data Technologies
- Big Data Use Cases
- Big Data Week
- Data Lake as a Service
- Dedicated Servers
- Disaster Recovery
- Industry Standards
- Online Retail
- People of Bigstep
- Performance for Big Data Apps
- Press Corner
- Tech Trends
- What is Big Data
5 Big Data Products You Need to Toss Out with Your 'Happy 2017" Party Napkins
Over the course of the past few years, big data has evolved from an incomprehensible buzzword to a little-understood but powerful plaything to an essential part of the business’ strategy. Whether your bag is marketing or BI or product development, there’s something in it for you with big data.
The growth and popularity of big data sparked a wave of development projects, from Hadoop to Spark to Kafka to big-data-oriented cloud storage solutions. Like most development flurries, some projects have solidified their place in the business world, and others have either gotten off track or simply been replaced with more potent alternatives.
The new year is a good time to clean house. As you toss out the Christmas tinsel, used wrapping paper, champagne bottles and “Happy 2017” napkins, you might also want to wrap these outdated big data tools in a garbage bag and make room for better alternatives for the new year.
MapReduce was never well-loved. It’s hard to work with, slow, and is almost not ever the best way to get something done. In the age of Spark, there really is no reason to stick with MapReduce any longer. While there is a bit of cost and effort involved, the switch from MapReduce to Spark is more than worth it.
2. Java (Language)
Java’s syntax isn’t really suitable for big data workloads, and more modern languages are much better suited for these jobs, such as Scala and Python. Python also has the advantage of tons and tons of qualified developers (there are probably some already on your staff).
The last Flume release was May of 2015, so the writing has been on the wall for Flume for some time. It just didn’t get a decent burial. Kafka is probably your best alternative, backed by a strong and highly committed development community and some adoptions by pretty heavy hitters in the big data universe (including LinkedIn, Twitter, Pinterest, and others).
For a while, there was some relevant discussion over Spark versus Storm, but not anymore. Storm has been surpassed by Spark, and is taking a hit from Flink now, too. Storm’s latency was no big deal until real-time processing became THE driving force in big data. Now, latency is a deal-breaker, and the interest within the development community is visibly waning. Apart from Hortonworks, no one is really putting much effort into Storm any more, and as better alternatives become available, attention there is dropping off, as well.
It’s kind of odd, really, that Pig is still around in the age of Spark and other big data innovations, but perhaps the unusual name kept it alive longer than necessary. Not to worry—if you want an oddly-named big data tool to play with, you can take your pick of more relevant options, like Chukwa, Flume, and Oozie (although an argument could be made that Oozie is a has-been tool, too).
What big data tools should you be investing in at the dawn of 2017? In terms of infrastructure, the Full Metal Cloud has proven incredibly scalable, flexible, powerful, and secure. It’s also integrated with all the top big data tools, including Hadoop, NoSQL databases, Splunk, and more. See how other organizations are navigating the next age of the big data revolution when you read our customer stories.