Technically Speaking

The Official Bigstep Blog

 

5 Big Data Products You Need to Toss Out with Your 'Happy 2017" Party Napkins

Over the course of the past few years, big data has evolved from an incomprehensible buzzword to a little-understood but powerful plaything to an essential part of the business' strategy. Whether your bag is marketing or BI or product development, there's something in it for you with big data.

Over the course of the past few years, big data has evolved from an incomprehensible buzzword to a little-understood but powerful plaything to an essential part of the business’ strategy. Whether your bag is marketing or BI or product development, there’s something in it for you with big data.

The growth and popularity of big data sparked a wave of development projects, from Hadoop to Spark to Kafka to big-data-oriented cloud storage solutions. Like most development flurries, some projects have solidified their place in the business world, and others have either gotten off track or simply been replaced with more potent alternatives.

The new year is a good time to clean house. As you toss out the Christmas tinsel, used wrapping paper, champagne bottles and “Happy 2017” napkins, you might also want to wrap these outdated big data tools in a garbage bag and make room for better alternatives for the new year.

1. MapReduce

Nobody ever really liked MapReduce, but it was the best option for some time. Now, with Spark, there’s really no reason to mess with it anymore.

MapReduce was never well-loved. It’s hard to work with, slow, and is almost not ever the best way to get something done. In the age of Spark, there really is no reason to stick with MapReduce any longer. While there is a bit of cost and effort involved, the switch from MapReduce to Spark is more than worth it.

2. Java (Language)

Java’s syntax isn’t really suitable for big data workloads, and more modern languages are much better suited for these jobs, such as Scala and Python. Python also has the advantage of tons and tons of qualified developers (there are probably some already on your staff).

3. Flume

The last Flume release was May of 2015, so the writing has been on the wall for Flume for some time. It just didn’t get a decent burial. Kafka is probably your best alternative, backed by a strong and highly committed development community and some adoptions by pretty heavy hitters in the big data universe (including LinkedIn, Twitter, Pinterest, and others).

4. Storm

For a while, there was some relevant discussion over Spark versus Storm, but not anymore. Storm has been surpassed by Spark, and is taking a hit from Flink now, too. Storm’s latency was no big deal until real-time processing became THE driving force in big data. Now, latency is a deal-breaker, and the interest within the development community is visibly waning. Apart from Hortonworks, no one is really putting much effort into Storm any more, and as better alternatives become available, attention there is dropping off, as well.

5. Pig

That’ll do, Pig.

It’s kind of odd, really, that Pig is still around in the age of Spark and other big data innovations, but perhaps the unusual name kept it alive longer than necessary. Not to worry—if you want an oddly-named big data tool to play with, you can take your pick of more relevant options, like Chukwa, Flume, and Oozie (although an argument could be made that Oozie is a has-been tool, too).

What big data tools should you be investing in at the dawn of 2017? In terms of infrastructure, the Full Metal Cloud has proven incredibly scalable, flexible, powerful, and secure. It’s also integrated with all the top big data tools, including Hadoop, NoSQL databases, Splunk, and more. See how other organizations are navigating the next age of the big data revolution when you read our customer stories.

Got a question? Need advice? We're just one click away.
Sharing is caring:TwitterFacebookLinkedinPinterestEmail

Readers also enjoyed:

Big Data and SEO: How the Two Can Partner Up

Search engine optimization is all about making sure that you get the best possible results from your online presence, and one of the most interesting…

Expert Interview with Jorge Balcells on Green Data

Where do Cloud Computing and Green IT intersect? We recently asked Jorge L. Balcells, Director of Technical Services for Verne Global.

Leave a Reply

Your email address will not be published.

* Required fields to post your comments.
Please review our Privacy Notice in order to understand how we process your personal data and what are your rights in this respect.