February 5, 2015

Best Books to Put on Your Big Data Reading List for 2015

Last year, the rage in big data literature included Big Data: A Revolution That Will Transform How We Live, Work, and Think by Kenneth Cukier and Viktor Mayer-Schönberger and Hadoop: The Definitive Guide by Tom White. Definitely, if you haven't already, add these must-have's to your reading list. But assuming you've kept up to date on your big data reading, here is the next round of fodder for the avid learners among us.As big data analytical technologies become more mainstream, the literature addressing the industry is becoming more specialized and technical. These books will help you delve more deeply into the big data solutions you've chosen to leverage the potential big data holds.

Last year, the rage in big data literature included Big Data: A Revolution That Will Transform How We Live, Work, and Think by Kenneth Cukier and Viktor Mayer-Schönberger and Hadoop: The Definitive Guide by Tom White. Definitely, if you haven’t already, add these must-have’s to your reading list. But assuming you’ve kept up to date on your big data reading, here is the next round of fodder for the avid learners among us.

As big data analytical technologies become more mainstream, the literature addressing the industry is becoming more specialized and technical. These books will help you delve more deeply into the big data solutions you’ve chosen to leverage the potential big data holds.

Practical Hadoop Security

Big data offers some of the most lucrative and rewarding job opportunities available in technology today. Staying well red gives you the edge for all those high-paying jobs.

Practical Hadoop Security was released in September 2014 by Bhushan Lakhe, Senior Vice President at Ipsos. The book is an excellent resource for administrators planning a production Hadoop deployment who want to secure their Hadoop clusters. A detailed guide to the security options and configuration within Hadoop itself, author Bhushan Lakhe takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way.

Hadoop Application Architectures

If you need some practical, real-world insight on the architectural considerations of leveraging Hadoop, then Hadoop Application Architectures is your guide. It lists numerous examples derived from the actual business world, and dips into the how-to’s of design and implementation of Hadoop applications. Additionally, it covers the incorporation of Hadoop into your existing infrastructures, along with some best practices regarding HBase and HDFS. It was released in July 2014 and written by Mark Grover, Ted Malaska, Jonathan Seidman, and Gwen Shapira.

Learning Spark

The difference between succeeding and failing with your big data endeavors is staying on top of the latest technologies to assist your efforts.

If you’re charged with managing streams of data coming in from websites, Spark is an invaluable tool. Learning Spark: Lightning-Fast Big Data Analytics offers solutions for helping programs run more quickly, as well as the use of cluster computing. Written by Holden Karau, it is a hearty introduction to the use of Spark as an alternative to MapReduce when it comes to loading data and querying memory. It was released in February of 2015, and is one of the latest, most up to date books available on big data.

Using Flume

Getting real time information off of front end servers and through Hadoop is challenging. Using Flume: Flexible, Scalable, and Reliable Data Streaming is your resource for mastering the features of Flume. It covers data collection, data aggregation, streaming large data sets to the Hadoop Distributed File System, HBase, Elastic Search, and other related topics. It is tailored for engineers and experienced coders, and is therefore most useful to advanced technical readers. It was written by Hari Shreedharan and released in October 2014.

Apache Hadoop Yarn

This little book has a big name. Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2 is for the business that’s moved past simple MapReduce and is into YARN. It has been hailed as, “The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop YARN,” and gives readers better, quicker ways to create code that makes the best use out of the most recent developments in Hadoop. The book delves into issues like scalability, utilizing clusters, and new service models. There are also alternatives included to batch processing and Java programming, as well as a step-by-step guide through the entire lifecycle of YARN. Examples are included. It was written by Arun C. Murthy, Vinod Kumar Vavilapalli, Doug Eadline, Joseph Niemiec, and Jeff Markham and released in March 2014.

Hadoop, and big data in general, works best on a Full Metal Cloud, which does not use a hypervisor and offers remarkably faster processing than the traditional cloud environment. The Full Metal Cloud available at Bigstep offers lightning-fast connectivity, and you can test its speed and reliability for yourself today at the Bigstep website.

Got a question? Need advice? We're just one click away.

Sharing is caring:

Back to articles

Readers also enjoyed:

December 30, 2016

How Data Analytics & the IoT are Revolutionizing CityManagement

By Daniela Mustatea in What is Big Data

The idea of a municipality using Internet of Things (IoT) devices isn't new. Between red light cameras and smart meters on houses, government agencies…

September 28, 2015

Now You See It, Now You See It Better: How to Turn Big Data Into a Purposeful Visual Presentation

By Daniela Mustatea in Performance for Big Data Apps

By now, it's common knowledge that the reason the space shuttle Challenger blew up mid-launch was because of an O-ring that failed at temperature. What's…

November 14, 2016

Search: The Big Data Secret That No One's Talking About