Supercharge Hadoop With a High Performance Cloud System
Hadoop is a computing platform that makes big data easier to handle. How Hadoop stores files and processes data are the two most important characteristics of the platform. Hadoop allows you to store files that are bigger than what can be stored on a particular server. It also allows you to store large numbers of files. How Hadoop processes data is different from how we used to think about processing. Rather than moving data over a network to be processed, the Hadoop process called MapReduce moves the processing power to the data.
It’s easy to see how Hadoop is suited for big data processing. Moving data over a network can be too slow, particularly with huge data sets. Hadoop operation is complex to use, but a number of companies have created tools that are making it easier to use and putting the power of Hadoop into the hands of more organizations.
Hadoop on Bare Metal or in the Cloud?
When Hadoop started out, “bare metal,” or an on-premises cluster with physical servers, was how Hadoop was done. This avoided the I/O overhead inherent in virtualized (cloud) environments. But just as so many other processes have moved to the cloud, so has Hadoop. With the cloud, cost savings (especially in terms of up-front capitalization) and ease of operation can be brought to bear on big data processing using Hadoop.
In terms of total cost of ownership (TCO), bare metal and cloud Hadoop offer similar price / performance ratios. Business consulting firm Accenture used a TCO model to evaluate bare metal versus cloud Hadoop. They found that there is no cost benefit to bare metal Hadoop for a representative cluster of 24 server nodes and 50 TB of Hadoop Distributed File System (HDFS) capacity. They also found that the performance tuning options that are becoming available with Hadoop in the cloud are becoming more powerful and easier to use.
Why Not Both?
In an ideal world, you’d get bare metal performance combined with the cloud’s ease of use and scalability, and in October 2013, Bigstep unveiled its solution: Full Metal Cloud. This service doesn’t allow customers to provision virtual machines, but bare metal servers they can use to operate whatever software stacks they want to deploy, including big data processing with Hadoop.
The Full Metal Cloud solution delivers the power of bare metal computing without the performance cost of running a hypervisor. This is no small performance gain, either, because bare metal offers at least twice the performance of a cloud solution when it comes to Hadoop. The “cloud” part of Full Metal Cloud is that it offers a self-service portal, is scalable, and is billed on a pay-per-use basis.
Bare Metal Performance Was Made for Hadoop
Since Hadoop was made for bare metal performance levels, Full Metal Cloud gives end-users that power, but without the drop in performance that you usually get with cloud solutions. Full Metal Cloud uses the latest version of Cloudera Hadoop Distribution (CDH), preconfigured to make integrating with other apps quick.
Full Metal Cloud is able to provide such great performance because it offers a dedicated Layer 2 domain for each client, with 4 to 40 Gbps connectivity at the machine level. It also offers all-SSD pseudo-distributed storage, which speeds things up even more. SSD storage at line rate speed allows end-users to achieve speeds up to 40 Gbps between Full Metal Cloud compute instances and their attached storage blocks, eliminating I/O bottlenecks.
Easier, Faster, More Efficient
Bare metal computing for Hadoop is not practical or cost-effective for many organizations, and typical cloud solutions can be slow due to the presence of a hypervisor and slower connections and storage. But Full Metal Cloud brings the speed and power of bare metal computing to cloud users, so end-users do not have to compromise on speed or power when they use Hadoop to process massive amounts of data.