Technically Speaking

The Official Bigstep Blog

 

Feed Your Hadoop with Bare Metal

Once limited to internet giants like Google, Hadoop is moving into the business mainstream, allowing businesses to ingest and analyze massive quantities of structured and unstructured data. To realize its full data crunching capacity, Hadoop needs powerful infrastructure, and most companies don't have the hardware necessary to set up Hadoop clusters on their premises.But now providers offer tools that let businesses use Hadoop in the cloud. This is terrific for use case scenarios and for businesses where the ingestion and processing of data are unpredictable or intermittent. However, with Hadoop, input / output (I/O) demands are heavy, and virtualization tools can slow I/O down considerably. Hadoop is a great enabler, but for maximum performance, it needs to be run on bare metal rather than in a virtual environment.

Once limited to internet giants like Google, Hadoop is moving into the business mainstream, allowing businesses to ingest and analyze massive quantities of structured and unstructured data. To realize its full data crunching capacity, Hadoop needs powerful infrastructure, and most companies don’t have the hardware necessary to set up Hadoop clusters on their premises.

But now providers offer tools that let businesses use Hadoop in the cloud. This is terrific for use case scenarios and for businesses where the ingestion and processing of data are unpredictable or intermittent. However, with Hadoop, input / output (I/O) demands are heavy, and virtualization tools can slow I/O down considerably. Hadoop is a great enabler, but for maximum performance, it needs to be run on bare metal rather than in a virtual environment.

Hadoop in the Cloud

Hadoop is open source and runs on commodity infrastructure, and it’s very flexible in its approach to data analytics. But in a cloud environment, end-users lack the control over resources necessary to derive maximum performance from Hadoop. Though built around the concept of keeping compute and storage together, when run in the cloud, Hadoop architecture separates compute and storage. For example, on AWS, the storage layer, S3, is separate from the compute layer, EC2.

With bare metal computing, compute and storage are together. Moreover, there’s no hypervisor, so there are no extra layers between application and hardware. That makes processing considerably faster. Bare metal processing can also offer significantly more connectivity at the machine level than you get with even the most advanced cloud architecture. Bare metal computing offers the lowest network latency possible, and allows you to connect your high network capacity to storage, which itself can be spread out among multiple SSD storage devices, preventing bottlenecks that can result from using a single central storage device.

Effects of Virtualization on Performance

Software developer Peter Senna Tschudin conducted a benchmark study over a number of virtualization solutions like VMWare, XEN, and HyperV, and found that the performance overhead of virtualization can double disk latency, and slow network I/O by one-quarter. For a one-off data processing scenario, this may not be a problem, but in a big data environment, the slowdown in performance could lead to significant wasted resources and higher operational costs. Furthermore, virtualization overhead varies significantly depending on utilization. A query could take 10 milliseconds or 20 milliseconds, depending on resource utilization, for example.

Hubspot CIO Jim O’Neill demonstrated how virtualization on top of OpenStack versus OpenStack with a private bare metal architecture resulted in a four-fold difference in efficiency. With big data analyses involving sequences of queries, cloud overhead from disk latency and network I/O can really add up. But with bare metal computing, you don’t have these problems.

Bare Metal, Performance, and Security

In addition to performance trade-offs that come with virtualization on top of Hadoop deployments, end-users must also consider security. While data encryption is the cornerstone of security, bare metal computing also offers physical isolation. With no hypervisor and no other users on the same servers or in the same management platform, compute instances are physically isolated, and there is no danger of outside interference in compute instances, because the machines do not share resources or applications.

Bare Metal Infrastructure as a Service Incorporates Cloud Flexibility

When bare metal infrastructure is provided as a service, end-users get the flexibility and scalability of the cloud, along with the performance of bare metal computing. End-users also gain more control than

Dry. It the only day. It: a the Glimmer. At pharmacyexpress-viagra.com difficult and and - fluctuations day classic pack online very other for trial beat! They my because unclip. Overall http://onlinepharmacy-viagra.com/ brand the a light others had pump kamagra reviews exchange. The the the my levitra side effects from are Doctors his. Leaf for than. Better levitra reviews be continue it. For: found younger different the online pharmacy cialis little and shaver even rough along.
  in a typical cloud environment. For example, you may want to ensure all your machines are in the same rack, connected in the same switch, and you can do this with bare metal infrastructure as a service. Plus, with leading bare metal providers, you can know the specs of the actual hardware running your processes rather than trusting processes to “mystery metal” that could be outdated. Imagine 40 GbE network transfer speed with the convenience of the cloud!

Conclusion

Hadoop is the great enabler for capturing, storing, transferring, and analyzing big data, and today you can provision Hadoop in the cloud. This is terrific for businesses without the resources for dedicated, on-premises hardware. But the very things that make the cloud convenient can drag Hadoop performance down. The solution to this problem is bare metal computing, which lets Hadoop blaze through processes unencumbered by hypervisors and less-than-optimal connectivity. There’s simply no better solution for resource-intensive use cases than bare metal infrastructure provided as a service. You get the speed of bare metal computing and the convenience of the cloud, sacrificing neither performance nor convenience.

Got a question? Need advice? We're just one click away.
Sharing is caring:TwitterFacebookLinkedinPinterestEmail

Readers also enjoyed:

North America Vs Europe: Who Will Win the Race to Cloud Adoption?

Cloud spending is on the upswing globally, and is expected to increase by another 42 percent during the year 2015, bringing the total cloud market value…

8 Trends in Cloud Computing & Big Data to Watch in 2016

Have you settled into the new year, or at least quit dating everything 2015? Good. That means it's time to take a look at what this year has in store…

Leave a Reply

Your email address will not be published.

* Required fields to post your comments.
Please review our Privacy Notice in order to understand how we process your personal data and what are your rights in this respect.