- Advertising
- Bare Metal
- Bare Metal Cloud
- Benchmarks
- Big Data Benchmarks
- Big Data Experts Interviews
- Big Data Technologies
- Big Data Use Cases
- Big Data Week
- Cloud
- Data Lake as a Service
- Databases
- Dedicated Servers
- Disaster Recovery
- Features
- Fun
- GoTech World
- Hadoop
- Healthcare
- Industry Standards
- Insurance
- Linux
- News
- NoSQL
- Online Retail
- People of Bigstep
- Performance for Big Data Apps
- Press
- Press Corner
- Security
- Tech Trends
- Tutorial
- What is Big Data
Getting the most out of Impala
We have teamed up with Cloudera to analyse ways of working with Impala in order to optimise for both better performance and budget. We will be sharing our experiences with you at the next London Enterprise Technology Meetup, Monday May 12th, starting at 7pm, in the Wolfson Theatre of the New Academic Building, at 54 Lincolns Inn Fields.
Getting more performance from any application is easy when there’s a possibility to increase the budget. But what happens when we have the opposite challenge? How can we do more with less? One usual suspect, probably even the most infamous for performance bottlenecks, is I/O. So that’s where we started.
We have teamed up with Cloudera to analyse ways of working with Impala in order to optimise for both better performance and budget. We will be sharing our experiences with you at the next London Enterprise Technology Meetup, Monday May 12th, starting at 7pm, in the Wolfson Theatre of the New Academic Building, at 54 Lincolns Inn Fields.
Getting more performance from any application is easy when there’s a possibility to increase the budget. But what happens when we have the opposite challenge? How can we do more with less? One usual suspect, probably even the most infamous for performance bottlenecks, is I/O. So that’s where we started.
The setup
We used a setup of Cloudera Impala on 10 instances, each with 20 physical CPU cores, 192 GB of RAM, and 4 x 10 Gbps ports (our FMCI 20.192) and we tested using TCP-DS.
In order to track I/O bottlenecks we looked at two alternate ways of deploying Impala:
A. Using local storage – precisely 8 x 1 TB drives per instance or a total of 80 enterprise drives at 7.2K RPM
B. Using Bigstep’s Full Metal Solid Storage – an all-SSD distributed storage system. The instance cluster was connected to the storage array with one 10 Gbps link per machine.
In both scenarios, the instances in the cluster were interconnected in a single LAN, each with one 10 Gbps link.
For those new to Impala, it is a massively parallel processing (MPP) SQL query engine from Cloudera that runs natively in Apache Hadoop and enables users to directly query data stored in HDFS and Apache HBase, without requiring data movement or transformation.
The speakers
Join us on May 12th at the London Enterprise Technology Meetup to see the results and to learn how you can optimise your big data infrastructure to provide more insight in less time.
Who will be speaking:
• Cloudera: Graham Gear - EMEA Director of Systems Engineering
• Bigstep: Alex Bordei – Product Manager
We look forward to seeing you there.
Leave a Reply
Your email address will not be published.