Our techie teamwork involves pushing applications to their limits. Our mission is simple: make sure we get the highest performance possible out of each setup we test, then use that knowledge to constantly improve our services.
Getting The Most Out Of Impala - Best Practices For Infrastructure Optimization

Paper Abstract

We tested Cloudera Impala in an effort to understand what hardware setup would provide the best performance/price for it. Our aim is to provide a quick practical guide for choosing the infrastructure to run Impala on.

What's Inside

  1. Query Execution Times
    Results for a set of 20 queries run on the same data set, ten times on each hardware configuration.
  2. Single vs. Dual-CPU Instances
    We weren't expecting dramatic score changes between single and dual-CPU instances, but what we found was surprising.
  3. Performance/Price Scores
    Using our standard cost structure, we added the price per hour for each instance and paired it with the Impala performance score.

