Technically Speaking

The Official Bigstep Blog

Subscribe to our newsletter

AlignAlytics benchmarks Elasticsearch, sees 200% performance improvement

Running queries in no time with Elasticsearch on Full Metal

Every so often we come across a use case that makes every hour of work put into our Full Metal Cloud worth it many times over. Today we’re in the happy position of sharing one of those use cases with you.  AlignAlytics ran Elasticsearch queries on 10 million documents (approx. 4 GB of compressed data) and consistently saw with Bigstep a 100-200% performance improvement over their existing dedicated servers. The specs on their machines were quite similar to those of our Full Metal Compute Instances, which makes this one of the closest “apples to apples” comparisons we’ve done.

In fact, here’s what the AlignAlytics team had to say about it:

“We were expecting better performance in the bare metal infrastructure compared to traditional cloud based dedicated servers, but it was incredible to see that performance was twice as good throughout and in some cases even better when dealing with highly complex queries like geo distance calculations.”

Amit Talhan - Senior Developer at AlignAlytics

Click to enlarge

Test results

Data size: 10 million documents, approx. 4 GB with Elasticsearch compression.

Existing AlignAlytics cluster

 Node 1Node 2 & 3 Node 4
ES Allocated RAM6 GB6 GB6 GB
Total RAM8 GB8 GB8 GB
Disk750 GB (SATA)250 GB (SSD)250 GB (SSD)
CPUIntel Xeon E3-1230 3.3 GHz (4 cores, 8vCores)Intel Xeon X3440 @ 2.53GHz  (4 cores, 8vCores)Intel Xeon E3-1230 3.3GHz (4 cores, 8vCores)
Data NodeNoYesYes
Search NodeYesNoNo
Network Speed100mbps100mbps100mbps
Full Metal Cloud Cluster
 Node 1 – 4
ES Allocated RAM6 GB
Total RAM16 GB
Disk200 GB (SSD)
CPU3.3 GHz, 4 Cores
Data NodeNo
Search NodeYes
Network Speed4 GbE ports


Multiple terms search

Click to enlarge

Multiple Terms Aggregations

Click to enlarge

Multiple Terms Aggregations and a Numeric Histogram

Click to enlarge

Multiple Terms Aggregations and a Geo Hash Aggregation (precision 5)

Click to enlarge

Four Tier aggregation with Date, Term, Term and Numeric

Click to enlarge


The Story behind the results

“ As an analytics solutions provider our team of data scientists performs various types of analysis on a variety of large amounts of data. This deep and wide-ranging analysis is what facilitates our discovery of actionable insights for our clients in order to solve their most critical business challenges and enable confident decision making.  To be able to fulfil these analysis requirements and deliver the best results, we had to move away from traditional SQL to unstructured data, where Elasticsearch was best suited. As the data size and complexity of the queries increased, it was clear to us that infrastructure mattered and we needed to ensure the best performing setup for running our Elasticsearch cluster. This lead to the performance benchmarking exercise which confirmed that Bigstep’s Full Metal Cloud can provide more than twice the performance of regular dedicated servers and therefore empower us to better execute our analysis and more rapidly deliver valuable insights to our clients. “

Amit Talhan - Senior Developer at AlignAlytics

Because the results were consistently 100-200% better than their existing infrastructure, AlignAlytics’s technical team came back asking for an explanation. They might have expected that in a virtualized environment, where hardware is oversold and there are noisy neighbors. But they were working with dedicated servers specifically to avoid those problems and they were using SSD local storage to avoid any I/O bottlenecks. So how could a bare metal cloud provide more performance than dedicated servers with local SSD drives, they asked.

Here are what we consider the usual suspects responsible for the difference in performance:

  • Wire-speed network

  • Our wire-speed bare metal network ensures that clients have the smallest physically possible network latency – as all switching happens at the hardware level. This means that connectivity between machines and to the storage is excellent, so much so that even working with local disks might not compensate for the difference.
  • Hand-picked components

  • Even with hardware, components are not created equal. Memory frequency can vary greatly and, although usually underestimated, takes quite a toll on performance. Up to 20% more performance can be achieved from the same setup, simply by increasing memory frequency as shown in one of our previous performance benchmarks.
  • All-SSD storage based on enterprise drives

  • As in the case of memory, not all SSD drives perform equally. For instance, lower end drives provide good performance for reading but not for writing. In fact, it is well documented that writing to SSDs can be quite slow. That’s why even some SSD based systems can achieve sub-optimal performance overall.


The conclusion

The main takeaway from AlignAlytics findings is to never take anything for granted. Especially due to the cloud’s pay-per-hour billing model, it has become affordable to test several providers and setups before deciding where you want to invest your infrastructure budget. Of course these tests take time and these comparisons aren’t always like for like. But, if nothing else, you’ll have a much better understanding of the strong and weak points of the system you’re building. That’s very precious knowledge when you find yourself having to scale or having to predict infrastructure costs realistically.

As we found in our testing with AlignAlytics, not everything labeled SSD really improves performance, local drives aren’t always better and what’s apparently the same 8 GB of RAM can perform very differently across providers. Nothing compares to getting your hands on a setup and testing it with your applications.

Got a question? Need advice? We're just one click away.
Sharing is caring:TwitterFacebookGoogle+PinterestEmail


Fabien Wernli
29.10.2014 14:41

Did you also benchmark indexing?

Readers also enjoyed:

North America Vs Europe: Who Will Win the Race to Cloud Adoption?

Cloud spending is on the upswing globally, and is expected to increase by another 42 percent during the year 2015, bringing the total cloud market value…

How to Run a Big Data Benchmarking Test

Are you preparing to conduct benchmark testing on your big data operations? This testing is essential to determine whether your efforts and changes are…

Leave a Reply

Your email address will not be published.

* Required fields to post your comments.
Please review our Privacy Notice in order to understand how we process your personal data and what are your rights in this respect.