Big Data - Use it!
Why you need data?
It’s popular, it’s in the now. It’s already in the dictionary. But, there’s only one thing that makes the difference: how to use Big Data in a smart way.
Now, data means all sorts of things, from raw demographics to order frequency, or to product preferences and many others. Most of the data that we can come up with may take a long time to extract and my not be actionable. You may not be able to do anything with it, except maybe understand your customers a little better, but it might not be worth the effort.
Dashboards for executives, on the other hand, have proven to be a valuable tool. It helps you keep a closer eye on the business. And no, accounting reports won’t do unless you get them every day and you get graphs so that you can see trends.
But if you have that, why won’t you automate it?
How an executive dashboard can look like
From the sample below you can get an idea of how an executive dashboard can look like.
The idea is that you can keep an eye on the evolution of a specific product option over a period of time. You can understand that although it used to be a good product, it is now slowly going out of fashion. And this is just basic info that you can get.
Getting the data
It is a lot more complicated to get this data than one might think. Even if you’re a technical engineer and can run your own ad-hoc SQLs, dump databases, or create reports and web interfaces to expose the data, you will still usually face a couple of challenges:
1. You never have data sitting around. It’s locked up somewhere so tight, you need absolute clearance to find out where it actually is.
2. Data is spread around the organization in many systems. For the past 20 years, architectures have been distributed and software developers have made architectures with many independent modules. This also means you have data in many different databases with different formats.
3. Data is huge. But huge doesn’t mean hundreds of terabytes of data. Even a gigabyte of MySQL data with 50 tables puts a lot of stress on the machine on which a single query runs. The reasons for this situation are the nature of the specific queries and the lack of optimization.
4. Building up the queries is hard. You have to understand the database schema and see how X relates to Y. For instance, you might have a table Customer, a table Product and a table Product Type. You have to check to see how they relate, which are the foreign keys and create joins accordingly. If you have many franchises and have sold this product over many countries, then you have to perform even more complex filtering.
5. Building dashboards is time consuming. You have to constantly adjust the reports as you add products and create one-time events. It is a constant work in progress.
The solution isn’t necessarily Big Data as you might expect. It’s a combination of things:
1. Building a secure data-warehouse separate from production databases to consolidate data from many places. This is a fairly common solution. When you do this consolidation though, since you have many formats and many databases over many years, you get huge archives of data, mostly junk data and old versions of databases. So yes, it’s big data, meaning big piles of obsolete data.
2. Hiring developers with some knowledge of statistics, or a developer and a statistician to build reports whenever they are needed.
Performance is a big concern. You can’t buy an Exadata just because you need to run a few queries now and then. But the amount of data tends to scale with the size of the business. So you might need to invest more in computing infrastructure.
When your queries and data forays suddenly require their own constantly evolving infrastructure, that’s when you’re probably close to what has been labeled Big Data.
Big Data as a service is our next topic. So stay tuned.