5 Rules for Avoiding a Big Crash & Burn with Big Data
In the wide world of racing, the goal is to field a car capable of zooming around the track at optimal speeds, avoiding a fiery crash into the wall or another car for 400-500 miles, and cross the finish line. Sure, every driver wants to win the race, but keeping the car in one piece and on the track is actually more important, because race teams aren’t just going for the race trophy—they’re going for the championship. Staying in the race (even when you don’t win) earns the points necessary to put the car and driver in the best possible position to take it all.
This isn’t unlike big data at all. Of course, the team wants each analysis to yield valuable insight the company can use to make more money or eliminate waste or find a new opportunity. But the overall goal is to keep the program on the track and out of the wall. Eventually, the project will yield consistently meaningful results—the championship of big data analytics.
Also like racing, the best car on the field and the best wheelman alive aren’t enough. You can land that brilliant data scientist and hit the motherload of data, but without the rest of the team and infrastructure in place, there won’t be any trophies to brag about. Here’s how to keep your big data project out of a fiery crash that could end your season in the least glamorous way.
1. Keep the Engine and Fuel Lines Clean
Big data teams often put all their focus on offloading the data into Hadoop (or your big data infrastructure of choice) and digging into the analytics. Unfortunately, without good data cleansing and data governance in place, you’ll be working with dirty data. Like dirty race fuel, you won’t get far. Take the time, effort, and expense to cleanse the data and establish good data governance policies and procedures so that you can depend on the analytical results you achieve.
2. Don’t Broadcast Important Info Over an Open Channel
If race teams tip off other teams about their secrets, it can cost them their competitive advantage. Teams often use code talk to express what they want to keep hidden. For example, “Is there debris on the front stretch?” might mean, “How am I doing on fuel?” Always remember, that your data isn’t just valuable to you—it would also be a goldmine for competitors, identity thieves, and perhaps even foreign governments. Data security policies, backed by strong encryption for data at rest and in transit, is essential to keep your project out of the news for all the wrong reasons.
3. Don’t Pull Out of the Garage Before the Crew is Finished
During practice sessions in preparation for a big race, some cars hit the track first thing. Others hang around the garage and give their teams time to get everything perfect before pulling out. Big data is the same way. Eventually, you’ll want the data to be readily available for everyone, but it’s important to give your data team time to fine-tune everything before rolling out. A project that comes out a month later but is perfectly honed is far better than an early project that doesn’t do what it’s supposed to do.
4. Remember, Everyone Shares Ownership of the Trophy
After a race, the winning driver doesn’t hit victory lane by himself and get the trophy and all the champagne and every one of the kisses from the pretty girls. The whole team is there. You need to approach your big data the same way. Once your big data team has worked the kinks out and prepared the data for consumption by your applications and users, it’s everyone’s reward. Data that’s held too tightly won’t ever deliver the ROI it would if it were made readily accessible across the organization.
5. Be Careful How You Approach the Driver With Your Analytical Insights
By the time a driver makes it to the big leagues of Formula 1 or NASCAR, they’ve put decades of their life into learning how to race. They know what line they prefer (high or low on the track), how they like the driver’s seat set up, and where they want their mirrors configured. They resist a crew chief telling them that they could gain 1/10th of a second by running a lower line on the track.
Similarly, when you bring big data into your business, you’re dealing with business people who have mastered their field for many years. Telling them that the data indicates they should be doing something differently is to imply that they’ve been doing it wrong their whole careers. They have knowledge, skills, talent, and most likely some ego at stake. Remember this when you deliver your analytical results. Being sensitive to this can prevent your hard work from being rejected outright.
Ready to get started? You can now get a powerful, secure, high-performance data lake as a service. It’s the ideal way to get your big data initiative set up right so that your project doesn’t crash and burn. Take advantage of our limited offer! Discover the Bigstep Data Lake as a Service in the world. Get 1TB free for life - limited to 100 applicants. Start here.