Technically Speaking

The Official Bigstep Blog

 

Hocus Pocus: Is it Data Science, or Is It Magic?

Back in the mid-1990s, IBM built a machine called Deep Blue. Deep Blue awed the world in 1996 by beating the then world chess champ at his own game under standard competition chess time rules. The world was (and continues to be) awed by this feat. To those outside the world of computers and AI, this seems like, well, magic.

Back in the mid-1990s, IBM built a machine called Deep Blue. Deep Blue awed the world in 1996 by beating the then world chess champ at his own game under standard competition chess time rules. The world was (and continues to be) awed by this feat. To those outside the world of computers and AI, this seems like, well, magic.

But what if Deep Blue were to play Backgammon? Or checkers? Or perhaps poker? In reality, Deep Blue wouldn’t likely do so well. That’s because the computer was built specifically to do well at the game of chess. Unlike a person, who can excel at a number of things simultaneously, computers aren’t that flexible. They are designed to do one thing, and they do that thing incredibly well. They just can’t apply that to other aspects of life, like a person who could easily take their strategy skills from chess and apply those to Risk or Yahtzee or Scrabble.

In fact, many of the things big data, data science, and AI or machine learning are capable of seem like a black art at the least and utter witchcraft at most. Big data is helping cure cancer, find people like yourself to hang out with, help stores stock up on the most popular stock items before a big storm, and even help women find better-fitting bras. Yeah, some of that definitely sounds like magic.

Though the data scientist is a highly-sought and rarely found creature, it’s not magic at all. In fact, the daily life of the average data scientist is rather mundane. Here’s what it takes to make all those mountains of data conjure up magically useful insight.

Data Science is a Lot of Meetings & Research

As much as half of a data scientist’s time isn’t spent working with the data or stirring a cauldron to make data say something useful. It’s spent in meetings, trying to determine what needs to be found and what data holds the potential answers.

According to data scientists in the trenches, the process usually begins with a meeting to determine exactly what questions or problems the data needs to answer. This isn’t always so cut and dried. After meeting with the customer (either the data scientist’s internal customers inside the company or external customers), the data scientist usually has to delve into more research to fully understand the issues.

But learning about the issues is just the first step. The data scientist then has to figure out what data holds the answer, and in many cases, how exactly to get that data. Sometimes, the data needed to query for the answers isn’t just sitting around in the Hadoop infrastructure. A source for the data has to be found, as well as a means for getting the data. All that is done before the data scientist can even begin to query the data.

Discover more salary details for Data Scientist. Browse salaries by job title, company, location, school on

Data Science is a Lot of Number Crunching

After the problem is well understood and the data is in place, the data scientist can begin tossing algorithms at it until the answers begin to appear.

Once the data is in place, the data scientist has to develop the right algorithm(s) for getting the answers needed. Speaking to InfoWorld magazine, one data scientist estimated that about half of her time is spent meeting with people to learn what they need out of data analytics. Another 20 percent is spent on the actual computations necessary to glean the findings.

Data Science Involves a Lot of Time Interpreting the ‘Findings’

As much as 30 percent of the data scientist’s time can be spent just interpreting the findings. Data doesn’t come out in usable form. It takes hours of analyzing the results and putting them into useful format so that people can understand what the analytics has discovered. Data scientist call this data visualization.

Before data scientists can even begin assembling data or running analytics, they have to build a flexible, scalable, practical, and secure data infrastructure. That’s where Bigstep can help. For a limited time, you can discover the first Bare-Metal Data Lake as a Service in the world. Get 1TB free for life - limited to 100 applicants. Start here.

Got a question? Need advice? We're just one click away.
Sharing is caring:TwitterFacebookLinkedinPinterestEmail

Readers also enjoyed:

Big Data in the Cloud Offers Lucrative Job Positions

Looking for a great new career path? Big data might be your ticket to success. The supply of big data scientists and related professions is low, while…

7 Things You Need to Know About Implementing Cloud Services

After a few years of proving its value, addressing security concerns, and developing viable business models, the cloud has achieved a significant market…

Leave a Reply

Your email address will not be published.

* Required fields to post your comments.
Please review our Privacy Notice in order to understand how we process your personal data and what are your rights in this respect.