Expert Interview with Mark Csernica on the Future of Big Data for Bigstep
The future of Big Data is even Bigger Data, according to Mark Csernica, technology analyst for Mind Commerce, a research and consulting firm that supports technology and telecommunications companies.
To clarify, Mark offered us a glimpse at the expected scope of data growth: the world’s technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012, every day 2.5 quintillion (2.5×1019) bytes of data were created. It is estimated that by 2020, there will be 20 Zettabytes (20 x 1021 bytes) of data available.
To accommodate the information explosion, Mark says that Big Data technology and techniques will have to be continually evolving.
“The Big Data technologies that are working today might have difficulties in processing the data files/sets of the future, which will continue to grow,” he says. “I expect new data processing paradigms to be developed in order to process larger and larger data sets in reasonable time frames.”
Here, Mark offers his thoughts on why businesses should care about Big Data and what they can do
Tell us about Mind Commerce ... what services do you offer? Who should be using them?
Mind Commerce provides research and consulting services of digital technologies and the telecommunications industry. Our reports provide analysis on technologies and emerging business opportunities. For clients looking for more specific details, Mind Commerce provides follow-up consultation and advisory services on an as-needed basis. The Mind Commerce customer base includes technology companies, service providers, enterprise, government agencies and NGOs.
What seem to be the most common problems your clients hope you’ll help them solve?
Our clients are looking for trends, projections, and analysis on technologies, infrastructure, devices, applications, services. Our clients want to understand business models involved and emerging business opportunities that they can take advantage of.
Why is Big Data so important to the growth and survival of businesses today?
The importance of Big Data technologies is due to businesses’ need for identifying and taking advantage of business intelligence. Identifying gems of business information out there in the vast expanse of businesses’ data repository and on the Internet. This business information is on databases, blogs, articles, analyses, surveys, etc. Gems of information that are currently hidden, but if found are a source of market intelligence and customer preferences. And when used, they allow a business to spot market trends, strengths and weaknesses of one’s product line and identify tactics to gain competitive advantage.
But there are technical challenges with Big Data. The collection of data files/sets are so many, so large and complex that it becomes difficult to process them using existing database management tools and traditional data processing applications and techniques in reasonable time frames.
How can businesses better harness all the data available to them today?
For businesses to better harness the huge volumes of data available to them, they must develop the ability to discover a unique data insight. That is, a business needs to develop the ability to ask new questions, formulate new hypotheses, explore and discover how they plan to use the data available to them.
Ultimately, a big part of a business’s Big Data efforts is the use of new analytic techniques, on either new data or data that has been combined in new ways. And deciding how to analyze that data. Big Data technologies offer various approaches in capturing, storing, searching, analyzing the tremendous volumes of data available in reasonable time frames. Identifying and implementing the technology that best fits a business’s needs is a key component of a business’s Big Data strategy.
Organizations that address these areas will have the ability to take advantage of business opportunities, minimize risks and control costs.
What are some of your favorite tools, resources or techniques for managing and analyzing available data?
A technique I find interesting is using Cloud technologies and paradigms to provide the multi-processing environment needed to perform analytics on Big Data streaming data. This is achieved by setting up IaaS environments and allowing analytic programs to run on virtual machines. The objective in this approach is to define a system that offers modular growth and expansion. These techniques center on taking the data that was just generated, performing analytics on the data you need, throwing away the remainder of the data and saving the analytic results.
This technique appeals to me because the data is not saved first, then analytics performed, as is done in traditional analytic processing methods. Performing real-time analytics on your source data, saving disk storage and processing time. But there is a down side. This means your analytic programs must run 24-7. This requires system redundancy to address outages and downtime.
What do you think are the most common mistakes businesses make in managing the data they collect?
The most common mistake businesses make is not defining and maintaining a clear and precise Big Data strategy. They skip this step and just trying to implement solutions. Businesses need to define how they plan to use the data available to them to run their business. This includes what data they need, how they want to analyze it and use it.
What should businesses be doing to adapt to the age of Big Data?
Minimally, businesses need to define and maintain a clear and precise Big Data strategy to define how they plan to use the data available to them to run their business. This includes what data they need, how they want to analyze it and use it. Considering Big Data is an emerging technology, businesses need to include in that strategy a plan to migrate to Big Data technologies.
What are some of the most interesting ways you’ve seen businesses and organizations leveraging Big Data to their benefit?
ATT is implementing the Cloud technologies technique I mentioned above to make the data generated in their networks (eight to nine Terabytes daily) available internally to their operations and to clients. Making this data available to clients allows ATT to monetize this information.