The third EMA / 9sight Big Data Survey was conducted in late 2014, with the results published in April 2015. Beyond Big Data itself, the survey also addressed the concepts of data lake, data driven and the Internet of Things, providing a comprehensive view of the state of thinking in the broadest definition of Big Data from 351 respondents.
Since its inception, the concept of “big data” has been widely associated with a single data management platform—Hadoop. This connection may be due to the popularity of the use of the Hadoop platform to store and process large amounts of multi-structured data. However, Hadoop is not the only answer to the question of “what is big data?”
As was explored in the inaugural EMA/9sight survey in 2012, big data is both a way to look at new sources of data and how organizations place that information under management. Big data has attracted a wide range of application innovators as well as many protesters against the dominance of relational databases and data warehouses. The EMA/9sight surveys use a deliberately broad definition of big data to inspire end users to think beyond the box of Hadoop. The survey explores the wide range of ways in which non-traditional data, often in combination with more traditional types, has enabled new or improved business processes. As was established in previous studies in 2012 and 2013 and again in 2014, big data offers a wide range of possibilities, but the name “big data” itself keeps media and industry eyes focused on size as the defining feature.
With the 2015 report, these previous observations continue to hold true. However, there is an evolving market in which size is not everything, and speed, in all its aspects, has grown in importance for respondents. Furthermore, respondents continue to include a wide range of data structures, from highly irregular to strongly modeled, within the scope of their projects. This refocusing of implementers’ attention on speed and structure reduces overall growth in big data by some measures and requires EMA/9sight to explore how consideration of speed and structure are changing market dynamics. The 2014 survey also included investigations into the highly visible topics of data-driven cultures, the Internet of Things, and data lake architectures.
Relating to data-driven cultures, respondents with larger percentages of big data in their organizations were also more likely to have adopted data-driven strategies and vice versa. In addition, there will be significant growth in the area of workforce access to data-driven initiatives over the next year.
If any term eclipsed big data in attention and hype in 2014, it must have been Internet of Things (IoT) or one of its synonyms, such as Internet of Everything or Industrial Internet of Things. Suddenly, it seemed that IoT was the acronym on everybody’s lips, as predictions of enormous growth and high economic impact have become commonplace. However, the devices themselves are only one side of the equation. On the other side are the data delivered to the enterprise and the analytics required to understand its significance—topics of key importance to survey respondents.
The concept of data lake architectures is also gaining traction in big data initiatives. Externally-sourced big data, including data from the Internet of Things, clearly demands a data management store, as information arrives at high speed and in significant volumes. This store must be large, flexible, and cost-effective. However, expanding the scope of the data lake to include business-critical data and legally-binding core business information could pose issues for organizations unless they master their data management practices first.
In 2012, EMA defined a next-generation data management architecture—the Hybrid Data Ecosystem. The HDE has been refined over the past three years. Each of the platforms within the Hybrid Data Ecosystem supports a particular combination of business requirements along with operational or analytical processing challenges. The HDE represents a unique approach compared with traditional best practices. Rather than advocating a single data store that supports all business and technical requirements at the center of its architecture, the Hybrid Data Ecosystem seeks to determine the best platforms for supporting a particular set of requirements and links those platforms together. This year the HDE expands to include the influence and impact of the cloud on big data environments and data consumers.
The 2014 EMA/9sight Big Data research surveyed 351 business and technology stakeholders around the world. The survey instrument was designed to identify key trends surrounding the adoption, expectations, and challenges associated with strategies, technologies, and implementations of big data initiatives. The research identified the following highlights in the 2014 Big Data research with comparisons to past EMA/9sight studies in 2012 and 2013.
• Growing Number of Projects – In 2014, almost two thirds of organizations had three or more big data projects, and more than 20% reported five or more projects in progress.
• IoT Importance – Nearly 50% of respondents indicated that the Internet of Things was currently adopted and an important or essential part of business.
• Data Lakes: An Acquired Taste – Over 22% of respondents said a data lake is currently adopted as part of the strategy to replace existing operational and informational platforms.
• Big Data Is a Maturing Strategy – Over 55% said that a big data strategy was Adopted and Essential or Adopted and Important.
• Data-Driven Is Driving Organizations – Almost 63% of respondents included data-driven strategies in their organization at a significant level.
• Speed Is Driving Competition – Speed of processing response was the most frequently indicated use case by respondents at nearly 20%.
• What’s Stopping Big Data? – Poor data management practices, like lack of data governance and not supporting SLAs, are the top obstacles for organizations implementing a big data initiative.
• Time to Value with Applications – Over 20% of respondents implemented big data projects using customizable applications from an external provider as their big data implementation strategy.
• Revenue-Generating Big Data Projects – The highest percentage of projects tackle market basket analysis and cross-sell/up-sell activities, targeting improvements to top-line revenue.
• Continued Importance of Analytics in Business Processes – Operational analytics leads all other categories, 42% stating they were executing projects around this workload to minimize cost.
• Business Stakeholders Lead Consumption Again – Line of business executives is the largest groups of data consumers. Marketing and financial analysts is the second largest group of users.
• Looking Outside the Skills Box – Over 20% of respondents utilize technical consulting services from external organizations, providing a significant opportunity for consulting groups.
• Partly Cloudy Implementations – Nearly 60% of big data projects have a primary implementation element as part of a cloud: private, public, hybrid, or as a managed service.
• Low Latency is High Profile – Big data projects are overwhelmingly near-real time, with over 32% described as real time/near-real time processing of data.
• Two-time Use Case Champion – For the second year in a row, the top use case for big data initiatives is speed of processing response at nearly 20% of mentions.
• Financial Driver: Managing Maintenance – The primary financial driver was that organizations wanted to reduce maintenance costs associated with their existing data management implementations.
• What’s in Big Data? – Machine-generated data types dominate the data types for big data initiatives, with over 40% of respondent mentions.
• How Is Big Data Moved? – The top three integration strategies were real-time streaming data integration, data replication (e.g., standardized duplication of enterprise data sources), and change data capture.