September 2013

Sponsored by Teradata Corporation

Download pdf

Collecting and using machine-generated data as name-value pairs is often seen as demanding Hadoop or other NoSQL tools. This paper shows how it can be effectively done in a relational database, based on work done at eBay. 

Abstract

Machine-generated data is the coming wave of big data. It will dwarf the current experience in both volume and velocity. Furthermore, in terms of content and structure, as well as its business use, it is very different to the social media big data most prevalent today. This paper describes how to unlock the value of machine-generated data, inspired by the approach taken by one of its bigger users, eBay.

The two defining characteristics of machine-generated data are its variability during both definition and production, and its semi-structured nature. These characteristics lead directly to name-value pairs (NVPs) as the most appropriate and useful format for such data. The first half of this paper explores the characteristics of name-value pair data by way of examples from three industries and positions it within the broader scope of all information–including big data–used by business today.

We then explore the approach taken by eBay to storing and processing name-value pair data. This shows the justification for choosing relational database technology as the foundation and includes sample SQL snippets that demonstrate how sensor data is easily transformed into analytics-friendly relational tables using Teradata v14.0 functionality.

Finally, we position machine-generated and other data types within Teradata’s Unified Data Architecture as an overarching vision of how information and data will be increasingly situated in the future.