May 2011

Sponsored by WhereScape USA, Inc.

Download pdf

This paper first explores the issues that plague data warehouse development projects and the most common trades-off made by vendors and developers. One compelling solution, the concept of data driven design is described, and a number of recommendations are provided on how data warehouse design and population activities can be best structured for maximum accuracy and reliability in estimating project scope and schedule. 

Abstract

The data warehouse has now been with us for a quarter of a century. Its architecture and infrastructure have stood largely stable over that period. A range of methodologies for designing and building data warehouses and data marts has evolved over the years. And yet, time after time, one question is repeatedly asked: “why is it so difficult to accurately and reliably estimate the size and duration of data warehouse development projects?”

This paper first explores the issues that plague data warehouse development projects and the most common trades-off made by vendors and developers-choosing between speed of delivery and consistency of information delivered. The conclusion is simple. This trade-off is increasingly unproductive. Advances in business needs and technological functions demand delivery of data warehouses and marts with both speed and consistency. And reliable estimates of project size and duration.

One compelling solution to these issues emerges from taking a new look at the process of designing and building data warehouses and marts from a very specific viewpoint–data and the specific skills needed to understand it. From this surfaces the concept of data driven design and a number of key recommendations on how data warehouse design and population activities can be best structured for maximum accuracy and reliability in estimating project scope and schedule.