Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data / Edition 1

Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data / Edition 1

by Ralph Kimball, Joe Caserta
     
 

The single most authoritative guide on the most difficult phase of building a data warehouse

The extract, transform, and load (ETL) phase of the data warehouse development life cycle is far and away the most difficult, time-consuming, and labor-intensive phase of building a data warehouse. Done right, companies can maximize their use of data storage; if not, they

See more details below

Overview

The single most authoritative guide on the most difficult phase of building a data warehouse

The extract, transform, and load (ETL) phase of the data warehouse development life cycle is far and away the most difficult, time-consuming, and labor-intensive phase of building a data warehouse. Done right, companies can maximize their use of data storage; if not, they can end up wasting millions of dollars storing obsolete and rarely used data. Bestselling author Ralph Kimball, along with Joe Caserta, shows you how a properly designed ETL system extracts the data from the source systems, enforces data quality and consistency standards, conforms the data so that separate sources can be used together, and finally delivers the data in a presentation-ready format.

Serving as a road map for planning, designing, building, and running the back-room of a data warehouse, this book provides complete coverage of proven, timesaving ETL techniques. Beginning with a quick overview of ETL fundamentals, it then looks at ETL data structures, both relational and dimensional. The authors show how to build useful dimensional structures, providing practical examples of techniques.

Along the way you’ll learn how to:

  • Plan and design your ETL system
  • Choose the appropriate architecture from the many possible options
  • Build the development/test/production suite of ETL processes
  • Build a comprehensive data cleaning subsystem
  • Tune the overall ETL process for optimum performance

Read More

Product Details

ISBN-13:
9780764567575
Publisher:
Wiley
Publication date:
10/28/2004
Edition description:
New Edition
Pages:
528
Sales rank:
263,897
Product dimensions:
7.44(w) x 9.22(h) x 1.08(d)

Table of Contents

Acknowledgments.

About the Authors.

Introduction.

Part I: Requirements, Realities, and Architecture.

Chapter 1: Surrounding the Requirements.

Chapter 2: ETL Data Structures.

Part II: Data Flow.

Chapter 3: Extracting.

Chapter 4: Cleaning and Conforming.

Chapter 5: Delivering Dimension Tables.

Chapter 6: Delivering Fact Tables.

Part III: Implementation and operations.

Chapter 7: Development.

Chapter 8: Operations.

Chapter 9: Metadata.

Chapter 10: Responsibilities.

Part IV: Real Time Streaming ETL Systems.

Chapter 11: Real-Time ETL Systems.

Chapter 12: Conclusions.

Index.

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >