Oracle DBA Guide to Data Warehousing and Star Schemas

( 1 )

Overview

  • The definitive, real-world guide to Oracle data warehousing
  • Maximizing performance, flexibility, and manageability in production environments
  • Hardware/software architectures, star schema design, partitioning, and more
  • Industrial strength data loading and query optimization techniques
  • By the world-renowned architect of 7-Eleven's ...
See more details below
Other sellers (Paperback)
  • All (14) from $2.65   
  • New (7) from $39.14   
  • Used (7) from $2.65   
Sending request ...

Overview

  • The definitive, real-world guide to Oracle data warehousing
  • Maximizing performance, flexibility, and manageability in production environments
  • Hardware/software architectures, star schema design, partitioning, and more
  • Industrial strength data loading and query optimization techniques
  • By the world-renowned architect of 7-Eleven's multi-terabyte datawarehouse
Maximize Oracle data warehouse performance, flexibility, and manageability

Oracle DBAs finally have a definitive guide to every aspect of designing, constructing, tuning, and maintaining star schema data warehouses with Oracle 8i and 9i. Bert Scalzo, one of the world's leading Oracle data warehousing experts, offers practical, hard-won lessons and breakthrough techniques for maximizing performance, flexibility, and manageability in any production environment. Coverage includes:

  • Data warehousing fundamentals for DBAs--including what a data warehouse isn't
  • Planning software architecture: business intelligence, user interfaces, Oracle versions, OS platforms, and more
  • Planning hardware architecture: CPUs, memory, disk space, and configuration
  • Radically different star schema design for radically improved performance
  • Tuning ad-hoc queries for lightning speed Industrial-strength data loading techniques
  • Aggregate tables: maximizing performance benefits, minimizing complexity tradeoffs
  • Improving manageability: The right ways to partition
  • Data warehouse administration: Backup/recovery, space and extent management, updates, patches, and more
Read More Show Less

Product Details

Meet the Author

BERT SCALZO is a product architect for Quest Software. As a member ofthe TOAD development team, he designed many of the features in the TOADDBA module. Scalzo has presented numerous papers on data warehousing andled or served on data warehouse special interest groups at Oracle eventsthroughout the past five years. He has worked for both Oracle Educationand Oracle Consulting, holds several Oracle Masters, a Ph.D. in ComputerScience, an MBA, and several insurance industry designations. Scalzodesigned 7-Eleven's multi-terabyte, star-schema data warehouse. He isauthor of The TOAD Handbook.

Read More Show Less

Read an Excerpt

I've written this book with the hope that it will serve as mylifetime technical contribution to my database administrator (DBA)brethren. It contains the sum knowledge and wisdom I've gathered thispast decade, both working on and speaking about data warehousing. Itdoes so purely from the DBA's perspective, solely for the DBA's needsand benefit.

While I've worked on many data warehousing projects, my three yearsat Electronic Data Systems (EDS) as the lead DBA for 7-ElevenCorporation's enterprise data warehouse provided my greatest learningexperience. 7-Eleven is a world leader in convenience retailing, withover 21,000 stores worldwide. The 7-Eleven enterprise data warehouse:

  • Is multi-terabyte in size, with tables having hundreds of millionsor billions of rows.
  • Is a true star schema design based on accurate business criteria andrequirements.
  • Has average and maximum report runtimes of seven minutes and fourhours, respectively.
  • Is operational 16X6 (i.e. the database is available 16 hours perday, 6 days per week).
  • Has base data and aggregations that are no more than 24 hours old(i.e., updated daily).

While the 7-Eleven enterprise data warehouse may sound impressive,it was not that way from Day One. We started with Oracle 7.2 and a smallHewlett–Packard (HP) K-class server. We felt like genuine explorers aswe charted new territory for both EDS and 7-Eleven. There were fewreference books or white papers at that time with any detailed datawarehousing techniques. Plus, there were few DBAs who had alreadysuccessfully built multi-terabyte data warehouses with whom to network.Fortunately, EDS and 7-Eleven recognized this fact and embraced thetruly iterative nature of data warehousing development.

Since you are reading this book, it's safe to assume we can agreethat data warehousing is radically different than traditional onlinetransaction processing (OLTP) applications. Whereas OLTP database andapplication development is generally well-defined and thus easy tocontrol via policies and procedures, data warehousing is more iterativeand experimental. You need the freedom, support, and longevity tointelligently experiment ad-infinitum. With few universal golden rulesto apply, often the method of finding what works best for a given datawarehouse is to:

  • Brainstorm for design or tuning ideas.
  • Add those ideas to a persistent list of ideas.
  • Try whichever ideas currently look promising.
  • Record a history of ideas attempted and their results.
  • Keep one good idea out of 10-20 tried per iteration.
  • Repeat the cycle with an ever growing list of new ideas

As Thomas Peters states, "Life is pretty simple: You do some stuff.Most fails. Some works. You do more of what works." That's some of thebest advice I can recommend for successfully building a data warehouseas well.

Purpose

There are numerous data warehousing books out there, so why is thisone different? Simply put: its DBA focus on implementation details. Infact, the mission statement for this book is:

To serve as the DBA's definitive and detailed reference regardingthe successful design, construction, tuning, and maintenance of starschema data warehouses in Oracle 8i and 9i.

So how is this different from what's already out there? In general,I've found that most data warehousing books fall into one of threecategories:

  • Conceptual—Primarily educational about theories andpractices, with very high-level information
  • Overview—Catalogs of hardware, software, and databaseoptions, with few specific recommendations
  • Cookbook—Detailed, DBA-oriented advice for all the datawarehouse development lifecycle stages

Respectively, "best-of-breed" examples for these three categoriesare:

  • Data Warehouse Tool Kit: Practical Techniques for BuildingDimensional Data Warehouses by Ralph Kimball
  • Oracle8 Data Warehousing by Gary Dodge and Tim Gorman

This book, primarily since no other book exists with this kind ofdetailed DBA advice

I mean no disrespect to these other categories or their books. Ihighly recommend Kimball's book to anyone new to data warehousing. Anduntil such time as this books debuts, I also highly recommend Dodge'sbook for DBAs.

Audience

This book is intended for physical DBAs—period, end of story. Thisbook assumes an extensive and detailed working knowledge of Oracletechnologies. Moreover, it presumes a keen awareness of hardware andsoftware options—often a skill possessed only by DBAs who also serve asat least the backup operating system (OS) administrator as well. Thatsaid, there are chapters that will be both applicable and beneficial toother members of the data warehousing team.

The sections on data modeling define how a DBA should interpret andextrapolate an entity relationship diagram (ERD) into a physicaldatabase design. So, this chapter would assist data modelers andapplication architects to understand how a DBA uses their input tocreate the underlying database structure.

Likewise, the sections on staging, promoting, and aggregating datadefine how a DBA should manage objects and processes to mostexpeditiously load massive amounts of data. So, this chapter would beboth educational and inspirational to extract, transform, and load (ETL)programmers tasked with loading a data warehouse.

And finally, the chapter on querying the data defines the indices,statistics, and plans necessary to deliver the best possible ad-hocquery runtimes. So, this chapter would assist business intelligencefront-end designers, who can appreciate how the database handles theircomplex, ad-hoc queries.

Read More Show Less

Table of Contents

Acknowledgments.

Introduction.

1.What Is a Data Warehouse?

The Nature of the Beast. Data Warehouse vs. Big Database. Operational Data Stores Don't Count. Executive Information Systems Don't Count. Warehouses Evolve without Phases. The Warehouse Roller Coaster.

2. Software Architecture.

Business Intelligence Options. Oracle Version Options. Oracle Instance Options-Querying. Oracle Instance Options-Loading. Recommended Oracle Architecture. Great Operating System Debate. The Great Programming Language Debate. The Serial vs. Parallel Programming Debate.

3. Hardware Architecture.

Four Basic Questions. How Many CPUs? How Much Memory? How Many of What Disks? Recommended Hardware Architecture. The Great Vendor Debate. The 32- vs. 64-Bit Oracle Debate. The Raw vs. Cooked Files Debate. The Need for Logical Volume Managers.

4. Star Schema Universe.

The Rationale for Stars. Star Schema Challenges. Modeling Star Schemas. Avoid Snowflakes. Dimensional Hierarchies. Querying Star Schemas. Fact Table Options. When Stars Implode.

5.Tuning Ad-Hoc Queries.

Key Tuning Requirements. Star Optimization Evolution. Star Transformation Questions. Initialization Parameters. Star Schema Index Design. Cost-Based Optimizer. Some Parting Thoughts.

6. Loading the Warehouse.

What About ETL Tools? Loading Architecture. Upstream Source Data. Transformation Requirements. Method 1: Transform, Then Load. Method 2: Load, Then Transform. Deploying the Loading Architecture.

7. Implementing Aggregates.

What Aggregates to Build? Loading Architecture. Aggregation by Itself. Use Materialized Views.

8. Partitioning for Manageability.

A Plethora of Design Options. Logical Partitioning Design. Simple Partitioning in 8i. Simple Partitioning in 9i. Complex Partitioning in 8i. Complex Partitioning in 9i. Partition Option Benchmarks.

9. Operational Issues and More.

Backup and Recovery. Space Management. Extent Management. Updates and Patches.

Index.

Read More Show Less

Preface

I've written this book with the hope that it will serve as mylifetime technical contribution to my database administrator (DBA)brethren. It contains the sum knowledge and wisdom I've gathered thispast decade, both working on and speaking about data warehousing. Itdoes so purely from the DBA's perspective, solely for the DBA's needsand benefit.

While I've worked on many data warehousing projects, my three yearsat Electronic Data Systems (EDS) as the lead DBA for 7-ElevenCorporation's enterprise data warehouse provided my greatest learningexperience. 7-Eleven is a world leader in convenience retailing, withover 21,000 stores worldwide. The 7-Eleven enterprise data warehouse:

  • Is multi-terabyte in size, with tables having hundreds of millionsor billions of rows.
  • Is a true star schema design based on accurate business criteria andrequirements.
  • Has average and maximum report runtimes of seven minutes and fourhours, respectively.
  • Is operational 16X6 (i.e. the database is available 16 hours perday, 6 days per week).
  • Has base data and aggregations that are no more than 24 hours old(i.e., updated daily).

While the 7-Eleven enterprise data warehouse may sound impressive,it was not that way from Day One. We started with Oracle 7.2 and a smallHewlett–Packard (HP) K-class server. We felt like genuine explorers aswe charted new territory for both EDS and 7-Eleven. There were fewreference books or white papers at that time with any detailed datawarehousing techniques. Plus, there were few DBAs who had alreadysuccessfully built multi-terabyte data warehouses with whom to network.Fortunately, EDS and 7-Eleven recognized this fact and embraced thetruly iterative nature of data warehousing development.

Since you are reading this book, it's safe to assume we can agreethat data warehousing is radically different than traditional onlinetransaction processing (OLTP) applications. Whereas OLTP database andapplication development is generally well-defined and thus easy tocontrol via policies and procedures, data warehousing is more iterativeand experimental. You need the freedom, support, and longevity tointelligently experiment ad-infinitum. With few universal golden rulesto apply, often the method of finding what works best for a given datawarehouse is to:

  • Brainstorm for design or tuning ideas.
  • Add those ideas to a persistent list of ideas.
  • Try whichever ideas currently look promising.
  • Record a history of ideas attempted and their results.
  • Keep one good idea out of 10-20 tried per iteration.
  • Repeat the cycle with an ever growing list of new ideas

As Thomas Peters states, "Life is pretty simple: You do some stuff.Most fails. Some works. You do more of what works." That's some of thebest advice I can recommend for successfully building a data warehouseas well.

Purpose

There are numerous data warehousing books out there, so why is thisone different? Simply put: its DBA focus on implementation details. Infact, the mission statement for this book is:

To serve as the DBA's definitive and detailed reference regardingthe successful design, construction, tuning, and maintenance of starschema data warehouses in Oracle 8i and 9i.

So how is this different from what's already out there? In general,I've found that most data warehousing books fall into one of threecategories:

  • Conceptual--Primarily educational about theories andpractices, with very high-level information
  • Overview--Catalogs of hardware, software, and databaseoptions, with few specific recommendations
  • Cookbook--Detailed, DBA-oriented advice for all the datawarehouse development lifecycle stages

Respectively, "best-of-breed" examples for these three categoriesare:

  • Data Warehouse Tool Kit: Practical Techniques for BuildingDimensional Data Warehouses by Ralph Kimball
  • Oracle8 Data Warehousing by Gary Dodge and Tim Gorman

This book, primarily since no other book exists with this kind ofdetailed DBA advice

I mean no disrespect to these other categories or their books. Ihighly recommend Kimball's book to anyone new to data warehousing. Anduntil such time as this books debuts, I also highly recommend Dodge'sbook for DBAs.

Audience

This book is intended for physical DBAs--period, end of story. Thisbook assumes an extensive and detailed working knowledge of Oracletechnologies. Moreover, it presumes a keen awareness of hardware andsoftware options--often a skill possessed only by DBAs who also serve asat least the backup operating system (OS) administrator as well. Thatsaid, there are chapters that will be both applicable and beneficial toother members of the data warehousing team.

The sections on data modeling define how a DBA should interpret andextrapolate an entity relationship diagram (ERD) into a physicaldatabase design. So, this chapter would assist data modelers andapplication architects to understand how a DBA uses their input tocreate the underlying database structure.

Likewise, the sections on staging, promoting, and aggregating datadefine how a DBA should manage objects and processes to mostexpeditiously load massive amounts of data. So, this chapter would beboth educational and inspirational to extract, transform, and load (ETL)programmers tasked with loading a data warehouse.

And finally, the chapter on querying the data defines the indices,statistics, and plans necessary to deliver the best possible ad-hocquery runtimes. So, this chapter would assist business intelligencefront-end designers, who can appreciate how the database handles theircomplex, ad-hoc queries.

Read More Show Less

Introduction

I've written this book with the hope that it will serve as mylifetime technical contribution to my database administrator (DBA)brethren. It contains the sum knowledge and wisdom I've gathered thispast decade, both working on and speaking about data warehousing. Itdoes so purely from the DBA's perspective, solely for the DBA's needsand benefit.

While I've worked on many data warehousing projects, my three yearsat Electronic Data Systems (EDS) as the lead DBA for 7-ElevenCorporation's enterprise data warehouse provided my greatest learningexperience. 7-Eleven is a world leader in convenience retailing, withover 21,000 stores worldwide. The 7-Eleven enterprise data warehouse:

  • Is multi-terabyte in size, with tables having hundreds of millionsor billions of rows.
  • Is a true star schema design based on accurate business criteria andrequirements.
  • Has average and maximum report runtimes of seven minutes and fourhours, respectively.
  • Is operational 16X6 (i.e. the database is available 16 hours perday, 6 days per week).
  • Has base data and aggregations that are no more than 24 hours old(i.e., updated daily).

While the 7-Eleven enterprise data warehouse may sound impressive,it was not that way from Day One. We started with Oracle 7.2 and a smallHewlett–Packard (HP) K-class server. We felt like genuine explorers aswe charted new territory for both EDS and 7-Eleven. There were fewreference books or white papers at that time with any detailed datawarehousing techniques. Plus, there were few DBAs who had alreadysuccessfully built multi-terabyte data warehouses with whom to network.Fortunately, EDS and 7-Eleven recognizedthis fact and embraced thetruly iterative nature of data warehousing development.

Since you are reading this book, it's safe to assume we can agreethat data warehousing is radically different than traditional onlinetransaction processing (OLTP) applications. Whereas OLTP database andapplication development is generally well-defined and thus easy tocontrol via policies and procedures, data warehousing is more iterativeand experimental. You need the freedom, support, and longevity tointelligently experiment ad-infinitum. With few universal golden rulesto apply, often the method of finding what works best for a given datawarehouse is to:

  • Brainstorm for design or tuning ideas.
  • Add those ideas to a persistent list of ideas.
  • Try whichever ideas currently look promising.
  • Record a history of ideas attempted and their results.
  • Keep one good idea out of 10-20 tried per iteration.
  • Repeat the cycle with an ever growing list of new ideas

As Thomas Peters states, "Life is pretty simple: You do some stuff.Most fails. Some works. You do more of what works." That's some of thebest advice I can recommend for successfully building a data warehouseas well.

Purpose

There are numerous data warehousing books out there, so why is thisone different? Simply put: its DBA focus on implementation details. Infact, the mission statement for this book is:

To serve as the DBA's definitive and detailed reference regardingthe successful design, construction, tuning, and maintenance of starschema data warehouses in Oracle 8i and 9i.

So how is this different from what's already out there? In general,I've found that most data warehousing books fall into one of threecategories:

  • Conceptual--Primarily educational about theories andpractices, with very high-level information
  • Overview--Catalogs of hardware, software, and databaseoptions, with few specific recommendations
  • Cookbook--Detailed, DBA-oriented advice for all the datawarehouse development lifecycle stages

Respectively, "best-of-breed" examples for these three categoriesare:

  • Data Warehouse Tool Kit: Practical Techniques for BuildingDimensional Data Warehouses by Ralph Kimball
  • Oracle8 Data Warehousing by Gary Dodge and Tim Gorman

This book, primarily since no other book exists with this kind ofdetailed DBA advice

I mean no disrespect to these other categories or their books. Ihighly recommend Kimball's book to anyone new to data warehousing. Anduntil such time as this books debuts, I also highly recommend Dodge'sbook for DBAs.

Audience

This book is intended for physical DBAs--period, end of story. Thisbook assumes an extensive and detailed working knowledge of Oracletechnologies. Moreover, it presumes a keen awareness of hardware andsoftware options--often a skill possessed only by DBAs who also serve asat least the backup operating system (OS) administrator as well. Thatsaid, there are chapters that will be both applicable and beneficial toother members of the data warehousing team.

The sections on data modeling define how a DBA should interpret andextrapolate an entity relationship diagram (ERD) into a physicaldatabase design. So, this chapter would assist data modelers andapplication architects to understand how a DBA uses their input tocreate the underlying database structure.

Likewise, the sections on staging, promoting, and aggregating datadefine how a DBA should manage objects and processes to mostexpeditiously load massive amounts of data. So, this chapter would beboth educational and inspirational to extract, transform, and load (ETL)programmers tasked with loading a data warehouse.

And finally, the chapter on querying the data defines the indices,statistics, and plans necessary to deliver the best possible ad-hocquery runtimes. So, this chapter would assist business intelligencefront-end designers, who can appreciate how the database handles theircomplex, ad-hoc queries.
Read More Show Less

Customer Reviews

Average Rating 4
( 1 )
Rating Distribution

5 Star

(0)

4 Star

(1)

3 Star

(0)

2 Star

(0)

1 Star

(0)
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted Thu Jul 31 00:00:00 EDT 2003

    Don't use Oracle 7!

    Scalzo gives a chatty, informal exposition of tuning large data warehouses using Oracle software. He uses a strong conversational tone that makes such issues as star schema query optimisations straightforward to follow, under the assumption that you are already an experienced Oracle database administrator. The book is independent of Oracle. What is unclear to me is how Oracle would regard this book. In several places, Scalzo describes the appallingly slow performance of Oracle 7 vis-a-vis Oracle 8i and 9i. He basically says that for certain tasks on large data sets, Oracle 7 was badly designed. No one who is currently using Oracle 7 will be thrilled to hear this. (Hee hee.) Cynically, one migh think Oracle 7 was overpromised and it underdelivered in some ways. But positively for Oracle, Scalzo gives typically much better results for 8i and 9i. An inducement to upgrade, if you have not done so already. The biggest quibble about this book is the chatty style. Doesn't bother me. But it may perturb some of you, if you prefere a more 'serious' timbre to the discussion of expensive design and deployment issues. If you can see your way clear past this, then the book may have merit in your DBA job.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)