DB2 High Performance Design and Tuning

Overview

Improve the performance of any IBM DB2 database with expert Version 6.x and 7.0 tuning techniques!

The only all-in-one, start-to-finish guide to maximizing DB2 performance!
Top 20 DB2 Performance Myths-and what to do about them
Includes exclusive, in-depth coverage of DB2 Versions 6.x and 7!
Tuning for the DB2 designers, developers and administrators
Design and tuning for robust e-commerce functionality

Foreword by Roger Miller, IBM Lead DB2 Strategist

This is the most comprehensive, practical DB2 optimization guide ever published! World-renowned DB2 consultants Richard Yevich and Susan Lawson cover all aspects of DB2 performance tuning: hardware, physical design, application coding, SQL, stored procedures, triggers, Java, and beyond. Review I/O, storage management, SMS usage, CPU tuning, memory tuning, and bufferpool design. Master every key DB2 database design optimization, including indexing, code and reference tables, and supporting the DB2 Catalog and Directory. Discover start-to-finish application development techniques proven to deliver outstanding performance, including commit strategies, application interfaces, and superior SQL coding techniques.

Top 20 DB2 Performance Myths—and what to do about them!
Based on hands-on experience at hundreds of DB2 installations
Includes hands-on, in-depth coverage of data sharing and DB2 Version 7!
For developers, designers, and DBAs

From OLAP to OLTP, e-business to high availability, troubleshooting to long-term monitoring, this book covers it all. You'll be amazed at how much more performance youcan squeeze out of your DB2 system—with DB2 High Performance Development & Tuning!

Read More Show Less

Product Details

ISBN-13: 9780132037952
Publisher: Pearson Education
Publication date: 8/24/2000
Edition number: 2
Pages: 697
Product dimensions: 6.90 (w) x 9.40 (h) x 1.57 (d)

Related Subjects

Read an Excerpt

PREFACE:

Preface

What's the one thing I can do to fix the performance problems
in my DB2 systems?

DB2 came, saw, and conquered on the mainframes or enterprise servers, better known today as the S/390. Just take a look at the some of the enormous database sizes and heavy workloads that DB2 is currently supporting in production. Here are a few examples: a single database table of six billion rows, an online system achieving over 1,200 transactions per second, batch systems exceeding 80 million update transactions per day and turning out approximately 1.3 million reports daily, and insert programs pushing 300+ inserts per day. These database systems are in existence today and are growing and pushing the envelope in new ways. These numbers may seem astonishing to some, but they show the capabilities of the DB2 engine and set the stage for larger databases and more complex applications. Take, for example, a major service firm that is in the planning stages of creating a 130-billion-row table, which we like to call a VLTB (very large table), to be implemented in the near future. These figures will probably be exceeded greatly by the time you read this material. Even with all these success stories and reports of the amazing accomplishments of these large-database and high-transaction-volume systems, many organizations are still failing to meet their service-level agreements. Why? It is not DB2 that has failed at these sites, but the design and implementation of the applications being implemented in DB2 and the care and feeding of the subsystem and its surroundings. To build a system of large magnitude such as those mentioned earlier,organizations are faced with several new challenges. Many of these challenges need to be met head on with knowledge and expertise that often is lacking, making this task virtually impossible.

Technical specialists such as database administrators, database designers, system programmers, and application programmers often are expected to perform miracles and are all too often limited in the resources available to accomplish the difficult task at hand. "Develop a large database with several large tables with no outages allowed, to support a complex application with several thousand SQL statements, running millions of transactions per day with excellent response time. By the way, we have no time for additional education, design reviews, or SQL performance tuning because we have a deadline to meet." Sound familiar? Many organizations have found that it is very easy to implement DB2 applications, which is one of the best features of the product, but without careful planning and attention to performance, these easy-to-implement applications can soon become performance nightmares. This fact could be best emphasized with a quote from IBM lead DB2 strategist Roger Miller: "The rule is not to bite the hand that you expect to feed you."

Many organizations have had their share of experience with poorly performing applications in the past, and the reason is probably that it is easy to implement a database application poorly when critical phases of system development are rushed or skipped altogether. No one goes into systems development with the goal of developing a poorly performing application. Often, however, there are design, development, and programming standards and practices in place that seem to force poor performance due to a lack of understanding about all the elements that affect performance in DB2. Even worse is the adherence to old performance myths that have been propagated through the years. This book is designed both to try to destroy the bad practices and to direct focus on more efficient and effective ways of viewing performance in DB2. DB2 works exactly the way it is directed, good or bad. The good news is that it is a proven fact that we can accomplish amazing feats with DB2 if we understand how it works and what is needed to make it perform.

Purpose of this Book

The title includes the words "high performance." Today, there are many books, papers, and especially IBM manuals that contain mountains of material on performance. But when technical people are faced with the development of a high-performance system, who has time to climb those mountains? As a result, many systems achieve what is deemed as acceptable performance based upon the best efforts of the technical staff with the resources available to them. The problem of less-than-optimal performance has existed in the past, but something is happening today to change the perspective of what is acceptable: the "e" stuff and to a lesser degree the "BI" stuff. For example, one company was very happy with its data-sharing complex's achieving 1,000 transactions per second, but the movement to support web-based clients changed all that. In a traditional application, a terminal operator may place an order using one or more transactions. But when an order is placed from the web, it may spawn 10 of those traditional back-end transactions. Prior to the web phenomenon, there were a fixed number of terminal operators, but the web opens up an unlimited number of users issuing transactions to place orders. What was acceptable performance in the past is now simply inadequate because you are not just dealing with your internal company personnel (who may have not complained or were ignored about system response time) taking orders. The rules have changed; with e-business you open the doors of your company to the world. You might say that you expose yourself, and any performance problems you had in the past will now be magnified.

So what can be done to take an existing application to a level far above what is achieved today, and how do we develop a new application that can perform at or above expectations? Too often this question is answered by adding additional CPUs. Sure, that makes vendors happy, but is that really how to address performance problems? This is short-term, quick fix, but it often is not the answer because the dollar costs related to this solution are often too high, and in the long run the application may not be scalable because the underlying problems have not been addressed.

What needs to be done is to remove code length at all of the 50 or more pressure points within the DB2 environment and within the application. It is the task of the person responsible for performance to find these key points and determine the best way to improve performance, through either subsystem tuning, database tuning, or application tuning. It is truly amazing what can be done by identifying a few key points and tuning them, and the benefits gained by such efforts are definitely worth it. We have seen simple system changes increase the number of concurrent threads by a factor of 4. SQL tuning dropped an application from 400 hours a month to less than 7. While some areas may seem miniscule by themselves, it is the combined benefits that will provide your database systems with optimal performance. By performing several small changes, it is possible for a 100-transaction-persecond OLTP application to become a 400-transaction-per-second OLTP application on the same hardware and operating environment. The challenge lies with the "bloodhound" skills to find the performance problems and the proper knowledge and resources to fix them. There are also many problems that can be fixed quite easily, bringing large performance gains immediately.

This book focuses on identifying these key pressure points, determining what is necessary to improve them, and taking a little code length out of each one. Suppose that you could remove 0.10 CPU seconds from 40 different pressure points. That equates to 4 CPU seconds for allowing more transactions to complete. Since the average OLTP system handles between 8,000 and 12,000 SQL statements per second, it is easy to figure out just how significant a series of minor changes can be. Add this type of change to other tuning efforts, and the results of the overall combined effort can be truly amazing.

Keep in mind there is no "silver bullet" in DB2 performance. Optimal performance comes from a combination of proper design, subsystem implementation, application coding, and customizing DB2 for your organization's specific requirements.

Maximum performance cannot be achieved without a full understanding of how each component interrelates with the others, such as how a simple SQL statement can use or abuse all the resources of the system and take literally hours to run but, with a minor change, take fewer resources and complete in seconds. Every design issue in DB2 is a trade-off. DB2 must be viewed as a delicate balance of issues and techniques, where pushing down on any single point increases pressure on another point somewhere else. High-performance design and tuning require a delicate balance, and therefore each balance/counterbalance situation will be discussed.-It is important that all technical support personnel understand how and why these trade-offs occur and, more important, what they can do in their particular area to achieve high performance.

There are many rules of thumb and guidelines for DB2. Unfortunately, sometimes these guidelines become standards of practice. Each of these "rules" must be examined, explained, and accepted or rejected. Mostly, they represent valid "day 1" starting points and nothing else. Worse, any "rule" is subject to change during any maintenance release of the DB2 software and must be reviewed or analyzed on an ongoing basis.

Often this question arises: "Since my system is relatively small with only a handful of users and a small amount of data, should I be concerned with all of these issues?" The answer is, "Yes, especially if you want to succeed by giving your applications the ability to integrate with others and to grow in both scope and scale." This can truly be achieved by striving for optimal performance in the implementation, which requires an overall understanding of the critical issues. Keep in mind that almost all systems expand, regardless of the initial plan or scope. It appears that the word "archive" has been removed from the dictionary, along with the phrase "We will never need that old data again." Once a system comes online in DB2, the data's life seems to become eternal. This is understandable, because we really don't deal with simple data anymore. We deal in information, and information must be viewed as a fixed asset with a real dollar value. For instance, when two old, seemingly unimportant pieces of data become "related" and then become an "atomic particle of information," the data will never again be just old and unnecessary; it now has become information that needs to be stored indefinitely.

Therefore, it is important that we strive to do it right the first time, or at least try to design the system to handle growth without degrading performance. Several years ago, a one-million-row table was considered enormous. Currently, a retailer has a six-billion-row table that has been estimated to reach ten billion rows (probably a very conservative estimate). A service organization is achieving 1,200 transactions per second, with each transaction comprising multiple SQL DML statements; however, these are designed transactions. A major financial company is viewing an EIS database that is being estimated in petabytes (1,000 TB or 1,000,000 GB, and too long to list in megabytes or bytes). Today, we talk about terabyte disk storage servers and storage area networks (SANs) housing massive amounts of data and turning that data into "information," and we are increasing the size of the data stores. Two traditional rules of thumb still apply:

 Rule of Thumb: It depends.


 Rule of Thumb: Always design for performance.

These two rules will never change, although there is an important corollary:

 Rule of Thumb: If you do not design for performance,


 then by default you are designing for poor performance.

Think about that for a while: designing for poor performance intentionally.

As an example of designing for poor performance, many places assume that all SQL gets written the best way possible, by all personnel, the first time! Unfortunately, this is not often the situation. In fact, the two most important areas of abuse that cause poor performance are improper use of SQL and improper physical design. Combine these two areas and you have a formula for disaster. This is not just opinion but comes after years of analysis by many who routinely address performance problems in real-world DB2 applications. Performance improvements of 240 to 1 have been achieved by simply correcting poor SQL and COMMIT strategies and minor tweaking of the relational design. Batch jobs have been reduced from 4 hours to 10 minutes and from 2 hours to 55 seconds. In one case an 85-95% reduction in run time was achieved on a system that would have required 400 hours of elapsed time to run each month (along with some hardware changes); this application now runs in 7 to 15 hours. These are significant numbers, and if you apply a cost factor to this, the dollar figures are impressive. A common problem in achieving a proper database design occurs when an improper system design is forced to be accepted, even though it could never support the application requirements. Even with this knowledge, the attitude seems to be, "Just fix it later; we have a deadline to meet, and we have already started coding." Unfortunately, and most of the time as predicted, the application's transactions cannot complete in the time allotted, and the database cannot handle the load. The most detrimental fact here is that design flaws are hardest, if not impossible, to fix, regardless of the amount of system or SQL application tuning. Sometimes this can result in the worst of all fates, moving the application logic into the program and simply using a DB2 file system. When this happens, the power of DB2 is completely inhibited, and any further growth of the application becomes a daunting task.

Without fail, in every organization where the issues and resources of DB2 have been discussed at any managerial level, a common reaction is, "DB2 implementation is easy and should not take many resources." Often this statement is based upon what is expected of DB2, without taking into consideration what needs to be given to the processes using DB2 to meet those expectations. For example, it is often said that "SQL is easy to learn and code; it must be DB2 that is causing the problem," when the truth is that SQL can be extremely difficult to learn to use properly. Today we are reviewing 10-page SQL statements that do the processing of several hundred lines of COBOL. This is a little more complicated than a single SQL select statement on a small PC-based database. This misperception often leads to SQL performance tuners being very undervalued, if they exist within the organization at all.

There is a big difference in performance between work done using SQL and work done after the SQL in the application code. Typically, SQL is used as "just another access method." SQL is not an access method! The number of ways that SQL can be used to navigate relational structures is mind-boggling. The DB2 SQL optimizer is one of the most powerful and robust optimizers in existence today, with amazing query rewrite capabilities. The more you can drive through the engine and allow the optimizer to help, the better your performance will be. Why have DB2 do a little work, incur all of the overhead to pass massive amounts of data back to the program, only to have to process it there? There have been several instances where pages—yes, pages—of C or COBOL code containing multiple cursors, fetching rows, and building complex linked lists for further processing other tables were replaced with one SQL statement. In every case where this was possible, there was a significant reduction in processing time.

These are some of the types of problems and issues that are addressed in the following chapters. All components of making DB2 work well are addressed, from system administration to physical design, down to a single SQL statement.

It is very difficult to investigate any type of problem in DB2 without offending hardworking technical support folks who have implemented what they truly believed to the correct method or who were limited by time or resources. It is important to understand that what was implemented in an earlier release of DB2 may not be optimal or applicable with the current release. This often is overlooked, resulting in less-than-optimal performance standards carried though DB2 releases. Every installation is different, and every installation has different problems. What is done at ABC Company, while it sounds wonderful, will not necessarily work at DEF Company and in fact might bring DEF Company to its knees if attempted! The topics presented in this book should help anyone make such determinations before disaster occurs. Remember the prime rule of DB2: "It depends!" This book helps identify what it depends upon, so that you can make informed performance decisions and achieve the desired results.

Organization of this Book

The book is organized from the broad to the narrow—from the overall environment to the inside of DB2 itself, from the application programs that use DB2 to the specific pieces used to build the applications. We found composing chapters on CPU, I/O, and memory difficult, since parts of this material come from so many different areas, covering all aspects of DB2 and its use. We decided to present this information in a hierarchical structure. The material is divided into six sections:

Section 1 is an overview of DB2 performance issues and a list of the top 20 performance myths that we most often encounter when identifying performance problems. These myths involve areas of performance that are most commonly misunderstood or misinterpreted. The resolutions of these myths are found throughout the various chapters, with additional information to help avoid problems associated with the myths and to fix existing problems.
Section 2 begins at the top of the DB2 hierarchy by discussing the system performance aspects of DB2, such as I/O and storage management, CPU, and memory. We look at such items as DFSMS usage, new disk storage devices, and detailed buffer pool design strategies.
Section 3 takes you into the aspects of physical design. Topics here include designing database objects, such as indexes, and discussing tricks and techniques for special situations, such as code and reference tables. This section also includes information on how to best support the DB2 catalog and directory, which contains all the information about physical objects.
Section 4 goes into application programming topics. These chapters cover proper application design, commit strategies, application interfaces, SQL coding tips, and examples of coding complex SQL statements. We also take a look at the latest enhancements in Version 6 and have a brief look at what is coming in Version 7.
Section 5 covers using several of the new, and not so new, features of DB2, such as stored procedures, triggers, user-defined functions, LOBS, and Java.
Section 6 is optional for many readers, because it covers an optional feature of DB2 called data sharing. This section addresses issues about migrating to a data-sharing environment, tuning your current environment prior to the migration, physical design considerations, and how to monitor and tune in this new environment.

Each chapter contains more information explaining the details of the areas we cover, as well as performance design strategies, tuning tips, and even critical roles and responsibilities critical for today's high-performance environment.

Audience for this Book

This book is for database designers, database administrators, architects, application designers, programmers, and anyone else who has the responsibility to ensure that DB2 systems meet or exceed their stated performance objectives. The four stages of building systems have always been

Make it work.
Make it work right.
Make it work fast.
Make it work all the time and be continuously available.

This book focuses mainly on the third category and a little on the fourth. The third category is often passed over in the real world of business applications. In most of these cases, failure to meet performance objectives often lies with the database design. Most often it lies with the physical database design and its implementation of programming paradigms and data access.

The book specifically addresses correctness of physical database design in DB2 and implementation strategies for application design and information access, along with some advanced SQL. We have attempted to leave no stone unturned. As you will see from the war stories throughout the text, we take no prisoners when it comes to eliminating performance problems.

We have presented this material several times all over the world and many times got the same two responses: either excitement from now knowing where to look for problems where tuning will pay off or concern about how to fix problems that have become imbedded in the organization throughout the years. The good news is that much of the information in this book comes from our real-life work and experience. We have worked with several large DB2 clients, we have seen what works and what does not, and in many cases we have seen tremendous improvements just by following high-performance design strategies.

Acknowledgments

There are many people to thank for assisting in the creation of this book, and many more who have provided information over the years, and it would impossible to name everyone. We would like to extend a very special thank you to all the IBM DB2 developers in the Silicon Valley Labs, and a special nod to Roger Miller and Jeff Josten for their technical reviews. For assisting in reviews, answering questions, and providing guidance, we thank Michael Harman, Joel Goldstein, Kathy Komer, Klaas Brant, Jan Henderyckx, and Jeff Vowell. We also would like to thank our many clients who have allowed us to share their experiences.

Read More Show Less

	Foreword
	Preface
	Setting the Stage	1
Sect. 1	Environment and Tuning	11
Ch. 1	I/O and Storage Management	13
Ch. 2	CPU	39
Ch. 3	Memory	55
Sect. 2	Database Design and Tuning	109
Ch. 4	Catalog and Directory	111
Ch. 5	Physical Database Objects	135
Ch. 6	VLDBs, VLTBs, and Warehouses	183
Ch. 7	Large Objects (LOBs)	209
Ch. 8	Special Tables	225
Ch. 9	Roles in Database Implementation	245
Sect. 3	Application Design and Tuning	253
Ch. 10	General Application Design	255
Ch. 11	Version 6 Late Additions and Version 7	279
Ch. 12	Program Design and Processing	299
Ch. 13	Locking and Concurrency	327
Ch. 14	Temporary Tables	343
Ch. 15	Enterprise Resource Planning	351
Sect. 4	SQL and Advanced Function Tuning	367
Ch. 16	SQL	369
Ch. 17	Triggers and User-Defined Functions	403
Ch. 18	LOBs and Extenders	423
Ch. 19	Stored Procedures	433
Ch. 20	REXX, Java, and SQL Procedure Language	463
Ch. 21	Parallelism	473
Ch. 22	Predictive Analysis, Governing and Tuning	495
Ch. 23	Influencing the Optimizer	543
Ch. 24	E-Business, the Web, and Networks	575
Sect. 5	Data Sharing Design and Tuning	585
Ch. 25	Data Sharing Overview and Processing Costs	587
Ch. 26	Hardware and Configuration Considerations	593
Ch. 27	Locking	601
Ch. 28	Group Buffer Pools	615
Ch. 29	Migration	635
Ch. 30	Application Tuning	649
Ch. 31	Physical Design	653
Ch. 32	Problem Diagnosing	659
Ch. 33	Long-Term Monitoring	667
	Index	681

Read More Show Less

Preface

PREFACE:

Preface

What's the one thing I can do to fix the performance problems
in my DB2 systems?

DB2 came, saw, and conquered on the mainframes or enterprise servers, better known today as the S/390. Just take a look at the some of the enormous database sizes and heavy workloads that DB2 is currently supporting in production. Here are a few examples: a single database table of six billion rows, an online system achieving over 1,200 transactions per second, batch systems exceeding 80 million update transactions per day and turning out approximately 1.3 million reports daily, and insert programs pushing 300+ inserts per day. These database systems are in existence today and are growing and pushing the envelope in new ways. These numbers may seem astonishing to some, but they show the capabilities of the DB2 engine and set the stage for larger databases and more complex applications. Take, for example, a major service firm that is in the planning stages of creating a 130-billion-row table, which we like to call a VLTB (very large table), to be implemented in the near future. These figures will probably be exceeded greatly by the time you read this material. Even with all these success stories and reports of the amazing accomplishments of these large-database and high-transaction-volume systems, many organizations are still failing to meet their service-level agreements. Why? It is not DB2 that has failed at these sites, but the design and implementation of the applications being implemented in DB2 and the care and feeding of the subsystem and its surroundings. To build a system of large magnitude such as those mentionedearlier,organizations are faced with several new challenges. Many of these challenges need to be met head on with knowledge and expertise that often is lacking, making this task virtually impossible.