- Shopping Bag ( 0 items )
Other sellers (Hardcover)
-
All (11) from $3.33
-
New (5) from $70.23
-
Used (6) from $3.33
More About This Textbook
Overview
The use of parallel processing technology in the next generation of Database Management Systems (DBMSs) makes it possible to meet new and challenging requirements. Database technology in rapidly expanding new application areas brings unique challenges such as increased functionality and efficient handling of very large heterogeneous databases.
Abdelguerfi and Wong present the latest techniques in parallel relational databases illustrating high-performance achievements in parallel database systems. The text is structured according to the overall architecture of a parallel database system presenting various techniques that may be adopted to the design of parallel database software and hardware execution environments. These techniques can directly or indirectly lead to high-performance parallel database implementation.
The book's main focus follows the authors' engineering model: A survey of parallel query optimization techniques for requests involving multi-way joins; A new technique for a join operation that can be adopted in the local optimization stage; A framework for recovery in parallel database systems using the ACTA formalism; The architectural details of NCR's new Petabyte multimedia database system; A description of the Super Database Computer (SDC-II); A case study for a shared-nothing parallel database server that analyzes and compares the effectiveness of five data placement techniques.
Editorial Reviews
Booknews
Presents techniques for designing parallel relational databases to meet the requirements of increased database sizes and the growing need for more sophisticated functionality, such as the support of object-oriented applications. The nine papers address the request manager, the data manager, the parallel machine architecture, and concerns about different techniques for the partitioned data store. No index. Annotation c. by Book News, Inc., Portland, Or.Product Details
Related Subjects
Read an Excerpt
Parallel Database Techniques
By Mahdi Abdelguerfi Kam-Fai Wong
John Wiley & Sons
ISBN: 0-8186-8398-8Chapter One
IntroductionMahdi Abdelguerfi and Kam-Fai Wong
1.1 Background
There has been a continuing increase in the amount of data handled by database management systems (DBMSs) in recent years. Indeed, it is no longer unusual for a DBMS to manage databases ranging in size from hundreds of gigabytes to terabytes. This massive increase in database sizes is coupled with a growing need for DBMSs to exhibit more sophisticated functionality such as the support of object-oriented, deductive, and multimediabased applications. In many cases, these new requirements have rendered existing DBMSs unable to provide the necessary system performance, especially given that many mainframe DBMSs already have difficulty meeting the I/O and CPU performance requirements of traditional information systems that service large numbers of concurrent users and/or handle massive amounts of data.
To achieve the required performance levels, database systems have been increasingly required to make use of parallelism. As noted in, the traditional approach to parallelism for conventional DBMSs which use industry-standard database models such as the relational, can take one of two forms. The first is through the use of massively parallel general-purpose hardware platforms. As an example of this, commercial platforms such as nCube and SP1 are now supporting Oracle's parallel server. Also, the Distributed Array Processor marketed by Cambridge Parallel Processing is now being used to produce a commercial massively parallel database system. The second approach makes use of arrays of off-the-shelf components to form custom massively parallel systems. For the most part, these hardware systems are based onMIMD parallel architectures. The NCR 3700 and the Super Database Computer II (SDC-II) are two such systems. The NCR 3700 uses a high-performance multistage interconnection network known as Bynet and RAIDS (Redundant Arrays of Inexpensive Disks). This system can now run a parallel version of Sybase relational DBMS. The SDC-II consists of eight data processing modules, where each module is composed of seven processors and five disk drives. The data processing modules communicate through an omega interconnection network.
The use of clusters of workstations as virtual parallel systems is a more recent approach that is already impacting the DBMS industry. These networks of workstations provide enormous amounts of aggregate computational power, often rivaling that of tightly coupled multiprocessor systems. They provide a viable high-performance computing environment and have several benefits over large, dedicated parallel machines, including cost, elimination of central point of failure, and scalability. Their use as a virtual parallel machine does not preclude the use of individual machines in more traditional ways. As an example of this, parallel versions of relational DBMSs, such as Oracle, are now even available on clusters of PC-compatible systems, thereby providing high performance at a relatively low cost.
The number of general purpose or dedicated parallel database computers is increasing each year. It is not unrealistic to envisage that all high-performance database management systems in the year 2010 will support parallel processing. The high potential of parallel databases in the future urges both the database vendors and practitioners to understand the concept of parallel database systems in depth.
1.2 Parallel Database Systems
The parallelism in databases is inherited from their underlying data model. In particular, the relational data model (RDM) provides many opportunities for parallelization. For this reason, existing research projects, academic and industrial alike, on parallel databases are nearly exclusively centered on relational systems. In addition to the parallel potential of the relational data model, the worldwide utilization of relational database management systems has further justified the investment in parallel relational databases research. It is, therefore, the objective of this book to review the latest techniques in parallel relational databases.
The topic of parallel databases is large and no single manuscript could be expected to cover this field in a comprehensive manner. In particular, this manuscript does not address parallel object-oriented database systems. However, it is noteworthy that several projects described in this manuscript make use of hybrid relational DBMSs. The emergence of hybrid relational DBMSs such as multimedia object/relational [10] database systems has been made necessary by new database applications as well as the need for commercial corporations to preserve their initial investment in the RDM. These hybrid systems require an extension to the underlying structure of the RDM to support unstructured data types such as text, audio, video, and object-oriented data. Towards this end, many commercial relational DBMSs are now offering support for large data types (also known as binary large objects, or BLOBs) that may require several gigabytes of storage space per row.
1.2.1 Computation Model
In relational databases, the mutual independence between two tables as well as between two tuples within a table, makes simultaneous processing of multiple tables and/or tuples possible. A database request often involves the processing of multiple tables. The ways in which a request accesses these tables and combines the intermediate results are defined by the computation model. Based on this model, one can understand the potential parallelism embedded in a database request.
Computation of a database request is modeled by an extended dataflow graph (EDG). The idea is to extend a conventional dataflow graph (that is, data operation nodes interconnected by data communication arcs) with partition parallelism; as such, one data operation node may be made up of multiple subnodes. As a result, complex and time consuming database operations, such as join, can be executed concurrently in a divide-and-conquer manner. For example, a join operation between (A join B) can be executed by performing N subjoins (Ai join Bi, where i = 1 to N) in parallel and later combining the results to produce the final answer of the join operation. The EDG model is the basis of FAD and LERA, the query languages of the BUBBA and EDS parallel database servers. To better understand the concept of EDG, let us consider the SQL request below:
SELECT * FROM employee, department WHERE (employee.dept_no = department.dept_no) AND (employee.position = "manager")
Assuming that the employee relation is partitioned in three fragments (that is, E1, E2, and E3) and the department relation in two (that is, D1 and D2), this request can be represented in the EDG shown in Figure 1.1 (b). The control of the execution of the EDG is completely data driven. Referring to the example, when a START signal is received by the SW (single wait), it will initiate the query execution process. It will send a trigger signal to the two SCAN operators which then commence the scanning of the employee table seeking for the tuples whose position matches with the string "manager." Those tuples which satisfy the condition are distributed, by hashing on the dept_no attribute, to the JOIN operators. Conceptually, as soon as one of the employee tuples appears at any one of the JOIN operators, the join operation with the department tuples can proceed without delay. The results of the join operation are assumed to be stored on the same nodes which store the department table. Finally, once the last tuple on each JOIN operator is processed, a signal from each of them will be sent to the GW (global wait) operator. Upon receiving these three end signals, GW will terminate the execution process. It is shown from the above that the EDG is a self-scheduling structure such that once the START signal is sent, the execution of the EDG will run to completion by itself without any external control such as a program counter.
Furthermore, if the underlying execution platform is comprised of multiple processing units, a set of similar database requests can be executed in parallel as follows:
Interquery parallelism. More than one database request can be executed at one time. This will lead to increased throughput-that is, the number of requests processed per second, an important metric in on-line transaction application.
Interoperation parallelism. Within one request more than one operation can be performed simultaneously. In this example both the SCAN and JOIN operations can be executed in parallel. Notice that this form of parallelism is also referred to as pipeline parallelism since the result tuples from the SCAN operator can be processed by the JOIN operator as soon as they are available.
Intraoperationparallelism. This is the parallelism offered by data partitioning. An operation can be split into smaller suboperations if the data that it processes are partitioned. This form of parallelism is the key to response time reduction.
An EDG only shows the potential parallelism of a database request. The actual parallelism depends on the implementation. Realization of the parallelism is a complicated engineering task whose primary objective is to achieve high performance in a cost-effective manner.
1.2.2 Engineering Model
To achieve a high-performance parallel database system, one must provide an efficient environment for EDG execution. This will require a thorough understanding of the parallel database systems engineering model. The engineering model defines how an EDG is physically executed on a parallel database environment. The main target of this book focuses on the engineering model and presents a number of state-of-the-art techniques which can either directly or indirectly lead to high-performance parallel database implementation.
In some practical situations, due to various physical constraints, certain forms of parallelism in an EDG are simply "restricted." In the above example, if there are only four processing units in the underlying parallel platform, two of the five data fragments, E1, E2, E3, D1, and D2 (see Figure 1.1 (b)), must then be placed together on the same processing unit. This, inevitably, limits parallelism; but for cost-effectiveness reasons, the parallel database designer may, however, choose to do this. In other situations, parallelism may be "reduced." In the above example, if the communication cost (for example, in setting up a message packet) is high, the SCAN operator may bundle several tuples together before sending them out to the downstream JOIN operator. Effectively, this reduces the pipeline parallelism. In yet other situations, parallelism may be "eliminated" deliberately by the system. In the above example, if the sizes of the employee and the department relations are small, the communication cost between the SCAN and JOIN operators may undermine the processing time of the two operations. In that case, one may group the operators together to avoid the communication overhead. The above are just a few examples depicting how the execution environment can hinder the exploitation of the potential parallelism embedded in an EDG.
Software Execution Environment. The top-level architecture of a typical parallel database system (such as the EDS parallel database server) consisting of the software and hardware execution environments is shown in Figure 1.2. The software part is comprised of the request manager and the data manager.
Request Manager. The role of the request manager is (a) to transform a user's request into a set of semantic equivalent EDGs; (b) to select the "best" EDG from the set; and (c) finally, to translate the EDG into an executable object file. For a single request, there is usually more than one way of execution. Query optimization is the process of determining the best execution path. This is a complicated process even for sequential databases. Conventional parallel database systems (such as XRPS) adopt a two-phase optimization approach. In the first phase, the "best" sequential execution plan is determined and in the second, this plan is parallelized. Recent query optimization techniques are presented in.
Data Manager. This can be regarded as the parallel database operating system whose major responsibility is to execute the object file produced by the request manager. As pointed out by Stonebraker, conventional operating systems (OS) are inefficient for database applications. Some of the traditional features of OS that, generally, do not meet the needs of database applications are buffer management, load placement, and transaction processing. For instance, KEV is a special OS designed for the BUBBA parallel database machine. In, it was observed that well-known load balancing methods at the OS level were generally unsuitable and that database-specific schemes were needed. Parallel transactions processing exploits interquery parallelism extensively. Due to its importance, the kernel of a number of existing commercial parallel database systems such as NonStop SQL, Oracle Parallel Server, and DB2 already support interquery parallelism. Furthermore, parallel transaction processing has opened up many new technical issues, such as recovery management. Data partitioning in a parallel database system renders data more vulnerable to failure. For the same reason, recovery in parallel database platforms is more difficult than in conventional database systems. Researchers are actively investigating techniques to provide reliable yet fast recovery environments (see, for example, [30, 22]).
In addition to the classical database requirements, special features must be provided by the OS for high-performance parallel database implementations. These include interprocessor communications, distributed/shared memory management, multithreading, and group communications.
Hardware Execution Environment. To realize parallelism, suitable hardware must be employed. This includes both a machine with a parallel architecture and a data store supported with the data partition model.
Parallel Machine Architecture. For database applications, machine architectures are classified according to the way in which resources are shared. The classification scheme, first introduced by Stonebraker, includes: shared nothing, shared memory, and shared disk. In a parallel database system based on the shared nothing concept, each system's node has its own main memory and secondary storage devices. Communication between the different nodes is achieved through message-passing across an interconnection network. This lack of resource sharing reduces nodes contention and permits a high degree of scalability. A parallel version of IBM's DB2 running on the RS/6000-based POWERparallel multiprocessor system is one such machine. This version of DB2 has the added advantage of providing advanced features such as support for binary large objects (for compressed video, images, and audio), and character large objects (for text documents). Despite their popularity, shared nothing parallel database systems suffer, in general, from a load balancing problem. In shared disk systems, each node has its private main memory but secondary storage, usually a disk array, is a shared resource. Because the disk data is shared by all nodes, these systems tend to have limited load imbalance but require a global lock manager to preserve the data consistency. Shared memory systems make use of a global main memory shared by all of the system's nodes. The use of a global memory eliminates the load balancing problem. The XPRS running Postgres is an example of such systems. However, these systems do not scale up well, as an increase in the number of nodes leads to more contention over the shared memory. Commercial database systems such as the NCR 3700 tend to have a hybrid architecture that combines some of the above features.
(Continues...)
Table of Contents
1 Introduction.
1.1 Background.
1.2 Parallel Database Systems.
1.2.1 Computation Model.
1.2.2 Engineering Model.
1.3 About this Manuscript.
Bibliography.
I: Request Manager.
2 Designing an Optimizer for Parallel Relational Systems.
2.1 Introduction.
2.2 Overall Design Issues.
2.2.1 Design a Simple Parallel Execution Model.
2.2.2 The Two-Phase Approach.
2.2.3 Parallelizing is Adding Information!
2.2.4 Two-Phase versus Parallel Approaches.
2.3 Parallelization.
2.3.1 Kinds of Parallelism.
2.3.2 Specifying Parallel Execution.
2.4 Search Space.
2.4.1 Slicing Hash Join Trees.
2.4.2 Search Space Size.
2.4.3 Heuristics.
2.4.4 The Two-Phase Heuristics.
2.5 Cost Model.
2.5.1 Exceptions to the Principle of Optimality.
2.5.2 Resources.
2.5.3 Skew and Size Model.
2.5.4 The Cost Function.
2.6 Search Strategies.
2.6.1 Deterministic Search Strategies.
2.6.2 Randomized Strategies.
2.7 Conclusion.
Bibliography.
3 New Approaches to Parallel Join Utilizing Page Connectivity Information.
3.1 Introduction.
3.2 The Environment and a Motivating Example.
3.3 The Methodology.
3.3.1 Definition of Parameters.
3.3.2 The Balancing Algorithm.
3.3.3 Schedules for Reading Join Components and Data Pages.
3.4 Performance Analysis.
3.4.1 The Evaluation Method.
3.4.2 Evaluation Results.
3.5 Concluding Remarks and Future Work.
Bibliography.
4 A Performance Evaluation Tool for Parallel Database Systems.
4.1 Introduction.
4.2 Performance Evaluation Methods.
4.2.1 Analytical Modeling.
4.2.2 Benchmarks.
4.2.3 Observations.
4.3 The Software Testpilot.
4.3.1 The Experiment Specification.
4.3.2 The Performance Assessment Cycle.
4.3.3 The System Interface.
4.4 The Software Testpilot and Oracle/Ncube.
4.4.1 Database System Performance Assessment.
4.4.2 The Oracle/Ncube Interface.
4.5 Preliminary Results.
4.6 Conclusion.
Bibliography.
5 Load Placement in Distributed High-Performance Database Systems.
5.1 Introduction.
5.2 Investigated System.
5.2.1 System Architecture.
5.2.2 Load Scenarios.
5.2.3 Trace Analysis.
5.2.4 Load Setup.
5.3 Load Placement Strategies Investigated.
5.4 Scheduling Strategies for Transactions.
5.5 Simulation Results.
5.5.1 Influence of Scheduling.
5.5.2 Evaluation of the Load Placement Strategies.
5.5.3 Lessons Learned.
5.5.4 Decision Parameters Used.
5.6 Conclusion and Open Issues.
Bibliography.
II: Parallel Machine Architecture.
6 Modeling Recovery in Client-Server Database Systems.
6.1 Introduction.
6.2 Uniprocessor Recovery and Formal.
Approach to Modeling Recovery.
6.2.1 Basic Formal Concepts.
6.2.2 Logging Mechanisms.
6.2.3 Runtime Policies for Ensuring Correctness.
6.2.4 Data Structures Maintained for Efficient Recovery.
6.2.5 Restart Recovery—The ARIES Approach.
6.3 LSN Sequencing Techniques for Multinode Systems.
6.4 Recovery in Client-Server Database Systems.
6.4.1 Client-Server EXODUS (ESM-CS).
6.4.2 Client-Server ARIES (ARIES/CSA).
6.4.3 Shared Nothing Clients with Disks (CD).
6.4.4 Summary of Recovery Approaches in Client-Server Architectures.
6.5 Conclusion.
Bibliography.
7 Parallel Strategies for a Petabyte Multimedia Database Computer.
7.1 Introduction.
7.2 Multimedia Data Warehouse, Databases, and Applications.
7.2.1 Three Waves of Multimedia Database Development.
7.2.2 National Medical Practice Knowledge Bank Application.
7.3 Massively Parallel Architecture, Infrastructure, and Technology.
7.3.1 Parallelism.
7.4 Teradata-MM Architecture, Framework, and New Concepts.
7.4.1 Teradata-MM Architecture.
7.4.2 Key New Concepts.
7.4.3 SQL3.
7.4.4 Federated Coordinator.
7.4.5 Teradata Multimedia Object Server.
7.5 Parallel UDF Execution Analysis.
7.5.1 UDF Optimizations.
7.5.2 PRAGMA Facility.
7.5.3 UDF Value Persistence Facility.
7.5.4 Spatial Indices for Content-Based Querying.
7.6 Conclusion.
Bibliography.
8 The MEDUSA Project.
8.1 Introduction.
8.2 Indexing and Data Partitioning.
8.2.1 Standard Systems.
8.2.2 Grid Files.
8.3 Dynamic Load Balancing.
8.3.1 Data Access Frequency.
8.3.2 Data Distribution.
8.3.3 Query Partitioning.
8.4 The MEDUSA Project.
8.4.1 The MEDUSA Architecture.
8.4.2 Software.
8.4.3 Grid File Implementation.
8.4.4 Load Balancing Strategy.
8.5 MEDUSA Performance Results.
8.5.1 Test Configuration.
8.5.2 Transaction Throughput.
8.5.3 Speedup.
8.5.4 Load Balancing Test Results.
8.6 Conclusions.
Bibliography.
III: Partitioned Data Store.
9 System Software of the Super Database Computer SDC-II.
9.1 Introduction.
9.2 Architectural Overview of the SDC-II.
9.3 Design and Organization of the SDC-II System Software.
9.3.1 Parallel Execution Model.
9.3.2 I/O Model and Buffer Management Strategy for Bulk Data Transfer.
9.3.3 Process Model and Efficient Flow Control Mechanism.
9.3.4 Structure of the System Software Components.
9.4 Evaluation of the SDC-II System.
9.4.1 Details of a Sample Query Processing.
9.4.2 Comparison with Commercial Systems.
9.5 Conclusion.
Bibliography.
10 Data Placement in Parallel Database Systems.
10.1 Introduction.
10.2 Overview of Data Placement Strategies.
10.2.1 Declustering and Redistribution.
10.2.2 Placement.
10.3 Effects of Data Placement.
10.3.1 STEADY and TPC-C.
10.3.2 Dependence on Number of Processing Elements.
10.3.3 Dependence on Database Size.
10.4 Conclusions.
Bibliography.
Contributors.