00001 // -*- mode:c++; c-basic-offset:4 -*- 00002 /*<std-header orig-src='shore' incl-file-exclusion='API_H'> 00003 00004 $Id: api.h,v 1.5 2010/07/07 20:50:10 nhall Exp $ 00005 00006 SHORE -- Scalable Heterogeneous Object REpository 00007 00008 Copyright (c) 1994-99 Computer Sciences Department, University of 00009 Wisconsin -- Madison 00010 All Rights Reserved. 00011 00012 Permission to use, copy, modify and distribute this software and its 00013 documentation is hereby granted, provided that both the copyright 00014 notice and this permission notice appear in all copies of the 00015 software, derivative works or modified versions, and any portions 00016 thereof, and that both notices appear in supporting documentation. 00017 00018 THE AUTHORS AND THE COMPUTER SCIENCES DEPARTMENT OF THE UNIVERSITY 00019 OF WISCONSIN - MADISON ALLOW FREE USE OF THIS SOFTWARE IN ITS 00020 "AS IS" CONDITION, AND THEY DISCLAIM ANY LIABILITY OF ANY KIND 00021 FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. 00022 00023 This software was developed with support by the Advanced Research 00024 Project Agency, ARPA order number 018 (formerly 8230), monitored by 00025 the U.S. Army Research Laboratory under contract DAAB07-91-C-Q518. 00026 Further funding for this work was provided by DARPA through 00027 Rome Research Laboratory Contract No. F30602-97-2-0247. 00028 00029 */ 00030 00031 /* -- do not edit anything above this line -- </std-header>*/ 00032 00033 /* 00034 * This file contains doxygen documentation only 00035 * Its purpose is to determine the layout of the SSMAPI 00036 * page, which is the starting point for the on-line 00037 * storage manager documentation. 00038 */ 00039 00040 /**\defgroup SSMAPI SHORE Storage Manager Application Programming Interface (SSM API) 00041 * 00042 * Most of the SHORE Storage Manager functionality is presented in 00043 * two C++ classes, ss_m and smthread_t. 00044 * The ss_m is the storage manager, an instance of which must be 00045 * constructed before any storage manager methods may be used. 00046 * The construction of the single instance performs recovery. 00047 * 00048 * All storage manager methods must be called in the context of a 00049 * storage manager thread, smthread_t. This means they must be called 00050 * (directly or indirectly) by the run() method of a class derived from 00051 * smthread_t. 00052 * See: smthread_t, \ref SSMINIT. 00053 * 00054 * The storage manager is paramaterized with options and their associated 00055 * values, some of which have defaults and others of which must be given 00056 * values by the server. An options-processing package is provided for this 00057 * purpose. 00058 * 00059 * See 00060 * - \ref SSMOPT for an inventory of the storage manager's options, 00061 * - \ref OPTIONS for a discussion of code to initialize the options, 00062 * - \ref SSMINIT for a discussion of how to initialize and start up 00063 * a storage manager, 00064 * - \ref startstop.cpp for an example of the minimal required use of 00065 * options in a server, and 00066 * - the example consisting of \ref create_rec.cpp and \ref init_config_options.cpp for a more complete example 00067 * 00068 */ 00069 00070 /**\defgroup IDIOMS Programming Idioms 00071 * \ingroup SSMAPI 00072 */ 00073 00074 /**\defgroup MACROS Significant C Preprocessor Macros 00075 * \ingroup SSMAPI 00076 */ 00077 00078 /**\defgroup IDS Identifiers 00079 * \ingroup SSMAPI 00080 * 00081 * Identifiers for persistent storage entities are used throughout 00082 * the storage manager API. This page collects them for convenience of 00083 * reference. 00084 */ 00085 00086 /**\defgroup SSMINIT Starting Up, Shutting Down, Thread Context 00087 * \ingroup SSMAPI 00088 * 00089 * \section SSMSTART Starting a Storage Manager 00090 * Starting the Storage Manager consists in 2 major things: 00091 * - Initializing the options the storage manager expects to be set. 00092 * See 00093 * - \ref OPTIONS for a discussion of code to initialize the options 00094 * - \ref SSMOPT for an inventory of the storage manager's options. 00095 * - Constructing an instance of the class ss_m. 00096 * The constructor ss_m::ss_m performs recovery, and when 00097 * it returns to the caller, the caller may begin 00098 * using the storage manager. 00099 * 00100 * No more than one instance may exist at any time. 00101 * 00102 * Storage manager functions must be called in the context of 00103 * a run() method of an smthread_t. 00104 * 00105 * See \ref SSMVAS for an example of how this is done. 00106 * 00107 * See also \ref SSMLOGSPACEHANDLING and \ref LOGSPACE for discussions 00108 * relating to the constructor and its arguments. 00109 * 00110 * \section SSMSHUTDOWN Shutting Down a Storage Manager 00111 * Shutting down the storage manager consists of deleting the instance 00112 * of ss_m created above. 00113 * 00114 * The storage manager normally shuts down gracefully; if you want 00115 * to force an unclean shutdown (for testing purposes), you can do so. 00116 * See ss_m::set_shutdown_flag. 00117 * 00118 * \section SSMLOGSPACEHANDLING Handling Log Space 00119 * The storage manager contains a primitive mechanism for responding 00120 * to potential inability to rollback or recover due to lack of log 00121 * space. 00122 * When it detects a potential problem, it can issue a callback to the 00123 * server, which can then deal with the situation as it sees fit. 00124 * The use of such a callback mechanism is entirely optional. 00125 * 00126 * The steps that are necessary are: 00127 * - The server constructs the storage manager ( ss_m::ss_m() ) with two callback function 00128 * pointers, 00129 * the first of type \ref ss_m::LOG_WARN_CALLBACK_FUNC, and 00130 * the second of type \ref ss_m::LOG_ARCHIVED_CALLBACK_FUNC. 00131 * - The server is run with a value given to the sm_log_warn option, 00132 * which determines the threshold at which the storage manager will 00133 * invoke *LOG_WARN_CALLBACK_FUNC. This is a percentage of the 00134 * total log space in use by active transactions. 00135 * This condition is checked when any thread calls a storage manager 00136 * method that acts on behalf of a transaction. 00137 * - When the server calls the given LOG_WARN_CALLBACK_FUNC, that function 00138 * is given these arguments: 00139 * - iter Pointer to an iterator over all xcts. 00140 * - victim Victim will be returned here. 00141 * - curr Bytes of log consumed by active transactions. 00142 * - thresh Threshhold just exceeded. 00143 * - logfile Character string name of oldest file to archive. 00144 * 00145 * The initial value of the victim parameter is the transaction that 00146 * is attached to the running thread. The callback function might choose 00147 * a different victim and this in/out parameter is used to convey its choice. 00148 * 00149 * The callback function can use the iterator to iterate over all 00150 * the transactions in the system. The iterator owns the transaction-list 00151 * mutex, and if this function is not using that mutex, or if it 00152 * invokes other static methods on xct_t, it must release the mutex by 00153 * calling iter->never_mind(). 00154 * 00155 * The curr parameter indicates whte bytes of log consumed by the 00156 * active transactions and the thresh parameter indicates the threshold 00157 * that was just exceeded. 00158 * 00159 * The logfile parameter is the name (including path) of the log file 00160 * that contains the oldest log record (minimum lsn) needed to 00161 * roll back any of the active transactions, so it is the first 00162 * log file candidate for archiving. 00163 * 00164 * If the server's policy is to abort a victim, it needs only set 00165 * the victim parameter and return eUSERABORT. The storage manager 00166 * will then abort that transaction, and the storage manager 00167 * method that was called by the victim will return to the running 00168 * thread with eUSERABORT. 00169 * 00170 * If the server's policy is not to abort a victim, it can use 00171 * xct_t::log_warn_disable() to prevent the callback function 00172 * from being called with this same transaction as soon as 00173 * it re-enters the storage manager. 00174 * 00175 * If the policy is to archive the indicated log file, and an abort 00176 * of some long-running transaction ensues, that log file might be 00177 * needed again, in which case, a failure to open that log file will 00178 * result in a call to the second callback function, indicated by the 00179 * LOG_ARCHIVED_CALLBACK_FUNC pointer. If this function returns \ref RCOK, 00180 * the log manager will re-try opening the file before it chokes. 00181 * 00182 * This is only a stub of an experimental handling of the problem. 00183 * It does not yet provide any means of resetting the counters that 00184 * cause the tripping of the LOG_WARN_CALLBACK_FUNC. 00185 * Nor does it handle the problem well in the face of true physical 00186 * media limits. For example, if, in recovery undo, it needs to 00187 * restore archived log files, there is no automatic means of 00188 * setting aside the tail-of-log files to make room for the older 00189 * log files; and similarly, when undo is finished, it assumes that 00190 * the already-opened log files are still around. 00191 * If a callback function renames or unlinks a log file, because the 00192 * log might have the files opened, the rename/unlink will not 00193 * effect a removal of these files until the log is finished with them. 00194 * Thus, these hooks are just a start in dealing with the problem. 00195 * The system must be stopped and more disks added to enable the 00196 * log size to increase, or a fully-designed log-archiving feature 00197 * needs to be added. 00198 * Nor is this well-tested. 00199 * 00200 * The example \ref log_exceed.cpp is a primitive 00201 * example using these callbacks. That example shows how you must 00202 * compile the module that uses the API for xct_t. 00203 * 00204 */ 00205 00206 00207 /**\defgroup OPTIONS Run-Time Options 00208 * \ingroup SSMAPI 00209 */ 00210 00211 /**\defgroup SSMOPT List of Run-Time Options 00212 * \ingroup OPTIONS 00213 */ 00214 00215 /**\defgroup SSMSTG Storage Structures 00216 * 00217 * The modules below describe the storage manager's storage structures. 00218 * In summary, 00219 * - devices contain 00220 * - volumes, which contain 00221 * - stores, upon which are built 00222 * - files of records, 00223 * - conventional indexes (B+-trees), and 00224 * - spatial indexes (R*-trees) 00225 * 00226 * 00227 * \ingroup SSMAPI 00228 */ 00229 00230 /**\defgroup SSMVOL Devices and Volumes 00231 * \ingroup SSMSTG 00232 */ 00233 00234 /**\defgroup SSMSTORE Stores 00235 * \ingroup SSMSTG 00236 */ 00237 00238 /**\defgroup SSMFILE Files of Records 00239 * \ingroup SSMSTG 00240 */ 00241 00242 /**\defgroup SSMPIN Pinning Records 00243 * \ingroup SSMFILE 00244 */ 00245 00246 /**\defgroup SSMBTREE B+-Tree Indexes 00247 * \ingroup SSMSTG 00248 */ 00249 00250 /**\defgroup SSMRTREE R*-Tree Indexes 00251 * \ingroup SSMSTG 00252 */ 00253 00254 /**\defgroup SSMSCAN Scanning 00255 * \ingroup SSMSTG 00256 */ 00257 00258 /**\defgroup SSMSCANF Scanning Files 00259 * \ingroup SSMSCAN 00260 * To iterate over the records in a file, 00261 * construct an instance of the class scan_file_i, q.v.. 00262 * That page contains examples. 00263 */ 00264 00265 /**\defgroup SSMSCANI Scanning B+-Tree Indexes 00266 * \ingroup SSMSCAN 00267 * To iterate over the {key,value} pairs in an index, 00268 * construct an instance of the class scan_index_i, q.v. 00269 * That page contains examples. 00270 */ 00271 00272 /**\defgroup SSMSCANRT Scanning R*-Tree Indexes 00273 * \ingroup SSMSCAN 00274 * To iterate over the {key,value} pairs in a spatial index, 00275 * construct an instance of the class scan_rt_i, q.v. 00276 * That page contains examples. 00277 * 00278 */ 00279 00280 /**\defgroup SSMBULKLD Bulk-Loading Indexes 00281 * \ingroup SSMSTG 00282 * 00283 * Bulk-loading indexes consists of the following steps: 00284 * - create the source of the datas for the bulk-load, which can be 00285 * - one or more file(s) of records, or 00286 * - a sort_stream_i 00287 * - call a bulk-loading method in ss_m 00288 * 00289 * To avoid excessive logging of files that do not need to persist after 00290 * the bulk-load is done, use the sm_store_property_t property 00291 * t_load_file for the source files. 00292 */ 00293 00294 /**\defgroup SSMSORT Sorting 00295 * \ingroup SSMSTG 00296 */ 00297 /**\example sort_stream.cpp */ 00298 00299 /**\defgroup SSMXCT Transactions, Locking and Logging 00300 * \ingroup SSMAPI 00301 */ 00302 00303 /**\defgroup SSMLOCK Locking 00304 * \ingroup SSMXCT 00305 */ 00306 00307 /**\defgroup SSMSP Partial Rollback: Savepoints 00308 * \ingroup SSMXCT 00309 */ 00310 00311 /**\defgroup SSMQK Early Lock Release: Quarks 00312 * \ingroup SSMXCT 00313 */ 00314 00315 /**\defgroup SSM2PC Distributed Transactions: Two-Phase Commit 00316 * \ingroup SSMXCT 00317 */ 00318 /**\defgroup SSMMULTIXCT Multi-threaded Transactions 00319 * \ingroup SSMXCT 00320 */ 00321 00322 /**\defgroup LOGSPACE Running Out of Log Space 00323 * \ingroup SSMXCT 00324 */ 00325 00326 /**\defgroup LSNS How Log Sequence Numbers are Used 00327 * \ingroup SSMXCT 00328 */ 00329 00330 /**\defgroup SSMSTATS Storage Manager Statistics 00331 * \ingroup SSMAPI 00332 * 00333 * The storage manager contains functions to gather statistics that 00334 * it collects. These are mostly counters and are described here. 00335 * 00336 * Volumes can be analyzed to gather usage statistics. 00337 * See ss_m::get_du_statistics and ss_m::get_volume_meta_stats. 00338 * 00339 * Bulk-loading indexes gathers statistics about the bulk-load activity. 00340 * See ss_m::bulkld_index and ss_m::bulkld_md_index. 00341 * 00342 * \note A Perl script facilitates modifying the statistics gathered by 00343 * generating much of the supporting code, including 00344 * structure definitions and output operators. 00345 * The server-writer can generate her own sets of statistics using 00346 * the same Perl tool. 00347 * See \ref STATS for 00348 * more information about how these statistics sets are built. 00349 * 00350 */ 00351 00352 /**\defgroup SSMVTABLE Virtual Tables 00353 * \ingroup SSMAPI 00354 * \details 00355 * 00356 * Virtual tables are string representations of internal 00357 * storage manager tables. 00358 * These tables are experimental. If the tables get to be very 00359 * large, they might fail. 00360 * - lock table (see ss_m::lock_collect) 00361 * Columns are: 00362 * - mode 00363 * - duration 00364 * - number of children 00365 * - id of owning transaction 00366 * - status (granted, waiting) 00367 * - transaction table (see ss_m::xct_collect) 00368 * Columns are: 00369 * - number of threads attached 00370 * - global transaction id 00371 * - transaction id 00372 * - transaction state (in integer form) 00373 * - coordinator 00374 * - forced-readonly (Boolean) 00375 * - threads table (see ss_m::thread_collect) 00376 * Columns are: 00377 * - sthread ID 00378 * - sthread status 00379 * - number of I/Os issued 00380 * - number of reads issued 00381 * - number of writes issued 00382 * - number of syncs issued 00383 * - number of truncates issued 00384 * - number of writev issued 00385 * - number of readv issued 00386 * - smthread name 00387 * - smthread thread type (integer) 00388 * - smthread pin count 00389 * - is in storage manager 00390 * - transaction ID of any attached transaction 00391 */ 00392 /**\example vtable_example.cpp */ 00393 00394 /**\defgroup MISC Miscellaneous 00395 * \ingroup SSMAPI 00396 */ 00397 /**\defgroup SSMSYNC Synchronization, Mutual Exclusion, Deadlocks 00398 * \ingroup MISC 00399 * 00400 * Within the storage manager are a variety of primitives that provide for 00401 * ACID properties of transactions and for correct behavior of concurrent 00402 * threads. These include: 00403 * - read-write locking primitives for concurrent threads (occ_rwlock, 00404 * mcs_rwlock) 00405 * - mutexes (pthread_mutex_t, queue_based_lock_t) 00406 * - condition variables (pthread_cond_t) 00407 * - latches (latch_t) 00408 * - database locks 00409 * 00410 * The storage manager uses database locks to provide concurrency control 00411 * among transactions; 00412 * latches are used for syncronize concurrent threads' accesses to pages in the 00413 * buffer pool. The storage manager's threads use carefully-designed 00414 * orderings of the entities they "lock" with synchronization primitives 00415 * to avoid any sort of deadlock. All synchronization primitives 00416 * except data base locks are meant to be held for short durations; they 00417 * are not even held for the duration of a disk write, for example. 00418 * 00419 * Deadlock detection is done only for database locks. 00420 * Latches are covered by locks, which is 00421 * to say that locks are acquired before latches are requested, so that 00422 * deadlock detection in the lock manager is generally sufficient to prevent 00423 * deadlocks among concurrent threads in a properly-written server. 00424 * 00425 * Care must be taken, when writing server code, to avoid deadlocks of 00426 * other sorts such as latch-mutex, or latch-latch deadlocks. 00427 * For example, multiple threads may cooperate on behalf of the same 00428 * transaction; if they are trying to pin records without a well-designed 00429 * ordering protocol, they may deadlock with one thread holding page 00430 * A pinned (latched) and waiting to pin (latch) B, while the other holds 00431 * B pinned and waits for a pin of A. 00432 */ 00433 00434 /**\defgroup SSMAPIDEBUG Debugging the Storage Manager 00435 * \ingroup SSMDEBUG 00436 * 00437 * The storage manager contains a few methods that are useful for 00438 * debugging purposes. Some of these should be used for not other 00439 * purpose, as they are not thread-safe, or might be very expensive. 00440 */ 00441 /**\defgroup TLS Thread-Local Variables 00442 * \ingroup MISC 00443 */ 00444 /**\defgroup UNUSED Unused code 00445 * \ingroup MISC 00446 */ 00447 00448 00449 /**\defgroup OPT Configuring and Building the Storage Manager 00450 */ 00451 00452 /**\defgroup IMPLGRP Implementation Notes 00453 * See \ref IMPLNOTES "this page" for some implementation details. 00454 */ 00455 00456 /**\defgroup REFSGRP References 00457 * See \ref REFERENCES "this page" for references to selected papers 00458 * from which ideas are used in the Shore Storage Manager. 00459 */