Devices and Volumes
[Storage Structures]

Collaboration diagram for Devices and Volumes:


Detailed Description

The storage manager was designed to permit multiple volumes on a device, with volume analogous to a Unix parition and a device analogous to a disk, and the original SHORE contained symmetric peer servers. However good that intention, multiple volumes on a device were never implemented, and times have changed, and the storage manager no longer has any notion of remote and local volumes. The notion a volume, separate from a device, remains, but may some day disappear.

For the time being, a device contains at most one volume.

A device is either an operating system file or an operating system device (e.g., raw disk partition) and is identified by a path name (absolute or relative).

A device has a quota. A device is intended to have multiple volumes on it, but in the current implementation the maximum number of volumes is exactly 1.

A volume is where data are stored. Each volume is a header and a set of pages. All pages are the same size (this is a compile-time constant, the default being 8K and sizes up to 64K permissible).

A volume is identified uniquely and persistently by a long volume ID (lvid_t), which is stored in its header. Volumes can be used whenever the device they are located on is mounted by the SM. Volumes have a quota. The sum of the quotas of all the volumes on a device cannot exceed the device quota.

A volume contains a variety of data structures. All user data reside in stores. A store is a collection of the pages on the volume, allocated in extents of a size that is a compile-time constant. (The storage manager has only been tested with an extent-size of 8 pages. The compile-time constant can be changed, but it also requires changes elsewhere in the code to maintain alignment of persistent structures. See the comments in config/shore.def.) Thus, the minimum size of a store is one extent's worth of pages. Larger extents provide better clustering, but more wasted space if small files and small indexes will be common.

Stores are identified by a store number (snum_t).

Each volume contains a few stores that are "overhead": 0 -- is reserved for an extent map and a store map 1 -- directory (dir_m) 2 -- root index

Beyond that, for each (user) file created, 2 stores are used, one for small objects, one for large objects, and for each index (btree, rtree) created 1 store is used.

Each volume is laid out thus:


Functions

static rc_t ss_m::set_disk_delay (u_int milli_sec)
 Set sleep time before I/O operations.
static rc_t ss_m::format_dev (const char *device, smksize_t quota_in_KB, bool force)
 Format a device.
static rc_t ss_m::mount_dev (const char *device, u_int &vol_cnt, devid_t &devid, vid_t local_vid=vid_t::null)
 Mount a device.
static rc_t ss_m::dismount_dev (const char *device)
 Dismount a device.
static rc_t ss_m::dismount_all ()
 Dismount all mounted devices.
static rc_t ss_m::list_devices (const char **&dev_list, devid_t *&devid_list, u_int &dev_cnt)
 Return a list of all mounted devices.
static rc_t ss_m::list_volumes (const char *device, lvid_t *&lvid_list, u_int &lvid_cnt)
 Return a list of all volume on a device.
static rc_t ss_m::get_device_quota (const char *device, smksize_t &quota_KB, smksize_t &quota_used_KB)
 Get the device quota.
static rc_t ss_m::set_fake_disk_latency (vid_t vid, const int adelay)
 Change the fake disk latency before I/Os on this volume, for debugging purposes.
static rc_t ss_m::enable_fake_disk_latency (vid_t vid)
 Enable the fake disk latency before I/Os on this volume, for debugging purposes.
static rc_t ss_m::disable_fake_disk_latency (vid_t vid)
 Disable the fake disk latency before I/Os on this volume, for debugging purposes.
static rc_t ss_m::generate_new_lvid (lvid_t &lvid)
 Add a volume to a device.
static rc_t ss_m::create_vol (const char *device_name, const lvid_t &lvid, smksize_t quota_KB, bool skip_raw_init=false, vid_t local_vid=vid_t::null, const bool apply_fake_io_latency=false, const int fake_disk_latency=0)
 Add a volume to a device.
static rc_t ss_m::destroy_vol (const lvid_t &lvid)
 Destroy a volume.
static rc_t ss_m::get_volume_quota (const lvid_t &lvid, smksize_t &quota_KB, smksize_t &quota_used_KB)
 Gets the quotas associated with the volume.
static rc_t ss_m::get_du_statistics (vid_t vid, sm_du_stats_t &du, bool audit=true)
 Analyze a volume and report statistics regarding disk usage.
static rc_t ss_m::get_du_statistics (const stid_t &stid, sm_du_stats_t &du, bool audit=true)
 Analyze a store and report statistics regarding disk usage.
static rc_t ss_m::get_volume_meta_stats (vid_t vid, SmVolumeMetaStats &volume_stats, concurrency_t cc=t_cc_none)
 Analyze a volume and collect brief statistics about its usage.
static rc_t ss_m::get_file_meta_stats (vid_t vid, w_base_t::uint4_t num_files, SmFileMetaStats *file_stats, bool batch_calculate=false, concurrency_t cc=t_cc_none)
 Analyze a volume and collect brief statistics about its usage.
static rc_t ss_m::vol_root_index (const vid_t &v, stid_t &iid)
 Get the index ID of the root index of the volume.
static rc_t ss_m::lvid_to_vid (const lvid_t &lvid, vid_t &vid)
 Return the short volume ID of a volume.
static rc_t ss_m::vid_to_lvid (vid_t vid, lvid_t &lvid)
 Return the long volume ID of a volume.


Function Documentation

static rc_t ss_m::set_disk_delay ( u_int  milli_sec  )  [static, inherited]

Set sleep time before I/O operations.

This method sets a milli_sec delay to occur before each disk read/write operation. This is for debugging. It is useful in discovering thread sync bugs. This delay applies to all threads.

static rc_t ss_m::format_dev ( const char *  device,
smksize_t  quota_in_KB,
bool  force 
) [static, inherited]

Format a device.

Parameters:
[in] device Operating-system file name of the "device".
[in] quota_in_KB Quota in kilobytes.
[in] force If true, format the device even if it already exists.
Since raw devices always "exist", force should be given as true for raw devices.

A device may not be formatted if it is already mounted.

Note:
This method should not be called in the context of a transaction.
Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

static rc_t ss_m::mount_dev ( const char *  device,
u_int &  vol_cnt,
devid_t devid,
vid_t  local_vid = vid_t::null 
) [static, inherited]

Mount a device.

Parameters:
[in] device Operating-system file name of the "device".
[out] vol_cnt Number of volumes on the device.
[out] devid A local device id assigned by the storage manager.
[in] local_vid A local handle to the (only) volume on the device, to be used when a volume is mounted. The default, vid_t::null, indicates that the storage manager can chose a value for this.
Note:
It is fine to mount a device more than once, as long as device is always the same (you cannot specify a hard link or soft link to an entity mounted under a different path). Device mounts are not reference-counted, so a single dismount_dev renders the volumes on the device unusable.

This method should not be called in the context of a transaction.

Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

static rc_t ss_m::dismount_dev ( const char *  device  )  [static, inherited]

Dismount a device.

Parameters:
[in] device Operating-system file name of the "device".
Note:
It is fine to mount a device more than once, as long as device is always the same (you cannot specify a hard link or soft link to an entity mounted under a different path). Device mounts are not reference-counted, so a single dismount_dev renders the volumes on the device unusable.

This method should not be called in the context of a transaction.

static rc_t ss_m::dismount_all (  )  [static, inherited]

Dismount all mounted devices.

Note:
This method should not be called in the context of a transaction.

static rc_t ss_m::list_devices ( const char **&  dev_list,
devid_t *&  devid_list,
u_int &  dev_cnt 
) [static, inherited]

Return a list of all mounted devices.

Parameters:
[out] dev_list Returned list of pointers directly into the mount table.
[out] devid_list Returned list of associated device ids.
[out] dev_cnt Returned number of entries in the two above lists.
The storage manager allocates the arrays returned with new[], and the caller must return these to the heap with delete[] if they are not null. They will be null if an error is returned or if no devices are mounted.

The strings to which dev_list[*] point are not to be deleted by the caller.

static rc_t ss_m::list_volumes ( const char *  device,
lvid_t *&  lvid_list,
u_int &  lvid_cnt 
) [static, inherited]

Return a list of all volume on a device.

Parameters:
[in] device Operating-system file name of the "device".
[out] lvid_list Returned list of pointers directly into the mount table.
[out] lvid_cnt Returned length of list lvid_list.
The storage manager allocates the array lvid_list with new[], and the caller must return it to the heap with delete[] if it is not null. It will be null if an error is returned.

Note:
This method should not be called in the context of a transaction.
Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

static rc_t ss_m::get_device_quota ( const char *  device,
smksize_t &  quota_KB,
smksize_t &  quota_used_KB 
) [static, inherited]

Get the device quota.

Parameters:
[in] device Operating-system file name of the "device".
[out] quota_KB Returned quota in kilobytes
[out] quota_used_KB Returned portion of quota allocated to volumes
The quota_used_KB is the portion of the quota allocated to volumes on the device.

Note:
This method may be called in the context of a transaction.

This method may be called in the context of a transaction.

static rc_t ss_m::set_fake_disk_latency ( vid_t  vid,
const int  adelay 
) [static, inherited]

Change the fake disk latency before I/Os on this volume, for debugging purposes.

Parameters:
[in] vid The ID of the volume of interest.
[in] adelay Nanoseconds to sleep with nanosleep()
This is for debugging only. Changing the value of the latency for a volume does not enable the delay.

static rc_t ss_m::enable_fake_disk_latency ( vid_t  vid  )  [static, inherited]

Enable the fake disk latency before I/Os on this volume, for debugging purposes.

Parameters:
[in] vid The ID of the volume of interest.
This is for debugging only. When this is enabled, is uses whatever disk latency was set with ss_m::create_vol() or the last applied ss_m::set_fake_disk_latency().

static rc_t ss_m::disable_fake_disk_latency ( vid_t  vid  )  [static, inherited]

Disable the fake disk latency before I/Os on this volume, for debugging purposes.

Parameters:
[in] vid The ID of the volume of interest.
This is for debugging only.

static rc_t ss_m::generate_new_lvid ( lvid_t lvid  )  [static, inherited]

Add a volume to a device.

Parameters:
[in] lvid Long volume id to be used on ss_m::create_vol().
This generates a unique volume identifier to be written persistently on the volume when it is formatted. This enables us to avoid the mistake of doubly-mounting a volume. The identifer is constructed from the machine network address and the time of day.
Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

static rc_t ss_m::create_vol ( const char *  device_name,
const lvid_t lvid,
smksize_t  quota_KB,
bool  skip_raw_init = false,
vid_t  local_vid = vid_t::null,
const bool  apply_fake_io_latency = false,
const int  fake_disk_latency = 0 
) [static, inherited]

Add a volume to a device.

Parameters:
[in] device_name Operating-system file name of the "device".
[in] lvid Long volume id to use when formatting the new volume.
[in] quota_KB Quota in kilobytes.
[in] skip_raw_init Do not initialize the volume if on a raw device.
[in] local_vid Short volume id by which to refer to this volume. If null, the storage manager will assign one.
[in] apply_fake_io_latency See ss_m::enable_fake_disk_latency()
[in] fake_disk_latency See ss_m::set_fake_disk_latency()
Note:
This method should not be called in the context of a transaction.
The pages on the volume must be zeroed; you can only use skip_raw_init = true if you have by some other means already initialized the volume.
Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

static rc_t ss_m::destroy_vol ( const lvid_t lvid  )  [static, inherited]

Destroy a volume.

Parameters:
[in] lvid Long volume id by which the volume is known.
Note:
This method should not be called in the context of a transaction.

static rc_t ss_m::get_volume_quota ( const lvid_t lvid,
smksize_t &  quota_KB,
smksize_t &  quota_used_KB 
) [static, inherited]

Gets the quotas associated with the volume.

Parameters:
[in] lvid Long volume id by which the volume is known.
[out] quota_KB Quota given when the volume was created.
[out] quota_used_KB Portion of the quota has been used by allocated extents.

static rc_t ss_m::get_du_statistics ( vid_t  vid,
sm_du_stats_t &  du,
bool  audit = true 
) [static, inherited]

Analyze a volume and report statistics regarding disk usage.

Parameters:
[in] vid The volume of interest.
[out] du The structure that will hold the collected statistics.
[in] audit If "true", the method acquires a share lock on the volume and then will check assertions about the correctness of the data structures on the volume. If the audit fails an internal fatal error is generated to facilitate debugging. (It will generate a core file if your shell permits such.) If "false" an IS lock is acquired, which means that the statistics will be fuzzy.
Using the audit feature is useful for debugging. It is the only safe way to use this method.
Note:
The statistics are added to the sm_du_stats_t structure passed in. This structure is not cleared by the storage manager.

static rc_t ss_m::get_du_statistics ( const stid_t stid,
sm_du_stats_t &  du,
bool  audit = true 
) [static, inherited]

Analyze a store and report statistics regarding disk usage.

Parameters:
[in] stid The store of interest.
[out] du The structure that will hold the collected statistics.
[in] audit If "true", the method acquires a share lock on the store and then will check assertions about the correctness of the data structures on the store.
Using the audit feature is useful for debugging. It is the only safe way to use this method.

static rc_t ss_m::get_volume_meta_stats ( vid_t  vid,
SmVolumeMetaStats &  volume_stats,
concurrency_t  cc = t_cc_none 
) [static, inherited]

Analyze a volume and collect brief statistics about its usage.

Parameters:
[in] vid The volume of interest.
[out] volume_stats The statistics are written here.
[in] cc Indicates whether the volume is to be locked by this method. Acceptable values are t_cc_none and t_cc_volume.
If no lock is acquired, the method can fail with eRETRY.

static rc_t ss_m::get_file_meta_stats ( vid_t  vid,
w_base_t::uint4_t  num_files,
SmFileMetaStats *  file_stats,
bool  batch_calculate = false,
concurrency_t  cc = t_cc_none 
) [static, inherited]

Analyze a volume and collect brief statistics about its usage.

Parameters:
[in] vid The volume of interest.
[in] num_files The size of the array file_stats.
[out] file_stats Preallocated array of structs into which to write the statistics for the individual files inspected.
[in] batch_calculate True means make one pass over the volume.
[in] cc Indicates whether the volume is to be locked by this method. Acceptable values are t_cc_none and t_cc_volume.
If no lock is acquired and batch_calculate is not set, the method can fail with eRETRY.

If batch_calculate is true then this works by making one pass over the meta data, but it looks at all the meta data. This should be the faster way to do the analysis when there are many files, and when files use a large portion of the volume.

If batch_calculate is false then each file is updated indidually, only looking at the extent information for that particular file. This requires a pass over the volume for each file. (Seek-wise it is less efficient).

static rc_t ss_m::vol_root_index ( const vid_t v,
stid_t iid 
) [inline, static, inherited]

Get the index ID of the root index of the volume.

Parameters:
[in] v Volume of interest.
[out] iid Store ID of the root index.
Each volume has a root index, which is a well-known index available to the server for bootstrapping a database.
Examples:
create_rec.cpp, log_exceed.cpp, sort_stream.cpp, and vtable_example.cpp.

Definition at line 1966 of file sm.h.

References RCOK, stid_t::store, and stid_t::vol.

static rc_t ss_m::lvid_to_vid ( const lvid_t lvid,
vid_t vid 
) [static, inherited]

Return the short volume ID of a volume.

Parameters:
[in] lvid Long (persistent) volume ID found on the volume's header.
[out] vid Short volume ID of a mounted volume.

static rc_t ss_m::vid_to_lvid ( vid_t  vid,
lvid_t lvid 
) [static, inherited]

Return the long volume ID of a volume.

Parameters:
[in] vid Short volume ID of a mounted volume.
[out] lvid Long (persistent) volume ID found on the volume's header.


Generated on Wed Jul 7 17:22:34 2010 for Shore Storage Manager by  doxygen 1.4.7