ssm_sort::sort_keys_t Class Reference
[Sorting]

Collaboration diagram for ssm_sort::sort_keys_t:

Collaboration graph
[legend]
List of all members.

Detailed Description

Parameter to control behavior of sort_file.

This class determines how the sort will behave, and it holds descriptions of the keys to be used for a sort. For details of the effect on a sort, see Behavior and Results of Sort. For details about sort keys, see Keys.

He who creates this data structure determines what is "most significant" key, next-most significant key, etc. Key 0 gets compared first, 1 next, and so on.

For related types, see the ssm_sort namespace.

Definition at line 856 of file sort_s.h.

Public Types

typedef ssm_sort::LEXFUNC LEXFUNC

Public Member Functions

NORET sort_keys_t (int nkeys)
 Create a structure that's ready to be populated with nkeys keys.
NORET ~sort_keys_t ()
NORET sort_keys_t (const sort_keys_t &old)
 Copy operator.
int nkeys () const
 Return number of keys.
int set_for_index ()
 Output is to be suitable for index bulk-load, index key is sort key.
int set_for_index (CSKF lfunc, key_cookie_t ck)
 Output is to be suitable for index bulk-load, index key is different from the sort key.
bool is_for_index () const
 Return true if set_for_index() was called.
bool is_for_file () const
 Return true if set_for_file() was called.
bool is_stable () const
 Return true if set_stable().
void set_stable (bool val)
 Ensure stable sort. Cannot be used with set_for_index.
CSKF lexify_index_key () const
 Return the function that creates or locates and/or lexifies the key.
key_cookie_t lexify_index_key_cookie () const
 Pointer to input datum for the associated lexify() CSKF.
int set_for_file (bool deepcopy, bool keeporig, bool carry_obj)
 Output is to be a the input records, sorted.
int set_object_marshal (MOF marshal, UMOF unmarshal, key_cookie_t c)
 Ensure marshaling and unmarshaling of the objects.
MOF marshal_func () const
 Return the marshal function or noMOF if none was given.
UMOF unmarshal_func () const
 Return the unmarshal function or noUMOF if none was given.
key_cookie_t marshal_cookie () const
 Pointer to datum for marshal and unmarshal functions.
bool is_ascending () const
 True if sort will be in ascending order.
int set_ascending (bool value=true)
 Ensure that sort will be in ascending order.
bool is_unique () const
 True if duplicate keys and their records will be removed.
int set_unique (bool value=true)
 Ensure that duplicate keys and their records will be removed.
bool null_unique () const
 True if duplicate null keys and their records will be removed.
int set_null_unique (bool value=true)
 Ensure that duplicate null keys and their records will be removed.
bool carry_obj () const
 True if the sort will copy the entire objects through each phase.
int set_carry_obj (bool value=true)
 Control whether the sort will copy the entire objects through each phase.
bool deep_copy () const
 True if the sort will copy the entire objects to the result file.
int set_deep_copy (bool value=true)
 Control whether the sort will copy the entire objects to the result file.
bool keep_orig () const
 Return true if the sort will not destroy the input file.
int set_keep_orig (bool value=true)
 Control whether the sort will not destroy the input file.
int set_sortkey_fixed (int keyindex, smsize_t off, smsize_t len, bool in_header, bool aligned, bool lexico, CF cfunc)
 Set attributes of key.
int set_sortkey_derived (int keyindex, CSKF gfunc, key_cookie_t cookie, bool in_header, bool aligned, bool lexico, CF cfunc)
 Set attributes of key.
key_location_tget_location (int i)
 Return the key-location information for a given fixed-location key.
smsize_t offset (int i) const
 Return the offset for a given fixed-location key.
smsize_t length (int i) const
 Return the offset for a given fixed-location key.
CSKF keycreate (int i) const
 Return the CSKF function for a given derived key.
key_cookie_t cookie (int i) const
 Return the argument to the CSKF function for a given derived key.
CF keycmp (int i) const
 Return the key-comparison function for a given key.
bool is_lexico (int i) const
 Return true if key i is in lexicographic format in the input record.
bool is_fixed (int i) const
 Return true if key i is in a fixed location in all input records.
bool is_aligned (int i) const
 Return true if key i is in suitably aligned in the input record for the key-comparison function.
bool in_hdr (int i) const
 True if the key or the source of a derived key is to be found in the record header.

Static Public Member Functions

static w_rc_t noLEXFUNC (const void *source, smsize_t len, void *sink)
 Lexify callback function that does a simple memory copy.
static w_rc_t noCSKF (const rid_t &rid, const object_t &obj, key_cookie_t cookie, factory_t &f, skey_t *out)
 Vacuous callback function, does nothing.
static w_rc_t generic_CSKF (const rid_t &rid, const object_t &in_obj, key_cookie_t cookie, factory_t &f, skey_t *out)
 Either copies or lexifies a key.
static w_rc_t noMOF (const rid_t &, const object_t &, key_cookie_t, object_t *)
 Vacuous Marshal Object Function.
static w_rc_t noUMOF (const rid_t &, const object_t &, key_cookie_t, object_t *)
 Vacuous Unmarshal Object Function.
LEXFUNCs
LEXFUNC (q.v.) functions for fundamental types.

static w_rc_t f8_lex (const void *source, smsize_t len, void *sink)
static w_rc_t f4_lex (const void *source, smsize_t len, void *sink)
static w_rc_t u8_lex (const void *source, smsize_t len, void *sink)
static w_rc_t i8_lex (const void *source, smsize_t len, void *sink)
static w_rc_t u4_lex (const void *source, smsize_t len, void *sink)
static w_rc_t i4_lex (const void *source, smsize_t len, void *sink)
static w_rc_t u2_lex (const void *source, smsize_t len, void *sink)
static w_rc_t i2_lex (const void *source, smsize_t len, void *sink)
static w_rc_t u1_lex (const void *source, smsize_t len, void *sink)
static w_rc_t i1_lex (const void *source, smsize_t len, void *sink)


Member Function Documentation

static w_rc_t ssm_sort::sort_keys_t::noLEXFUNC ( const void *  source,
smsize_t  len,
void *  sink 
) [static]

Lexify callback function that does a simple memory copy.

Parameters:
[in] source Pointer to start of key
[in] len Length of key
[out] sink Pointer to output buffer
Does no reformatting; simply copies from source to sink.

static w_rc_t ssm_sort::sort_keys_t::noCSKF ( const rid_t &  rid,
const object_t obj,
key_cookie_t  cookie,
factory_t f,
skey_t out 
) [static]

Vacuous callback function, does nothing.

Parameters:
[in] rid Ignored.
[in] obj Ignored.
[in] cookie Ignored.
[in] f Ignored.
[out] out Ignored.
This function should never be used. It is a default value. The sort_file checks for
 sort_keys_t::lexify() == sort_keys_t::noCSKF 
and if so, bypasses any code connected with key creation, using the object_t it would have passed in to this function as if it were the output of this function. This comparison and bypass is faster than executing the prologue and epilogue code to acquire space and release it, needed when a CSKF is called.

Referenced by set_for_index(), and set_sortkey_fixed().

Here is the caller graph for this function:

static w_rc_t ssm_sort::sort_keys_t::generic_CSKF ( const rid_t &  rid,
const object_t in_obj,
key_cookie_t  cookie,
factory_t f,
skey_t out 
) [static]

Either copies or lexifies a key.

Parameters:
[in] rid Record ID of the record containing the key.
[in] in_obj This refers to the record containing the key.
[in] cookie Must be a pointer to a generic_CSKF_cookie, which tells it which LEXFUNC function to call (noLEXFUNC indicates straight copy), and also tells it the length and location (offset) of the key.
[in] f A heap manager for allocating space.
[out] out Result is written here.
One normally expects the user to provide the entire function for this, but we have this generic version just for simplifying the handling of basic types for backward compatibility.

static w_rc_t ssm_sort::sort_keys_t::noMOF ( const rid_t &  ,
const object_t ,
key_cookie_t  ,
object_t  
) [static]

Vacuous Marshal Object Function.

This function is never called; rather, the sort code checks for

and if so, bypasses any code specific to marshalling. This is done because the preparatory work for calling a marshal function includes allocating space for the results, and it is cheaper to bypass it altogether.

static w_rc_t ssm_sort::sort_keys_t::noUMOF ( const rid_t &  ,
const object_t ,
key_cookie_t  ,
object_t  
) [static]

Vacuous Unmarshal Object Function.

This function is never called; rather, the sort code checks for

and if so, bypasses any code specific to unmarshalling. This is done because the preparatory work for calling an unmarshal function includes allocating space for the results, and it is cheaper to bypass it altogether.

int ssm_sort::sort_keys_t::set_for_index (  )  [inline]

Output is to be suitable for index bulk-load, index key is sort key.

Call this if you want the output file to be written with objects of the form * hdr == key, body==rid and the input file not to be destroyed. This file is suitable for bulk-loading an index, and the index key is the sort key.

You must provide conversion functions for the sort key to be converted to a lexicographic format string if it is not already in such format in the original record, if the index key (being the sort key) is to be used in a B+-Tree.

Only one sort key is supported when sorting for index bulk-load, but the key may be derived, and so the CSKF callback can combine multiple keys, and lexifying them ensures that they can be sorted as one. This is not entirely sufficient to cover all cases of multiple keys, but it will do for many cases, particularly where the sub-keys are of fixed length.

Definition at line 1149 of file sort_s.h.

References noCSKF().

Referenced by set_for_index().

Here is the call graph for this function:

Here is the caller graph for this function:

int ssm_sort::sort_keys_t::set_for_index ( CSKF  lfunc,
key_cookie_t  ck 
) [inline]

Output is to be suitable for index bulk-load, index key is different from the sort key.

Parameters:
[in] lfunc Key creation/location function for the index key.
[in] ck Datum for lfunc
Only one sort key can be used.

Call this if you want the output file to be written with objects of the form hdr == key, body==rid and the input file not to be destroyed, and you wish the index key to be different from the sort key.

The lfunc argument must produce an index key in lexicographic format if the index is to be a B+-Tree. This function is called when the record is first encountered (reading the input file), since the record is already pinned to gather a sort key.

Only one sort key is supported when bulk-loading for indexes, but the key may be derived, and so the CSKF callback can combine multiple keys, and lexifying them ensures that they can be sorted as one. This is not entirely sufficient to cover all cases of multiple keys, but it will do for many cases, particularly where the sub-keys are of fixed length.

Definition at line 1189 of file sort_s.h.

References noCSKF(), and set_for_index().

Here is the call graph for this function:

int ssm_sort::sort_keys_t::set_for_file ( bool  deepcopy,
bool  keeporig,
bool  carry_obj 
) [inline]

Output is to be a the input records, sorted.

Parameters:
[in] deepcopy Use true if you want a deep copy.
[in] keeporig Use true if you want to retain the input file.
[in] carry_obj Use true if you want to carry along the entire objects through the scratch files and to the output file. Used only for is_for_file().
Call this if you want the output file to contain copies of the input file records, undulterated, but in sorted order.

Multiple keys are supported. Use of a CSKF is not needed if the keys are embedded in the records, suitably aligned, and do not cross page boundaries (string comparisons excepted, of course, as string-comparison methods can be called repeatedly on successive corresponding portions of string keys).

Definition at line 1232 of file sort_s.h.

References set_carry_obj(), set_deep_copy(), and set_keep_orig().

Here is the call graph for this function:

int ssm_sort::sort_keys_t::set_object_marshal ( MOF  marshal,
UMOF  unmarshal,
key_cookie_t  c 
) [inline]

Ensure marshaling and unmarshaling of the objects.

Parameters:
[in] marshal MOF to be used when reading records from disk.
[in] unmarshal UMOF to be used to write records to disk,
[in] c Arguemtn to marshal and unmarshal
Call this if the objects in the file need to be byte-swapped or otherwise marshaled before use, and if they need to be unmarshaled before the output file is written. This may be used with set_for_index or set_for_file.

Definition at line 1249 of file sort_s.h.

bool ssm_sort::sort_keys_t::is_unique (  )  const [inline]

True if duplicate keys and their records will be removed.

When duplicates are encountered, they are sorted by record-id, and the larger of the two (per umemcmp) is removed.

Definition at line 1275 of file sort_s.h.

bool ssm_sort::sort_keys_t::null_unique (  )  const [inline]

True if duplicate null keys and their records will be removed.

When duplicates are encountered, they are sorted by record-id, and the larger of the two (per umemcmp) is removed.

Definition at line 1286 of file sort_s.h.

bool ssm_sort::sort_keys_t::carry_obj (  )  const [inline]

True if the sort will copy the entire objects through each phase.

Used when is_for_file only.

Definition at line 1298 of file sort_s.h.

int ssm_sort::sort_keys_t::set_carry_obj ( bool  value = true  )  [inline]

Control whether the sort will copy the entire objects through each phase.

Parameters:
[in] value If true, ensure keep_orig().
Used when is_for_file only. This is useful if the keys are fixed and consume most of the original objects, in which case there is no need for the sort code to duplicate the key as well as the object in the temporary output files, or to re-pin the original records to copy them to the output file.

Definition at line 1311 of file sort_s.h.

Referenced by set_for_file().

Here is the caller graph for this function:

int ssm_sort::sort_keys_t::set_deep_copy ( bool  value = true  )  [inline]

Control whether the sort will copy the entire objects to the result file.

Parameters:
[in] value If true, ensure deep_copy().
Used when is_for_file only. When large objects appear in the input file and the input (original) file is not to be kept, sort can copy only the metadata for the large objects and reassign the large-object store to the result file. This eliminates a lot of object creation and logging.

Definition at line 1330 of file sort_s.h.

Referenced by set_for_file().

Here is the caller graph for this function:

bool ssm_sort::sort_keys_t::keep_orig (  )  const [inline]

Return true if the sort will not destroy the input file.

Used when is_for_file only. This is turned on automatically when set_for_index.

Definition at line 1340 of file sort_s.h.

int ssm_sort::sort_keys_t::set_keep_orig ( bool  value = true  )  [inline]

Control whether the sort will not destroy the input file.

Parameters:
[in] value If true, ensure keep_orig().
Used when is_for_file only. This is turned on automatically when set_for_index.

Definition at line 1347 of file sort_s.h.

Referenced by set_for_file().

Here is the caller graph for this function:

int ssm_sort::sort_keys_t::set_sortkey_fixed ( int  keyindex,
smsize_t  off,
smsize_t  len,
bool  in_header,
bool  aligned,
bool  lexico,
CF  cfunc 
) [inline]

Set attributes of key.

Parameters:
[in] keyindex The ordinal number of the index whose attributes are to be set
[in] off Offset from beginning of record header or body where the key is to be found
[in] len Length of key in recordd
[in] in_header True indicates that the key is to be found in the record header rather than in the body.
[in] aligned True indicates that the key, as found in the record, is suitably aligned for key comparisons with the CF (key-comparison function) to be used. False means that sort has to make an aligned copy before doing a key comparison.
[in] lexico True indicates that the key, as found in the record, is already in lexicographic format
[in] cfunc Key comparison function to use on this key.
Return values:
0 if OK, 1 if error.
You must call this or set_sortkey_derived for each of the keys.

There must be implicit agreement between what the cfunc expects and the arguments aligned and lexico.

Definition at line 1573 of file sort_s.h.

References is_for_index(), noCSKF(), and ssm_sort::key_cookie_t::null.

Here is the call graph for this function:

int ssm_sort::sort_keys_t::set_sortkey_derived ( int  keyindex,
CSKF  gfunc,
key_cookie_t  cookie,
bool  in_header,
bool  aligned,
bool  lexico,
CF  cfunc 
) [inline]

Set attributes of key.

Parameters:
[in] keyindex The ordinal number of the index whose attributes are to be set
[in] gfunc Key-creation (lexify) function.
[in] cookie Datum for gfunc.
[in] in_header True indicates that the key is to be found in the record header rather than in the body.
[in] aligned True indicates that the key, as found in the result of gfunc, is suitably aligned for key comparisons with the CF (key-comparison function) to be used. False means that sort has to make an aligned copy before doing a key comparison.
[in] lexico True indicates that the key, as found in the result of gfunc, is in lexicographic format.
[in] cfunc Key comparison function to use on this key.
Return values:
0 if OK, 1 if error.
You must call this or set_sortkey_fixed for each of the keys.

There must be implicit agreement between what the cfunc expects and the arguments aligned and lexico.

Definition at line 1594 of file sort_s.h.

References cookie(), and is_for_index().

Here is the call graph for this function:

key_location_t& ssm_sort::sort_keys_t::get_location ( int  i  )  [inline]

Return the key-location information for a given fixed-location key.

Parameters:
[in] i The ordinal number of the index of interest.
Only for fixed-location keys.

Definition at line 1427 of file sort_s.h.

smsize_t ssm_sort::sort_keys_t::offset ( int  i  )  const [inline]

Return the offset for a given fixed-location key.

Parameters:
[in] i The ordinal number of the index of interest.
Only for fixed-location keys.

Definition at line 1434 of file sort_s.h.

References ssm_sort::key_location_t::_off.

smsize_t ssm_sort::sort_keys_t::length ( int  i  )  const [inline]

Return the offset for a given fixed-location key.

Parameters:
[in] i The ordinal number of the index of interest.
Only for fixed-location keys.

Definition at line 1442 of file sort_s.h.

References ssm_sort::key_location_t::_length.

CSKF ssm_sort::sort_keys_t::keycreate ( int  i  )  const [inline]

Return the CSKF function for a given derived key.

Parameters:
[in] i The ordinal number of the index of interest.
Only for derived keys.

Definition at line 1450 of file sort_s.h.

key_cookie_t ssm_sort::sort_keys_t::cookie ( int  i  )  const [inline]

Return the argument to the CSKF function for a given derived key.

Parameters:
[in] i The ordinal number of the index of interest.
Only for derived keys.

Definition at line 1459 of file sort_s.h.

Referenced by set_sortkey_derived().

Here is the caller graph for this function:

CF ssm_sort::sort_keys_t::keycmp ( int  i  )  const [inline]

Return the key-comparison function for a given key.

Parameters:
[in] i The ordinal number of the index of interest.
For fixed-location and derived keys.

Definition at line 1468 of file sort_s.h.


The documentation for this class was generated from the following file:
Generated on Wed Jul 7 17:22:44 2010 for Shore Storage Manager by  doxygen 1.4.7