casacore
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Public Member Functions | Static Public Member Functions | Private Member Functions | Private Attributes | List of all members
casacore::ISMBucket Class Reference

A bucket in the Incremental Storage Manager. More...

#include <ISMBucket.h>

Public Member Functions

 ISMBucket (ISMBase *parent, const char *bucketStorage)
 Create a bucket with the given parent. More...
 
 ~ISMBucket ()
 
uInt getInterval (uInt colnr, rownr_t rownr, rownr_t bucketNrrow, rownr_t &start, rownr_t &end, uInt &offset) const
 Get the row-interval for given column and row. More...
 
Bool canAddData (uInt leng) const
 Is the bucket large enough to add a value? More...
 
void addData (uInt colnr, rownr_t rownr, uInt index, const char *data, uInt leng)
 Add the data to the data part. More...
 
Bool canReplaceData (uInt newLeng, uInt oldLeng) const
 Is the bucket large enough to replace a value? More...
 
void replaceData (uInt &offset, const char *data, uInt newLeng, uInt fixedLength)
 Replace a data item. More...
 
const char * get (uInt offset) const
 Get a pointer to the data for the given offset. More...
 
uInt getLength (uInt fixedLength, const char *data) const
 Get the length of the data value. More...
 
uIntgetOffset (uInt colnr, rownr_t rownr)
 Get access to the offset of the data for given column and row. More...
 
Block< rownr_t > & rowIndex (uInt colnr)
 Get access to the index information for the given column. More...
 
Block< uInt > & offIndex (uInt colnr)
 Return the offsets of the values stored in the data part. More...
 
uIntindexUsed (uInt colnr)
 Return the number of values stored. More...
 
rownr_t split (ISMBucket *&left, ISMBucket *&right, Block< Bool > &duplicated, rownr_t bucketStartRow, rownr_t bucketNrrow, uInt colnr, rownr_t rownr, uInt lengToAdd)
 Split the bucket in the middle. More...
 
Bool simpleSplit (ISMBucket *left, ISMBucket *right, Block< Bool > &duplicated, rownr_t &splitRownr, rownr_t rownr)
 Determine whether a simple split is possible. More...
 
uInt getSplit (uInt totLeng, const Block< uInt > &rowLeng, const Block< uInt > &cumLeng)
 Return the index where the bucket should be split to get two parts with almost identical length. More...
 
void shiftLeft (uInt index, uInt nr, Block< rownr_t > &rowIndex, Block< uInt > &offIndex, uInt &nused, uInt leng)
 Remove nr items from data and index part by shifting to the left. More...
 
void copy (const ISMBucket &that)
 Copy the contents of that bucket to this bucket. More...
 
void show (ostream &os) const
 Show the layout of the bucket. More...
 
Bool check (uInt &offendingCol, uInt &offendingIndex, rownr_t &offendingRow, rownr_t &offendingPrevRow) const
 Check that there are no repeated rowIds in the bucket. More...
 

Static Public Member Functions

static char * readCallBack (void *owner, const char *bucketStorage)
 Callback function when BucketCache reads a bucket. More...
 
static void writeCallBack (void *owner, char *bucketStorage, const char *bucket)
 Callback function when BucketCache writes a bucket. More...
 
static char * initCallBack (void *owner)
 Callback function when BucketCache adds a new bucket to the data file. More...
 
static void deleteCallBack (void *, char *bucket)
 Callback function when BucketCache removes a bucket from the cache. More...
 

Private Member Functions

 ISMBucket (const ISMBucket &)
 Forbid copy constructor. More...
 
ISMBucketoperator= (const ISMBucket &)
 Forbid assignment. More...
 
void removeData (uInt offset, uInt leng)
 Remove a data item with the given length. More...
 
uInt insertData (const char *data, uInt leng)
 Insert a data value by appending it to the end. More...
 
uInt copyData (ISMBucket &other, uInt colnr, rownr_t toRownr, uInt fromIndex, uInt toIndex) const
 Copy a data item from this bucket to the other bucket. More...
 
void read (const char *bucketStorage)
 Read the data from the storage into this bucket. More...
 
void write (char *bucketStorage) const
 Write the bucket into the storage. More...
 

Private Attributes

ISMBasestmanPtr_p
 Pointer to the parent storage manager. More...
 
uInt uIntSize_p
 The size (in bytes) of an uInt and rownr_t (used in index, etc.). More...
 
uInt rownrSize_p
 
uInt dataLeng_p
 The size (in bytes) of the data. More...
 
uInt indexLeng_p
 The size (in bytes) of the index. More...
 
PtrBlock< Block< rownr_t > * > rowIndex_p
 The row index per column; each index contains the row number of each value stored in the bucket (for that column). More...
 
PtrBlock< Block< uInt > * > offIndex_p
 The offset index per column; each index contains the offset (in bytes) of each value stored in the bucket (for that column). More...
 
Block< uIntindexUsed_p
 Nr of used elements in each index; i.e. More...
 
char * data_p
 The data space (in external (e.g. More...
 

Detailed Description

A bucket in the Incremental Storage Manager.

Intended use:

Internal

Review Status

Reviewed By:
UNKNOWN
Date Reviewed:
before2004/08/25

Prerequisite

Etymology

ISMBucket represents a bucket in the Incremental Storage Manager.

Synopsis

The Incremental Storage Manager uses a BucketCache object to read/write/cache the buckets containing the data. An ISMBucket object is the internal representation of the contents of a bucket. ISMBucket contains static callback functions which are called by BucketCache when reading/writing a bucket. These callback functions do the mapping of bucket data to ISMBucket object and vice-versa.

A bucket contains the values of several rows of all columns bound to this Incremental Storage Manager. A bucket is split into a data part and an index part. Each part has an arbitrary length but together they do not exceed the fixed bucket length.

The beginning of the data part contains the values of all columns bound. The remainder of the data part contains the values of the rows/columns with a changed value.
The index part contains an index per column. Each index contains the row number and an offset for a row with a stored value. The row numbers are relative to the beginning of the bucket, so the bucket has no knowledge about the absolute row numbers. In this way deletion of rows is much simpler.

The contents of a bucket looks like:

-------------------------------------------------------------------
| index offset | data part | index part | free |
-------------------------------------------------------------------
0 4 4+length(data part)
<--------------------------bucketsize----------------------------->

The data part contains all data value belonging to the bucket. The index part contains for each column the following data:

-----------------------------------------------------------------------
| \#values stored | row numbers of values | offset in data part of |
| for column i | stored for column i | values stored for column i |
-----------------------------------------------------------------------
0 4 4+4*nrval

Note that the row numbers in the bucket start at 0, thus are relative to the beginning of the bucket. The main index kept in ISMIndex knows the starting row of each bucket. In this way bucket splitting and especially row removal is much easier.

The bucket can be stored in canonical or local (i.e. native) data format. When a bucket is read into memory, its data are read, converted, and stored in the ISMBucket object. When flushed, the contents are written. ISMBucket takes care that the values stored in its object do not exceed the size of the bucket. When full, the user can call a function to split it into a left and right bucket. When the new value has to be written at the end, the split merely consist of creating a new bucket. In any case, care is taken that a row is not split. Thus a row is always entirely contained in one bucket.

Class ISMColumn does the actual writing of data in a bucket and uses the relevant ISMBucket functions.

Motivation

ISMBucket encapsulates the data of a bucket.

Definition at line 132 of file ISMBucket.h.

Constructor & Destructor Documentation

casacore::ISMBucket::ISMBucket ( ISMBase parent,
const char *  bucketStorage 
)

Create a bucket with the given parent.

When bucketStorage is non-zero, reconstruct the object from it. It keeps the pointer to its parent (but does not own it).

casacore::ISMBucket::~ISMBucket ( )
casacore::ISMBucket::ISMBucket ( const ISMBucket )
private

Forbid copy constructor.

Member Function Documentation

void casacore::ISMBucket::addData ( uInt  colnr,
rownr_t  rownr,
uInt  index,
const char *  data,
uInt  leng 
)

Add the data to the data part.

It updates the bucket index at the given index. An exception is thrown if the bucket is too small.

Bool casacore::ISMBucket::canAddData ( uInt  leng) const

Is the bucket large enough to add a value?

Bool casacore::ISMBucket::canReplaceData ( uInt  newLeng,
uInt  oldLeng 
) const

Is the bucket large enough to replace a value?

Bool casacore::ISMBucket::check ( uInt offendingCol,
uInt offendingIndex,
rownr_t offendingRow,
rownr_t offendingPrevRow 
) const

Check that there are no repeated rowIds in the bucket.

void casacore::ISMBucket::copy ( const ISMBucket that)

Copy the contents of that bucket to this bucket.

This is used after a split operation.

uInt casacore::ISMBucket::copyData ( ISMBucket other,
uInt  colnr,
rownr_t  toRownr,
uInt  fromIndex,
uInt  toIndex 
) const
private

Copy a data item from this bucket to the other bucket.

static void casacore::ISMBucket::deleteCallBack ( void *  ,
char *  bucket 
)
static

Callback function when BucketCache removes a bucket from the cache.

This function dletes the ISMBucket bucket object.

const char * casacore::ISMBucket::get ( uInt  offset) const
inline

Get a pointer to the data for the given offset.

Definition at line 315 of file ISMBucket.h.

References data_p.

uInt casacore::ISMBucket::getInterval ( uInt  colnr,
rownr_t  rownr,
rownr_t  bucketNrrow,
rownr_t start,
rownr_t end,
uInt offset 
) const

Get the row-interval for given column and row.

It sets the start and end of the interval to which the row belongs and the offset of its current value. It returns the index where the row number can be put in the bucket index.

uInt casacore::ISMBucket::getLength ( uInt  fixedLength,
const char *  data 
) const

Get the length of the data value.

It is fixedLength when non-zero, otherwise read it from the data value.

uInt& casacore::ISMBucket::getOffset ( uInt  colnr,
rownr_t  rownr 
)

Get access to the offset of the data for given column and row.

It allows to change it (used for example by replaceData).

uInt casacore::ISMBucket::getSplit ( uInt  totLeng,
const Block< uInt > &  rowLeng,
const Block< uInt > &  cumLeng 
)

Return the index where the bucket should be split to get two parts with almost identical length.

uInt & casacore::ISMBucket::indexUsed ( uInt  colnr)
inline

Return the number of values stored.

Definition at line 327 of file ISMBucket.h.

References indexUsed_p.

static char* casacore::ISMBucket::initCallBack ( void *  owner)
static

Callback function when BucketCache adds a new bucket to the data file.

This function creates an empty ISMBucket object. It returns the pointer to ISMBucket object which gets part of the cache. The object gets deleted by the deleteCallBack function.

uInt casacore::ISMBucket::insertData ( const char *  data,
uInt  leng 
)
private

Insert a data value by appending it to the end.

It returns the offset of the data value.

Block< uInt > & casacore::ISMBucket::offIndex ( uInt  colnr)
inline

Return the offsets of the values stored in the data part.

Definition at line 323 of file ISMBucket.h.

References offIndex_p.

ISMBucket& casacore::ISMBucket::operator= ( const ISMBucket )
private

Forbid assignment.

void casacore::ISMBucket::read ( const char *  bucketStorage)
private

Read the data from the storage into this bucket.

static char* casacore::ISMBucket::readCallBack ( void *  owner,
const char *  bucketStorage 
)
static

Callback function when BucketCache reads a bucket.

It creates an ISMBucket object and converts the raw bucketStorage to that object. It returns the pointer to ISMBucket object which gets part of the cache. The object gets deleted by the deleteCallBack function.

void casacore::ISMBucket::removeData ( uInt  offset,
uInt  leng 
)
private

Remove a data item with the given length.

If the length is zero, its variable length is read first.

void casacore::ISMBucket::replaceData ( uInt offset,
const char *  data,
uInt  newLeng,
uInt  fixedLength 
)

Replace a data item.

When its length is variable (indicated by fixedLength=0), the old value will be removed and the new one appended at the end. An exception is thrown if the bucket is too small.

Block< rownr_t > & casacore::ISMBucket::rowIndex ( uInt  colnr)
inline

Get access to the index information for the given column.

This is used by ISMColumn when putting the data.

Return the row numbers with a stored value.

Definition at line 319 of file ISMBucket.h.

References rowIndex_p.

void casacore::ISMBucket::shiftLeft ( uInt  index,
uInt  nr,
Block< rownr_t > &  rowIndex,
Block< uInt > &  offIndex,
uInt nused,
uInt  leng 
)

Remove nr items from data and index part by shifting to the left.

The rowIndex, offIndex, and nused get updated. The caller is responsible for removing data when needed (e.g. ISMIndColumn removes the indirect arrays from its file).

void casacore::ISMBucket::show ( ostream &  os) const

Show the layout of the bucket.

Bool casacore::ISMBucket::simpleSplit ( ISMBucket left,
ISMBucket right,
Block< Bool > &  duplicated,
rownr_t splitRownr,
rownr_t  rownr 
)

Determine whether a simple split is possible.

If so, do it. This is possible if the new row is at the end of the last bucket, which will often be the case.
A simple split means adding a new bucket for the new row. If the old bucket already contains values for that row, those values are moved to the new bucket.
This fuction is only called by split, which created the left and right bucket.

rownr_t casacore::ISMBucket::split ( ISMBucket *&  left,
ISMBucket *&  right,
Block< Bool > &  duplicated,
rownr_t  bucketStartRow,
rownr_t  bucketNrrow,
uInt  colnr,
rownr_t  rownr,
uInt  lengToAdd 
)

Split the bucket in the middle.

It returns the row number where the bucket was split and the new left and right bucket. The caller is responsible for deleting the newly created buckets. When possible a simple split is done.
The starting values in the right bucket may be copies of the values in the left bucket. The duplicated Block contains a switch per column indicating if the value is copied.

void casacore::ISMBucket::write ( char *  bucketStorage) const
private

Write the bucket into the storage.

static void casacore::ISMBucket::writeCallBack ( void *  owner,
char *  bucketStorage,
const char *  bucket 
)
static

Callback function when BucketCache writes a bucket.

It converts the ISMBucket bucket object to the raw bucketStorage.

Member Data Documentation

char* casacore::ISMBucket::data_p
private

The data space (in external (e.g.

canonical) format).

Definition at line 311 of file ISMBucket.h.

Referenced by get().

uInt casacore::ISMBucket::dataLeng_p
private

The size (in bytes) of the data.

Definition at line 298 of file ISMBucket.h.

uInt casacore::ISMBucket::indexLeng_p
private

The size (in bytes) of the index.

Definition at line 300 of file ISMBucket.h.

Block<uInt> casacore::ISMBucket::indexUsed_p
private

Nr of used elements in each index; i.e.

the number of stored values per column.

Definition at line 309 of file ISMBucket.h.

Referenced by indexUsed().

PtrBlock<Block<uInt>*> casacore::ISMBucket::offIndex_p
private

The offset index per column; each index contains the offset (in bytes) of each value stored in the bucket (for that column).

Definition at line 306 of file ISMBucket.h.

Referenced by offIndex().

PtrBlock<Block<rownr_t>*> casacore::ISMBucket::rowIndex_p
private

The row index per column; each index contains the row number of each value stored in the bucket (for that column).

Definition at line 303 of file ISMBucket.h.

Referenced by rowIndex().

uInt casacore::ISMBucket::rownrSize_p
private

Definition at line 296 of file ISMBucket.h.

ISMBase* casacore::ISMBucket::stmanPtr_p
private

Pointer to the parent storage manager.

Definition at line 293 of file ISMBucket.h.

uInt casacore::ISMBucket::uIntSize_p
private

The size (in bytes) of an uInt and rownr_t (used in index, etc.).

Definition at line 295 of file ISMBucket.h.


The documentation for this class was generated from the following file: