casacore
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Public Member Functions | Private Member Functions | List of all members
casacore::IncrementalStMan Class Reference

The Incremental Storage Manager. More...

#include <IncrementalStMan.h>

Inheritance diagram for casacore::IncrementalStMan:
casacore::ISMBase casacore::DataManager

Public Member Functions

 IncrementalStMan (uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1)
 Create an incremental storage manager with the given name. More...
 
 IncrementalStMan (const String &dataManagerName, uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1)
 
 ~IncrementalStMan ()
 
- Public Member Functions inherited from casacore::ISMBase
 ISMBase (uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1)
 Create an incremental storage manager without a name. More...
 
 ISMBase (const String &dataManagerName, uInt bucketSize, Bool checkBucketSize, uInt cacheSize)
 Create an incremental storage manager with the given name. More...
 
 ISMBase (const String &aDataManName, const Record &spec)
 Create an incremental storage manager with the given name. More...
 
 ~ISMBase ()
 
virtual DataManagerclone () const
 Clone this object. More...
 
virtual String dataManagerType () const
 Get the type name of the data manager (i.e. More...
 
virtual String dataManagerName () const
 Get the name given to the storage manager (in the constructor). More...
 
virtual Record dataManagerSpec () const
 Record a record containing data manager specifications. More...
 
virtual Record getProperties () const
 Get data manager properties that can be modified. More...
 
virtual void setProperties (const Record &spec)
 Modify data manager properties. More...
 
uInt version () const
 Get the version of the class. More...
 
void setCacheSize (uInt cacheSize, Bool canExceedNrBuckets)
 Set the cache size (in buckets). More...
 
uInt cacheSize () const
 Get the current cache size (in buckets). More...
 
void clearCache ()
 Clear the cache used by this storage manager. More...
 
virtual void showCacheStatistics (ostream &os) const
 Show the statistics of all caches used. More...
 
void showIndexStatistics (ostream &os)
 Show the index statistics. More...
 
void showBucketLayout (ostream &os)
 Show the layout of the buckets. More...
 
uInt bucketSize () const
 Get the bucket size (in bytes). More...
 
uInt uIntSize () const
 Get the size of a uInt in external format (can be canonical or local). More...
 
uInt rownrSize () const
 Get the size of a rownr in external format (can be canonical or local). More...
 
ISMBucketgetBucket (rownr_t rownr, rownr_t &bucketStartRow, rownr_t &bucketNrrow)
 Get the bucket containing the given row. More...
 
ISMBucketnextBucket (uInt &cursor, rownr_t &bucketStartRow, rownr_t &bucketNrrow)
 Get the next bucket. More...
 
char * tempBuffer () const
 Get access to the temporary buffer. More...
 
uInt uniqueNr ()
 Get a unique column number for the column (it is only unique for this storage manager). More...
 
rownr_t nrow () const
 Get the number of rows in this storage manager. More...
 
virtual Bool canAddRow () const
 Can the storage manager add rows? (yes) More...
 
virtual Bool canRemoveRow () const
 Can the storage manager delete rows? (yes) More...
 
virtual Bool canAddColumn () const
 Can the storage manager add columns? (not yet) More...
 
virtual Bool canRemoveColumn () const
 Can the storage manager delete columns? (not yet) More...
 
ISMColumngetColumn (uInt colnr)
 Get access to the given column. More...
 
void addBucket (rownr_t rownr, ISMBucket *bucket)
 Add a bucket to the storage manager (i.e. More...
 
void setBucketDirty ()
 Make the current bucket in the cache dirty (i.e. More...
 
StManArrayFileopenArrayFile (ByteIO::OpenOption opt)
 Open (if needed) the file for indirect arrays with the given mode. More...
 
Bool checkBucketLayout (uInt &offendingCursor, rownr_t &offendingBucketStartRow, uInt &offendingBucketNrow, uInt &offendingBucketNr, uInt &offendingCol, uInt &ffendingIndex, rownr_t &offendingRow, rownr_t &offendingPrevRow)
 Check that there are no repeated rowIds in the buckets comprising this ISM. More...
 
- Public Member Functions inherited from casacore::DataManager
 DataManager ()
 Default constructor. More...
 
virtual ~DataManager ()
 
void dataManagerInfo (Record &info) const
 Add SEQNR and SPEC (the DataManagerSpec subrecord) to the info. More...
 
virtual Bool isStorageManager () const
 Is the data manager a storage manager? The default is yes. More...
 
virtual Bool canReallocateColumns () const
 Tell if the data manager wants to reallocate the data manager column objects. More...
 
virtual DataManagerColumnreallocateColumn (DataManagerColumn *column)
 Reallocate the column object if it is part of this data manager. More...
 
uInt sequenceNr () const
 Get the (unique) sequence nr of this data manager. More...
 
uInt ncolumn () const
 Get the nr of columns in this data manager (can be zero). More...
 
Bool asBigEndian () const
 Have the data to be stored in big or little endian canonical format? More...
 
const TSMOptiontsmOption () const
 Get the TSM option. More...
 
MultiFileBasemultiFile ()
 Get the MultiFile pointer (can be 0). More...
 
String keywordName (const String &keyword) const
 Compose a keyword name from the given keyword appended with the sequence number (e.g. More...
 
String fileName () const
 Compose a unique filename from the table name and sequence number. More...
 
ByteIO::OpenOption fileOption () const
 Get the AipsIO option of the underlying file. More...
 
virtual Bool isRegular () const
 Is this a regular storage manager? It is regular if it allows addition of rows and writing data in them. More...
 
Tabletable () const
 Get the table this object is associated with. More...
 
virtual Bool canRenameColumn () const
 Does the data manager allow to rename columns? (default yes) More...
 
virtual void setMaximumCacheSize (uInt nMiB)
 Set the maximum cache size (in bytes) to be used by a storage manager. More...
 
virtual void showCacheStatistics (std::ostream &) const
 Show the data manager's IO statistics. More...
 
DataManagerColumncreateScalarColumn (const String &columnName, int dataType, const String &dataTypeId)
 Create a column in the data manager on behalf of a table column. More...
 
DataManagerColumncreateDirArrColumn (const String &columnName, int dataType, const String &dataTypeId)
 Create a direct array column. More...
 
DataManagerColumncreateIndArrColumn (const String &columnName, int dataType, const String &dataTypeId)
 Create an indirect array column. More...
 
DataManagergetClone () const
 Has the object already been cloned? More...
 
void setClone (DataManager *clone) const
 Set the pointer to the clone. More...
 

Private Member Functions

 IncrementalStMan (const IncrementalStMan &that)
 Copy constructor cannot be used. More...
 
IncrementalStManoperator= (const IncrementalStMan &that)
 Assignment cannot be used. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from casacore::ISMBase
static DataManagermakeObject (const String &dataManagerType, const Record &spec)
 Make the object from the type name string. More...
 
- Static Public Member Functions inherited from casacore::DataManager
static void registerCtor (const String &type, DataManagerCtor func)
 Register a mapping of a data manager type to its static construction function. More...
 
static DataManagerCtor getCtor (const String &dataManagerType)
 Get the "constructor" of a data manager (thread-safe). More...
 
static Bool isRegistered (const String &dataManagerType)
 Test if a data manager is registered (thread-safe). More...
 
static DataManagerunknownDataManager (const String &dataManagerType, const Record &spec)
 Serve as default function for theirRegisterMap, which catches all unknown data manager types. More...
 
- Static Public Attributes inherited from casacore::DataManager
static rownr_t MAXROWNR32
 Define the highest row number that can be represented as signed 32-bit. More...
 
- Protected Member Functions inherited from casacore::DataManager
void decrementNcolumn ()
 Decrement number of columns (in case a column is deleted). More...
 
void setEndian (Bool bigEndian)
 Tell the data manager if big or little endian format is needed. More...
 
void setTsmOption (const TSMOption &tsmOption)
 Tell the data manager which TSM option to use. More...
 
void setMultiFile (MultiFileBase *mfile)
 Tell the data manager that MultiFile can be used. More...
 
void throwDataTypeOther (const String &columnName, int dataType) const
 Throw an exception in case data type is TpOther, because the storage managers (and maybe other data managers) do not support such columns. More...
 

Detailed Description

The Incremental Storage Manager.

Intended use:

Public interface

Review Status

Reviewed By:
UNKNOWN
Date Reviewed:
before2004/08/25
Test programs:
tIncrementalStMan

Prerequisite

Etymology

IncrementalStMan is the data manager storing values in an incremental way (similar to an incremental backup). A value is only stored when it differs from the previous value.

Synopsis

IncrementalStMan stores the data in a way that a value is only stored when it is different from the value in the previous row. This storage manager is very well suited for columns with slowly changing values, because the resulting file can be much smaller. It is not suited at all for columns with continuously changing data.

In general it can be advantageous to use this storage manager when a value changes at most every 4 rows (although it depends on the length of the data values themselves). The following simple example shows the approximate savings that can be achieved when storing a column with double values changing every CH rows.

\#rows CH normal length ISM length compress ratio
50000 5 4000000 1606000 2.5
50000 50 4000000 164000 24.5
50000 500 4000000 32800 122

There is a special test program nISMBucket in the Tables module doing a simple, but usually adequate, simulation of the amount of storage needed for a scenario.

IncrementalStMan stores the values (and associated indices) in fixed-length buckets. A BucketCache object is used to read/write the buckets. The default cache size is 1 bucket (which is fine for sequential access), but for random access it can make sense to increase the size of the cache. This can be done using the class ROIncrementalStManAccessor.

The IncrementalStMan can hold values of any standard data type (thus from Bool to String). It can handle scalars, direct and indirect arrays. It can support an arbitrary number of columns. The values in each of them can vary at its own speed.
A bucket contains the values of several consecutive rows. At the beginning of a bucket the values of the starting row of all columns for this storage manager are repeated. In this way the value of a cell can always be found in the bucket and no references to previous buckets are needed.
A bucket should be big enough to hold all starting values and a reasonable number of other values. As a rule of thumb it should be big enough to hold at least 100 values of each column. In general the default bucket size will do. Only in special cases (e.g. when storing large variable length strings) the bucket size should be set explicitly. Giving a zero bucket size means that a suitale default bucket size will be calculated.
When a table is filled sequentially each bucket can be filled as much as possible. When writing in a random way, buckets can contain some unused space, because a bucket in the middle of the file has to be split when a new value has to be put in it.

Each column in the IncrementalStMan has the following properties to achieve the "store-different-values-only" behaviour.


Note: This class contains many public functions which are only used by other ISM classes; The only useful function for the user is the constructor;

Motivation

IncrementalStMan can save a lot of storage space. Unlike the old StManMirAIO it stores the values directly in the file to save on memory usage.

Example

This example shows how to create a table and how to attach the storage manager to some columns.

SetupNewTable newtab("name.data", tableDesc, Table::New);
IncrementalStMan stman; // define storage manager
newtab.bindColumn ("column1", stman); // bind column to st.man.
newtab.bindColumn ("column2", stman); // bind column to st.man.
Table tab(newtab); // actually create table

Definition at line 181 of file IncrementalStMan.h.

Constructor & Destructor Documentation

casacore::IncrementalStMan::IncrementalStMan ( uInt  bucketSize = 0,
Bool  checkBucketSize = True,
uInt  cacheSize = 1 
)
explicit

Create an incremental storage manager with the given name.

If no name is used, it is set to an empty string. The name can be used to construct a

ROIncrementalStManAccessor object (e.g. to set the cache size).
The bucket size has to be given in bytes and the cache size in buckets. Bucket size 0 means that the storage manager will set the bucket size such that it can contain about 100 rows (with a minimum size of 32768 bytes). However, if that results in a very large bucket size (>327680) it'll make it smaller. Note it uses 32 bytes for the size of variable length strings, so this heuristic may fail when a column contains large strings. When checkBucketSize is set and Bucket size > 0 the storage manager throws an exception when the size is too small to hold the values of at least 2 rows. For this check it uses 0 for the length of variable length strings.

casacore::IncrementalStMan::IncrementalStMan ( const String dataManagerName,
uInt  bucketSize = 0,
Bool  checkBucketSize = True,
uInt  cacheSize = 1 
)
explicit
casacore::IncrementalStMan::~IncrementalStMan ( )
casacore::IncrementalStMan::IncrementalStMan ( const IncrementalStMan that)
private

Copy constructor cannot be used.

Member Function Documentation

IncrementalStMan& casacore::IncrementalStMan::operator= ( const IncrementalStMan that)
private

Assignment cannot be used.


The documentation for this class was generated from the following file: