casacore
|
The Incremental Storage Manager. More...
#include <IncrementalStMan.h>
Public Member Functions | |
IncrementalStMan (uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1) | |
Create an incremental storage manager with the given name. More... | |
IncrementalStMan (const String &dataManagerName, uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1) | |
~IncrementalStMan () | |
Public Member Functions inherited from casacore::ISMBase | |
ISMBase (uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1) | |
Create an incremental storage manager without a name. More... | |
ISMBase (const String &dataManagerName, uInt bucketSize, Bool checkBucketSize, uInt cacheSize) | |
Create an incremental storage manager with the given name. More... | |
ISMBase (const String &aDataManName, const Record &spec) | |
Create an incremental storage manager with the given name. More... | |
~ISMBase () | |
virtual DataManager * | clone () const |
Clone this object. More... | |
virtual String | dataManagerType () const |
Get the type name of the data manager (i.e. More... | |
virtual String | dataManagerName () const |
Get the name given to the storage manager (in the constructor). More... | |
virtual Record | dataManagerSpec () const |
Record a record containing data manager specifications. More... | |
virtual Record | getProperties () const |
Get data manager properties that can be modified. More... | |
virtual void | setProperties (const Record &spec) |
Modify data manager properties. More... | |
uInt | version () const |
Get the version of the class. More... | |
void | setCacheSize (uInt cacheSize, Bool canExceedNrBuckets) |
Set the cache size (in buckets). More... | |
uInt | cacheSize () const |
Get the current cache size (in buckets). More... | |
void | clearCache () |
Clear the cache used by this storage manager. More... | |
virtual void | showCacheStatistics (ostream &os) const |
Show the statistics of all caches used. More... | |
void | showIndexStatistics (ostream &os) |
Show the index statistics. More... | |
void | showBucketLayout (ostream &os) |
Show the layout of the buckets. More... | |
uInt | bucketSize () const |
Get the bucket size (in bytes). More... | |
uInt | uIntSize () const |
Get the size of a uInt in external format (can be canonical or local). More... | |
uInt | rownrSize () const |
Get the size of a rownr in external format (can be canonical or local). More... | |
ISMBucket * | getBucket (rownr_t rownr, rownr_t &bucketStartRow, rownr_t &bucketNrrow) |
Get the bucket containing the given row. More... | |
ISMBucket * | nextBucket (uInt &cursor, rownr_t &bucketStartRow, rownr_t &bucketNrrow) |
Get the next bucket. More... | |
char * | tempBuffer () const |
Get access to the temporary buffer. More... | |
uInt | uniqueNr () |
Get a unique column number for the column (it is only unique for this storage manager). More... | |
rownr_t | nrow () const |
Get the number of rows in this storage manager. More... | |
virtual Bool | canAddRow () const |
Can the storage manager add rows? (yes) More... | |
virtual Bool | canRemoveRow () const |
Can the storage manager delete rows? (yes) More... | |
virtual Bool | canAddColumn () const |
Can the storage manager add columns? (not yet) More... | |
virtual Bool | canRemoveColumn () const |
Can the storage manager delete columns? (not yet) More... | |
ISMColumn & | getColumn (uInt colnr) |
Get access to the given column. More... | |
void | addBucket (rownr_t rownr, ISMBucket *bucket) |
Add a bucket to the storage manager (i.e. More... | |
void | setBucketDirty () |
Make the current bucket in the cache dirty (i.e. More... | |
StManArrayFile * | openArrayFile (ByteIO::OpenOption opt) |
Open (if needed) the file for indirect arrays with the given mode. More... | |
Bool | checkBucketLayout (uInt &offendingCursor, rownr_t &offendingBucketStartRow, uInt &offendingBucketNrow, uInt &offendingBucketNr, uInt &offendingCol, uInt &ffendingIndex, rownr_t &offendingRow, rownr_t &offendingPrevRow) |
Check that there are no repeated rowIds in the buckets comprising this ISM. More... | |
Public Member Functions inherited from casacore::DataManager | |
DataManager () | |
Default constructor. More... | |
virtual | ~DataManager () |
void | dataManagerInfo (Record &info) const |
Add SEQNR and SPEC (the DataManagerSpec subrecord) to the info. More... | |
virtual Bool | isStorageManager () const |
Is the data manager a storage manager? The default is yes. More... | |
virtual Bool | canReallocateColumns () const |
Tell if the data manager wants to reallocate the data manager column objects. More... | |
virtual DataManagerColumn * | reallocateColumn (DataManagerColumn *column) |
Reallocate the column object if it is part of this data manager. More... | |
uInt | sequenceNr () const |
Get the (unique) sequence nr of this data manager. More... | |
uInt | ncolumn () const |
Get the nr of columns in this data manager (can be zero). More... | |
Bool | asBigEndian () const |
Have the data to be stored in big or little endian canonical format? More... | |
const TSMOption & | tsmOption () const |
Get the TSM option. More... | |
MultiFileBase * | multiFile () |
Get the MultiFile pointer (can be 0). More... | |
String | keywordName (const String &keyword) const |
Compose a keyword name from the given keyword appended with the sequence number (e.g. More... | |
String | fileName () const |
Compose a unique filename from the table name and sequence number. More... | |
ByteIO::OpenOption | fileOption () const |
Get the AipsIO option of the underlying file. More... | |
virtual Bool | isRegular () const |
Is this a regular storage manager? It is regular if it allows addition of rows and writing data in them. More... | |
Table & | table () const |
Get the table this object is associated with. More... | |
virtual Bool | canRenameColumn () const |
Does the data manager allow to rename columns? (default yes) More... | |
virtual void | setMaximumCacheSize (uInt nMiB) |
Set the maximum cache size (in bytes) to be used by a storage manager. More... | |
virtual void | showCacheStatistics (std::ostream &) const |
Show the data manager's IO statistics. More... | |
DataManagerColumn * | createScalarColumn (const String &columnName, int dataType, const String &dataTypeId) |
Create a column in the data manager on behalf of a table column. More... | |
DataManagerColumn * | createDirArrColumn (const String &columnName, int dataType, const String &dataTypeId) |
Create a direct array column. More... | |
DataManagerColumn * | createIndArrColumn (const String &columnName, int dataType, const String &dataTypeId) |
Create an indirect array column. More... | |
DataManager * | getClone () const |
Has the object already been cloned? More... | |
void | setClone (DataManager *clone) const |
Set the pointer to the clone. More... | |
Private Member Functions | |
IncrementalStMan (const IncrementalStMan &that) | |
Copy constructor cannot be used. More... | |
IncrementalStMan & | operator= (const IncrementalStMan &that) |
Assignment cannot be used. More... | |
Additional Inherited Members | |
Static Public Member Functions inherited from casacore::ISMBase | |
static DataManager * | makeObject (const String &dataManagerType, const Record &spec) |
Make the object from the type name string. More... | |
Static Public Member Functions inherited from casacore::DataManager | |
static void | registerCtor (const String &type, DataManagerCtor func) |
Register a mapping of a data manager type to its static construction function. More... | |
static DataManagerCtor | getCtor (const String &dataManagerType) |
Get the "constructor" of a data manager (thread-safe). More... | |
static Bool | isRegistered (const String &dataManagerType) |
Test if a data manager is registered (thread-safe). More... | |
static DataManager * | unknownDataManager (const String &dataManagerType, const Record &spec) |
Serve as default function for theirRegisterMap, which catches all unknown data manager types. More... | |
Static Public Attributes inherited from casacore::DataManager | |
static rownr_t | MAXROWNR32 |
Define the highest row number that can be represented as signed 32-bit. More... | |
Protected Member Functions inherited from casacore::DataManager | |
void | decrementNcolumn () |
Decrement number of columns (in case a column is deleted). More... | |
void | setEndian (Bool bigEndian) |
Tell the data manager if big or little endian format is needed. More... | |
void | setTsmOption (const TSMOption &tsmOption) |
Tell the data manager which TSM option to use. More... | |
void | setMultiFile (MultiFileBase *mfile) |
Tell the data manager that MultiFile can be used. More... | |
void | throwDataTypeOther (const String &columnName, int dataType) const |
Throw an exception in case data type is TpOther, because the storage managers (and maybe other data managers) do not support such columns. More... | |
The Incremental Storage Manager.
Public interface
IncrementalStMan is the data manager storing values in an incremental way (similar to an incremental backup). A value is only stored when it differs from the previous value.
IncrementalStMan stores the data in a way that a value is only stored when it is different from the value in the previous row. This storage manager is very well suited for columns with slowly changing values, because the resulting file can be much smaller. It is not suited at all for columns with continuously changing data.
In general it can be advantageous to use this storage manager when a value changes at most every 4 rows (although it depends on the length of the data values themselves). The following simple example shows the approximate savings that can be achieved when storing a column with double values changing every CH rows.
There is a special test program nISMBucket
in the Tables module doing a simple, but usually adequate, simulation of the amount of storage needed for a scenario.
IncrementalStMan stores the values (and associated indices) in fixed-length buckets. A BucketCache object is used to read/write the buckets. The default cache size is 1 bucket (which is fine for sequential access), but for random access it can make sense to increase the size of the cache. This can be done using the class ROIncrementalStManAccessor.
The IncrementalStMan can hold values of any standard data type (thus from Bool to String). It can handle scalars, direct and indirect arrays. It can support an arbitrary number of columns. The values in each of them can vary at its own speed.
A bucket contains the values of several consecutive rows. At the beginning of a bucket the values of the starting row of all columns for this storage manager are repeated. In this way the value of a cell can always be found in the bucket and no references to previous buckets are needed.
A bucket should be big enough to hold all starting values and a reasonable number of other values. As a rule of thumb it should be big enough to hold at least 100 values of each column. In general the default bucket size will do. Only in special cases (e.g. when storing large variable length strings) the bucket size should be set explicitly. Giving a zero bucket size means that a suitale default bucket size will be calculated.
When a table is filled sequentially each bucket can be filled as much as possible. When writing in a random way, buckets can contain some unused space, because a bucket in the middle of the file has to be split when a new value has to be put in it.
Each column in the IncrementalStMan has the following properties to achieve the "store-different-values-only" behaviour.
add 1 row; put value in row N; add M rows;
add M+1 rows; put value in row N;
The IncrementalStMan is optimized for sequential access to a table.
- A bucket is accessed only once, because a bucket contains consecutive rows.
- For each column a copy is kept of the last value read. So the value for the next rows (with that same value) is immediately available.
For random access the performance can be improved by setting the cache size using class
Note: This class contains many public functions which are only used by other ISM classes; The only useful function for the user is the constructor;
IncrementalStMan can save a lot of storage space. Unlike the old StManMirAIO it stores the values directly in the file to save on memory usage.
This example shows how to create a table and how to attach the storage manager to some columns.
Definition at line 181 of file IncrementalStMan.h.
|
explicit |
Create an incremental storage manager with the given name.
If no name is used, it is set to an empty string. The name can be used to construct a
ROIncrementalStManAccessor object (e.g. to set the cache size).
The bucket size has to be given in bytes and the cache size in buckets. Bucket size 0 means that the storage manager will set the bucket size such that it can contain about 100 rows (with a minimum size of 32768 bytes). However, if that results in a very large bucket size (>327680) it'll make it smaller. Note it uses 32 bytes for the size of variable length strings, so this heuristic may fail when a column contains large strings. When checkBucketSize
is set and Bucket size > 0 the storage manager throws an exception when the size is too small to hold the values of at least 2 rows. For this check it uses 0 for the length of variable length strings.
|
explicit |
casacore::IncrementalStMan::~IncrementalStMan | ( | ) |
|
private |
Copy constructor cannot be used.
|
private |
Assignment cannot be used.