casacore
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Public Member Functions | Static Public Member Functions | Static Private Member Functions | Private Attributes | List of all members
casacore::StringDistance Class Reference

Class to deal with Levensthein distance of strings. More...

#include <StringDistance.h>

Public Member Functions

 StringDistance ()
 Default constructor sets maxDistance to 0. More...
 
 StringDistance (const String &source, Int maxDistance=-1, Bool countSwaps=True, Bool ignoreBlanks=True, Bool caseInsensitive=False)
 Construct from the source string and maximum distance. More...
 
const string & source () const
 Get data members. More...
 
Int maxDistance () const
 
const Matrix< Int > & matrix () const
 
Bool match (const String &target) const
 Test if the given target string is within the maximum distance. More...
 
Int distance (const String &target) const
 Calculate the distance from the string to the string given in the constructor. More...
 

Static Public Member Functions

static Int distance (const String &source, const String &target, Bool countSwaps=True)
 Calculate the distance between the two strings. More...
 
static String removeBlanks (const String &source)
 Remove blanks from the given string. More...
 

Static Private Member Functions

static Int doDistance (const String &source, const String &target, Bool countSwaps, Matrix< Int > &matrix)
 Calculate the distance. More...
 

Private Attributes

String itsSource
 
Matrix< IntitsMatrix
 
Int itsMaxDistance
 
Bool itsCountSwaps
 
Bool itsIgnoreBlanks
 
Bool itsCaseInsensitive
 

Detailed Description

Class to deal with Levensthein distance of strings.

Synopsis

The Levenshtein Distance is a metric telling how similar strings are. It is also known as the Edit Distance.

The distance tells how many operations (i.e., character substitutions, insertions, and deletions are needed to transform one string into another.
There are several extensions to the basic definition:

This class optionally uses the swap extension. Furthermore one can optionally ignore blanks. By default both options are used.

The code is based on code written by Anders Sewerin Johansen. Calculating the distance is an expensive O(N^2) operation, thus should be used with care.

The class is constructed with the source string to compare against. Thereafter its match or distance function can be used for each target string.

Definition at line 69 of file StringDistance.h.

Constructor & Destructor Documentation

casacore::StringDistance::StringDistance ( )

Default constructor sets maxDistance to 0.

casacore::StringDistance::StringDistance ( const String source,
Int  maxDistance = -1,
Bool  countSwaps = True,
Bool  ignoreBlanks = True,
Bool  caseInsensitive = False 
)
explicit

Construct from the source string and maximum distance.

If the maximum distance is negative, it defaults to 1+strlength/3. Note that maximum distance 0 means that the strings must match exactly.

Member Function Documentation

Int casacore::StringDistance::distance ( const String target) const

Calculate the distance from the string to the string given in the constructor.

If the length of target exceeds source length + maxDistance, the difference in lengths is returned.

static Int casacore::StringDistance::distance ( const String source,
const String target,
Bool  countSwaps = True 
)
static

Calculate the distance between the two strings.

This is slower than the distance member function, because it has to allocate the underlying Matrix for each invocation.

static Int casacore::StringDistance::doDistance ( const String source,
const String target,
Bool  countSwaps,
Matrix< Int > &  matrix 
)
staticprivate

Calculate the distance.

Bool casacore::StringDistance::match ( const String target) const

Test if the given target string is within the maximum distance.

Referenced by casacore::TaqlRegex::match().

const Matrix<Int>& casacore::StringDistance::matrix ( ) const
inline

Definition at line 88 of file StringDistance.h.

References itsMatrix.

Int casacore::StringDistance::maxDistance ( ) const
inline

Definition at line 86 of file StringDistance.h.

References itsMaxDistance.

static String casacore::StringDistance::removeBlanks ( const String source)
static

Remove blanks from the given string.

const string& casacore::StringDistance::source ( ) const
inline

Get data members.

Definition at line 84 of file StringDistance.h.

References itsSource.

Member Data Documentation

Bool casacore::StringDistance::itsCaseInsensitive
private

Definition at line 121 of file StringDistance.h.

Bool casacore::StringDistance::itsCountSwaps
private

Definition at line 119 of file StringDistance.h.

Bool casacore::StringDistance::itsIgnoreBlanks
private

Definition at line 120 of file StringDistance.h.

Matrix<Int> casacore::StringDistance::itsMatrix
mutableprivate

Definition at line 117 of file StringDistance.h.

Referenced by matrix().

Int casacore::StringDistance::itsMaxDistance
private

Definition at line 118 of file StringDistance.h.

Referenced by maxDistance().

String casacore::StringDistance::itsSource
private

Definition at line 116 of file StringDistance.h.

Referenced by source().


The documentation for this class was generated from the following file: