Class CountBasedNarrowingIntervalMatcher
A NarrowingIntervalMatcher refinement which precalculate the count of occurrences of each item at index i in BWT[0..(i - 1)], and the index of first occurrence of each char in SortedBWT, and later uses them to perform in constant time interval narrowing operations within the top-level loop of chars to match.
Implements
Inherited Members
Namespace: MoreStructures.BurrowsWheelerTransform.Matching
Assembly: MoreStructures.dll
Syntax
public class CountBasedNarrowingIntervalMatcher : NarrowingIntervalMatcher, IMatcher
Remarks
Precalculating counts requires iterating over all the chars of the BWT and populating a table of n rows and sigma
columns.
Precalculating first occurrences also requires iterating over the BWT, and storing a dictionary of n items.
Therefore the cost paid upfront is O(n) in time and O(n * sigma) in space.
Constructors
| Improve this Doc View SourceCountBasedNarrowingIntervalMatcher(RotatedTextWithTerminator, RotatedTextWithTerminator)
Declaration
public CountBasedNarrowingIntervalMatcher(RotatedTextWithTerminator bwt, RotatedTextWithTerminator sbwt)
Parameters
Type | Name | Description |
---|---|---|
RotatedTextWithTerminator | bwt | |
RotatedTextWithTerminator | sbwt |
Remarks
This specific implementation also precalculates two dictionaries: the counts of BWT and the first occurrence of each of the chars of SortedBWT. These two data structures makes single char matching a linear operation.
Properties
| Improve this Doc View SourceLinearSearch
The ISearch implementation to be used when looking for the 1st occurrence of each of the items of an enumerable.
Declaration
protected static ISearch LinearSearch { get; }
Property Value
Type | Description |
---|---|
ISearch |
OccurrencesCounter
The IOccurrencesCounter implementation to be used when counting the number of occurrences of each of the items of an enumerable.
Declaration
protected static IOccurrencesCounter OccurrencesCounter { get; }
Property Value
Type | Description |
---|---|
IOccurrencesCounter |
Methods
| Improve this Doc View SourceBuildLastFirstFinder()
Declaration
protected override ILastFirstFinder BuildLastFirstFinder()
Returns
Type | Description |
---|---|
ILastFirstFinder |
Overrides
Remarks
Unlike NarrowingIntervalMatcher, this implementation of IMatcher doesn't make
explicit calls to LastToFirst(Int32). Instead it solely uses its precomputed
structures and uses the last-first property implicitely when narrowing the current interval via such strucutes
in NarrowInterval(Char, ILastFirstFinder, Int32, Int32).
Because of that, it doesn't need an optimized ILastFirstFinder, and in particular one which does
precomputation (such as the PrecomputedFinder used by NarrowingIntervalMatcher),
and can just instantiate a NaiveFinder instead.
NarrowInterval(Char, ILastFirstFinder, Int32, Int32)
Declaration
protected override (bool success, int narrowedStartIndex, int narrowedEndIndex) NarrowInterval(char currentChar, ILastFirstFinder finder, int startIndex, int endIndex)
Parameters
Type | Name | Description |
---|---|---|
System.Char | currentChar | |
ILastFirstFinder | finder | |
System.Int32 | startIndex | |
System.Int32 | endIndex |
Returns
Type | Description |
---|---|
System.ValueTuple<System.Boolean, System.Int32, System.Int32> |
Overrides
Remarks
ALGORITHM
Narrowing is performed in three sub-steps (compared to the five in NarrowingIntervalMatcher):
1. The new start index is calculated as the 1st occurrence in SortedBWT of the
current char + the count of such char in BWT up to the current start index excluded
(i.e. the number of occurrences of the char up to the index before the current start index).
2. The new end index is calculated as the 1st occurrence in SortedBWT of the current
char + the count of such char in BWT up to the current end index included, short of
one (i.e. the number of occurrences of the char up to the current end index - 1).
3. The narrowed interval in Sorted BWT is returned.
COMPLEXITY
Total amortized cost is O(1), both in time and space.