Class PrecomputedFinder
A BinarySearchFinder refinement which precalculate an hash-map of all the positions by each char, for both BWT and its sorted version, which takes ~ 2 * n space and makes calls to FindIndexOfNthOccurrenceInBWT(Int32, Int32) and FindIndexOfNthOccurrenceInSortedBWT(Int32, Int32) executed in constant time.
Implements
Inherited Members
Namespace: MoreStructures.BurrowsWheelerTransform.Builders.LastFirstFinders
Assembly: MoreStructures.dll
Syntax
public class PrecomputedFinder : BinarySearchFinder, ILastFirstFinder
Remarks
Calls to FindOccurrenceRankOfCharInBWT(Int32) are still executed in O(n / sigma) time, where sigma is
the size of the alphabet, or the number of distinct values in the BWT. If sigma is constant w.r.t. n, complexity
is still linear.
Calls to FindOccurrenceRankOfCharInSortedBWT(Int32) are still executed in O(log(n / sigma)), which is
O(log(n)) when sigma is constant w.r.t. n.
Constructors
| Improve this Doc View SourcePrecomputedFinder(RotatedTextWithTerminator, RotatedTextWithTerminator)
Declaration
public PrecomputedFinder(RotatedTextWithTerminator lastBWMColumn, RotatedTextWithTerminator firstBWMColumn)
Parameters
Type | Name | Description |
---|---|---|
RotatedTextWithTerminator | lastBWMColumn | |
RotatedTextWithTerminator | firstBWMColumn |
Remarks
Properties
| Improve this Doc View SourceUnorderedListSearch
The ISearch implementation to be used when searching for items in lists not sorted in any order.
Declaration
protected static ISearch UnorderedListSearch { get; }
Property Value
Type | Description |
---|---|
ISearch |
Methods
| Improve this Doc View SourceFindIndexOfNthOccurrenceInBWT(Int32, Int32)
Declaration
public override int FindIndexOfNthOccurrenceInBWT(int indexOfCharInBWT, int occurrenceRank)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | indexOfCharInBWT | |
System.Int32 | occurrenceRank |
Returns
Type | Description |
---|---|
System.Int32 |
Overrides
Remarks
This implementation uses a precomputed hash-map of all the positions by each char.
Time Complexity = O(1). Space Complexity = O(1).
FindIndexOfNthOccurrenceInSortedBWT(Int32, Int32)
Declaration
public override int FindIndexOfNthOccurrenceInSortedBWT(int indexOfCharInBWT, int occurrenceRank)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | indexOfCharInBWT | |
System.Int32 | occurrenceRank |
Returns
Type | Description |
---|---|
System.Int32 |
Overrides
Remarks
FindOccurrenceRankOfCharInBWT(Int32)
Declaration
public override int FindOccurrenceRankOfCharInBWT(int indexOfCharInBWT)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | indexOfCharInBWT |
Returns
Type | Description |
---|---|
System.Int32 |
Overrides
Remarks
ALGORITHM
This implementation uses a precomputed hash-map of all the positions by each char.
However, unlike SortedBWT, BWT is not sorted,
so the precomputed list storing all the indexes where the char of BWT at
index indexOfCharInBWT
appears can be accessed in O(1) but has to be iterated over
linearly.
Such a list has in average n / sigma elements, where sigma is the number of distinct chars in the text.
If sigma is constant, the Time Complexity is O(n).
Space Complexity is always O(1), since O(n * sigma) space has already been allocated to host the result of
counts and first occurrences precomputation.
FindOccurrenceRankOfCharInSortedBWT(Int32)
Declaration
public override int FindOccurrenceRankOfCharInSortedBWT(int indexOfCharInSortedBWT)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | indexOfCharInSortedBWT |
Returns
Type | Description |
---|---|
System.Int32 |
Overrides
Remarks
ALGORITHM
This implementation uses a precomputed hash-map of all the positions by each char.
It also takes advantage of the fact that SortedBWT is sorted, by running a
Binary Search on it, which takes logarithmic time over the list of indexes for the char at position
indexOfCharInSortedBWT
in BWT.
Such list has average size = n / sigma, where n = number of chars in
SortedBWT and sigma = size of the alphabet of
SortedBWT.
If sigma is constant, the list has a size O(n).
COMPLEXITY
Therefore, Time Complexity = O(log(n)) and Space Complexity = O(1).