Class LastFirstPropertyBasedBuilder
An extension of NaiveBuilder which takes advantange of the last-first property to reduce the complexity of InvertTransform(RotatedTextWithTerminator).
Implements
Inherited Members
Namespace: MoreStructures.BurrowsWheelerTransform.Builders
Assembly: MoreStructures.dll
Syntax
public class LastFirstPropertyBasedBuilder : NaiveBuilder, IBuilder
Remarks
A ILastFirstFinder, built by FirstLastFinderBuilder is used to jump between the BWT and its sorted version.
Properties
| Improve this Doc View SourceFirstLastFinderBuilder
The strategy by which this builder finds chars in the BWT and its sorted version.
Declaration
public Func<RotatedTextWithTerminator, ILastFirstFinder> FirstLastFinderBuilder { get; set; }
Property Value
Type | Description |
---|---|
Func<RotatedTextWithTerminator, ILastFirstFinder> |
Methods
| Improve this Doc View SourceInvertTransform(RotatedTextWithTerminator)
Declaration
public override TextWithTerminator InvertTransform(RotatedTextWithTerminator lastBWMColumn)
Parameters
Type | Name | Description |
---|---|---|
RotatedTextWithTerminator | lastBWMColumn |
Returns
Type | Description |
---|---|
TextWithTerminator |
Overrides
Remarks
ALGORITHM
This implementation inverts the BWT by using the last-first property.
- First column of the matrix (sBWT) is just the last column (BWT), sorted.
- By last-first property, the 1-st (and only) occurrence of terminator in sBWT at sBWT[0] corresponds
to the 1st occurrence of terminator in BWT at BWT[i0]. BWTs[i0] is the 1-st char of the text.
- Again by last-first property, the n-th occurrence of c in BWTs at sBWTs[i0] corresponds to the n-th
occurrence of c in BWT at BWT[i1]. BWTs[i1] is the 2-st char of the text.
- And so on, until BWTs[i(n-1)], the terminator, is reached.
COMPLEXITY
- Before any iteration, Sorted BWT is computed, taking O(n * log(n)) time, where n is the length of
lastBWMColumn
. If the alphabet is of constant size sigma, Counting Sort reduces
the overall Time Complexity of this step to O(n).
- After that the finder may also preallocate other supporting structures, to speed up searches (such
the dictionary used in PrecomputedFinder. Although it depends on the specific
implementation built by FirstLastFinderBuilder, we may assume this cost to also be
linear with n.
- From terminator to terminator, there are n top-level iterations. Each iteration takes m1 + m2,
where m1 is the cost of FindIndexOfNthOccurrenceInBWT(Int32, Int32)
and m2 is the cost of FindOccurrenceRankOfCharInSortedBWT(Int32).
- Finally, the
- So total Time Complexity is O(n * (m1 + m2)) and Space Complexity is O(n).
Using NaiveFinder, m1 and m2 are both O(n), so Time Complexity is O(n^2).
Using BinarySearchFinder, m1 is O(n) and m2 is O(log(n)), so overall Time
Complexity is still O(n^2).
Using PrecomputedFinder, m1 is O(1), whereas m2 is O(log(n / sigma)) where
sigma is the size of the alphabet, so overall Time Complexity is O(n * log(n)) if sigma is constant.