Search Results for

    Show / Hide Table of Contents

    Class NaiveBuilder

    This implementation adopts the simplest approach at BWMatrix building, which results in a more than quadratic time and space. BWTransform is calculated via the BWMatrix, therefore same level of Time and Space Complexity.

    Inheritance
    System.Object
    NaiveBuilder
    LastFirstPropertyBasedBuilder
    Implements
    IBuilder
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: MoreStructures.BurrowsWheelerTransform.Builders
    Assembly: MoreStructures.dll
    Syntax
    public class NaiveBuilder : IBuilder
    Remarks

    Check specific builder methods, such as BuildMatrix(TextWithTerminator), for further information about the complexity of each operation.

    Methods

    | Improve this Doc View Source

    BuildMatrix(BWTransform)

    Rebuilds the original BWMatrix from a BWTransform representing the last column of the Burrows-Wheeler Matrix (which is also the Burrows-Wheeler Transform).

    Declaration
    public BWMatrix BuildMatrix(BWTransform lastBWMColumn)
    Parameters
    Type Name Description
    BWTransform lastBWMColumn

    The last column of the Burrows-Wheeler Matrix.

    Returns
    Type Description
    BWMatrix

    The matrix, wrapped into a BWMatrix object.

    Remarks

    Because the entire Burrows-Wheeler Matrix is built from the text with an invertible function, and the same happens for the Burrows-Wheeler Transform of the text, it's possible to get back the entire matrix from its last column.

    | Improve this Doc View Source

    BuildMatrix(TextWithTerminator)

    Builds Burrows-Wheeler objects, such as BWMatrix and BWTransform of the provided TextWithTerminator.

    Declaration
    public virtual BWMatrix BuildMatrix(TextWithTerminator text)
    Parameters
    Type Name Description
    TextWithTerminator text

    The text to build the BWM, with its terminator (required).

    Returns
    Type Description
    BWMatrix

    The matrix, wrapped into a BWMatrix object.

    Remarks

    COMPLEXITY
    Since this operation requires computing a n * n matrix, where n is the Length of text, it can be intensive operation, both in time.
    Time Complexity:
    - Sorting a large number of strings on a large non-constant alphabet takes n * log(n) * m, where m is the cost of a comparison of two n-sized strings, which is O(n).
    - Therefore Time Complexity is O(n^2 * log(n)).
    - If the alphabet can be considered of constant size and comparison between two strings happens in constant time, the complexity is O(n * log(n)).
    Space Complexity:
    - The output is a n * n matrix of chars (all cyclic rotations of a n-sized string).
    - Therefore Space Complexity is O(n^2 * m), when no assumption is made on the size of a char being constant, where m = log(w, M), with w = size of a word in memory and M = size of the alphabet.
    - If the alphabet can be considered of constant size, the complexity is O(n^2).

    | Improve this Doc View Source

    BuildTransform(BWMatrix)

    Builds the Burrows-Wheeler Transform from the provided BWMatrix.

    Declaration
    public virtual BWTransform BuildTransform(BWMatrix matrix)
    Parameters
    Type Name Description
    BWMatrix matrix

    The matrix, whose BWT has to be calculated.

    Returns
    Type Description
    BWTransform

    The transform, wrapped into a BWTransform object.

    Remarks

    | Improve this Doc View Source

    BuildTransform(TextWithTerminator)

    Declaration
    public virtual BWTransform BuildTransform(TextWithTerminator text)
    Parameters
    Type Name Description
    TextWithTerminator text
    Returns
    Type Description
    BWTransform
    Remarks

    COMPLEXITY
    - Done without constructing the BWMatrix of text, which would requires O(n^2) space.
    - Instead, n VirtuallyRotatedTextWithTerminator objects are created (one per char of text), mapping a specific rotation of the original text and taking into account the rotation in its all its char-position dependent functionalities, such as CompareTo(VirtuallyRotatedTextWithTerminator), GetEnumerator() etc.

    | Improve this Doc View Source

    InvertMatrix(BWMatrix)

    Declaration
    public virtual TextWithTerminator InvertMatrix(BWMatrix matrix)
    Parameters
    Type Name Description
    BWMatrix matrix
    Returns
    Type Description
    TextWithTerminator
    Remarks

    COMPLEXITY
    - No computation to be done, except for building the string of the TextWithTerminator.
    - Time Complexity = O(n), Space Complexity = O(n), where n = edge of matrix.

    | Improve this Doc View Source

    InvertTransform(RotatedTextWithTerminator)

    Declaration
    public virtual TextWithTerminator InvertTransform(RotatedTextWithTerminator lastBWMColumn)
    Parameters
    Type Name Description
    RotatedTextWithTerminator lastBWMColumn
    Returns
    Type Description
    TextWithTerminator
    Remarks

    ALGORITHM
    This implementation inverts the BWT by iteratively building n+1-mers from n-mers.
    - 1-mers (first column of the matrix) is just the last column (BWT), sorted. That gives a matrix M0 of 1 columns and n rows (where n = length of lastBWMColumn).
    - 2-mers are derived from 1-mers, by juxtaposing side-by-side last column (BWT) and M0, sorted. That gives a matrix M1 of 2 columns and n rows.
    - 3-mers are derived from 2-mers, by juxtaposing side-by-side last column (BWT) and M1, sorted. That gives a matrix M2 of 3 columns and n rows.
    - And so on, up to (n - 1)-mers and matrix M(n - 2) of n - 1 columns and n rows.
    - The last column is already known (BWT), so the text can be extracted from the first line: the first char is the separator, the rest is the text without separator.

    COMPLEXITY
    - There are n top-level iterations, where n is the length of lastBWMColumn.
    - Each iteration takes n * log(n) * m time to sort, where m is the length of strings to compare = n.
    - So total Time Complexity is O(n^3 * log(n)) and Space Complexity is O(n^2).

    Implements

    IBuilder

    Extension Methods

    SuffixStructureNodeExtensions.GetAllSuffixesFor<TEdge, TNode>(TNode, TextWithTerminator)
    • Improve this Doc
    • View Source
    In This Article
    Back to top Generated by DocFX