Search Results for

    Show / Hide Table of Contents

    Class SuffixArrayBasedMatcher

    A IMatcher implementation which uses a MoreStructures.SuffixArrays.SuffixArray to perform text pattern matching.

    Inheritance
    System.Object
    SuffixArrayBasedMatcher
    Implements
    IMatcher
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: MoreStructures.BurrowsWheelerTransform.Matching
    Assembly: MoreStructures.dll
    Syntax
    public class SuffixArrayBasedMatcher : IMatcher
    Remarks

    Text and its MoreStructures.SuffixArrays.SuffixArray has to be provided as an additional input.
    If not already available, it can be built via a SuffixStructureBasedSuffixArrayBuilder<TEdge, TNode>, if a Suffix Structure of the provided text is already available (such as a Suffix Tree or Trie), and via NaiveSuffixArrayBuilder otherwise.

    The algorithm performs two binary searches over the SortedBWT in sequence.
    - The first binary search looks for the first index i of the SortedBWT such that the provided pattern matches the i-th element of the MoreStructures.SuffixArrays.SuffixArray (meaning that the pattern is fully contained in the i-th suffix of Text in SuffixArray).
    - If i is not found, a failing Match is returned. Otherwise, a second binary search is performed.
    - The second binary search looks for the last index j of the SortedBWT respecting the same condition as the first binary search. This search starts from the index found in the first search (included).
    - The second search is guaranteed to succeed (as in the worst case the last occurrence corresponds to the first one, found in the first search.

    - The array of indexes is as long as the length n of SortedBWT, which is also the length of the text and the MoreStructures.SuffixArrays.SuffixArray.
    - Each binary search does a number of comparisons which is logarithmic over n.
    - Each comparison is of at most n chars.
    - So Time Complexity is O(n * log(n)) and Space Complexity is O(n).

    Constructors

    | Improve this Doc View Source

    SuffixArrayBasedMatcher(RotatedTextWithTerminator, TextWithTerminator, SuffixArray)

    Declaration
    public SuffixArrayBasedMatcher(RotatedTextWithTerminator sbwt, TextWithTerminator text, SuffixArray suffixArray)
    Parameters
    Type Name Description
    RotatedTextWithTerminator sbwt
    TextWithTerminator text
    MoreStructures.SuffixArrays.SuffixArray suffixArray
    Remarks

    Properties

    | Improve this Doc View Source

    BWT

    Declaration
    public RotatedTextWithTerminator BWT { get; }
    Property Value
    Type Description
    RotatedTextWithTerminator
    Remarks

    Unlike SortedBWT, BWT is not required to perform Pattern Matching against Text, and is not supported.

    | Improve this Doc View Source

    OrderedAscListSearch

    The ISearch implementation to be used when searching for items in lists sorted in ascending order.

    Declaration
    protected static ISearch OrderedAscListSearch { get; }
    Property Value
    Type Description
    ISearch
    | Improve this Doc View Source

    SortedBWT

    The sorted version of the Burrows-Wheeler Transform. Also, the first column of the Burrows-Wheeler Matrix.

    Declaration
    public RotatedTextWithTerminator SortedBWT { get; }
    Property Value
    Type Description
    RotatedTextWithTerminator
    | Improve this Doc View Source

    SuffixArray

    The MoreStructures.SuffixArrays.SuffixArray of Text.

    Declaration
    public SuffixArray SuffixArray { get; }
    Property Value
    Type Description
    MoreStructures.SuffixArrays.SuffixArray
    | Improve this Doc View Source

    Text

    The TextWithTerminator, to do pattern matching against.

    Declaration
    public TextWithTerminator Text { get; }
    Property Value
    Type Description
    TextWithTerminator
    Remarks

    Requires to get the actual suffix from the i-th element of SuffixArray, to be compared against the pattern.

    Methods

    | Improve this Doc View Source

    BuildComparer(IEnumerable<Char>)

    Builds a SuffixArrayAgainstPatternComparer instance, or a derivation of it, to be used in Match(IEnumerable<Char>), to find the start and end indexes in SortedBWT, corresponding to first and last matches of the pattern in Text.

    Declaration
    protected virtual SuffixArrayAgainstPatternComparer BuildComparer(IEnumerable<char> pattern)
    Parameters
    Type Name Description
    IEnumerable<System.Char> pattern

    Returns
    Type Description
    SuffixArrayAgainstPatternComparer

    An instance of SuffixArrayAgainstPatternComparer instance, or a derivation of it.

    | Improve this Doc View Source

    Match(IEnumerable<Char>)

    Tries to match the provided pattern against Text, via the SortedBWT and the SuffixArray of Text.

    Declaration
    public Match Match(IEnumerable<char> pattern)
    Parameters
    Type Name Description
    IEnumerable<System.Char> pattern
    Returns
    Type Description
    Match
    Remarks

    Implements

    IMatcher

    Extension Methods

    SuffixStructureNodeExtensions.GetAllSuffixesFor<TEdge, TNode>(TNode, TextWithTerminator)
    • Improve this Doc
    • View Source
    In This Article
    Back to top Generated by DocFX