Search Results for

    Show / Hide Table of Contents

    Interface IBuilder<TEdge, TNode>

    Builds objects, such as edges and nodes, for the ISuffixStructureNode<TEdge, TNode> concretion it is the builder of.

    Namespace: MoreStructures.SuffixStructures.Builders
    Assembly: MoreStructures.dll
    Syntax
    public interface IBuilder<TEdge, TNode>
        where TEdge : ISuffixStructureEdge<TEdge, TNode> where TNode : ISuffixStructureNode<TEdge, TNode>
    Type Parameters
    Name Description
    TEdge

    TNode

    Remarks

    This interface allows to have a shared construction interface for objects among all structures. Object construction is externalized into a Builder object as a workaround to the limitation of not having constructor signatures directly in interfaces. See https://codeblog.jonskeet.uk/2008/08/29/lessons-learned-from-protocol-buffers-part-4-static-interfaces/

    Methods

    | Improve this Doc View Source

    BuildTree(TextWithTerminator[])

    Build a ISuffixStructureEdge<TEdge, TNode> of the provided text(s), which is a n-ary search tree in which edges coming out of a node are substrings of at least one of the texts and nodes are not directly labeled instead.

    Declaration
    TNode BuildTree(params TextWithTerminator[] texts)
    Parameters
    Type Name Description
    TextWithTerminator[] texts

    The text(s) to build the Suffix Structure of, each one with its own unique terminator (required for traversal).

    Returns
    Type Description
    TNode

    The root node of the Suffix Structure.

    Remarks

    TERMINATORS
    Terminators have to be unique characters, not just on the single TextWithTerminator they are appended to, but to all TextWithTerminator items in texts.
    That means that if texts is (T1, ..., Tn), where each TextWithTerminator Ti has ti as Terminator, ti should not be in any of the Text of Tj, for any j.

    GENERALIZED SUFFIX STRUCTURES
    When multiple TextWithTerminator instances are passed into texts, the Suffix Structure built is known as Generalized Suffix Structure (e.g. Generalized Suffix Tree or Trie).
    It differs from a normal Suffix Structure built for the concatenation of items in texts by the fact that it doesn't contain any root-to-leaf path identifying a suffix which spans multiple texts.
    When a Suffix Structure is built for a single TextWithTerminator which a concatenation of multiple texts, each with its own terminator T1 || t1 || ... || Tn || tn, the resulting structure has branches which span over multiple texts, such as suffixOf[Ti] || ti || prefixOf[T(i+1)] or suffixOf[Ti] || ti || T(i+1) || t(i+1) || prefixOf[T(i+2)].
    A Generalized Suffix Structure trims those branches in construction, so that each node-to-leaf branch ends with any of the terminator t1, ..., tn (not always tn) and doesn't contain any other terminator except the ending one.


    EDGES
    Substrings of a text are identified by their start position in text and their length, rather than by a copy of the substring itself.
    The technique, known as Edge Compression allows to store edges information in constant space, and the entire tree in linear space w.r.t. the number of nodes in the tree (which can be linear or not in the input, depending on the type of suffix structure).
    The sequence of edges in root-to-node paths in the tree identify prefixes in common to multiple suffixes of the text(s).

    Extension Methods

    SuffixStructureNodeExtensions.GetAllSuffixesFor<TEdge, TNode>(TNode, TextWithTerminator)
    BuilderExtensions.BuildTree<TEdge, TNode>(IBuilder<TEdge, TNode>, String)
    • Improve this Doc
    • View Source
    In This Article
    Back to top Generated by DocFX