Search Results for

    Show / Hide Table of Contents

    Class OrderBasedSingleCharPcsClassifier

    A ISingleCharPcsClassifier implementation which uses an externally provided position list Order of the 1-char PCS of the Input, to calculate equivalence classes.

    Inheritance
    System.Object
    OrderBasedSingleCharPcsClassifier
    Implements
    ISingleCharPcsClassifier
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: MoreStructures.SuffixArrays.CyclicShifts
    Assembly: MoreStructures.dll
    Syntax
    public class OrderBasedSingleCharPcsClassifier : ISingleCharPcsClassifier
    Remarks

    ADVANTAGES
    - Compared to the naive implementation of NaiveSingleCharPcsClassifier, it has better runtime and only allocates an array of n elements, where n is the length of the Input.
    - However, it requires the position list of the Input to be provided to the classifier at construction time.
    - Calculating the position list is a linear time operation only when some assumptions can be made on the input, such as an alphabet of limited size (which is the main scenario for CountingSortCharsSorter). In all other cases, the runtime has a worst case of O(n * log(n)).

    ALGORITHM
    - An array of n elements EC, to accomodate the equivalence classes of the result, is allocated upfront.
    - The item of EC with index P[0], where P is the list of positions, is set to 0. This is because P[0] is the index in the Input T, of the smallest char in T. Therefore, there are no smaller chars in T and the equivalence class of T[P[0]] is hence the smallest, i.e. 0.
    - EC[O[i + 1]] can be calculated from EC[O[i]]: two scenarios are possible.
    - If T[O[i + 1]] == T[O[i]], it means that the two chars T[O[i + 1]] and T[O[i]], which come one after the other one in the position list, are the same and should have the same equivalence class. Therefore, the equivalence class of T[O[i + 1]], EC[O[i + 1]] can be set to the equivalence class of T[O[i]], EQ[O[i]].
    - If, on the other hand, T[O[i + 1]] != T[O[i]], it means that the two chars T[O[i + 1]] and T[O[i]], which come one after the other one in the position list, are different and should have different equivalence classes. Therefore, the equivalence class of T[O[i + 1]], EC[O[i + 1]] can be set to the successor of the equivalence class of T[O[i]], EQ[O[i]].

    COMPLEXITY
    - There are as many iterations as items of the position list.
    - Within each iteration, direct accesses to items of Input and Order by index are done in constant time.
    - Therefore, Time and Space Complexity are O(n), excluding the cost of calculating the position list, which is externally provided. Check ICharsSorter implementations for the complexity of the algorithms calculating the position list of a string.

    Constructors

    | Improve this Doc View Source

    OrderBasedSingleCharPcsClassifier(String, IList<Int32>)

    Declaration
    public OrderBasedSingleCharPcsClassifier(string input, IList<int> order)
    Parameters
    Type Name Description
    System.String input

    IList<System.Int32> order

    Properties

    | Improve this Doc View Source

    Input

    The input text, whose 1-char PCS have to be classified.

    Declaration
    public string Input { get; }
    Property Value
    Type Description
    System.String
    | Improve this Doc View Source

    Order

    The position list of the 1-char PCS of the Input.

    Declaration
    public IList<int> Order { get; }
    Property Value
    Type Description
    IList<System.Int32>
    Remarks

    Has as many items as chars in the Input.
    It is specifically required by this implementation of the algorithm and has to be externally provided.
    It can be calculated with any implementation of Sort(String).

    Methods

    | Improve this Doc View Source

    Classify()

    Declaration
    public IList<int> Classify()
    Returns
    Type Description
    IList<System.Int32>

    A list of equivalence classes, with as many items as chars in the Input.

    Remarks

    Implements

    ISingleCharPcsClassifier

    Extension Methods

    SuffixStructureNodeExtensions.GetAllSuffixesFor<TEdge, TNode>(TNode, TextWithTerminator)
    • Improve this Doc
    • View Source
    In This Article
    Back to top Generated by DocFX