Search Results for

    Show / Hide Table of Contents

    Class Matcher

    Exposes utility methods to match a text against a pattern, using the Knuth-Morris-Pratt algorithm.

    Inheritance
    System.Object
    Matcher
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: MoreStructures.KnuthMorrisPratt.Matching
    Assembly: MoreStructures.dll
    Syntax
    public static class Matcher

    Methods

    | Improve this Doc View Source

    Match(String, String, Char, Boolean)

    Tries to match the provided pattern against the provided text, using the Knuth-Morris-Pratt algorithm. Returns all matches found.

    Declaration
    public static IEnumerable<Match<int>> Match(string text, string pattern, char separator, bool validateSeparator = false)
    Parameters
    Type Name Description
    System.String text

    The text, to match the pattern against.

    System.String pattern

    The pattern to match.

    System.Char separator

    A special character, to be used as separator in the concatenated string, between pattern and text. The special character is required by the algorithm and must be absent from both the pattern and the text.

    System.Boolean validateSeparator

    Whether an additional validation check should be performed, before actually running the algorithm, on whether text and pattern contain separator or not. The check requires going through all chars of text and pattern, and can be expensive. This is why it has to be explicitely opted-in.

    Returns
    Type Description
    IEnumerable<Match<System.Int32>>

    A sequence of all matches, from the one starting at the lowest index in text to the one starting at the highest index.

    Remarks

    ALGORITHM
    The Knuth-Morris-Pratt algorithm:
    - First builds a concatenated string in the form "pattern-separator-text".
    - Then calculates the Prefix Function of the concatenated string.
    - Finally, iterates over the Prefix Function, retaining all indexes i such that the value of the Prefix Function at i is equal to the length of the pattern. That means that the prefix of the concatenated string from 0 to i (included) has a border of length equal to the length of the pattern, which means that the prefix contains the pattern.
    - For each retained index i, a successful match is emitted, starting at index i - 2 * m and matching exactly m chars, where m is the length of the pattern.

    COMPLEXITY
    - The only validation check which is not done in constant time is whether text and pattern contain the separator. This validation check is by default not done, and has to be opted in via the flag validateSeparator. When executed it takes O(n + m) time, since it has to compare the separator against each char of the text and of the pattern.
    - The concatenated string has length n + m + 1, where n is the length of the text and m is the length of the pattern. Building it requires iterating both over the pattern and the text.
    - Calculating the Prefix Function of the concatenated string via FastPrefixFunctionCalculator is a O(n + m + 1) operation, both in time and space.
    - Iterating over the indexes of the Prefix Function which refer to the text is an O(n) operation.
    - Each iteration makes a constant-time comparison and yield the result.
    - Therefore, both Time and Space Complexity are O(n + m).

    • Improve this Doc
    • View Source
    In This Article
    Back to top Generated by DocFX