Class EqClassesBasedDoubleLengthPcsClassifier
An implementation of IDoubleLengthPcsClassifier which uses the externally provided list of equivalence classes EqClassesPcsHalfLength, of the PCS of length PcsLength / 2, as well as the position list of the PCS of length PcsLength, and doesn't require the input string, to generate equivalence classes of the PCS of length PcsLength.
Inheritance
Implements
Inherited Members
Namespace: MoreStructures.SuffixArrays.CyclicShifts
Assembly: MoreStructures.dll
Syntax
public class EqClassesBasedDoubleLengthPcsClassifier : IDoubleLengthPcsClassifier
Remarks
ADVANTAGES AND DISADVANTAGES
- Compared to NaiveDoubleLengthPcsClassifier, it has way better runtime (linear time, instead of
cubic).
- Compared OrderBasedDoubleLengthPcsClassifier, it still has better runtime (linear time, instead
of quadratic).
- However, it requires both the position list Order and the equivalence class list
EqClassesPcsHalfLength, to be externally provided.
- Compared to other implementations, this algorithm requires PcsLength to be even.
- Unlike NaiveDoubleLengthPcsClassifier, this implementation requires specific data structures
to be provided, in alternative to the input, to calculate equivalence classes.
- However, unlike NaiveDoubleLengthPcsClassifier, it does not require to know whether
input contains a terminator or not.
- This is because such piece of information would only be needed when running comparisons between PCS.
- This classifier, on the other hand, uses an externally provided lists precisely in order to avoid the
need for costly PCS comparisons, which are "embedded" in the externally provided data structures.
ALGORITHM
- The externally provided ordered sequence of PCS Order is iterated over.
- A new equivalence class is generated (increasing the current counter) every time a PCS differs from the
previous one. The equivalence class is assigned to the output at the index of the PCS being checked.
- The comparison between a PCS and the previous one (both of length PcsLength), is not done by
comparing chars, rather by comparing equivalence classes of the PCS of length PcsLength / 2,
defined in the externally provided EqClassesPcsHalfLength.
- More in detail: to compare the PCS of even length L = PcsLength at index j1 =
Order[i] with the one at index j2 = Order[i - 1], the following two comparisons
are done: EqClassesPcsHalfLength[j1] == EqClassesPcsHalfLength[j2]
and
EqClassesPcsHalfLength[j1 + L / 2] == EqClassesPcsHalfLength[j2 + L / 2]
.
- Finally, the equivalence class list is returned.
COMPLEXITY
- Iterating over the n PCS of length L is an O(n) operation. Each iteration does O(1) work, since
direct access to the equivalence class list of PCS of half length is done in constant time and comparison
between the current PCS and the previous one is a comparison of the equivalence classes of the first and
second halves of both PCS, and requires two pairs of integers to be compared.
- Instantiating the equivalence class list of output is also an O(n) operation.
- Therefore, Time and Space Complexity are O(n).
Constructors
| Improve this Doc View SourceEqClassesBasedDoubleLengthPcsClassifier(Int32, IList<Int32>, IList<Int32>)
Declaration
public EqClassesBasedDoubleLengthPcsClassifier(int pcsLength, IList<int> eqClassesPcsHalfLength, IList<int> order)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | pcsLength | |
IList<System.Int32> | eqClassesPcsHalfLength | |
IList<System.Int32> | order |
Properties
| Improve this Doc View SourceEqClassesPcsHalfLength
The list of equivalence classes of the PCS of length PcsLength / 2, to be used to calculate the equivalence classes of the PCS of double the length (PcsLength).
Declaration
public IList<int> EqClassesPcsHalfLength { get; }
Property Value
Type | Description |
---|---|
IList<System.Int32> |
Order
The position list of the PCS of length PcsLength, to be used, in addition to the EqClassesPcsHalfLength, to calculate the equivalence classes of length PcsLength.
Declaration
public IList<int> Order { get; }
Property Value
Type | Description |
---|---|
IList<System.Int32> |
PcsLength
The length the PCS of input string to be classified.
Declaration
public int PcsLength { get; }
Property Value
Type | Description |
---|---|
System.Int32 |
Methods
| Improve this Doc View SourceClassify()
Declaration
public IList<int> Classify()
Returns
Type | Description |
---|---|
IList<System.Int32> |