Pattern matching

The text analysis classes include a set of iterator classes that search through part or all of a text object to find a specific character pattern. This family of iterators searches through styled text, ignoring any styling information when pattern matching. You always specify the pattern as a text object. However, each concrete iterator class interprets the pattern in a different way, as described below.

TTextPatternIterator is the abstract base class that provides the protocol for text pattern matching. The system provides a set of concrete derived classes that interpret the text pattern in these ways:

TStandardTextPatternIterator provides language-sensitive pattern matching. This iterator uses a TTableBasedTextOrder object to match characters based on the rules of a particular natural language. By default, this class uses the ordering object specified in the current locale, but you can explicitly specify a different ordering object.
TExactTextPatternIterator provides language-insensitive pattern matching based on Unicode character sequences. This iterator performs a bit-by-bit comparison, matching patterns only when the Unicode values are equivalent.
TSpanTextPatternIterator locates the spans of continuous characters that either include or exclude characters in the specified pattern. This iterator treats the pattern as a set of individual characters, each of which can be matched with characters in the text object in no particular order.

[Contents] [Previous] [Next]

Click the icon to mail questions or corrections about this material to Taligent personnel.

Generated with WebMaker