See: Description
Interface | Description |
---|---|
IcuTokenizerConfig |
Class that allows for tailored Unicode Text Segmentation on
a per-writing system basis.
|
Class | Description |
---|---|
DefaultIcuTokenizerConfig |
Default
IcuTokenizerConfig that is generally applicable
to many languages. |
IcuTokenizer |
Breaks text into words according to UAX #29: Unicode Text Segmentation
http://www.unicode.org/reports/tr29/.
|
IcuTokenizerFactory |
ICU-based tokenizer, optionally using ICU rbbi rules files.
|