See: Description
| Interface | Description |
|---|---|
| IcuTokenizerConfig |
Class that allows for tailored Unicode Text Segmentation on
a per-writing system basis.
|
| Class | Description |
|---|---|
| DefaultIcuTokenizerConfig |
Default
IcuTokenizerConfig that is generally applicable
to many languages. |
| IcuTokenizer |
Breaks text into words according to UAX #29: Unicode Text Segmentation
http://www.unicode.org/reports/tr29/.
|
| IcuTokenizerFactory |
ICU-based tokenizer, optionally using ICU rbbi rules files.
|