Character clustering for Thai text
To cluster Thai text into undividable units. Character
cluster is defined to be the smallest recognizable unit. The character
string is clustered for the sake of avoiding the processing of invalid
Thai character units.
KWIC for Thai text KWIC (KeyWord In Context) for both
segmented or unsegmented Thai text. It is used to create concordance
of Thai text for studying the occurrence of words in question.