A Two-Pass Search Algorithm for Thai Morphological Analysis
Canasai Kruengkrai and Hitoshi Isahara
Abstract
Considering Thai morphological analysis as a search problem, the approach is to search the most likely path out of all candidate paths in the word lattice.
However, the search space may not contain all possible word hypotheses due to the unknown word problem.
This paper describes an efficient algorithm called the two-pass search algorithm that first recovers missing word hypotheses, and then searches the most likely path in the expanded search space.
Experimental results show that the two-pass search algorithm improves the performance of the standard search by 3.23 F1 in word segmentation and 2.92 F1 in the combination of word segmentation and POS tagging.
Download: pdf, ps
Canasai Kruengkrai