Browsing Computer Science by Subject "Universal dependencies"
Now showing items 1-1 of 1
-
Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
(European Language Resources Association (ELRA), 2018)This paper describes how a tokenizer can be trained from any dataset in the Universal Dependencies 2.1 corpus (UD2) (Nivre et al., 2017). A software tool, which relies on Elephant (Evang et al., 2013) to perform the ...