Datafiles for ucto, the rule-based tokenization package that is used to
parse texts in different languages.
