I have found some problems trying to implement case mapping. I am making
some assumptions and have some questions.
#1 It is unclear other than Turkish which languages use the dotless I. I
assume they are:
Turkish, Azeri, Tatar, and Bashkir.
#2 What are the rules for Title case and spacing? I assume that a
non-breaking space is a joiner and does not indicate that the following
alpha character is a title case character. Also that the zero width
non-breaking space (BOM) is neutral.
#3 French also has other articles such as d' are there prescribed rules for
capitalization? Are there other languages to consider?
#4 There is no mention of stop words.
Carl
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT