Swapcase for Titlecase characters
Martin J. Dürst
duerst at it.aoyama.ac.jp
Fri Mar 18 02:43:56 CDT 2016
I'm working on extending the case conversion methods for the programming
language Ruby from the current ASCII only to cover all of Unicode.
Ruby comes with four methods for case conversion. Three of them, upcase,
downcase, and capitalize, are quite clear. But we have hit a question
for the forth method, swapcase.
What swapcase does is swap upper and lower case, so that e.g.
'Unicode Standard'.swapcase => 'uNICODE sTANDARD'
I'm not sure myself where this method is actually used, but it also
exists in Python (and maybe Ruby got it from there).
Now the question I have is: What to do for titlecase characters? Several
possibilities already have been floated:
a) Leave as is, because there are neither upper nor lower case.
b) Convert to upper (or lower), which may simplify implementation.
c) Decompose the character into upper and lower case components, and
apply swapcase to these.
For example, 'ǅinsi' (jeans) would become 'ǅINSI' with a), 'ǄINSI' (or
'ǆinsi') with b), and 'dŽINSI' with c). For another example, 'ᾨδή' would
become 'ᾨΔΉ' with a), 'ὨΙΔΉ' (or 'ᾠΔΉ') with b), and 'ὠΙΔΉ' with c).
It looks like Python 3 (3.4.3 in my case) is doing a). My guess is that
from an user expectation point of view, c) is best, so I'm tending to go
for c). There is no existing data from the Unicode Standard for this,
but it seems pretty straightforward.
But before I just implement something, I'd appreciate additional input,
in particular from users closer to the affected language communities.
More information about the Unicode