On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote:
> I'm working on extending the case conversion methods for the programming
> language Ruby from the current ASCII only to cover all of Unicode.
>
> Ruby comes with four methods for case conversion. Three of them, upcase,
> downcase, and capitalize, are quite clear. But we have hit a question
> for the forth method, swapcase.
>
> What swapcase does is swap upper and lower case, so that e.g.
>
> 'Unicode Standard'.swapcase => 'uNICODE sTANDARD'
>
> I'm not sure myself where this method is actually used, but it also
> exists in Python (and maybe Ruby got it from there).
>
>
> Now the question I have is: What to do for titlecase characters? Several
> possibilities already have been floated:
>
> a) Leave as is, because there are neither upper nor lower case.
>
> b) Convert to upper (or lower), which may simplify implementation.
>
> c) Decompose the character into upper and lower case components, and
> apply swapcase to these.
>
>
> For example, 'Džinsi' (jeans) would become 'DžINSI' with a), 'DŽINSI' (or
> 'džinsi') with b), and 'dŽINSI' with c). For another example, 'ᾨδή' would
> become 'ᾨΔΉ' with a), 'ὨΙΔΉ' (or 'ᾠΔΉ') with b), and 'ὠΙΔΉ' with c).
>
> It looks like Python 3 (3.4.3 in my case) is doing a). My guess is that
> from an user expectation point of view, c) is best, so I'm tending to go
> for c). There is no existing data from the Unicode Standard for this,
> but it seems pretty straightforward.
>
> But before I just implement something, I'd appreciate additional input,
> in particular from users closer to the affected language communities.
As far as I can tell from my limited experience, the swapcase method is used only to convert “inverted titlecase” to titlecase. I call “inverted titlecase” the state of text produced by keyboard input while the caps lock toggle is accidentally on, and those words are “inversely capitalized” where the user pressed the shift modifier. Therefore such examples would be most useful.
Having said that, I know that this never occurs on many keyboards of English-speaking users who remapped that key to perform another action such as backspace, compose, or kana lock. Living myself in a country where the caps lock toggle is indispensable, I may be considered part of the aimed user communities, though unfortunately I donʼt speak Croatian nor Greek.
Looking at your examples, I would add a case that typically occurs for swapcase to be applied: ‘ᾠΔΉ’ (cited [erroneously] as a result of option b) that is to be converted to ‘ᾨδή’, and ‘džINSI’, that is to become ‘Džinsi’.
As about decomposing digraphs and ypogegrammeni to apply swapcase: That probably would be doing no good, as itʼs unnecessary and users wonʼt expect it.
I hope that helps.
Kind regards,
Marcel
Received on Fri Mar 18 2016 - 14:35:42 CDT
This archive was generated by hypermail 2.2.0 : Fri Mar 18 2016 - 14:35:42 CDT