Re: Titlecasing words starting with numeric glyphs and period as word separator

From: Mark Davis ☕ (mark@macchiato.com)
Date: Wed Feb 23 2011 - 09:57:39 CST

Next message: Mark Rosa: "Re: Kaida font (work in progress)"

Previous message: Koji Ishii: "RE: Titlecasing words starting with numeric glyphs and period as word separator"
In reply to: Koji Ishii: "RE: Titlecasing words starting with numeric glyphs and period as word separator"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I didn't take what you said as at all brash - you and others at CSS are
looking for a solution to your issue, and there is no reason for you to know
the structure and process used in the Unicode Consortium. Such a solution
could involve use of structure and properties already defined (by the UTC
and CLDR-TC), or result in improvements or extensions to those structures.

I should have also mentioned that the W3C has a liaison relationship with
the Unicode Consortium, and you can also work through knowledgeable people
in the i18n group in the W3C, such as Richard Ishida and Addison Phillips.

Mark

*— Il meglio è l’inimico del bene —*

On Tue, Feb 22, 2011 at 20:14, Koji Ishii <kojiishi@gluesoft.co.jp> wrote:

> Thank you Mark for leading me.
>
>
>
> I apologize any brashness, as I’m new to here.
>
>
>
> I didn’t write what I want very clearly, I’m sorry about that, but all I
> want for now is just to present what were talked at CSS, listen to what
> people here would say, and hopefully have some discussions.
>
>
>
> I’m not sure if I want it be on the next agenda at this point, but I’ll
> follow your instructions if I want to.
>
>
>
>
>
> Regards,
>
> Koji
>
>
>
> *From:* mark.edward.davis@gmail.com [mailto:mark.edward.davis@gmail.com] *On
> Behalf Of *Mark Davis ?
> *Sent:* Tuesday, February 22, 2011 4:56 PM
> *To:* Koji Ishii
> *Cc:* unicode@unicode.org
> *Subject:* Re: Titlecasing words starting with numeric glyphs and period
> as word separator
>
>
>
> The default Unicode rules cannot cover all languages or circumstances
> properly. It is worth bringing up to the Unicode technical committee any
> proposals (and/or problem cases) with the default rules, but bear in mind
> that those default rules will never be able to cover all languages well. Acronyms,
> hyphenations, and contractions present particular problems: there are some
> notes on some of them in http://www.unicode.org/reports/tr29/.
>
>
>
> You can have discussions here or on the http://unicode.org/forum/, but to
> get on the next agenda (May) for the UTC, make sure that there is a proposal
> filed by a member or by you on http://www.unicode.org/reporting.html.
>
>
>
> > "word separating rules optimized for titlecasing" could be slightly
> different from general word separating rules
>
>
>
> Language-specific rules such as for titlecasing, fall under the CLDR
> technical committee <http://cldr.unicode.org/>. There have been tickets
> filed for adding structure and data for language-specific titlecasing some
> time ago, but it hadn't reached a high enough relative priority for the
> committee to work on. Having such "word separating rules optimized for
> titlecasing" was the direction the committee was thinking of. I put it on
> the agenda for the next CLDR meeting (that committee meets weekly by phone),
> and you can file a ticket with additional information and/or example problem
> cases that you'd like to see handled:
> http://unicode.org/cldr/trac/newticket
>
>
>
> Mark
>
> *— Il meglio è l’inimico del bene —*
>
> On Mon, Feb 21, 2011 at 23:15, Koji Ishii <kojiishi@gluesoft.co.jp> wrote:
>
> Hello,
>
> There's a discussion going on in W3C CSS mailing list[1] about
> specifications of the text-transform property[2], specifically how the
> "capitalize" value that titlecase specified span of text.
>
> During the discussion, two cases were presented:
>
> 1. Titlecasing words starting with numeric glyphs (e.g., "99ers") can be
> "99Ers" if we follow the rules defined in 5.18 Case Mappings. Is this
> discussed here and it's up to implementations to define which words to apply
> titlecasing, or should this be fixed in Unicode spec?
>
> 2. We're thinking to use UAX #24 to separate words and then apply
> Titlecase_Mapping to every word. But doing so makes "a.m." to be "A.m." and
> it contradicts with the general publication rules[3]. While I understand
> both separating words and titlecasing are ambiguous, cannot be perfect, and
> we must make compromises. But since Unicode defines these two rules
> separately, I guess there's a possibility that "word separating rules
> optimized for titlecasing" could be slightly different from general word
> separating rules. I haven't thought much about counter-cases for not doing
> so, but I wonder if anyone in this ML could have idea including whether we
> should do it or not, or we should include more other cases.
>
> Any feedback is greatly appreciated.
>
>
> Regards,
> Koji
>
> [1] http://lists.w3.org/Archives/Public/www-style/2011Feb/0621.html
> [2] http://dev.w3.org/csswg/css3-text/#text-transform
> [3]
> http://www.businesswritingblog.com/business_writing/2009/06/what-is-the-correct-time-am-pm-am-pm-am-pm-.html
>
>
>

Next message: Mark Rosa: "Re: Kaida font (work in progress)"
Previous message: Koji Ishii: "RE: Titlecasing words starting with numeric glyphs and period as word separator"
In reply to: Koji Ishii: "RE: Titlecasing words starting with numeric glyphs and period as word separator"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Feb 23 2011 - 10:03:19 CST