The Unicode Consortium Discussion Forum

The Unicode Consortium Discussion Forum

 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
 
It is currently Fri Oct 31, 2014 11:11 pm

All times are UTC - 6 hours [ DST ]





Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: Combining marks & ® sign.
PostPosted: Tue Jul 03, 2012 6:39 pm 
Offline

Joined: Mon Feb 01, 2010 6:18 pm
Posts: 79
UTR #50 revision 5 wrote:
There is actually one character for which a contextual determination would be useful and reliable: U+00AE ® REGISTERED SIGN, which can occur both following terms in kanji/kana and following terms in Latin. An occurrence of ® should be assigned the same class as the character it follows. Others? Enough to warrant the complexity of contextual rules?


I'm wondering if we had a value of "I" (inherited) if that would not work properly not only for the ® sign, but also any combining marks. You could then set NBSP as SVO=R and the Ideographic space as SVO=U to allow for exhibiting combining marks in isolation, and it would simply follow from their inherited value. We could even go with In/Ip (inherit next/previous) to accomodate open/close quotation marks. Does that seem like a reasonable solution to anyone else?


Top
 Profile  
 
 Post subject: Re: Combining marks & ® sign.
PostPosted: Wed Jul 04, 2012 8:09 am 
Offline

Joined: Wed Dec 07, 2011 3:01 am
Posts: 71
While I agree that context dependent orientations is a nice feature, I'm not a big fan of defining it in Unicode.

I think it's a good application feature. Applications are facing users and can confirm if automatic translation matches to user's intention. Once applications determined the translation is correct, they should be able to persist the confirmed orientation, and it should stay unchanged once persisted. I think Unicode is responsible for the persistence, so that it won't change for decades or even more.

Consider smart quotes many applications implement today. User types U+0022. Applications suggests open or close depends on context, and then persist it as U+201C or U+201D. It will never change once persisted.

Because UTR#50 is the lowest building block of rendering layers, I think not being too smart works better.

Doing such separation has benefits for applications to improve the logic without concerning Unicode. When MS Word designed smart quotes and other AutoCorrect features, MS kept improving them for multiple versions. Such logic might look simple at first look, but will become much more complicated if you consider more use cases and multiple scripts. It's an area of improve and compete, rather than standardize, I think.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by DEVPPL.com