Re: Announcing The Unicode® Standard, Version 8.0
charupdate at orange.fr
Sat Jun 20 04:32:47 CDT 2015
This is intrinsicly the nicest version announcement of all the history of Unicode, because of the opportune use of the newly encoded emoji U+1F37E BOTTLE WITH POPPING CORK. Even if I wouldnʼt drink whatʼs in, nor eat any more U+1F9C0 CHEESE WEDGE (you know Iʼve become a vegan between my beta feedback* and now), I was already very pleased when Unicode started adding emojis, and I'm still more as emojis are now thriving and covering the complete cultural range.
Iʼd the purpose not to mail to the List for a time. But Iʼve got some other topics I need to discuss. And, first, it would be a pity if there were no follow-up in this 8.0.0 version announcement thread (even if it wasn't sent as a "new topic" to discuss).
* On Fri Apr 24 12:51:50 CDT 2015, I wrote:
> U+1F9C0 CHEESE WEDGE and Translations of the Code Charts
> Dear Unicode Consortium,
I'm pleased to read the Feedback from Mr Lawson and would join my
> congratulations to his' [apostrophe mistake; read: his].
> The Cheese Wedge symbol he underscores, recalls me
the new sets have already been translated to French [...].
> More precisely about the Cheese Wedge, I'm glad to see unbloody, no-slaughter
> food is now strongly promoted and is given a fabulous opportunity of becoming
> a wide-spread cultural phenomenon.
[Alas! That turned out not to be so pleasing at all.]
I know, the purpose of this Mailing List is encoding and implementation, not civilisation. Thatʼs why Iʼve made up a new keyboard layout for the United Kingdom. It should help British users to get readily fully processible Unicode text files. That means, quotation marks can be simply converted to US usage by doing two research-and-replace-all.
Iʼve called this keyboard layout ‘typographic’, because U+02BC MODIFIER LETTER APOSTROPHE is now inserted by default, while U+0027 for smart quotes (and names of archive files) is obtained with AltGr, a shift state that is already present in the shipped layout, and where now all comma (and angle) quotation marks for use in English and Welsh are equally found, along with em and en dashes. (As is well known, Welsh is the locale of the UK extended keyboard layout shipped with Windows, which this driver is based on.)
If a header or a readme can be provided with the input text, the use of U+02BC for apostrophe should be mentioned, until all software has been updated (by adding U+02BC to the equivalence class for U+0027), because as a collateral damage of legacy practice, searches for apostrophe-containing words are actually prevented from being successful when U+0027 is used in the search bar while the matching words present in the text are accurately spelled with U+02BC.
For future readers: For more information about MODIFIER LETTER APOSTROPHE, please look up the thread ‘A new take on the English apostrophe in Unicode’.
This time, the layout is released for UK only, not for USA because disambiguating apostrophe and single closing-quote seems not to be worth-while in US English, where indeed single quotation marks must scarcely be in current use.
The (again unlicensed) drivers (several architecture versions for all actual Windows versions) are for free download at:
For more information about keyboard drivers and the Microsoft Keyboard Layout Creator, please download your free copy of MSKLC at:
http://www.microsoft.com/en-us/download/details.aspx?id=22339 and look up the Help.
Compatibility extends to Windows 7 and 8:
> Message du 17/06/15 23:17
> De : announcements at unicode.org
> A : announcements at unicode.org
> Copie à :
> Objet : Announcing The Unicode® Standard, Version 8.0
> Version 8.0 of the Unicode Standard is now available. It includes 41 new emoji characters (including five modifiers for diversity), 5,771 new ideographs for Chinese, Japanese, and Korean, the new Georgian lari currency symbol, and 86 lowercase Cherokee syllables. It also adds letters to existing scripts to support Arwi (the Tamil language written in the Arabic script), the Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of Africa. In total, this version adds 7,716 new characters and six new scripts.
> The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.
> The 41 new emoji in Unicode 8.0 include the following:
> five emoji modifiers
> Faces and Hands
> NERD FACE, FACE WITH ROLLING EYES, ROBOT FACE
> HOT DOG, TACO, CHEESE WEDGE, POPCORN
> CRICKET BAT AND BALL, VOLLEYBALL, BOW AND ARROW
> UNICORN FACE, LION FACE, CRAB, SCORPION
> MOSQUE, SYNAGOGUE, PRAYER BEADS
> (For the full list, including images, see emoji additions for Unicode 8.0.)
> Phones and computers often need operating system updates to support new emoji, which may take some time. It is also now clear which existing characters, such as the often requested SHOPPING BAGS, can be used as emoji. Once phones and computers support these characters, people will be able to see colorful images such as the BOTTLE WITH POPPING CORK above.
> Three other important Unicode specifications are updated for Version 8.0:
UTS #10, Unicode Collation Algorithm — for sorting Unicode text
UTS #39, Unicode Security Mechanisms — for reducing Unicode spoofing
UTS #46, Unicode IDNA Compatibility Processing — for compatible processing of non-ASCII URLs
> Some of the changes in Version 8.0 and associated Unicode technical standards may require modifications in implementations. For more information, see Unicode 8.0 Migration and the migration sections of UTS #10, UTS #39, and UTS #46. For full details on Version 8.0, see Unicode 8.0.
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 20694 bytes
Desc: not available
More information about the Unicode