Re: Concise term for non-ASCII Unicode characters

From: Steve Swales <steve_at_swales.us>
Date: Sun, 20 Sep 2015 10:59:52 -0700

Exactly. I think the reason that non-ASCII feels non-concise is that there is widespread confusion between ASCII and Latin-1/ISO 8859-1 (which in turn is widely confused with Windows-1252).

-steve

Sent from my iPhone

> On Sep 20, 2015, at 10:05 AM, Phillips, Addison <addison_at_lab126.com> wrote:
>
> I agree, although I note that sometimes the additional (redundant) specificity of "non-7-bit-ASCII characters" is needed when talking to people unclear on what "ASCII" means.
>
> Addison
>
>> -----Original Message-----
>> From: Unicode [mailto:unicode-bounces_at_unicode.org] On Behalf Of Peter
>> Constable
>> Sent: Sunday, September 20, 2015 9:52 AM
>> To: Sean Leonard; unicode_at_unicode.org
>> Subject: RE: Concise term for non-ASCII Unicode characters
>>
>> You already have been using "non-ASCII Unicode", which is about as concise
>> and sufficiently accurate as you'll get. There's no term specifically defined in
>> any standard or conventionally used for this.
>>
>>
>> Peter
>>
>> -----Original Message-----
>> From: Unicode [mailto:unicode-bounces_at_unicode.org] On Behalf Of Sean
>> Leonard
>> Sent: Sunday, September 20, 2015 7:48 AM
>> To: unicode_at_unicode.org
>> Subject: Concise term for non-ASCII Unicode characters
>>
>> What is the most concise term for characters or code points outside of the
>> US-ASCII range (U+0000 - U+007F)? Sometimes I have referred to these as
>> "extended characters" or "non-ASCII Unicode" but I do not find those terms
>> precise. We are talking about the code points U+0080 - U+10FFFF. I suppose
>> that this also refers to code points/scalar values that are not formally
>> Unicode characters, such as U+FFFF. Basically, I am looking for a concise term
>> for values that would require multiple UTF-8 octets if encoded in UTF-8
>> (without referring to UTF-8 encoding specifically).
>> "Non-ASCII" is not precise enough since character sets like Shift-JIS are non-
>> ASCII.
>>
>> Also a citation to a relevant standard (whether Unicode or otherwise) would
>> be helpful.
>>
>> The terms "supplementary character" and "supplementary code point" are
>> defined in the Unicode standard, referring to characters or code points
>> above U+FFFF. I am looking for something like those, but for characters or
>> code points above U+007F.
>>
>> Thank you,
>>
>> Sean
>
>
Received on Sun Sep 20 2015 - 13:01:22 CDT

This archive was generated by hypermail 2.2.0 : Sun Sep 20 2015 - 13:01:23 CDT