From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jan 29 2004 - 17:40:38 EST
RE: problem - non-ASCII characters on Windows command lineFrom: Mike Ayers
To: Deepak Chand Rathore ; unicode
Sent: Thursday, January 29, 2004 7:34 PM
Subject: RE: problem - non-ASCII characters on Windows command line
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
> Behalf Of Markus Scherer
> Sent: Thursday, January 29, 2004 8:51 AM
> As I said in my earlier email, I would try the Windows
> command line window (DOS prompt window) and
> set it to Unicode mode via "chcp 10000".
>
> I just tried this on Windows 2000, and pasting Unicode
> characters (that are not in the OEM codepage)
> from the character map does not work. It appears to perform a
> conversion from Unicode to the OEM
> codepage (and then back out).
I see a similar thing on Win2K server.
> My other machine has Windows XP. There, the same experiment
> works - I can paste non-Latin-1 accented
> Latin characters, Greek, the Euro symbol, etc.
It does not work in XP either: my default codepage set in my French keyboard
driver is CP-850 for the console. If I paste a "é" after I have changed to
"CHCP 10000", what I see is a "Ä", i.e. the result of the displayed
interpretation of the pasted code point U+00E9 (Latin small letter e with
accute), as the CP-850 code 0x8E (U+00C4: Latin capital letter a with
diaeresis).
Note that even trying to display the current codepage, uses the wrong
characters:
C:\>MODE CON /STATUS
âtat du périphérique CON:
-------------------------
Lignes?: 300
Colonnes?: 80
Vitesse clavier?: 31
DÄlai clavier?: 1
Page de codes?: 10000
where "?" is the box-drawing character coded 0xCA in codepage 850 (i.e.
U+2569, box-drawing double line to West North and East) which appears
instead of the expected non-breaking space U+00A0 (if someone understands
why this box-drawing character appears, please explain, I can't find the
rationale). Note also the wrong characters: for "É" incorrectly displayed
"â", and "é" incorrectly displayed "Ä".
Even more strange, I can select and copy what is displayed on screen, and
paste it in a Windows GUI app, such as this email program I'm using to
compose the message, and I get the correct characters:
État du périphérique CON:
-------------------------
Lignes : 300
Colonnes : 80
Vitesse clavier : 31
Délai clavier : 1
Page de codes : 10000
So it seems that despite the characters are not correctly displayed, they
are correctly stored in the Console display buffer.
This seems to be an effect of the currently selected font in the Console
display: if this font is the default legacy raster font built for Console
apps (built for CP-850 on my system), it will always incorrectly display
Unicode characters stored in the display buffer.
So I suppose that the console stores correctly the Unicode characters, but
fails to convert them into font indices when the font is a legacy raster
font for console apps (and I don't understand how it can produce such bogous
display, given than the raster font really contains the correct characters,
even if it requires a conversion from Unicode to its default OEM codepage
for which the font was designed.)
The bug then remains with the display of the Windows console with legacy
raster fonts.
A solution is to select a monospaced TrueType font (such as "Lucida
Console", clean to read if selected in Bold style, at 12 point size) in the
Console properties menu. Does Microsoft knows this bug in the rendering with
his own legacy raster fonts selected by default for his own Windows console
?
This archive was generated by hypermail 2.1.5 : Thu Jan 29 2004 - 19:13:44 EST