Le 30/10/2013 15:34, Frédéric Grosshans a écrit :
> Le 29/10/2013 17:15, "Jörg Knappen" a écrit :
>> After running this script, a few more things were there: 
>> Non-normalised accents and some really strange
>> encodings I could not really explain but rather guess their meanings, 
>> like
>> s/Ü/Ü/g
>> s/É/É/g
>> s/AÌ€/À/g
>> s/aÌ€/à/g
>> s/EÌ€/È/g
>> s/eÌ€/è/g
>> s/„/„/g
>> s/“/“/g
>> s/ß/ß/g
>> s/’/’/g
>> s/Ä/Æ/g
>
> It was probably not utf8 read as latin 1 and reencoded in utf8, but 
> utf_8 encoding read as Windows 1252 ( 
> http://en.wikipedia.org/wiki/Windows-1252 ) and reencoded as utf-8. 
> Each of the combination above contains a character absent in latin-1 
> (œ‰€žŸ™„), and some of them are only present in Windows-1252 (‰™„) and 
> not in Latin-15, the other possible mistake.
>
> I'v e check that this is consistent with Ü É and ß but not with your 
> Æ. This double encoding would give Ä :
> Ä=Win1252(C3 84)=110.00011 10.000100 = UTF8(00011 000100)=unicode 
> 00C4 =Ä (and not Æ)
>
I've also checked the other combiniations, including ̀ = U+0300 
COMBINING GRAVE ACCENT and everything is consistent with Windows-1252, 
except your Æ which should be Ä.
     Frédéric
Received on Wed Oct 30 2013 - 09:55:31 CDT
This archive was generated by hypermail 2.2.0 : Wed Oct 30 2013 - 09:55:31 CDT