RE: Unicode file reading Problem??

From: Rick Cameron (Rick.Cameron@businessobjects.com)
Date: Fri Dec 09 2005 - 11:31:56 CST

  • Next message: Tom Emerson: "Re: UnicodeData.txt problem"

    Can you explain your situation some more?
     
    How is the data encoded in the file - as UTF-16 or UTF-32?
    How do you want to store the data in memory - as UTF-16 or UTF-32?
     
    If the answer to both questions is UTF-16, I don't think there is a real
    problem here. wifstream will read in UTF-16 code points and store them
    in (16-bit) wchar_t, without interpreting them. If there are surrogate
    pairs in the text, it is up to you to interpret them correctly after
    they've been read in.
     
    You may want to look at ICU (http://icu.sourceforge.net/). It provides a
    lot of help in dealing with Unicode, and is much more up-to-date than
    the Microsoft runtime libraries.
     
    Cheers
     
    - rick

    ________________________________

            From: unicode-bounce@unicode.org
    [mailto:unicode-bounce@unicode.org] On Behalf Of Sajal Maity
            Sent: Friday, 9 December 2005 2:42
            To: unicode@unicode.org
            Subject: Unicode file reading Problem??
            
            
            Hi All,
            I am getting problem while i am reading a unicode file using
    wistream/wifstream in VC6.VC's wchar_t doesn't support unicode character
    which is more than 2 bytes i.e. in surrogate area.My work is to read any
    type of UTF text file. I got my result using CFile in MFC. But there i
    used WideCharToMultiByte & the reverse one. But still i have some
    confusion whether those function converts properly for every UTF?
             
            Is there any other way to read unicode character?
            Whether UNICODE CONSORTIUM itself has any readymade functions?
             
            Bye...
             



    This archive was generated by hypermail 2.1.5 : Fri Dec 09 2005 - 11:35:25 CST