The Unicode Consortium Discussion Forum

The Unicode Consortium Discussion Forum

 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
 
It is currently Tue Sep 23, 2014 1:17 am

All times are UTC - 6 hours [ DST ]




Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: Unicode in different machine
PostPosted: Thu Jul 11, 2013 5:16 am 
Offline

Joined: Mon Jul 08, 2013 10:41 am
Posts: 1
I just wonder to know whethet the unicode should be installed in a machine before downloading some unicode based web site? I developed one program which extract some text from web (unicode text here) and this works well in my laptop(Windows based) since my laptop has unicode installed. I ran the same program on another machine (Linux based) and extracted the text. I am not sure whether linux machined has unicode installed or not but since latest linux is installed i preassume that unicode might be there. But when I open the file in linux machine, I found the strange character (may be due to not unicode installed). Then I copied the downloaded file into my laptop and open the file, the content of file is same as strange characters rather than what I expect to have real unicode text.
my question/s is/are
1. does unicode is machine dependent?
2. Is utf-8 or utf-16 is different character encoding?

Thanks in advace,
Shrestha


Top
 Profile  
 
 Post subject: Re: Unicode in different machine
PostPosted: Tue Sep 24, 2013 2:15 am 
Offline

Joined: Sat Feb 13, 2010 4:46 pm
Posts: 11
nshresthan wrote:
2. Is utf-8 or utf-16 is different character encoding?

Yes and no.

As a set of numbers representing characters, there is only one encoding. However, as the numbers require 21 bits, there are different ways of storing these numbers as a sequence of 8-bit numbers (UTF-8) and as 16-bit numbers (UTF-16). To that extent, they are different character encodings.

nshresthan wrote:
1. does unicode is machine dependent?

When viewed as a sequence of bytes, there are two main ways of storing a 16- or 32-bit number - most significant byte first ('big-endian'), or least significant bit first ('little-endian'). If the order is not specified, there may be a machine-dependent default (a 'higher order protocol' in Unicode jargon).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by DEVPPL.com