Re: BOCU patent (was: Re: Medievalist ligature character in the PUA)

From: Doug Ewell (doug@ewellic.org)
Date: Fri Dec 18 2009 - 08:39:28 CST

Next message: verdy_p: "re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"

Previous message: Otto Stolz: "Re: Medievalist ligature character in the PUA"
In reply to: verdy_p: "re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Next in thread: verdy_p: "Re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Reply: verdy_p: "Re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Reply: Michael D'Errico: "Re: BOCU patent"
Reply: Peter Krefting: "HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

"verdy_p" <verdy underscore p at wanadoo dot fr> wrote:

> Separate ranges has a benefit: it allows fast text search algorithms
> to work reliably as it allows easy resynchronisation from random
> positions.

It is a fundamental feature of UTF-8 and UTF-16. I don't remember
seeing a claim about separate ranges in the BOCU patent, but one would
think an attempt to claim that as an innovation would be untenable.

> I did not know that HTML5 *forbidded* supporting some MIME-registered
> charsets.
>
> Do you mean instead that it forbids recognizing automatically when the
> charset is unknown (not specified by the resource server, and not
> specified with the source link) and must be guessed from the bytes
> content of the stream ?

From
http://www.w3.org/TR/html5/infrastructure.html#character-encodings-0 :

"User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
encodings."

Amazing, isn't it? So thoughtful of the HTML 5 WG to protect
developers' time by prohibiting a handful of selected encodings. I can
support Fieldata or PTTC/EBCD in my user agent if I want to, but not
UTF-7 or SCSU.

> You don't have to use ICU actually. ICU components can be fully
> isolated and rewritten in any other language. But you have to include
> its licence as your new work will be a derived work based on a
> copyrighted work, even if it does not use any piece of its source
> code.

Right. So suppose I want to implement BOCU-1 from scratch, possibly in
an attempt to speed up encoding or decoding? Can't do it without asking
IBM for a license. (Note that I haven't actually looked at the ICU code
to see if it is already optimally fast. You get the point.)

> Almost all softwares today include several copyright notices

I'm not interested, for the moment, in the copyright notices attached to
software or libraries or other development tools. BOCU-1 is a
compression encoding, a relatively straightforward way (compared to gzip
and such) to represent Unicode characters as a sequence of bytes,
similar to UTF-8 and -7 and -16 and -32 and SCSU and
ASCII-with-XML-entities and all the rest. But only BOCU-1 among these
requires me to even think about licenses.

> For this reason, I don't consider the ICU licence intrusive and
> blocking, and BOCU-1 as provided through ICU, is both a free (FSF
> definition) and open (OSI definition) software which does not restrict
> rewriting it completely.

I haven't read the ICU license thoroughly, but I'd be surprised if
"rewriting it completely" is allowed.

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s

Next message: verdy_p: "re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"
Previous message: Otto Stolz: "Re: Medievalist ligature character in the PUA"
In reply to: verdy_p: "re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Next in thread: verdy_p: "Re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Reply: verdy_p: "Re: BOCU patent (was: Re: Medievalist ligature character in the PUA)"
Reply: Michael D'Errico: "Re: BOCU patent"
Reply: Peter Krefting: "HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Dec 18 2009 - 08:40:45 CST