RE: UTF-7: help me understand - a-ha!

From: Mike Brown (
Date: Wed Apr 12 2000 - 14:20:16 EDT

Deborah Goldsmith wrote:
> 3. Divide into sextets, padding to a *sextet* boundary:
> 001000 000010 0010oo
> Does that make it more clear?

Yes, definitely.

My confusion stems in part from the terminology used in the RFCs. Use of the
words "quantum" and "integral" only tell me that I didn't pay enough
attention in Calculus class 12 years ago.

Also there is this in the UTF-7 RFCs:

"...the octet stream is encoded by applying the Base64 content transfer
encoding algorithm as defined in RFC 1521, modified to omit the "=" pad
character. Instead, when encoding, zero bits are added to pad to a Base64
character boundary."

The Base64 encoding algorithm says to first divide the octet stream into
sextets (although they don't use that word for some reason) and then to
group those sextets into input groups of 24 bits (4 sextets). Then it says:

"When fewer than 24 input bits are available in an input group, zero bits
are added (on the right) to form an integral number of 6-bit groups.
Padding at the end of the data is performed using the '=' character."

So to me it is not obvious that the padding talked about in the UTF-7 RFC is
supposed to be applied at the divide-into-sextets step, nor is it obvious
that there is no group-into-24-bit requirement.

I'm not arguing; just rationalizing my confusion. :)


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:01 EDT