Re: When to validate?

From: Doug Ewell (dewell@adelphia.net)
Date: Fri Dec 10 2004 - 10:20:18 CST

  • Next message: Carl W. Brown: "RE: When to validate?"

    Arcane Jill <arcanejill at ramonsky dot com> wrote:

    > Here's something that's been bothering me. Suppose I write a function
    > - let's call it trim(), which removes leading and trailing spaces from
    > a string, represented as one of the UTFs. If I've understood this
    > correctly, I'm supposed to validate the input, yes?
    >
    > Okay, now suppose I write a second function - let's call it tolower(),
    > which lowercases a string, again represented as one of the UTFs.
    > Again, I guess I'm supposed to validate the input. yes?...

    This is one reason why I work with "strings" of code points, and only
    convert strings of UTF code units when I read them in and write them
    out. The read and write functions do the necessary validation, allowing
    the rest of the code to focus on characters. If you operate directly on
    strings of UTF-8 bytes, you have to worry about things like this.

    To answer your question, if you've already validated your input, and you
    generate only valid output (which I hope is the case :-), and your
    second function ONLY gets (valid) data from your first function, then
    you probably don't need to re-validate them. But I'd hate to have to do
    tolower() for non-Basic-Latin on strings of UTF-8 bytes.

    For me, conversion from any CES or TES always implies validation.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Fri Dec 10 2004 - 10:22:21 CST