Description of Issue:
This proposed draft Unicode Technical Standard specifies a standard mechanism for detecting URLs embedded in plain text — in particular, detecting URLs containing non-ASCII characters. It also defines the minimally necessary escaping of non-ASCII code points in the Path, Query, and Fragment portions of a URL that aligns with the mechanism for detecting URLs.
Linkification is the process of adding links to URLs in plain text input, such as in emails, text messaging, or video meeting chats. The first step in this process is
link detection, which is determining the boundaries of spans of text that contain URLs. That substring can then have a link applied to it in output text. The functions that perform these operations are called a
linkifier and link detector, respectively.
The specifications for a URL don’t specify how to handle link detection, since they are only concerned with the structure in isolation, not when it is embedded within flowing text. The lack of a clear specification for link detection also causes many implementations to overuse percent escaping for non-ASCII characters when converting URLs into plain text.
How to Provide Feedback: For information about how to discuss this
Public Review Issue and how to supply
formal feedback, please see the
feedback and discussion
instructions. The accumulated feedback received so far on this issue is shown below,
or you can look at a full page view.
Feedback is reviewed by the relevant committee according to their meeting schedule.
|