Commit Graph

2 Commits

Author SHA1 Message Date
Shimeng (Simon) Wang
3207e29d10 Enhance URL regular expression to match more Unicode chars.
Enhance URL regular expression to match legal one byte Unicode characters in
Internationalized Resource Identifiers as detailed in RFC 3987.  Specifically
two byte Unicode characters are not included.  Not all things in RFC 3987 is
implemented, this is just an enhancement for recognizing more common used one
byte Unicode characters.

This change helps Browser address bar identify more valid URL without scheme
typed in, such as 현금영수증.kr

make-iana-tld-pattern.py is modified to contain only Top Level Domain
regular expression generation.  Other parts of WEB_URL pattern are in
solely in Patters.java for better consistency and maintenance.
2010-02-11 14:07:44 -08:00
Shimeng (Simon) Wang
56811abc37 Add back lost python script.
The script is used to generate top level domains' regular expressions.
This is enhanced and used to regenerate the new top level domains.

	new file:   common/tools/make-iana-tld-pattern.py
2010-02-10 11:22:01 -08:00