Re: The URL regex.



On 2001.05.22 04:24 Brian Stafford wrote:
> On Mon, 21 May 18:22 Albrecht Dreß wrote:
[snip]
> If the not blank character class is retained, some extra characters
> will need adding to the class to improve the reliability of a
> valid match, e.g. the double quote '"', consider
> "http://some.host/some/path".
[snip]
To supplement a carefully designed regex, Balsa could look at the context
of the URL to make a guess as to where it ends.  I get mail such as (see 
http://www.gnome.org/applist/view.php3?name=Balsa or
http://www.balsa.net/main.html) which is never parsed appropriately (well,
Netscape Messenger trims off the trailing parenthesis, but it also does
that when it's part of the URL, too!).  I could post some code for matching
up say "()" and "\"\"", if (a) it doesn't violate some other principle and
(b) you think it's worthwhile.

The idea would be to constrain a URL from flowing out from a paren- or
quote-delimited block.  Other delimiters could be used, though "<>", "{}",
and "[]" are problematic because of their use in quoting.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]