[polari] util: Change URL matching (again)



commit a08ed09fcef81e22805ff144e9442f1984f2c38b
Author: Florian Müllner <fmuellner gnome org>
Date:   Thu Mar 2 00:38:12 2017 +0100

    util: Change URL matching (again)
    
    Commit 094d7d1cc11 changed the URL regex to match any scheme under
    the assumption that false positives would be rare. While this may
    be true for schemes://, it turns out that trying to match colon-only
    schemes generically commonly produces matches for random stuff that's
    not a URL - someone not adding a whitespace after a mention, explaining
    that server:port can be used to configure non-default ports, ...
    
    So probably the best we can do is to keep allowing all schemes://, and
    then whitelist a selected list of colon-only schemes. (Of course we
    could go back to generating the list from the installed applications
    when not running under flatpak, but let's try to keep the differences
    between sandboxed and unsandboxed as small as possible).
    
    https://bugzilla.gnome.org/show_bug.cgi?id=779449

 src/utils.js |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)
---
diff --git a/src/utils.js b/src/utils.js
index 8e68521..93dc483 100644
--- a/src/utils.js
+++ b/src/utils.js
@@ -54,11 +54,19 @@ const _balancedParens = '\\((?:[^\\s()<>]+|(?:\\(?:[^\\s()<>]+\\)))*\\)';
 const _leadingJunk = '[\\s`(\\[{\'\\"<\u00AB\u201C\u2018]';
 const _notTrailingJunk = '[^\\s`!()\\[\\]{};:\'\\".,<>?\u00AB\u00BB\u201C\u201D\u2018\u2019]';
 
+// schemes that only use a colon cannot be matched generically without producing
+// a lot of false positives, so whitelist some useful ones and hope nobody complains :-)
+const _schemeWhitelist = ['geo', 'mailto', 'man', 'info', 'ghelp', 'help'];
+
 const _urlRegexp = new RegExp(
     '(^|' + _leadingJunk + ')' +
     '(' +
         '(?:' +
-            '(?:[a-z]+):' +                       // scheme:
+            '(?:[a-z]+)://' +                     // scheme://
+            '|' +
+            '(?:' +
+                _schemeWhitelist.join('|') +      // scheme:
+            '):' +
             '|' +
             'www\\d{0,3}[.]' +                    // www.
             '|' +


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]