[polari] util: Change URL matching (again)
- From: Florian Müllner <fmuellner src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [polari] util: Change URL matching (again)
- Date: Sat, 4 Mar 2017 21:50:29 +0000 (UTC)
commit a08ed09fcef81e22805ff144e9442f1984f2c38b
Author: Florian Müllner <fmuellner gnome org>
Date: Thu Mar 2 00:38:12 2017 +0100
util: Change URL matching (again)
Commit 094d7d1cc11 changed the URL regex to match any scheme under
the assumption that false positives would be rare. While this may
be true for schemes://, it turns out that trying to match colon-only
schemes generically commonly produces matches for random stuff that's
not a URL - someone not adding a whitespace after a mention, explaining
that server:port can be used to configure non-default ports, ...
So probably the best we can do is to keep allowing all schemes://, and
then whitelist a selected list of colon-only schemes. (Of course we
could go back to generating the list from the installed applications
when not running under flatpak, but let's try to keep the differences
between sandboxed and unsandboxed as small as possible).
https://bugzilla.gnome.org/show_bug.cgi?id=779449
src/utils.js | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
---
diff --git a/src/utils.js b/src/utils.js
index 8e68521..93dc483 100644
--- a/src/utils.js
+++ b/src/utils.js
@@ -54,11 +54,19 @@ const _balancedParens = '\\((?:[^\\s()<>]+|(?:\\(?:[^\\s()<>]+\\)))*\\)';
const _leadingJunk = '[\\s`(\\[{\'\\"<\u00AB\u201C\u2018]';
const _notTrailingJunk = '[^\\s`!()\\[\\]{};:\'\\".,<>?\u00AB\u00BB\u201C\u201D\u2018\u2019]';
+// schemes that only use a colon cannot be matched generically without producing
+// a lot of false positives, so whitelist some useful ones and hope nobody complains :-)
+const _schemeWhitelist = ['geo', 'mailto', 'man', 'info', 'ghelp', 'help'];
+
const _urlRegexp = new RegExp(
'(^|' + _leadingJunk + ')' +
'(' +
'(?:' +
- '(?:[a-z]+):' + // scheme:
+ '(?:[a-z]+)://' + // scheme://
+ '|' +
+ '(?:' +
+ _schemeWhitelist.join('|') + // scheme:
+ '):' +
'|' +
'www\\d{0,3}[.]' + // www.
'|' +
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]