[orca/gnome-3-38] Web: Fix performance issue resulting from detecting offscreen text brokenness

From: Joanmarie Diggs <joanied src gnome org>
To: commits-list gnome org
Cc:
Subject: [orca/gnome-3-38] Web: Fix performance issue resulting from detecting offscreen text brokenness
Date: Tue, 19 Jan 2021 16:01:09 +0000 (UTC)

commit a2bdcacba24c8c0db4844bf53a22a4a988a45efb
Author: Joanmarie Diggs <jdiggs igalia com>
Date:   Tue Jan 19 16:52:30 2021 +0100

    Web: Fix performance issue resulting from detecting offscreen text brokenness
    
    When authors hide text offscreen so that only screen readers will find
    them and present them, they think they are being helpful. Unfortunately,
    their techniques by side effect can break what we get for the accessible
    text (e.g. asking for a line at offset results in only a single char or
    word). Thus we have to sanity check all text in order to work around
    this. Normally this is not a performance problem because we can bail
    after checking the first line. But in a giant text object whose contents
    consist almost entirely of embedded object chars, we can get quite laggy.
    Therefore, if the accessible text is more than 30% embedded object chars,
    bail on the lines-are-single-words sanity check.

 src/orca/scripts/web/script_utilities.py | 10 ++++++++++
 1 file changed, 10 insertions(+)
---
diff --git a/src/orca/scripts/web/script_utilities.py b/src/orca/scripts/web/script_utilities.py
index 636191a19..ad54cb251 100644
--- a/src/orca/scripts/web/script_utilities.py
+++ b/src/orca/scripts/web/script_utilities.py
@@ -3008,6 +3008,16 @@ class Utilities(script_utilities.Utilities):
         if not nChars:
             return False
 
+        # If we have a series of embedded object characters, there's a reasonable chance
+        # they'll look like the one-word-per-line CSSified text we're trying to detect.
+        # We don't want that false positive. By the same token, the one-word-per-line
+        # CSSified text we're trying to detect can have embedded object characters. So
+        # if we have more than 30% EOCs, don't use this workaround. (The 30% is based on
+        # testing with problematic text.)
+        eocs = re.findall(self.EMBEDDED_OBJECT_CHARACTER, text.getText(0, -1))
+        if len(eocs)/nChars > 0.3:
+            return False
+
         try:
             obj.clearCache()
             state = obj.getState()

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]