[orca] Web: Fix performance issue resulting from detecting offscreen text brokenness



commit 6cf5160d3e2bc1a39932d7ceda84d6c239184d23
Author: Joanmarie Diggs <jdiggs igalia com>
Date:   Tue Jan 19 16:52:30 2021 +0100

    Web: Fix performance issue resulting from detecting offscreen text brokenness
    
    When authors hide text offscreen so that only screen readers will find
    them and present them, they think they are being helpful. Unfortunately,
    their techniques by side effect can break what we get for the accessible
    text (e.g. asking for a line at offset results in only a single char or
    word). Thus we have to sanity check all text in order to work around
    this. Normally this is not a performance problem because we can bail
    after checking the first line. But in a giant text object whose contents
    consist almost entirely of embedded object chars, we can get quite laggy.
    Therefore, if the accessible text is more than 30% embedded object chars,
    bail on the lines-are-single-words sanity check.

 src/orca/scripts/web/script_utilities.py | 10 ++++++++++
 1 file changed, 10 insertions(+)
---
diff --git a/src/orca/scripts/web/script_utilities.py b/src/orca/scripts/web/script_utilities.py
index ba6f2073c..435ea74e6 100644
--- a/src/orca/scripts/web/script_utilities.py
+++ b/src/orca/scripts/web/script_utilities.py
@@ -3029,6 +3029,16 @@ class Utilities(script_utilities.Utilities):
         if not nChars:
             return False
 
+        # If we have a series of embedded object characters, there's a reasonable chance
+        # they'll look like the one-word-per-line CSSified text we're trying to detect.
+        # We don't want that false positive. By the same token, the one-word-per-line
+        # CSSified text we're trying to detect can have embedded object characters. So
+        # if we have more than 30% EOCs, don't use this workaround. (The 30% is based on
+        # testing with problematic text.)
+        eocs = re.findall(self.EMBEDDED_OBJECT_CHARACTER, text.getText(0, -1))
+        if len(eocs)/nChars > 0.3:
+            return False
+
         try:
             obj.clearCache()
             state = obj.getState()


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]