[orca] More refinement in the detection of one-char-per-line CSSified brokenness



commit 8698054dc202dc92771fdf98183171b7e5cfd97b
Author: Joanmarie Diggs <jdiggs igalia com>
Date:   Wed Aug 21 13:50:18 2019 -0400

    More refinement in the detection of one-char-per-line CSSified brokenness
    
    Before we were not examining text with embedded object characters due to
    the possibility of false positives. But CSSified brokenness can occur in
    content with embedded object characters too. So if most of the text is
    not embedded object characters, proceed with the examination.

 src/orca/scripts/web/script_utilities.py | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
---
diff --git a/src/orca/scripts/web/script_utilities.py b/src/orca/scripts/web/script_utilities.py
index 1387b32f7..954f83153 100644
--- a/src/orca/scripts/web/script_utilities.py
+++ b/src/orca/scripts/web/script_utilities.py
@@ -2720,6 +2720,16 @@ class Utilities(script_utilities.Utilities):
         if not nChars:
             return False
 
+        # If we have a series of embedded object characters, there's a reasonable chance
+        # they'll look like the one-char-per-line CSSified text we're trying to detect.
+        # We don't want that false positive. By the same token, the one-char-per-line
+        # CSSified text we're trying to detect can have embedded object characters. So
+        # if we have more than 30% EOCs, don't use this workaround. (The 30% is based on
+        # testing with problematic text.)
+        eocs = re.findall(self.EMBEDDED_OBJECT_CHARACTER, text.getText(0, -1))
+        if len(eocs)/nChars > 0.3:
+            return False
+
         try:
             obj.clearCache()
             state = obj.getState()
@@ -2735,7 +2745,7 @@ class Utilities(script_utilities.Utilities):
             boundary = pyatspi.TEXT_BOUNDARY_LINE_START
             for i in range(nChars):
                 string, start, end = text.getTextAtOffset(i, boundary)
-                if len(string) > 1 or string == self.EMBEDDED_OBJECT_CHARACTER:
+                if len(string) > 1:
                     rv = False
                     break
 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]