[meld] dirdiff: Fix display of encoding errors when scanning folders (#235)



commit 924ac31d039504e17096fc9bca2a3d723c187f26
Author: Kai Willadsen <kai willadsen gmail com>
Date:   Sun Oct 28 08:54:46 2018 +1000

    dirdiff: Fix display of encoding errors when scanning folders (#235)
    
    The existing handling was Python 2 era. In the current code, we'll
    always have a `str`-type root for our `os.listdir()` call, so the
    entries will always be `str`s. This patch handles the Python 3 path
    handling situation of getting surrogate escaped paths (in the case of a
    bad file name vs. filesystem encoding) by just checking for a valid
    re-encode and treating any failure as an encoding error.

 meld/dirdiff.py | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
---
diff --git a/meld/dirdiff.py b/meld/dirdiff.py
index e4d9ce79..5cebb116 100644
--- a/meld/dirdiff.py
+++ b/meld/dirdiff.py
@@ -778,11 +778,11 @@ class DirDiff(MeldDoc, Component):
 
                 for e in entries:
                     try:
-                        if not isinstance(e, str):
-                            e = e.decode('utf8')
-                    except UnicodeDecodeError:
-                        approximate_name = e.decode('utf8', 'replace')
-                        encoding_errors.append((pane, approximate_name))
+                        e.encode('utf8')
+                    except UnicodeEncodeError:
+                        invalid = e.encode('utf8', 'surrogatepass')
+                        printable = invalid.decode('utf8', 'backslashreplace')
+                        encoding_errors.append((pane, printable))
                         continue
 
                     try:


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]