[kupfer] kupferstring: Expand germanic letters to two char transliteration



commit 62dba7642db4c118a94350e5cd13dbebf65e4ad4
Author: Ulrik Sverdrup <ulrik sverdrup gmail com>
Date:   Sat Sep 12 19:57:27 2009 +0200

    kupferstring: Expand germanic letters to two char transliteration
    
    Common transliteration uses å = aa, ä = ae and so on, which might be
    a less surprising transliteration (for some people; this expectation
    is different for different locales I fear. However, an added letter
    does not affect matches, but not adding them when you search for "ae"
    would remove matches.)

 kupfer/kupferstring.py |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)
---
diff --git a/kupfer/kupferstring.py b/kupfer/kupferstring.py
index 3e52cb4..95d3807 100644
--- a/kupfer/kupferstring.py
+++ b/kupfer/kupferstring.py
@@ -4,13 +4,20 @@ from unicodedata import normalize, category
 
 def _folditems():
 	_folding_table = {
+		# general non-decomposing characters
+		# FIXME: This is not complete
 		u"Å?" : u"l",
-		u"æ" : u"ae",
-		u"ø" : u"o",
 		u"Å?" : u"oe",
 		u"ð" : u"d",
 		u"þ" : u"th",
 		u"Ã?" : u"ss",
+		# germano-scandinavic canonical transliterations
+		u"ü" : u"ue",
+		u"Ã¥" : u"aa",
+		u"ä" : u"ae",
+		u"æ" : u"ae",
+		u"ö" : u"oe",
+		u"ø" : u"oe",
 	}
 
 	for c, rep in _folding_table.iteritems():



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]