Question to beagled & character encodings
- From: Stephan Hegel <stephan hegel gmx de>
- To: dashboard-hackers gnome org
- Subject: Question to beagled & character encodings
- Date: Mon, 31 Oct 2005 09:59:19 +0100
Hi all,
I have indexed a mix of English and German html documents downloaded
as "Web page complete" via Firefox. Everything works fine and the pages
get indexed by beagled.
However, there seems to be a problem with German Umlauts or in general
with character encodings, e.g.:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta name="Author" content="Thomas Frenzel">
<meta name="Generator" content="NetObjects Fusion 4.0.1 für Windows">
<meta name="Keywords" content="DH, Downhill, Enduro, Enduro - Zschopau, Mountainbike,
Mountainbiketouren, geführte Touren, Stülpner, "><title>Löwenkopftrails</title></head>
...
...
Even when the page is marked clearly with "charset=ISO-8859-1" the term
"Löwenkopftrails" is displayed in "Beagle-Best" only as "Lwenkopftrails
- the German "ö" is missing. Also only a search for "Lwenkopftrails"
brings out a result, "Löwenkopftrails" returns nothing.
I'm running a Novell/SuSE 10.0 system, English as primary language (env:
LANG=en_US.UTF-8). Is there a way to prevent / workaround/ configure this ?
Thanks & kind regards,
Stephan.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]