Re: Problem with Non ASCII character and F-Spot.



Hi

I had the same problem and tracked down the issue.  Here is the deal:

- in XmpTagsMetadata.cs, the function Read() loops on the
SemWeb.Statement in the store and reads what is stored in each
statement's "object":
  string stmt_obj_str = stmt.Object.ToString();

- Now this "object" is a Resource, more precisely here the Literal
implementation of Resource (see semweb/Resource.cs).

- Let's look at what the ToString() function does for a Literal:
 [...]
ret.Append('"');
ret.Append(N3Writer.Escape(Value));
ret.Append('"');
 [...]

 Here we have two interesting things: the extra quotes used by the
XMP patch to find the string and remove the Language and DataType
appended by ToString() (see end of the function) and the
"N3Writer.Escape" thingy.

- The original string (Value) is encoded as specified by the N-Triples
format for RDF [1].  So for example, 'é' which would be stored in the
XML as 'é' will be changed into '\u00E9' :-(.

The bad news is the getter for Value is not accessible, so no easy
fix... I was able to have the tags imported properly by simply
changing the line:

ret.Append(N3Writer.Escape(Value));

into:

ret.Append(Value);

I am not sure modifying SemWeb is the best thing to do though.  Any ideas?

--
Cosme

----
[1] http://www.w3.org/TR/rdf-testcases/#ntriples

On 7/7/06, Stephane Delcroix <stephane delcroix org> wrote:
Bengt,

confirmed also for french tags (like ones with 'ç')
i'll also try to find out why...

S
[...]


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]