Re: Importing XMP data with SemWeb



On Wed, 2006-04-19 at 19:31 -0400, Warren Baird wrote:
> Larry Ewing wrote:
> > It is following the XMP specification for keywords.  It creates triple
> > with a dc:Subject predicate and points that to the rdf bag with the
> > keywords.  We could easily extend way it stores this by defining some
> > custom predicates that hold more info but what is there now is the most
> > interoperable way of storing tags/keywords.
> 
> That doesn't seem to be what is happening, unless I'm missing something. 
>    I added some code that does the following:
> 
> 	XmpFile xmp = Photo.UpdateXmp(photo,null);
> 	string xmppath = photo.Path + ".xmp";
> 	using (System.IO.FileStream stream = System.IO.File.OpenWrite (xmppath)) {
> 		System.Console.WriteLine("Saving xmp to " + xmppath);
> 		xmp.Save(stream);
> 	}
> 
> To see what xmp is being generated for a photo with 3 tags, and the 
> resulting xmp data looked like this (with whitespace added to make it 
> more readable:
> <?xpacket begin="" id="testing"?>
> <x:xmpmeta xmlns:x="adobe:ns:meta/">
> <rdf:RDF 
> xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/"; 
> xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"; 
> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"; 
> xmlns:xmpBJ="http://ns.adobe.com/xap/1.0/bj/"; 
> xmlns:xmpidq="http://ns.adobe.com/xmp/Identifier/qual/1.0"; 
> xmlns:dc="http://purl.org/dc/elements/1.1/"; 
> xmlns:xmp="http://ns.adobe.com/xap/1.0/"; 
> xmlns:tiff="http://ns.adobe.com/tiff/1.0/"; 
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; 
> xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/"; 
> xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"; 
> xmlns:exif="http://ns.adobe.com/exif/1.0/";>
>    <rdf:Bag rdf:nodeID="anon0">
>      <rdf:li>People</rdf:li>
>      <rdf:li>Places</rdf:li>
>      </rdf:Bag>
>    <rdf:Description rdf:about="">
>        <dc:subject rdf:nodeID="anon0" />
>    </rdf:Description></rdf:RDF>
> </x:xmpmeta>
> <?xpacket end="r"?>
> 
> so it seems to be creating an description node with an empty subject 
> node, and then putting the bag at the top level.  Possibly I'm doing 
> something wrong here.  I'm currently working with the 0.1.11 snapshot - 
> I haven't tried the cvs version yet...
> 

Sadly, as far as I know, what I said and what you pasted here mean the
same thing.  It would definitely be preferable to nest the blank node in
the subject and avoid the nodeID altogether but I think this is a valid
serialization (I'd be happy to be proven wrong).  Improving the SemWeb
output would be welcome of course.  

> However, with Bengt's pointer to the iptc site, I found a few sample 
> jpegs with xmp data in them already, and it looks like you are right --- 
> I think the 'approved' way to do something like this would look like the 
> attached xmp file.   I basically sucked the xmp out of one of the 
> samples, deleted most of the IPTC entities and simplified the list of 
> subjects.  This is what I'm trying to get importing now.  You can see an 
> example the attachment IMG_5299.jpg.xmp
> 
> > With semweb this is definitely a pain (look at the places in f-spot that
> > use the XMP directly for the lame hacks I did), but the semweb code for
> > reading and writing handled what I needed.  Someone could easily use the
> > triples to create a walkable graph and I'd be very happy if they did. I
> > haven't gotten to it yet.
> 
> Well - I've been trying to do that, but I'm not having much luck at 
> all...   I've written code that reads an xmp sidecar while importing 
> (<file>.xmp) - it wasn't that hard, I'm just doing
> 
> 	// check for an xmp "sidecar" file
> 	System.Console.WriteLine("looking for xmp file :" + origPath + ".xmp");
> 	if (System.IO.File.Exists(origPath + ".xmp")) {
> 		XmpFile sidecar = new XmpFile(System.IO.File.OpenRead(origPath + ".xmp"));
> 		sidecar.Dump();
> 	}
> 				
> 
> The output I'm getting is in IMG_5299.jpg.xmp.dump --- it looks to me 
> like it's creating a triple for the close of the creator node, and for 
> the opening of the bag, but *not* for the opening of the dc:subject 
> containing the bag...
> 

Welcome to the blank node again.  Read the RDF primer at
http://www.w3.org/TR/rdf-primer/#newresources and pay close attention
for Figure 14,  then look at the dump output. 

> I'm not sure where to go from here.   How attached are we to SemWeb? 
> What I'm trying to do seems like it'd be a heck of a lot easier with a 
> stock XML parser...   anything that provides DOM style access should 
> make this trivial.
> 

Again, sadly, this just isn't true.  You are running headlong into why
it isn't true.  RDF has several different ways of representing the same
graph with wildy different DOM structures.  The shorthand form and
nodeIDs in particular make it impossible.  The whole reason SemWeb is
there in the first place is to deal with that fact, and yes it makes me
sad.

Like i said earlier, the way forward is probably to take the triples
from the semweb output and construct a navigable graph.  Alternatively
I'd happily use a different RDF library with an api more appropriate for
XMP as long as it didn't add a lot of hard to satisfy dependencies to
the build.

--Larry

p.s.> I just wanted to take this opportunity to say the whole thing
makes me sad again.






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]