Re: [xml] Generating XML from Schema definitions (again, sorry)

On Fri, Jul 27, 2007 at 11:28:36AM +1000, Callum Gibson wrote:
I've been acquainting myself with XML and gradually getting to know
libxml2 and the functionality it provides, and I'm wondering if it's
able to do what I want.

Basically, I have a large document schema spread across several xsd files
which are imported into each other. I know xsd is used primarily for
validation, but I would like to use the schema definition to generate a
valid XML document from my data. I think this is basically how things
work in the java world with xmlbeans, etc where you can generate your
class definitions from the xsd files and then populate them with set

It seems I could read the individual schemas files as documents and build
up an overall schema definition myself, but xmlSchemaParse seems to do
that for me so it would be silly not to use it.

I've managed to suck in the top level schema file and it imports all the
sub-schemas fine, but it seems to store it all obscurely internally, and
I'm not familiar enough with the API to know if what I want to do is
possible. I can grunge around with a debugger and see where things are
stored but many of the data structures aren't exposed and the imported
schemas don't seem to be linked to the main one in any way that allows
me to traverse the xsd "document", look up imported schemas and types
to build my xml output.

I would envisage being able to traverse the schema and check the xpath
of each element against a mapping into my own data which would allow
me to populate/generate the output as xml. I don't really want to just
manually code to a specific xml schema since I don't have control of
the schema, hence my desire to programmatically generate it based on the
schema files which would be read in up front.

Can anyone tell me if this is possible with libxml2 and give me a rough
outline of how to go about it, which APIs I should look at or example
programs which do a similar thing.

  We don't really have what you expect and for a variety of reasons:
    - first you can't systematically derive one kind of document from
      a schemas, it's like trying to derive one string from a complex
      regexp. You need schemas or regexps preceisely because your
      document or string can mute into various ways. It's a completely
      open problem if looked from a generic POV.
    - second the result of the XSD compilations are really internal and
      only the strict minimum needed for validation is exported as API
      from libxml2. The reasons are that there is no garantee that the
      internal won't change over long term, and to some extend we are
      not sure the processing is right so change may be frequently needed.

For something as simple as DTD we were able to expose entry points to help
generator/editors, but for XSD or RNG this get way too complex.

  Sorry, but unless you basically copy some of the internal definitions in
your own code and try to hack your way in, it's just not possible as all
the data structures are opaque from an API viewpoint, and even then, it
may be really hard/impossible to get what you want just because the internal
structures are heavilly processed toward validation, not to be exposed as
'how the schemas look', for exemple we build regexp like internal validation
structure, which are compressed to binary tables to express element content
model, and getting from there to a possible instance would be far from trivial.

  Sorry, that code was really designed for validation, not exposure,


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]