[xml] Xmltextreader jumping to section in the stream



Title: Xmltextreader jumping to section in the stream
Hi,

I have been using the libxml library for a while now and i’m loving it. I recently start using the textreader for parsing files who are larger than 2 gigs. It works great for parsing the entire document at once.

But I run into a little problem what I try to do I probably not possible. I want to read to a specific node and from there I want to split the reader so I can read with the copy of the reader all the nodes between the specific node once. Then I can continue from the same position again reading the nodes I skipped in the first run.

Maybe its not really clear what I want, so a little example would be in place here.
<root>
    <group id=”1”>                          <- I want to split the reader here
        <value1>foo</value1>            <- want to extract in the first run
        <item>                                      <- this node occurs multiple times and will be extracted in the second run and return the values in this node
            ....
        </item>
        <item>                                  <- this node occurs multiple times and will be extracted in the second run and return the values in this node
            ....
        </item>
        ...
        <value2>foo2</value2>       <- want to extract in the first run
    </group>
.... Etc

I tried different approaches, create a IO reader wand jump back to the start offset for the node where I intended to split/duplicate the reader. But I cant seem to get the right byte offset for the start node. What I am doing here is make a duplicate of the group for the reader and read it once for extracting the first run and jump back in the file and then extract the needed nodes <item>. But this will not work if I cant get the right byte offset of the start node and don’t know how to reset the buffer and offset position in the reader. Expanding the current node is not an option because there is too much data to store there.

I am a bit lost and can’t seem to see the right way to do this, maybe someone else had this problem before and can give me good solution.

Thanks in advance,
Michael


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]