Re: [xml] getElementsByName is available?



On Thu, 2007-06-28 at 11:00 +0200, Stefan Behnel wrote:

kalyanasundaram wrote:
On Thu, 2007-06-28 at 09:10 +0200, Stefan Behnel wrote:
kalyanasundaram s wrote:
  I need to parse a huge xml file for a specific set of nodes. Is there
any method like getElementByName is available in libxml. I could not
find it in the document. Does it exists with some other name?
Otherwise i will have to travel in the entire tree. It is inefficient.
No it's not. It just depends on the implementation of getElementByName. :)

You can always use XPath to find the tag. However, if your XML tree is really
so big (note that XPath is pretty fast, so I'd try it first) you may consider
building an index of the tree, i.e. some kind of data structure that maps tag
names to a list of node pointers.

Note that there is also a hash map implementation in libxml2, see hash.c.

Stefan

  Thanks for your information. I need to parse the xml file only once. 

Do you mean: parse it once, keep it in memory and keep doing lots of things
with it? Or rather: parse it once, extract what you need and then throw it away?

Yeh, in my case parse it once and update few nodes and save it as
another document. Nothing more than that.

In the first case: build an index. In the latter: Either read it in, traverse
it and extract what you need (possibly with XPath), or read it in with SAX and
extract what you need while parsing. Depends on whether you need a tree to
know what you need or not.


So which would be better? XPath or linear traversing?
I dont know much about XPath implementation. (Do they not traverse
atleast once?) The file size is about 500 KB. :)

That sounds rather small. Just parse it in and walk through it, that's what
I'd do.

Really! I thought 500 Kb is bigger. How much it would be able to handle?
At what size I should go for XPath ? 

thanks for your all help,
 -"kalyan"





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]