Tool to convert malformed XML into valid XML xml-request gnome org wrote:
Message: 1 Date: 20 Jul 2013 11:07:22 -0000 From: "Subrata Dasgupta" <subrata_usha rediffmail com> To: <xml gnome org> Subject: [xml] Tool to convert malformed XML into valid XML Message-ID: <20130720110722 3438 qmail f5mail-224-156 rediffmail com> Content-Type: text/plain; charset="utf-8" Respected Sir, While working in a project I have faced huge problems with malformed XML files. Most of the times few opening or closing tags are missing in those files and some times though XML is not malformed but it is not matching with the DTD. It is very very hard to fix this by hand because XMLs are very big more than 30 MB to 2 GB. So I am looking for a open source tool which can detect and fix the malformed xml with the help of a DTD or XSD automatically(at least where there is no ambiguity). But till now I am unable to find such a tool. But after googling it seems to me that we can write such tool using GNU libxml open source library.
Honestly, I don't think you can write a tool that fixes the problems you've described reliably. Offhand, I can think of several situations where you just can't decide what is missing or how a malformed content can be rearranged or somehow else modified to match an XML schema. One would think that an effort to avoid the creation of malformed XML in the first place would be a much better option. As for doing it by hand: there are several XML editors around that will show where an error occurs. If the file size causes any problems, it should be possible to write a simple tool that splits a big file into chunks. -W
But I am not sure how to implement this and which API functions I should use. Please help me to write such an application. I am proficient in c and c++. It would be very much helpful if you provide me some information on this. If there is any already available free tool or open source for this purpose then also please let me know. Thanks Subrata Dasgupta
Attachment:
wolfgang_laun.vcf
Description: wolfgang_laun.vcf