Hi Liam, I misdiagnosed the problem. The problem actually seems to be that the XML file I am parsing has a file entity whose path contains a Unicode character that needs to be escaped.
Here is the XML I am trying to parse: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "W:/matlab/sys/namespace/docbook/v4/dtd/docbookx.dtd" [ <!ENTITY sect-002 SYSTEM "./uc£_html_files/image-000-chapter.xfrag"> ]> <book lang="en"> <?dbhtml filename="uc£.html"?> <bookinfo><title></title><subtitle></subtitle><pubdate>31-Jul-2022 11:08:41</pubdate></bookinfo>§-002;</book> Here is the error returned by the parser. "Entity 'sect-002' failed to parse\n" The parser escapes high-order characters in the URL for the main XML file but apparently does not do the same for file entities declared in the DTD. I am currently trying to convert a Xerces-c/Xalan-c application to libxml/xslt. This is because Xalan-c is unable to execute the Docbook FO stylesheet. My Xerces-c implementation uses a custom entity resolver to resolve file entities. I
might need a custom entity resolver to fix the problem with the libxml2 implementation. However, libxml2 does not seem to support custom entity resolvers. At lease, I have not been able to find this feature in the doc or the libxml2 code base on GitHub. I would appreciate any help you can give to finding a solution., Regards, Paul From: Liam R E Quin <liam holoweb net> On Sat, 2022-07-30 at 17:15 +0000, Paul Kinnucan via xml wrote: |