[xslt] libxslt adding unwanted attributes



I'm new to these programs, having come from extensively using MSXML2
utilities in a Windows environment. So while experienced with XML and
XSLT, I am pretty green with libxml2 and libxslt.


What I'm attempting to do is manage a website by using XML tools to
standardize web pages and to prepare a site map. In Windows, I used the
DOM to open the XHTML files, check for and add CSS references and
reconcile the internal links between pages in the website. I started by
trying to do the same things with libxml2, but, unfortunately, libxml2
was frustrating me by either (in HTML mode) not properly closing the
empty elements like <link> and <meta>, or (in XML mode) closing the
empty elements but adding redundant xmlns attributes to the <html>
element as well as redundant <?xml ?> and <!DOCTYPE declarations.

After I gave up on libxml2 for this task, I decided that I could do the
same thing with a stylesheet. The only remaining problem there is that
libxslt is adding unwanted attributes to my <a> elements.  In one web
page, I am using intra-page links to named <a> elements. When I run the
XHTML file through libxslt or xsltproc, those same <a> elements have the
name attribute and an extra id attribute with the same value. I wouldn't
really care about this except that the next time I try to parse the file
for updates, the parser chokes, in an unrecoverable way, on what it
calls redundant attributes.

I am using the Python bindings (libxml2.py, libxslt.py). I would be
happy with a workaround. I tried to work with the result object of the
transformation, which seems to be a somewhat crippled xmlDoc object. I
tried to xpathEval() to find the broken <a> tags, but xpathEval()
returns no nodes. I can get to the <html> node with result.children, but
I can't get it with result.xpathEval("/html")


I can reproduce the problem with the two simple files listed below. Note
the last two templates in the xsl file are the special case for
processing the <a> element where the problem occurs.  Run the example
with the following command:

xxx$ xsltproc --html demo.xsl demo.htm

I would appreciate any advice.
Thanks,
Chuck Jungmann


demo.xml:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
	<head>
		<meta http-equiv="Content-Type"
			content="text/html; charset=ISO-8859-1" />
		<title>Demo libxslt Error</title>
	</head>
	<body>
		<h1>Demo libxslt Error</h1>
		<p>
			Here's a <a href="bogusURL.htm">bogus link</a>
			with href attribute that should be changed
		</p>
		<p>
			Here's a <a name="bogusName">named link</a>
			that is improperly copied.
		</p>
	</body>
</html>

demo.xsl:

<?xml version="1.0"?>
<xsl:stylesheet
	version="1.0"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
	xmlns="http://www.w3.org/1999/xhtml";
	xmlns:html="http://www.w3.org/1999/xhtml";
	exclude-result-prefixes="html">

	<xsl:output method="xml"
		doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
		doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";
		version="1.0"
		indent="yes"
		omit-xml-declaration="yes"
		encoding="ISO-8859-1"/>

	<xsl:template match="/">
		<xsl:apply-templates select="*" />
	</xsl:template>

	<xsl:template match="html">
		<html>
			<xsl:apply-templates
				select="child::node()"
				mode="copy" />
		</html>
	</xsl:template>

	<xsl:template match="node()" mode="copy">
		<xsl:choose>
			<xsl:when test="self::*">
				<xsl:element name="{name()}">
					<xsl:apply-templates select="@*"
						mode="copy" />
					<xsl:apply-templates
						select="child::node()"
						mode="copy" />
				</xsl:element>
			</xsl:when>
			<xsl:when test="self::text()">
				<xsl:value-of select="." />
			</xsl:when>
			<xsl:when test="self::comment()">
				<xsl:comment>
					<xsl:value-of select="."  />
				</xsl:comment>
			</xsl:when>
			<xsl:when test="self::processing-instruction()">
				<xsl:processing-instruction
					name="{name()}">
					<xsl:value-of select="."
						mode="copy" />
				</xsl:processing-instruction>
			</xsl:when>
		</xsl:choose>
	</xsl:template>

	<xsl:template match="@*" mode="copy">
		<xsl:attribute name="{name()}">
			<xsl:value-of select="." />
		</xsl:attribute>
	</xsl:template>

	<xsl:template match="a" mode="copy">
		<xsl:element name="a">
			<xsl:apply-templates
				select="@*[not(name()='href')]"
				mode="copy" />
			<xsl:apply-templates
				select="@href"
				mode="copy" />
			<xsl:apply-templates
				select="child::node()"
				mode="copy" />
		</xsl:element>
	</xsl:template>

	<xsl:template match="a/@href" mode="copy">
		<xsl:attribute name="href">
			<xsl:value-of
				select="concat('boguspath/', .)" />
		</xsl:attribute>
	</xsl:template>

</xsl:stylesheet>





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]