[xslt] Re: question for the libxsl experts



*** NOTE: I've taken the liberty of moving this discussion onto the libxslt
mailing list. AFAIK, Todd isn't on it, so CC him.

On Thu, 20 Sep 2001 tlewis@mindspring.com wrote:

> Fellas,
>
> libxml continues, unsuprisingly, to suprise me with its quality.
> My Network Intrusion Detection System (NIDS) hank is using xml (and
> libxml) for its rule format.  I have this week been reading up on xslt
> and am using a stylesheet to take a generic ruleset, which specifies the
> signature characteristics of various attacks, and transform it according
> to a site policy into a working rule file, with such local details as
> to whom reports should be sent, if matching packets should be dropped
> by the OS's firewall (if available), etc.  I don't know anyone else
> who can do stuff like this, and I owe it all to Xmlsoft.  8^)
>
> This part is going well.  However, in the formulation of the reports, I
> have a question that I fear wasting a good deal of time finding the right
> answer, so I thought that I would see if I could cheat by asking here.
> I asked it in a more primitive form last month and got this answer
> from Ignacio:
>
> On Sat, 18 Aug 2001, Ignacio Vazquez-Abrams wrote:
>
> > Have you looked at XSLT? You're basically translating one type of XML document
> > into another, even if it's as simple as replacing elements with text. XSLT
> > excels at manipulating XML.
> >
> > You could even change your original C code to output an XSLT stylesheet
> > instead of doing search-and-replace on an xmlDoc, then use libxslt to apply
> > the stylesheet to any report you choose, copying any elements it doesn't know
> > about, and replacing ones it does know about with the appropriate value.
>
> This was excellent advice.  So now I have looked at and used XSLT, and
> it certainly is, generally, the tool that I want to use to solve this
> problem.  My question comes in the specifics of how I use it.  I will
> have a set of zero or more report snippets which will be populated with
> actual values from the network traffic, like this:
>
> <SNIPPET>
> ILLEGAL DNS PACKET: opcode is <REPORT proto="dns" field="opcode"/>.
> </SNIPPET>
>
> '<REPORT proto="dns" field="opcode"/>' will be replaced with something
> poetic like '4'; that code I have ready to go once I know where to plug
> it in.
>
> I then want to take these (zero or more) snippets and stick them in a
> report, like this:
>
> <report>
> 	Aiee!  Major attack detected!  Packet matches following attacks:
> 	<snippet-all/>
> </report>
>
> So, I have three elements constituend: the snippet, the dynamic values
> used to populate the snippet, and then the report template, which is
> combined with the post-processed snippets to generate the final report.
>
> Now, I have an article from IBM that explains rather well how to use
> XSLTs to generate themselves, and I'm sure that I could do that given
> enough effort, but from the libxslt documentation, composing (potentially
> several) new stylesheets for each report sounds expensive.  I really
> don't care which part is the document and which is the stylesheet;
> judging from the literature that I have read thusfar, this is a common
> ambivalence.  So, not caring either way, I ask:
>
> 1) Which parts should be stylesheet and which should be document?
>
> I have thought about just generating a document with the protocol values
> and transforming the hell out of that with one or several stylesheets,
> or generating a dynamic stylesheet with the protocol values and using the
> snippet as the document, or having a series of stylesheets.  I really
> don't know why one would adopt one approach over another, and so I am
> seriously floundering at this point.

The approach I would suggest is to use the protocol values as the stylesheet
and the snippet document as the target:

---
 ...
<!-- This rule isn't required but I like to code it explicitly regardless -->
<xsl:template match="/">
  <xsl:apply-templates />
</xsl:template>

<xsl:template match="SNIPPET">
  <xsl:copy>
    <xsl:apply-templates />
  </xsl:copy>
</xsl:template>

<xsl:template match="REPORT[@proto='dns' and @field='opcode']>
  4
</xsl:template>

<xsl:template match="REPORT[@proto='nis' and @field='opcode']>
  5
</xsl:template>

<xsl:template match="REPORT[@proto='smtp' and @field='opcode']>
  6
</xsl:template>
 ...
---

Then have each report be a stylesheet that is applied to the transformed
document:

---
 ...
<xsl:template match="/">
  <report>
    Aiee!  Major attack detected!  Packet matches following attacks:
    <xsl:apply-templates />
  </report>
</xsl:template>

<xsl:template match="SNIPPET">
   ...
</xsl:template>
 ...
---

> 2) Are there any tricks recommended to get the final document produced
> with a minimum of cost?
>
> I'm pretty sure that generating several new stylesheets for each report I
> want to generate would be expensive, but I don't know that to be the case.

Each report is already an XML document. All you will have to do is replace
each report with a single stylesheet, then you will have n+1 stylesheets (one
per report, plus the protocol stylesheet), and m XML files to be transformed
(the snippet files).

> 3) Code-wise, how does one shuffle all of these xml documents around in
> memory and combine them?  Can I just re-parent the DOM tree in memory
> underneath where it needs to go in the new document?  Even if I come
> up with an XML strategy (this stylesheet applied to that document,
> resulting subtree stuck here...) I really have no idea how to perform
> cross-document integration like this using libxml when it comes to the
> actual code.

---
 ...
xmlDocPtr protoDoc=NULL, snippetDoc=NULL, reportDoc=NULL;
xsltStylesheetPtr protoSheet=NULL, reportSheet=NULL;
xmlDocPtr midresult=NULL, finalresult=NULL;

protoDoc=xmlParse*(...);
snippetDoc=xmlParse*(...);
reportDoc=xmlParse*(...);

protoSheet=xsltParseStylesheetDoc(protoDoc);
reportSheet=xsltParseStylesheetDoc(reportDoc);

midresult=xsltApplyStylesheet(protoSheet, snippetDoc, NULL);
finalresult=xsltApplyStylesheet(reportSheet, midresult, NULL);

xmlDocDump*(finalresult, ...);

cleanup();
 ...
---

> 4) Is there anyone else out there who is dealing with dynamic data to
> compose reports like this who can offer advice, or software that does
> this that I could review?
>
> If this were something that I had been doing for a while, then I would
> just start playing with it and see where I end up, but I am still new
> enough to the entire XML universe that I am afraid of doing something
> really performance-killing without even knowing it.
>
> Any and all advice is welcome.  If this is just something that I should
> experiment with and stop bothering people about, then my feelings won't
> be hurt if people say so.

I would definitely suggest experimenting, even if for nothing more than
getting comfortable with libxml and libxslt.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>










[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]