XSLT performance when mapping large documents in BizTalk
Recently I had to map a document with many thousand rows. I could not split the document because before I could split it, the document’s nodes had to be sorted.
With such large files you generally test it using a small subset to avoid waiting for maps to complete, I built an XSLT which worked great, I thought.
When you use a select filter such as "not(KeyValue=preceding-sibling::row/ KeyValue)" you end up with a huge performance hit the larger the document gets. My map went from 2 seconds for 50 rows to 10 minutes for a few thousand.
How to improve performance when you have large XML files to map that you can’t split? Try using xsl:key instead, which builds an index of keys from which you can much more efficiently select.
Here is a sample XSLT that demonstrates how to use the xsl:key:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:ns0="http://Conversion.schemas">
<xsl:output method="xml" indent="no" />
<xsl:key name="NumberKey" match="/*[local-name()='top' and namespace-uri()='http://biztalk/Conversion.schemas']/*[local-name()='row' and namespace-uri()='']"
use="keyValue" />
<xsl:template match="/">
<ns0:Rows>
<xsl:for-each select="/*[local-name()='top' and namespace-uri()='http://biztalk/Conversion.schemas']/*[local-name()='row' and namespace-uri()='' and generate-id(.) = generate-id(key('NumberKey', keyValue)[1])]">
<xsl:variable name="current_Number" select="keyValue" />
<Data>
<keyValue>
<xsl:value-of select="$current_Number" />
</keyValue>
<xsl:for-each select="//row[keyValue=$current_Number]">
<Part>
<PartID>
<xsl:value-of select="nr_data" />
</PartID>
</Part>
</xsl:for-each>
</Data>
</xsl:for-each>
</ns0:Rows>
</xsl:template>
</xsl:stylesheet>
1 comment:
Could you elaborate a little bit more....
How does your input looks like, and how does the output look like.
Post a Comment