Wednesday, February 08, 2006

XSLT performance when mapping large documents in BizTalk

Recently I had to map a document with many thousand rows. I could not split the document because before I could split it, the document’s nodes had to be sorted.

With such large files you generally test it using a small subset to avoid waiting for maps to complete, I built an XSLT which worked great, I thought.

When you use a select filter such as "not(KeyValue=preceding-sibling::row/ KeyValue)" you end up with a huge performance hit the larger the document gets. My map went from 2 seconds for 50 rows to 10 minutes for a few thousand.

How to improve performance when you have large XML files to map that you can’t split? Try using xsl:key instead, which builds an index of keys from which you can much more efficiently select.

Here is a sample XSLT that demonstrates how to use the xsl:key:

<?xml version="1.0" encoding="UTF-8" ?>

<xsl:stylesheet xmlns:xsl="" version="1.0" xmlns:ns0="http://Conversion.schemas">

    <xsl:output method="xml" indent="no" />

    <xsl:key name="NumberKey" match="/*[local-name()='top' and namespace-uri()='http://biztalk/Conversion.schemas']/*[local-name()='row' and namespace-uri()='']"

        use="keyValue" />

    <xsl:template match="/">


            <xsl:for-each select="/*[local-name()='top' and namespace-uri()='http://biztalk/Conversion.schemas']/*[local-name()='row' and namespace-uri()='' and generate-id(.) = generate-id(key('NumberKey', keyValue)[1])]">

                <xsl:variable name="current_Number" select="keyValue" />



                        <xsl:value-of select="$current_Number" />


                    <xsl:for-each select="//row[keyValue=$current_Number]">



                                <xsl:value-of select="nr_data" />









1 comment:

Anonymous said...

Could you elaborate a little bit more....

How does your input looks like, and how does the output look like.