Flat file disassembler difference between BizTalk 2004 and 2006

We are updating a medium/large sized solution from BizTalk 2004 to BizTalk 2006 R2 for a client at the moment. We have run into a couple of interesting and unexpected issues which I’ll describe on this post and the next. The first “issue” we ran into is that the flat file disassembler in BizTalk 2006 and R2 now complain when there is unaccounted data (not described in the schema) at the end of a flat file. I put double quotes around issue because it isn’t really one; it’s doing what it should. It’s BizTalk 2004 that lets the extra data get away with it.

A lot of changes have been made to the flat file disassembler pipeline component that ships with BizTalk 2006 and BizTalk 2006 R2 when compared to its BizTalk 2004 version. Most of it is related to the recoverable interchange feature.

While everything was working fine in BizTalk 2004 we kept getting the following error message raised by the flat file disassembler component in BizTalk 2006:

“The remaining stream has unrecognizable data”

So, let’s have a closer look at what we have here. Let’s say we have the following flat file schema:

Flat file schema

Where Record1 has max occurs unbounded and min occurs of 0, and Record2 has max occurs of 1 and min occurs of 0. All four elements are restricted with a max length of 8. The default child delimiter is ; and both Record1 and Record2 use the default child delimiter. The default child order is Postfix but the child order of Record1 and Record2 is Infix. The tag name of Record1 is “Record1;” and tag name of Record2 is “Record2;”. Here’s the schema code:

<?xml version=“1.0“ encoding=“utf-16“?>
<xs:schema xmlns=“http://TestBizTalkProject.TestSchema“ xmlns:b=“http://schemas.microsoft.com/BizTalk/2003“ targetNamespace=“http://TestBizTalkProject.TestSchema“ xmlns:xs=“http://www.w3.org/2001/XMLSchema“>
    <xs:annotation>
        <xs:appinfo>
            <b:schemaInfo count_positions_by_byte=“false“ parser_optimization=“complexity“ lookahead_depth=“3“ suppress_empty_nodes=“false“ generate_empty_nodes=“true“ allow_early_termination=“false“ standard=“Flat File“ root_reference=“Root“ child_delimiter_type=“char“ default_child_delimiter=“;“ default_child_order=“postfix“ />
            <schemaEditorExtension:schemaInfo namespaceAlias=“b“ extensionClass=“Microsoft.BizTalk.FlatFileExtension.FlatFileExtension“ standardName=“Flat File“ xmlns:schemaEditorExtension=“http://schemas.microsoft.com/BizTalk/2003/SchemaEditorExtensions“ />
        </xs:appinfo>
    </xs:annotation>
    <xs:element name=“Root“>
        <xs:annotation>
            <xs:appinfo>
                <b:recordInfo structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ sequence_number=“1“ child_order=“postfix“ child_delimiter_type=“default“ />
            </xs:appinfo>
        </xs:annotation>
        <xs:complexType>
            <xs:sequence>
                <xs:annotation>
                    <xs:appinfo>
                        <b:groupInfo sequence_number=“0“ />
                    </xs:appinfo>
                </xs:annotation>
                <xs:element minOccurs=“0“ maxOccurs=“unbounded“ name=“Record1“>
                    <xs:annotation>
                        <xs:appinfo>
                            <b:recordInfo sequence_number=“1“ structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ tag_name=“Record1;“ child_delimiter_type=“default“ child_order=“infix“ />
                        </xs:appinfo>
                    </xs:annotation>
                    <xs:complexType>
                        <xs:sequence>
                            <xs:annotation>
                                <xs:appinfo>
                                    <b:groupInfo sequence_number=“0“ />
                                </xs:appinfo>
                            </xs:annotation>
                            <xs:element name=“Element1“>
                                <xs:annotation>
                                    <xs:appinfo>
                                        <b:fieldInfo sequence_number=“1“ justification=“left“ />
                                    </xs:appinfo>
                                </xs:annotation>
                                <xs:simpleType>
                                    <xs:restriction base=“xs:string“>
                                        <xs:maxLength value=“8“ />
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                            <xs:element name=“Element2“>
                                <xs:annotation>
                                    <xs:appinfo>
                                        <b:fieldInfo sequence_number=“2“ justification=“left“ />
                                    </xs:appinfo>
                                </xs:annotation>
                                <xs:simpleType>
                                    <xs:restriction base=“xs:string“>
                                        <xs:maxLength value=“8“ />
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
                <xs:element minOccurs=“0“ maxOccurs=“1“ name=“Record2“ nillable=“true“>
                    <xs:annotation>
                        <xs:appinfo>
                            <b:recordInfo sequence_number=“2“ structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ child_order=“infix“ child_delimiter_type=“default“ tag_name=“Record2;“ />
                        </xs:appinfo>
                    </xs:annotation>
                    <xs:complexType>
                        <xs:sequence>
                            <xs:annotation>
                                <xs:appinfo>
                                    <b:groupInfo sequence_number=“0“ />
                                </xs:appinfo>
                            </xs:annotation>
                            <xs:element name=“Element3“>
                                <xs:annotation>
                                    <xs:appinfo>
                                        <b:fieldInfo sequence_number=“1“ justification=“left“ />
                                    </xs:appinfo>
                                </xs:annotation>
                                <xs:simpleType>
                                    <xs:restriction base=“xs:string“>
                                        <xs:maxLength value=“8“ />
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                            <xs:element name=“Element4“>
                                <xs:annotation>
                                    <xs:appinfo>
                                        <b:fieldInfo sequence_number=“2“ justification=“left“ />
                                    </xs:appinfo>
                                </xs:annotation>
                                <xs:simpleType>
                                    <xs:restriction base=“xs:string“>
                                        <xs:maxLength value=“8“ />
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Now let’s say we have a text file with the following data:

Record1;Element1;Element2;Record2;Element3;Element4;

Copy the schema file (say Test.xsd) and the text file above (say Test.txt) to the PipelineTools folder in the BizTalk sdk. Run the following in the visual studio command prompt:

FFDasm.exe Test.txt -bs Test.xsd

The above tests the flat file disassembler. It works for both versions of BizTalk. Now if we change the data to:

Record1;Element1;Element2;Record2;Element3;Element4;gibberish

and test again you will see that BizTalk 2004 still processes the message with no errors and disregards the gibberish at the end of the file, where BizTalk 2006 throws the “The remaining stream has unrecognizable data” error.

This difference in behaviour is inside the “GetNext2” method of the FFDasmComp class inside Microsoft.BizTalk.Pipeline.Components.dll. In BizTalk 2004 that error message would only be thrown when there is a trailer schema supplied and it can’t probe the extra data against the schema. It disregards any extra data if there’s no trailer schema supplied. In BizTalk 2006 that extra data must be accounted for as it has an extra if statement checking for it in the code.

We only ran into the issue because there is actually trailer data in the messages coming into our client’s BizTalk Server and the original developers did not define that trailer in the schema or as a trailer schema (probably because BizTalk 2004 didn’t complain about it).

The fix to this in BizTalk 2006 is simple: create a catch all flat file schema (with a record and a string element with no restrictions) and define it in the flat file disassembler as the trailer schema. This has already been described here and here.

So for example create the following schema:

Trailer flat file schema

<?xml version=“1.0“ encoding=“utf-16“?>
<xs:schema xmlns=“http://BizTalk_Server_Project1.Trailer“ xmlns:b=“http://schemas.microsoft.com/BizTalk/2003“ targetNamespace=“http://BizTalk_Server_Project1.Trailer“ xmlns:xs=“http://www.w3.org/2001/XMLSchema“>
    <xs:element name=“Root“>
        <xs:complexType>
            <xs:sequence>
                <xs:element name=“Element1“ type=“xs:string“ />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Say it’s called “Trailer.xsd”. If you run the following it will work in BizTalk 2006 correctly:

FFDasm.exe Test.txt -bs Test.xsd -ts Trailer.xsd

Note that this is not picked up by validating the sample file against the schema inside visual studio, hence the use of FFDasm.exe.

10 thoughts on “Flat file disassembler difference between BizTalk 2004 and 2006”

Johan says:

April 18, 2008 at 8:33 pm

Thanks for writing this! I’ve been pulling my hair out over this for some time now, and then I stumbled upon your post which solved my problem right away!! 🙂
Thiago Almeida says:

April 19, 2008 at 11:49 am

No problem Johan, I’m glad it helped you!
Bobby says:

June 4, 2008 at 5:08 am

Nice post, and thanks for the explanation.
Nergock says:

June 10, 2008 at 12:29 pm

I have an issue where if I include the trailer schema, it solves the problem when I have an extra CrLf. However if there isn’t an extra CrLf, then I get the following error:

The trailer specification specified on the pipeline component properties does not contain an interchange trailer.

Strangely enough, it passes FFDASM but fails when it goes through the actualy Pipeline with the Flat File Disassembler. Am I doing something wrong here?
Thiago Almeida says:

June 10, 2008 at 12:46 pm

Nergock,

I think that by having the trailer specified in the pipeline it expects something to be there, so when you put a file through with no trailer it complains. What happens if you change the min occurs of the element on the trailer schema to 0?
Also, try using pipeline.exe to test your pipeline.
Pingback: The remaining stream has unrecognizable data « Rikard Alard's Blog
Dennis says:

October 22, 2009 at 12:55 am

Great post. This did the trick for us. Thanx for all you help.
Zeeshan says:

February 2, 2010 at 3:08 pm

Great post Thiago.. This is one of the very common scenarios with flat file messages (SWIFT included)..
Pingback: Flat File diassember throws error “The remaining stream has unrecognizable data” « Ravindar, .Net, Biztalk developer
Ritesh says:

November 16, 2010 at 7:45 am

Excellent post!!!!!
I have been struggling with this for a month. I have my flat file validated from Visual Studio against the schema, but the same flat file when received thru FF receive pipeline it give me the infamous error – “The remaining stream has unrecognizable data”
Once, I used this dummy trailer schema, it worked!! Thanks for sharing this piece of info. Greately appreciated.

Comments are closed.