We are updating a medium/large sized solution from BizTalk 2004 to BizTalk 2006 R2 for a client at the moment. We have run into a couple of interesting and unexpected issues which I’ll describe on this post and the next. The first “issue” we ran into is that the flat file disassembler in BizTalk 2006 and R2 now complain when there is unaccounted data (not described in the schema) at the end of a flat file. I put double quotes around issue because it isn’t really one; it’s doing what it should. It’s BizTalk 2004 that lets the extra data get away with it.
A lot of changes have been made to the flat file disassembler pipeline component that ships with BizTalk 2006 and BizTalk 2006 R2 when compared to its BizTalk 2004 version. Most of it is related to the recoverable interchange feature.
While everything was working fine in BizTalk 2004 we kept getting the following error message raised by the flat file disassembler component in BizTalk 2006:
“The remaining stream has unrecognizable data”
So, let’s have a closer look at what we have here. Let’s say we have the following flat file schema:
Where Record1 has max occurs unbounded and min occurs of 0, and Record2 has max occurs of 1 and min occurs of 0. All four elements are restricted with a max length of 8. The default child delimiter is ; and both Record1 and Record2 use the default child delimiter. The default child order is Postfix but the child order of Record1 and Record2 is Infix. The tag name of Record1 is “Record1;” and tag name of Record2 is “Record2;”. Here’s the schema code:
<?xml version=“1.0“ encoding=“utf-16“?>
<xs:schema xmlns=“http://TestBizTalkProject.TestSchema“ xmlns:b=“http://schemas.microsoft.com/BizTalk/2003“ targetNamespace=“http://TestBizTalkProject.TestSchema“ xmlns:xs=“http://www.w3.org/2001/XMLSchema“>
<xs:annotation>
<xs:appinfo>
<b:schemaInfo count_positions_by_byte=“false“ parser_optimization=“complexity“ lookahead_depth=“3“ suppress_empty_nodes=“false“ generate_empty_nodes=“true“ allow_early_termination=“false“ standard=“Flat File“ root_reference=“Root“ child_delimiter_type=“char“ default_child_delimiter=“;“ default_child_order=“postfix“ />
<schemaEditorExtension:schemaInfo namespaceAlias=“b“ extensionClass=“Microsoft.BizTalk.FlatFileExtension.FlatFileExtension“ standardName=“Flat File“ xmlns:schemaEditorExtension=“http://schemas.microsoft.com/BizTalk/2003/SchemaEditorExtensions“ />
</xs:appinfo>
</xs:annotation>
<xs:element name=“Root“>
<xs:annotation>
<xs:appinfo>
<b:recordInfo structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ sequence_number=“1“ child_order=“postfix“ child_delimiter_type=“default“ />
</xs:appinfo>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:annotation>
<xs:appinfo>
<b:groupInfo sequence_number=“0“ />
</xs:appinfo>
</xs:annotation>
<xs:element minOccurs=“0“ maxOccurs=“unbounded“ name=“Record1“>
<xs:annotation>
<xs:appinfo>
<b:recordInfo sequence_number=“1“ structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ tag_name=“Record1;“ child_delimiter_type=“default“ child_order=“infix“ />
</xs:appinfo>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:annotation>
<xs:appinfo>
<b:groupInfo sequence_number=“0“ />
</xs:appinfo>
</xs:annotation>
<xs:element name=“Element1“>
<xs:annotation>
<xs:appinfo>
<b:fieldInfo sequence_number=“1“ justification=“left“ />
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base=“xs:string“>
<xs:maxLength value=“8“ />
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name=“Element2“>
<xs:annotation>
<xs:appinfo>
<b:fieldInfo sequence_number=“2“ justification=“left“ />
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base=“xs:string“>
<xs:maxLength value=“8“ />
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs=“0“ maxOccurs=“1“ name=“Record2“ nillable=“true“>
<xs:annotation>
<xs:appinfo>
<b:recordInfo sequence_number=“2“ structure=“delimited“ preserve_delimiter_for_empty_data=“true“ suppress_trailing_delimiters=“false“ child_order=“infix“ child_delimiter_type=“default“ tag_name=“Record2;“ />
</xs:appinfo>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:annotation>
<xs:appinfo>
<b:groupInfo sequence_number=“0“ />
</xs:appinfo>
</xs:annotation>
<xs:element name=“Element3“>
<xs:annotation>
<xs:appinfo>
<b:fieldInfo sequence_number=“1“ justification=“left“ />
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base=“xs:string“>
<xs:maxLength value=“8“ />
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name=“Element4“>
<xs:annotation>
<xs:appinfo>
<b:fieldInfo sequence_number=“2“ justification=“left“ />
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base=“xs:string“>
<xs:maxLength value=“8“ />
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Now let’s say we have a text file with the following data:
Record1;Element1;Element2;Record2;Element3;Element4;
Copy the schema file (say Test.xsd) and the text file above (say Test.txt) to the PipelineTools folder in the BizTalk sdk. Run the following in the visual studio command prompt:
FFDasm.exe Test.txt -bs Test.xsd
The above tests the flat file disassembler. It works for both versions of BizTalk. Now if we change the data to:
Record1;Element1;Element2;Record2;Element3;Element4;gibberish
and test again you will see that BizTalk 2004 still processes the message with no errors and disregards the gibberish at the end of the file, where BizTalk 2006 throws the “The remaining stream has unrecognizable data” error.
This difference in behaviour is inside the “GetNext2” method of the FFDasmComp class inside Microsoft.BizTalk.Pipeline.Components.dll. In BizTalk 2004 that error message would only be thrown when there is a trailer schema supplied and it can’t probe the extra data against the schema. It disregards any extra data if there’s no trailer schema supplied. In BizTalk 2006 that extra data must be accounted for as it has an extra if statement checking for it in the code.
We only ran into the issue because there is actually trailer data in the messages coming into our client’s BizTalk Server and the original developers did not define that trailer in the schema or as a trailer schema (probably because BizTalk 2004 didn’t complain about it).
The fix to this in BizTalk 2006 is simple: create a catch all flat file schema (with a record and a string element with no restrictions) and define it in the flat file disassembler as the trailer schema. This has already been described here and here.
So for example create the following schema:
<?xml version=“1.0“ encoding=“utf-16“?>
<xs:schema xmlns=“http://BizTalk_Server_Project1.Trailer“ xmlns:b=“http://schemas.microsoft.com/BizTalk/2003“ targetNamespace=“http://BizTalk_Server_Project1.Trailer“ xmlns:xs=“http://www.w3.org/2001/XMLSchema“>
<xs:element name=“Root“>
<xs:complexType>
<xs:sequence>
<xs:element name=“Element1“ type=“xs:string“ />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Say it’s called “Trailer.xsd”. If you run the following it will work in BizTalk 2006 correctly:
FFDasm.exe Test.txt -bs Test.xsd -ts Trailer.xsd
Note that this is not picked up by validating the sample file against the schema inside visual studio, hence the use of FFDasm.exe.
Pingback: The remaining stream has unrecognizable data « Rikard Alard's Blog()
Pingback: Flat File diassember throws error “The remaining stream has unrecognizable data” « Ravindar, .Net, Biztalk developer()