Sunday, June 7, 2015

SSIS: Excel Import 255 character Truncation issue and Solution

Few days ago we have created a SSIS package to load excel data in our SQL Server database. When we run the package for a sample excel file, it runs perfectly. But when we run the package for original excel file it fails.




To find out the reason of the error, first we investigate in the Progress tab of the SSIS package, why the package fails to run. We see the below error messages in the Progress tab. One of the error message says "failed because truncation occurred". 

So our first understanding about the error was, may be one of the columns of the excel file contains long string and cannot insert that value to the database. So we revised our table structure and find the below schema  for the table.

CREATE TABLE [dbo].[TurncationProblemTest](
       [Code] [nvarchar](100) NULL,
       [Description] [nvarchar](1200) NULL
) ON [PRIMARY]

GO

In this stage we see that our understanding was wrong because column length is maximum which is sufficient to store the value of the excel file.

After some more investigation we found the below explanation of the error in SSIS documentation.

"Truncated text:  When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples (The Excel driver reads a certain number of rows (by default, 8 rows) in the specified source to guess at the data type of each column). If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated." 

So that is the actual reason that cause the SSIS package fail to run, because our excel file contains longer string after 8 rows and the truncation error occurs.


Solutions

The documentation  proposed two solutions for resolving the error.

Solution1: To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows (i.e. within first 8 rows of the excel sheet ) contains a value longer than 255 characters.

Solution2: You must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key.

Note: I can't find the registry key TypeGuessRows in the mentioned location. So I search for the key in the registry editor and found the key in the below location.

HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel. But change the registry key does not work perfectly.

Besides these two solution there are two more solutions. In the below section we try to apply these two solutions and see the result.


Solution 1( Not Recommended):
We can just ignore the error. For this case the package will not fail to run but it truncate the long string to 255 character long and insert the data to database. To ignore the error, right click the excel source and click the Show Advanced Editor option.


It will launches the Advanced Editor for Excel Source dialog box. In the dialog box we see Input and Output Properties tab. In this tab we can set the data type and length etc of the External Columns & Output Columns. In this dialog box we can set TruncationRowDisposition property to RD_IgnoreFailure. Follow the below image to set ignore failure.




Solution 2 (Recommended):
We can use Flat File Connection Manager instead of Excel Connection Manager. When we create Flat File Connection Manager we can set data type and length explicitly. To do so first we need to save the excel file as csv file or tab delimated file. Then we can use this file to create Flat File Connection. Drag and drop a Flat File Source in  the Data Flow tab. In the Flat File Source Editor dialog box click New button and it will launches  Flat File Connection Manager Editor dialog box. In the General tab specify the file full path and click Advanced tab. Then put data type and column width like below image.

Click OK and close the dialog box, this will create our connection manager. Now the connection manager can successfully read the full length data but we have to set the data type & length of the Output Columns so that we can get the data in the output pipeline. To do that right click on the Flat File Source and click Show Advanced Editor option. Then follow the below image instruction.



When we finish we run our package and it run successfully without any truncation error and insert all the data in our target database.

2 comments:

  1. Lo intente, pero no funciona, mi archivo excel tiene muchas hojas y cuando lo transformo a cvs este simplemente no pasa la informacion completa.

    ReplyDelete