Changing Directions

The other day when discussing some of the challenges we were facing with Lysine with a co-worker, I was forced to stop and think when he made the statement, "Don't be afraid of throwing everything away if you think another direction is better.".

It got me thinking about our original intent of doing the EBCDIC to ASCII conversion inside a System.IO.Stream subclass. The thinking was that it would be reusable in other applications besides just being consumed by an SSIS component (e.g. could wrap it within an FTP Stream and convert the bytes before they even hit the disk).

The major problem we have been facing here in the last couple of months is that forcing this solution in a Stream is like forcing a square peg into a round hole. The solution just doesn't fit the problem. In the Stream you are typically dealing with some type of input being converted/decorated on its way to being consumable output. It's a 1 to 1 flow, meaning that a single stream of input will yield a single stream of output.

However, part of the processing of EBCDIC data to ASCII involves dealing with with a binary format that in it's own unique way is part normalized and part denormalized data. To keep it on a single stream of output, we'd need to invent some encoding mechanism to allow us to read confidently all the different rows and types of records that might be found within a single record/line of EBCDIC data. Whether that be some XML structure or some delimiting scheme, it involves quite a bit of extra processing and overhead to get it into just an intermediate stage.

This intermediate step would still be in memory but would still involve an extra step. It's an idea that we haven't given up on and will table for a future release. Our primary goal right now is to get a usable SSIS component out the door by the end of December 2006. With this component the user should be able to define and process and EBCDIC file straight into a normalized database without having to right any code or scripts. To get there, we have made some fundamental shifts to how we are approaching solving this problem.

We will no longer be using the stream but will be leveraging the data structure that represents the EBCDIC data and will be more tightly integrated with the SSIS component. I finished up the code last night to get this working and all that is left to do is to wire it up in our SSIS test component (a rudimentary version of what the releasable version will be) and test it. We'll spend the next 8 to 10 weeks working out any bugs found through this testing as well as building a usable UI for the SSIS component.

Tags: software, ebcdic, ssis, integration services, amino software