I'm trying to "componentize" an SSIS package with multiple Data Flows, and I can't figure out how to get the output of the results from one Data Flow into another.
So, for example, at the end of one Data Flow, I have a Recordset destination, and I'm storing that into a variable. But I can't figure out how to access the contents of that variable in the following Data Flow. Can this be done, or am I going about this the wrong way?The best is to use a raw file destination and source.
You could use a script component to store the data in a variable however you want and then use a script component to retrieve the data.|||If you want to try the second option that Simon mentioed then there's some code here that might help you extract data from a variable.
http://blogs.conchango.com/jamiethomson/archive/2005/02/08/960.aspx
-Jamie|||
Jamie, thanks for the link, but it doesn't seem to apply in my situation, as I'm not sending the recordset to a Script Task, but rather a Script Component (as a source) in a Data Flow.
What I'm trying to do is break a package down into four or five separate Data Flows that I can enable/disable and route around as necessary--essentially creating a template package that I can apply to many different scenarios.
It seems like it should be a trivial thing to dump the data at the end of one Data Flow into a recordset and retrieve it at the beginning of another, but I can't figure out how to do it.
I suppose using raw files is one answer, but it seems like there has to be a more elegant solution, without writing files to disk.
|||John,Use the recordset destination to dump the data into a recordset in memory. The code I linked to above can easily be used in a script source component to loop through that recordset and put values into the pipeline.
Your next question may be "if there's a Recordset Destination, why isn't there a Recordset Source?" The reason is that SSIS needs to know the metadata of the dataflow at design time which wouldn't be possible with a recordset Source.
-Jamie|||Jamie
But whats the point of being able to put it in and not get it out. Essentially your working round the fact so why can't SSIS.
Having a recordset source that you define the columns, whats wrong with that. Ok so it can't validate until execution time and that needs to be understood. Not providing a source component people are just going to write their own as you sort of have done.|||
Will there be a "typed dataset source" somewhere in the near future? The current situation with writing data, say half a gig, to disk only to read it again seconds later does not sound very performant. And writing one's own metadata in scriptcomponent or scripttask (in a language other than the preferred) without IDE support isn't a killer feature either :-/
With best regards,
Artus
|||ArtusKG wrote:
Will there be a "typed dataset source" somewhere in the near future? The current situation with writing data, say half a gig, to disk only to read it again seconds later does not sound very performant. And writing one's own metadata in scriptcomponent or scripttask (in a language other than the preferred) without IDE support isn't a killer feature either :-/
With best regards,
Artus
It may not sound very performant but have you tried it? Its a proprietary file format and is very very fast indeed.
Even an out-of-the-box recordset source wouldn't be able to pick up its metadata from anywhere because that metadata doesn't exist until runtime.
If there's a way around that problem then i don't know about it.
-Jamie
|||The problem I am currently struggling with is the vast amount of data unnecessarily written to disk. Let's keep the example with 500 MB in the recordset. If I had to write it the disk were the bottleneck for several seconds. Therafter some logi occurs (prior to my next dataflow task). And when I then read the data from the raw file I am again waisting time by reading data unnecessarily written to disk (and blocking every other disk-i/o on the machine).
I don't mind the data being held in RAM as I want it there for further processing later on anyway.
And a design time component where I can define my RecordSet's colums using drag-drop were a great improvement. Think of a "write schema to file" option of the RecordSet destination and a "read schema from file" option in the to be component RecordSet source...
With best regards,
Artus
|||What is the scenario you are trying to enable, Artus? The first post in this thread was about componentization of data flows - we will be addressing that in the future, probably through other mechanisms. In your case you want to persist the results of a data flow in memory and then have another data flow pick up that data? Why? You really want to persist 500MB of data in RAM?
Thanks
Donald
No comments:
Post a Comment