Multiple output files from a single input file

Is there any way to create a set of output files from a single input file?

For example, I would like to take an XML file and based on the content split it up into a set of markdown files.

Just add as many output file nodes as you need. I think almost every transform I’ve ever put together has multiple outputs.

My favorite trick is to name the same file in each output with a different sheet name - whatever.xlxs[sheet name] - to create named sheets in a single output file.

If you use overwrite mode, you need to set “sheet” instead of “file” when you do that.

I see the Mac version of EDT offers Markdown output - I need to experiment with that. Sounds intriguing.

1 Like

As @Amontillado says, you can add as many output nodes as you like to an input or transform.

One of the things we are hoping to add in future is to be able to define the output file name in a column, so you change the file(s) you are outputting to dynamically in the data. For example, you could use this to split a big file into 1000 line chunks without knowing in advance how big the input file is. But we don’t have that yet.

1 Like

Thank you both for your reply. I think Andy hit the nail on the head. I am looking to create a new output file for each “row” (or XML equivalent). I can’t do that with multiple outputs in the transform since I don’t know how many rows there are and it would be very tedious.

Guess I’ll have to wait for the new functionality (or write some code :frowning: )

It is quite high on the wishlist. But we need to smooth off a few rough edges before we start adding new features.

Where there’s a will, there’s probably a way.

How about creating a column in a single output file that contains the file name you want that line written to?

Write the output to CSV, and read it in, say, Python. Write each line to the filename specified in the filename column.

I could be missing the whole point, too. That happens - good luck!

I have a similar request. For example, I have a big file containing records from, say, 50 weeks of survey. There is a column indicate which week a response is from. I want to split the file in a way that each output file contains responses from one weeks only. I can use filter function. But given that we have 50 weeks of survey, that’s quite a lot of job.

This is still high on the ‘wishlist’. But we haven’t done it yet.

A major issue is: what if someone selects the wrong column for the file names and this creates thousands of files accidentally? Some of them could even overwrite important files. We need to try to stop that happening. But we putting up a confirmation window every time would be really annoying. It needs some thought.

If you are concerned with this, this is my suggestion: by default , this function will create a new empty folder, and save the output to this new empty folder. So the worst outcome of using this function is simply producing a lot of wrong files, but it would not overwrite other important files.

Possibly. But they also need to be able to specify the folder (as well as the file name) in the dataset.

I mean, we make empty folder a default setting. The user can always change the folder, and user can specify the file name. If you’re concerned with overwriting, you can simply do not offer overwriting in this function. If there is a duplicate file name, let the program add a suffix, for example, (1).

1 Like

Certainly some options to consider.

@CRC @alancai This is now available in the latest snapshot. See:

thank you very much for the efforts