Extracting key/value pairs from column into new columns?

Hello,

I have a csv file with one column that contains several key/value pairs, not always the same for all rows. I need to extract them into additonal column names with column name = key and column content = value.

Using the “Split Col” transform, I can do this with “Delimiter” = key, but this only works if “key” is not at the start of the column.

Is there a way to do this?

Thanks!

sample of input csv (one column, no headers, no row names):

"key1: value1"
"text key2: value2"
"key1: value1 key3: value3"
"text key2: value2 key1: value1"

desired output csv (with column names):

"common","key1","key2","key3"
"","value1","",""
"text","","value2",""
"","value1","","value3"
"text","value1","value2",""

It depends a bit on the details. Can the text, key or value contains spaces or semi-colons?

Also are the key values known in advance?

“text” and “value” can contain anything except “:”
“key” can be one or more words (letters only), separated by space, terminated by ": "

Yes (but subject to change depending on the whims of my bank’s IT department)

If text and key can both contains spaces, then I don’t think there is any way to reliably parse it - using Easy Data Transform or any other system.

Having given this some thought, I am afraid you are right.
I guess I will have to develop a program that does the extraction…

Thanks!

Is there any possibility of getting it in a more structured form?

Unfortunately, aside from the unusable csv file, they only provide pdf versions of the bank statements.

I will complain to them about the essentially unparseable structure of their csv, but still need to look for alternatives…

If you can email us (or post here) a few lines that represent a real statement (with confidential stuff modified) I will take a look and see if I can suggest anything else.