I have some irregular input data from which I want to extract two cells in each row. I can do this by writing a program, but was wondering whether anyone had quick suggestions about how it might be done efficiently in EDT.
The following is typical of the input (not an actual data sample but sufficient for the question):
Problem input.csv (155 Bytes)
Note that the header names are unhelpful and columns not aligned so Schema does not work to organise it. The capture rule is that wherever there is text in a row – this will occur only in one cell per row – then capture the text and the numeric value in the column immediately prior, no other data. The text will never be in column 1. This would be output captured from the above:
Desired output.csv (99 Bytes)
I did not spot anything in transforms to help. However, regex might? All numbers will be decimal between 0.00 and 0.99 and the text has only half a dozen possibilities.
Is there something fairly straightforward I have missed? Otherwise I will write a separate script.