Split a column based on length

Hello,
I came across a column split that I cannot easily solve:
I have a column that contains a text and date YYYY-MM-DD at the end. I would like to split the column in two: C1 (text), C2 (date). I tried splitting identifying the date with regex

(^(19|20)\d\d[- \/.](0[1-9]|1[012])[- \/.](0[1-9]|[12][0-9]|3[01])$)

But it does not work.

Date length is always 10, and thinking of it, it may be useful to be able to split columns based on number of character from start or end.
Below data in which the first column has a size 10, second size 5, third 8
TEXT 1 123 34567
EXAMPLE 1 3 9875
T1 890 456678
In this case to get 3 columns one should to be able to split from start at position 11 and at 16 to get three columns. (#feature-request)

I believe you can do what you want with Extract and Chop transforms.

split-on-length.transform (2.4 KB)

1 Like

You can also do it with regular expressions, but the approach above is simpler!

1 Like

Thank you VERY much for your workaround.
Still, it may be useful to have a split option with characters positions from start/end (#feature-request)

We could add something like that to ‘Split Cols’. But I’m not sure what the cleanest, and most easy to understand, way to do that would be.

Enter “10,5,8” to split into 3 columns that correspond to:
the first 10 characters
the next 5 character
the next 8 characters

or

Enter “10,15” to split between character 10 and 11 and characters 15 onward

1 Like

Note, depending on the data in the other columns, if the data is from a text file (e.g. CSV) you could read in the file as a fixed width file.

It will be in the next snapshot release.

3 Likes

Now available here:

1 Like

Thank you for the reply and thank you for the implementation! Works great!!

1 Like