Removing html tags

Hi All,

I am migrtating data from a legacy system to a new system and have source field (Comments) that is wrapped in HTML.
html comments.txt (688 Bytes)

Any ways to remvoe these tags and simplify the output?

Attached is source and result spreadsheet.

I welcome all ideas!

thanks
html comments.csv (708 Bytes)

Hi,

Here is the idea as per your request.

Transform file.
RemoveHTMLTags.transform (2.0 KB)

1 Like

thanks Anon, regex always a mystery to me!

can you retry with the attached file , your transform didnt seem work using a source sheet as all the data is in a single cell, (cell A2)
html comments.csv (405 Bytes)

Remove the $ in the second term.

To give:

Lots of resources online to learn about regex and https://regex101.com/ is useful.

Hi @simonj

Here is the solution as per your change requirement.

Transform file.
RemoveHTMLTags2.transform (1.5 KB)

Data file.
html comments.csv (405 Bytes)

this is great many thanks for help!