Deleting subsequent identical values in different rows in the same column

ANTONIA · March 22, 2024, 4:33pm

Hi, i need to delete subsequent identical values in different rows in the same column, if value in other column is x

Thanks in advance

Olaf · March 22, 2024, 5:39pm

Hi Antonia,

if I understand it right you want to remove duplicates in rows if a check value is “x”.

I created a table with a column “Value_multi” which have such duplicates. If it is duplicated in the next row and the “Check “x”” is “x” the new column “Value_multi_new” includes now an empty value. If you want to delete the rows you can set with the if transformation a special value and use this to filter (remove) such rows out.

I hope it gives an idea.

Antionia_remove_duplicates.transform (3.5 KB)

Admin · March 22, 2024, 7:08pm

If @Olaf 's answer isn’t what you are looking for, then please can you supply a simple input and the correspoding output you want.

Olaf · March 22, 2024, 8:10pm

I had to correct the Direction in the Offset transformation and now it fits really to my understanding:

Antionia_remove_duplicates.transform (3.5 KB)

Anonymous · March 23, 2024, 3:15am

Since you did not provide which column is representing x and how you get that x, there is no way for us to provide you with the solution.

As @Admin said, it would be easy for us to provide solution if the input and expected out is provided.

Anyway, what I get is that you want to remove the duplicate rows and in the absence of how to identify x, here is my solution to remove the duplicates based on the data you have provided.

Transform file.
DeleteSubsequentDuplicateValues.transform (4.4 KB)

ANTONIA · March 25, 2024, 11:55am

thanks a lot to everybody,
I want to delete all the identical values where there is no “abito” text in multiple columns.
tanks in advance for your invaluable support

Admin · March 25, 2024, 12:01pm

We really need a simple example input and the expected output. A word description is too ambiguous.

ANTONIA · March 25, 2024, 12:03pm

Hi Olaf, thanks a lot for your support, I need to delete all the identical values in various columns deleting values in the rows where there is no abito txt in a certain column.
hope to have been clearer
Thanks in advance

ANTONIA · March 25, 2024, 12:06pm

Thanks I’ll prepare and provide you a sample.

Olaf · March 25, 2024, 12:57pm

I think I get it, Assume the column Check “x” as your column with the abitio txt. You need just to change in the if transformation the second argument form “=” to “!=” in the example I supported. And the “x” must be replaces by abitio txt.

Then I expect it will work as you want. If you are sure the abitio txt is only in the first occurrence then the infill transformation is the more efficient choice.

ANTONIA · March 25, 2024, 1:01pm

here is the sample

col1	col2	col3	col4		col1	col2	col3	col4
abcdefg	abc	aaaa	abiti		abcdefg	abc	aaaa	abiti
abcdefg	abc	aaaa
abcdefg	abc	aaaa
abcdefg	abc	aaaa
abcdefg	abc	aaaa
dadttg	abcfg	abcuy	abiti		dadttg	abcfg	abcuy	abiti
dadttg	abcfg	abcuy		to
dadttg	abcfg	abcuy
dadttg	abcfg	abcuy
dsds	dfghghgf	dsyrrr	abiti		dsds	dfghghgf	dsyrrr	abiti
dsds	dfghghgf	dsyrrr
dsds	dfghghgf	dsyrrr
dsds	dfghghgf	dsyrrr

Olaf · March 25, 2024, 1:08pm

This look, like the abiti txt is every time in the first line, so use unfill and select the column where the values shall be removed.

Otherwise adapt the example I provided to the columns you need.

If you want to delete the rows with the duplicates you can add a filter and keep only columns where after the infill a value exists.

ANTONIA · March 25, 2024, 1:55pm

Thanks Olaf, that’s working quite well, but it also deletes duplicates where “abito” is not present.
Is there a way to execute a more granular and precise deletion?

Olaf · March 25, 2024, 2:01pm

but it also deletes duplicates where “abito” is not present

The example followed your first request, remove the value where no abito txt is … I struggle, for me your two remarks are contradicting

If it doesn’t fit 100% play with the condition in the if transformation

ANTONIA · March 25, 2024, 2:11pm

[quote=“Olaf, post:14, topic:1118”]
but it also deletes duplicates where “abito” is not present
excuse me. i was saying that if there are multiple values repeating in a serie where abito is not the first of the serie but is not even present duplicates wil be deleted leaving the first row where “abito” is not present
dedupe should work only when the first row of duplicates is containing “abito”, if duplicates check column does not contain “abito” the corresponding similar values should not be touched

col1	col2	col3	col4		col1	col2	col3	col4
abcdefg	abc	aaaa	abiti		abcdefg	abc	aaaa	abiti
abcdefg	abc	aaaa
abcdefg	abc	aaaa
abcdefg	abc	aaaa
abcdefg	abc	aaaa
dadttg	abcfg	abcuy	abiti		dadttg	abcfg	abcuy	abiti
dadttg	abcfg	abcuy		to
dadttg	abcfg	abcuy
dadttg	abcfg	abcuy
dsds	dfghghgf	dsyrrr	gonna		dsds	dfghghgf	dsyrrr	gonna
dsds	dfghghgf	dsyrrr			dsds	dfghghgf	dsyrrr
dsds	dfghghgf	dsyrrr			dsds	dfghghgf	dsyrrr
dsds	dfghghgf	dsyrrr			dsds	dfghghgf	dsyrrr

Anonymous · March 25, 2024, 4:16pm

Here is your solution as per your Input and Output.

Transform file.
DeleteSubsequentDuplicateValues2.transform (3.8 KB)

Olaf · March 25, 2024, 4:31pm

@Anonymous was faster

But if you want to keep the rows and just delete some values, here is my solution.

Antionia_remove_duplicates2.transform (4.1 KB)

ANTONIA · March 26, 2024, 2:15pm

thanks a lot Olaf.
I only have to adapt it to my data file now.

Anonymous · March 26, 2024, 2:56pm

If you like to have empty lines, instead of removing them

Transform file.
DeleteSubsequentDuplicateValues3.transform (5.1 KB)

ANTONIA · March 28, 2024, 1:19pm

Thanks a lot! I have to understand all the process but is really astounding!