Filling in empty cells in the table based on the key field

Vadim · September 23, 2023, 1:07pm

Good day everyone!

I apologize for bothering with my question on a Saturday afternoon, but something is not working out to solve one problem. My question is not urgent, but I ask for practical help if possible.

After converting the table in the “SPREAD” mode, the following file is obtained.

I need to fill in the missing fields. The key field is “client_id”.

At the output, you need to get the file in the following form.

I will be grateful for the hints, since I am almost sure that there is a solution, but I can’t find it yet.

P.S. In the screenshots, I highlighted the entries with matching “client_id” with colors.

result after spread.transform (3.8 KB)

Admin · September 23, 2023, 2:21pm

Here is one way to do it usign Unique and New Rows:

vadim.transform (3.5 KB)

It would be a bit easier if Unique had a Keep first non-empty option.

Admin · September 23, 2023, 2:49pm

This is with the data you supplied.
vadim2.transform (4.0 KB)

Vadim · September 23, 2023, 3:35pm

Thank you so much! The UNIQUE mode works just fine! I apologize for bothering you and for not being able to solve this problem myself.
I would like to take this opportunity to say that your program has saved me a lot of time and money. Previously, the programmer had to write or tell what I want to get as a result of file conversion. He recorded the task and wrote the necessary script. Now, thanks to EDT, I do almost everything myself. The result is faster and better.
Over the past year, I have been turning to programmers only with large files. Unfortunately, there is currently not enough processing in EDT with the number of rows of 10 million.
If you make it possible to process tables with 100 million records, then I will do everything myself, and I will fire the programmers)))

Admin · September 23, 2023, 4:05pm

Thanks. Can I put that on the testimonial page?

It depends on how many columns you have and the data profile, but 10 million rows+ should be possible. Is it running out memory?

Vadim · September 23, 2023, 4:16pm

Admin
Of course you can post a review. For me, this is a real saving of time, nerves and money! I am very grateful to you for EDT!

That’s the thing, that memory does not end. I allocate 120+ GB of RAM for EDT and files up to ten million records are handled perfectly. There are not many columns (from 5 to 10).
But here is a file in which 50-70 million records are no longer loaded.
And this is sometimes very lacking. It remains only to hope that soon it will be possible to process them on EDT)))

Admin · September 23, 2023, 4:38pm

We have managed to process 100 million rows in testing.

Is Optimize processing set to Minimum memory use in Preferences?

Is it failing on input or after multiple transforms?

In v2 I hope to look at ways of being more efficient with RAM, so we can solve bigger problems.

Vadim · September 23, 2023, 4:45pm

I’ll take some screenshots with errors and share them with you

Anonymous · September 24, 2023, 6:30am

@Vadim ,

Here is the solution, that you required as per your screen shot. Data is from your uploaded transform file result after spread.transform

@Admin , there is only first id, while the user wanted the id’s to be preserved as shown in the desired output screen shot.

Learned something new in the process that you can leave concatenate field blank, though it highlight’s it with red, but still it let you continue, I did not know about it, as in other transforms it won’t let you continue if you don’t chose or provide value for the options.

Transform file.
result after spread.transform (4.2 KB)

Anonymous · September 24, 2023, 6:37am

I second that, it would be excellent addition to Unique Transform.

Admin · September 24, 2023, 8:03am

Missed that. :0)

I think it will generally be an error to leave out the delimiter, but not always. So we try not to be too prescriptive.

Anonymous · September 25, 2023, 5:45am

It is just a suggestion, don’t know if you accept it or not, It would be great if you can make Fill and Unfill transform to have option to fill or unfill by using Column, like for example in this case

id	client_id	full_name	birthday	email	phone
522	40714	Maxim Smith
524	40714		02/02/2000
526	40714			ms@gmail.com	mailto:ms@gmail.com
545	27649		03/03/1999
547	27649	James Brown
548	27649				88888888
549	46427		04/04/1998
550	46427			jb@gmail.com	mailto:jb@gmail.com
551	38638			bill@gmail.com	mailto:bill@gmail.com
552	38638	Bill Williams
553	38638		05/05/1997
554	38638				99999999
555	31830	Ace Miller

We could have set in the Fill transform column to be used for it’s range or group of records on client_id and chose the columns that need to be filled from single value in the chosen fields wherever it might by in the group or range of records.

So if we have something like that with Fill then above solution would have gone from six different transforms to just twol transform one Replace to cleanup the email and second Fill that makes it fast and consume less resources.

id	client_id	full_name	birthday	email	phone
522	40714	Maxim Smith	2000-02-02	ms@gmail.com
524	40714	Maxim Smith	2000-02-02	ms@gmail.com
526	40714	Maxim Smith	2000-02-02	ms@gmail.com
545	27649	James Brown	1999-03-03		88888888
547	27649	James Brown	1999-03-03		88888888
548	27649	James Brown	1999-03-03		88888888
549	46427		1998-04-04	jb@gmail.com
550	46427		1998-04-04	jb@gmail.com
551	38638	Bill Williams	1997-05-05	bill@gmail.com	99999999
552	38638	Bill Williams	1997-05-05	bill@gmail.com	99999999
553	38638	Bill Williams	1997-05-05	bill@gmail.com	99999999
554	38638	Bill Williams	1997-05-05	bill@gmail.com	99999999
555	31830	Ace Miller

For Unfill in another thread I had to create blank row, to stop Unfill from removing the values in another set of records.

Example:

Area	Flow	Option	Condition	Sequence
Area A	Flow 3	Optional	Condition 10	1
Area A	Flow 3	Optional	Condition 8	2
Area A	Flow 3	Optional	Condition 7	3
Area B	Flow 2	Optional	Condition 2	1
Area B	Flow 2	Optional	Condition 5	2
Area C	Flow 1	Optional	Condition 1	1
Area C	Flow 1	Optional	Condition 2	2
Area C	Flow 1	Optional	Condition 3	3
Area C	Flow 2	Optional	Condition 2	1
Area C	Flow 2	Optional	Condition 5	2

In the above to make it more readable and remove repeating value Unfill in current state will remove Optional from all the records and will keep only for the first Record

Area	Flow	Option	Condition	Sequence
Area A	Flow 3	Optional	Condition 10	1
			Condition 8	2
			Condition 7	3
			Condition 9	4
			Condition 5	5
Area B	Flow 2		Condition 2	1
			Condition 5	2
			Condition 6	3
			Condition 4	4
Area C	Flow 1		Condition 1	1
			Condition 2	2
			Condition 3	3
Area C	Flow 2		Condition 2	1
			Condition 5	2
			Condition 6	3
			Condition 4	4

While it should be like this

Area	Flow	Option	Condition	Sequence
Area A	Flow 3	Optional	Condition 10	1
			Condition 8	2
			Condition 7	3
			Condition 9	4
			Condition 5	5
Area B	Flow 2	Optional	Condition 2	1
			Condition 5	2
			Condition 6	3
			Condition 4	4
Area C	Flow 1	Optional	Condition 1	1
			Condition 2	2
			Condition 3	3
Area C	Flow 2	Optional	Condition 2	1
			Condition 5	2
			Condition 6	3
			Condition 4	4

Also we could make option available in Slide transform for the group of records as well, that based on the group slide the values in the columns to the first record of the group for each column if they have value in any place within the group and slide that to the first record, after which we could use fill for example to fill up the group with same value.

Just few thoughts I had and thought why not put it out in open and either to be accepted or shot down.

Admin · September 25, 2023, 7:37am

I was actually thinking much the same thing. It would only work when Direction is set to Up or Down.

Perhaps if Direction is set to Up or Down it shows a For field with drop-down options All rows and Matching values for. If you select Matching values for you can choose a column.

I will put it on the wishlist for v2.

prashant · September 25, 2023, 7:49am

V2 wishlist is getting fatter and fatter

Vadim · September 26, 2023, 3:12pm

Many thanks to dear @Admin and @Anonymous for participating in the proposed solutions!
For me, the solution to the problem was to use the UNIQUE function, since duplicates of information are immediately deleted.
I am very grateful to everyone, now I know the solution!

Admin · September 28, 2023, 4:18pm

Unique now has Keep first non-empty and Keep last non-empty options.

Anonymous · September 29, 2023, 2:06pm

With that, transform reduced from Six to Three only.

Once @Admin makes option available in Fill transform, to fill group of records with first non empty value based on a column that represents group of records, like in this example client_id and let us add columns that we want to be filled from the first non empty value (like when you want to filter you can choose which fields you want to filter on and choose conditions for filter) then it will reduce further to two transforms only i.e Fill and Replace in this case and if there was no need to remove additional value in the email field, then only one transform Fill.

That is from Six transforms to One transform will be quite a big change.

Updated Transform file.
result after spread.transform (3.3 KB)

prashant · September 29, 2023, 3:53pm

I am already so spoiled I cannot think of the Pre-Keep First Non-Empty era. It just fits Unique options so well

Admin · March 10, 2025, 1:28pm

Note that Fill with Direction=Down and matching on the client_id column would convert:

id	client_id	full_name
522	40714	Maxim Smith
524	40714
526	40714
545	27649
547	27649	James Brown
548	27649
549	46427
550	46427
551	38638
552	38638	Bill Williams
553	38638
554	38638
555	31830	Ace Miller

To:

id	client_id	full_name
522	40714	Maxim Smith
524	40714	Maxim Smith
526	40714	Maxim Smith
545	27649
547	27649	James Brown
548	27649	James Brown
549	46427
550	46427
551	38638
552	38638	Bill Williams
553	38638	Bill Williams
554	38638	Bill Williams
555	31830	Ace Miller

You would then have to do a Fill with Direction=Up and matching on the client_id to get:

id	client_id	full_name
522	40714	Maxim Smith
524	40714	Maxim Smith
526	40714	Maxim Smith
545	27649	James Brown
547	27649	James Brown
548	27649	James Brown
549	46427
550	46427
551	38638	Bill Williams
552	38638	Bill Williams
553	38638	Bill Williams
554	38638	Bill Williams
555	31830	Ace Miller

But you could Sort the rows first, so as only to need 1 Fill.

Anonymous · March 12, 2025, 2:23pm

@Admin

id client_id full_name
. . .
551 38638
552 38638 Bill Williams
553 38638
554 38638
555 31830 Ace Miller

Why not add third choice Both direction Up and Down as well, since now Filling or Un Filling using key. This will save extra step of sort or any other step to get data in correct order.