Video: How to compare Excel sheets

Name: How to compare Excel sheets
Uploaded: 2025-02-25T16:52:03Z
Description: We have created a new 4 minute video to show you how to use the powerful new Compare Data and Compare Headers features in Easy Data Transform v2: [How to compare Excel sheets] Youtube ‘l...

Admin · February 25, 2025, 4:52pm

We have created a new 4 minute video to show you how to use the powerful new Compare Data and Compare Headers features in Easy Data Transform v2:

Youtube ‘likes’ help us to be more prominent and are much appreciated.

Subscribe to our Youtube channel to be notified of future videos.

prashant · February 26, 2025, 5:37am

Holy crap,

I had no idea this feature exists !

Comparing header in itself was good enough but data is icing on cake!

N8AKTIV · July 13, 2025, 3:47pm

It would be helpful if we could export this result directly.
(only the rows with differences and only the columns with their different values).

Yes, we can do all the comparison in a flow with “subtract” left from right and vice versa, but an easy-to-use compare-data-component that does this for us or an export function within this compare-data-screen would be great

Admin · July 13, 2025, 4:27pm

We have thought about doing a Diff transform. I’m not sure how you would best represent the results as a single table though. Especially when the inputs might have different numbers of rows and different columns.

Note that you can Join 2 dataset by key or row number and then use Filter to remove rows where 2 values match in particular columns. This doesn’t really scale when you want to compare lots of columns though. Although I guess you could Concat Cols the columns you care about.

Perhaps you could give some information about a real world example where you need to do this?

N8AKTIV · July 23, 2025, 3:56pm

That is exactly my challenge

We have a big Material-Masterdata-list with lots of attributes (50+).
When I compare it today with the previous month’s list (based on the material no. as key), I merge all columns and export them if there are differences in any of the columns.
( the number and names of the columns stays the same ! )

In my wishful thinking, the perfect differences-list will only contain the attribute-name and the old and new values per material-no.

Example:

In “Compare data” you are only highlighting the differences, so it would be nice if that info can be exported in the structure of my “diff” example

Thanks for your helpful tips.

Admin · July 23, 2025, 5:38pm

Your ‘diff’ example might be a good way of putting the differences into 1 table and not replying on fonts/colors etc.

Possibly we could add:

A new transform called Diff that takes 2 inputs. The top input is ‘old’ and the bottom input is ‘new’ (or possibly we could refer to them as ‘top’ and ‘bottom’).
Options to compare by row number or key-based (as in Compare Data).
Key comparison can optionally be case sensitive (as in Compare Data).
Only outputs differences, something like:

Material	Change	Column	Old Value	New Value
1001	Value changed	Attribut1	A	B
1001	Value changed	Attribut3	1,5	1,18
1002	Row deleted
1003	Value changed	Attribut2		X
1004	Value changed	Attribut3	4,5	6,5
1005	Row inserted

Would have to handle duplicate row keys for key-based.
Would have to handle differences in column name/order. The columns are the same in your example, but we have to handle cases when they aren’t. How do you output the differences for this?:

Perhaps something like this?

V1	Change	Column	Old Value
A	Value removed	V3	4
B	Value removed	V3	5
C	Value removed	V3	6

Olaf · July 23, 2025, 6:09pm

as long as the “DIFF” transformation is not available I have a suggestion for an interim solution, which can/should be adapted to get more close to the expectation. But I think it is a step to come to an overview in a table form as wished.

N8AKTIV compare.transform (6.7 KB)

Specially if this is a monthly activity it can be given some effort.

N8AKTIV · July 23, 2025, 6:16pm

Wow, that would be the perfect solution.

Since I currently also concatenate the contents of all fields, there is a difference for all rows as soon as a column is removed or added.
I have to edit this manually if I don’t want to see this.
If these differences could also be seen in the difference list, that would be the icing on the cake.

In the final difference list, the users could then even use the “Change” column to filter which differences they want to keep: “Value changed”, “Value removed”, “Row deleted” or “Row inserted”

Olaf · July 23, 2025, 7:32pm

I just played around a little more and now it creates exactly the wished out put.

for larger input tables the selection in the Gather transformations need to be adapted, but if the structure of the file is very month the same it should work with 50+ attributes, too.

N8AKTIV compare 2.transform (13.1 KB)

Admin · July 23, 2025, 8:31pm

Impressive! Would be nice if you could do it in one transform though. ;0)

Olaf · July 24, 2025, 6:35am

Sure, but as long it is not available and you have to do it monthly it worse to run the 25 transformation job. For a one-time activity it is a little much. But it was a nice finger exercise.

N8AKTIV · July 24, 2025, 9:05am

Danke Olaf, thanks Andy,

the interim solution was almost perfect, I only made small adjustments at the end to improve the display a little. The big solution was not necessary at all.
Now my 2 files with 100K rows and 59 columns each could be compared and only the differing columns are displayed.
It requires a whopping 12GB of memory, but it’s worth it

Admin · July 24, 2025, 9:13am

That sounds a lot. Have you got Preferences>General>Optimize processing for set to Minimum memory use? If not, you probably should. In many cases this is actually faster than Maximum speed, due to the reduced memory use.

N8AKTIV · July 24, 2025, 9:36am

… I tried both, but it didn’t make much difference.
At the “Gather” step we are already at 8GB and as soon as the two input streams with 5.4M lines each are combined via full join, another 3GB are required.
With “Maximum Speed” even 1GB less RAM is required

Admin · July 24, 2025, 9:43am

That is surprising because Maximum Speed doesn’t use memory compression.

Admin · August 18, 2025, 8:04pm

We are working on a new Difference transform. It can output a dataset of differences between 2 input datasets. The two input datasets can be compared by row number or using key columns. Columns are compared by column number (not name).

It gives the same results to the Compare Data window, but output in a tabular form.

It should be pretty fast.

You can download a snapshot release with the new transform here:

Windows: https://www.easydatatransform.com/downloads/EasyDataTransform_2_8_1_snapshot_1.exe
Mac: https://www.easydatatransform.com/downloads/EasyDataTransform_2_8_1_snapshot_1.dmg

Let us know what you think.

N8AKTIV · August 19, 2025, 10:30pm

… tested with these small datasets:

… it work perfectly when column-structure is equal
and also when columns (e.g. Attribut3) are missing at the end

When comparing by column-no. it would be helpful if the column-names were also visible next to the column-number, so that we can see which attribute we are talking about in this line.

Is it possible to add an option to compare by column-names ?

If a column is missing in the middle (e.g. Attribut2), this is declared as a missing column and also as renaming the column and all the rows are shown as a difference …

Difference_Test1.transform (3.4 KB)

Thanks a lot in advance !

Admin · August 19, 2025, 10:41pm

I did think about that. But:

there is no limit on how long a column name can be
the column name might be different in the 2 datasets

So that makes it tricky.

Admin · August 19, 2025, 10:46pm

Perhaps. Until then you need to make sure the column ordering is the same in the 2 datasets. Note that you can do this by using Sample with Rows=0 to remove all the rows from one dataset (so it is only the header), then Stack the other dataset under it by column name.

Amontillado · August 21, 2025, 2:28pm

I noticed the compare function is only available for files, so you can’t compare datasets if the datasets are transforms.

Then I noticed if you attach an output file to a transform, that output is a valid source for the compare feature.

There are always other ways to compare data, of course. EDT is truly cool.