Getting the Top 10

Is there an easy way, say from a list that has:

date, account#, amount

to get top 10 amounts and associated account each day?

I know I can sort by date and then by highest amount.

If your Sort had an ability to optionally add a rank number in a new column, then I’d be able to filter for the rank being <= 10.

But with EDT as it currently is, what would be the easiest way for me to do this?

I would use a Sort then use Sample to get the top 10.

image

But if I have 500 different dates in the file, and I want the top 10 for each date, then to use Sample, I’d first have to sort my file by date and by highest amount, filter my file into 500 files - one for each date, Sample 10 from each, and then combine them again. That’s not very convenient.

hmm, haven‘t tried it or thought through in detail, but I would build two transformation files, one to split the entry file into multiple files and another one read in the created files in a batch an sample each one and write the output into into one file (append). This should be controlled by a command line script deleting first the old or temporary files and run the transformation scripts afterwards. EDT support to export the command line commands.
Have similar jobs implemented, doing complex for a project about 3 to 4 times per day.
First implementation cost some tim and effort but it is quite efficient in execution afterwards. I even trigger script execution via keystroke on my Mac, but the process works generally on Win as well.

1 Like

I missed that there were multiple days in the file.

You can do it with a few transforms.

See attached for a simple example that gets the top 2 amounts for each day and account.

top2.transform (3.0 KB)

Thanks Andy. Not exactly what I was asking, but close enough that it gives me a good idea how to do it.

If you submit a small example input dataset and the output you want for that input, then it is easier to understand what you are trying to do.

Note also that my answer may not do what you want when multiple accounts have the same amount for the same day.

Share a sample file and I will show you how to do this.

1 Like

I’ve got an example for you Dan, but when I try to upload it, it says “Sorry, new users cannot upload files”.

But here’s what the first 21 lines of the input file looks like:

Date,AccountNum,Amount
2022-09-01,1001,177
2022-09-01,1002,0
2022-09-01,1003,0
2022-09-01,1004,212.7
2022-09-01,1005,0
2022-09-01,1006,0
2022-09-01,1007,3463.3
2022-09-01,1008,36.8
2022-09-01,1009,39.3
2022-09-01,1010,0
2022-09-01,1011,38800.1
2022-09-01,1012,19.1
2022-09-01,1013,43.5
2022-09-01,1014,2.3
2022-09-01,1015,0
2022-09-01,1016,29483.1
2022-09-01,1017,0
2022-09-01,1018,0
2022-09-01,1019,1839.2

And here’s what the first 21 lines of the output file should look like:

Date,Rank,Amount,AccountNum
2022-09-01,1,38800.1,1011
2022-09-01,2,29483.1,1016
2022-09-01,3,20294.1,1024
2022-09-01,4,16030.5,1032
2022-09-01,5,14092.4,1039
2022-09-01,6,4015.9,1022
2022-09-01,7,3463.3,1007
2022-09-01,8,2370.9,1036
2022-09-01,9,1839.2,1019
2022-09-01,10,1502.5,1059
2022-09-02,1,41631.8,1009
2022-09-02,2,15988.7,1033
2022-09-02,3,14452.5,1021
2022-09-02,4,13514.6,1040
2022-09-02,5,11585.2,1027
2022-09-02,6,9619.5,1061
2022-09-02,7,8487.8,1066
2022-09-02,8,8433.2,1050
2022-09-02,9,7764.7,1074
2022-09-02,10,6796.6,1052

I don’t care whether the ranks increment or not on tied amounts.

Hi,

You can follow the tip number 4, to upload the data set while you are still new user and cannot attach files.

1 Like

Thank you Anony,

Here’s a link to my sample files:
https://www.swisstransfer.com/d/68bd426f-7836-49b4-9e9f-f7b9a5181bb8

Hi,

Please try out the following transformation.

Top10.transform (3.2 KB)

2 Likes

It is on the wishlist to be able to add a column with the rank of each row. This would probably be relatively easy to add to Sort as an optional extra column.

1 Like

Oh… this is good! Much simpler and better than my solution.

Nice job Anonymous.

2 Likes

Anonymous: That works perfect. Thank you!

Very ingenious of you, to create a cumulative total of 1’s by date.

1 Like

One of the tags seemed to be unclosed in your .transform file when I downloaded it. Here is another version with that fixed.

Top10-fixed.transform (3.2 KB)

The latest snapshot includes the ability to add a Rank column to the Sort transform. Which now means you can do it in only 3 transforms:

Top10-ranking.transform (2.2 KB)

Also ties can now be correctly handled.

This is great! Thanks, Andy.

1 Like