Non exact JOIN operations (e.g. key A contains or starts with key B)

Typically Join operation are done on matches between key columns in 2 datasets and the key values need to match exactly, including case and whitespace.

However, sometimes you may need to do a more exotic join, for example matching when the key value in column A contains, starts with or is a fuzzy match with the key value in column B. You can do this in Easy Data Transform using a Cross transform to add a row for every possible combination of rows, followed by a Filter to remove any non-matches.

join-contains.transform (2.1 KB)

Note that Cross can create very large datasets. A Cross of 2 one thousand row datasets will create a 1 million row dataset. So this technique may not work with large datasets, due to insufficient memory.

1 Like

So why not simply enhance Join transform and provide either in drop down or as a radio button options to choose from that will join on Key1 and Key2.

Key1 and Key2 match options

  • Exact
  • Contains
  • Starts with
  • Fuzzy match
    … and so on.
    and Exact could be set as default.

It is a possibility. But not much demand for it so far.