‘Schema drift’ is where the columns in an input change over time. E.g.:
- new columns are added
- existing columns are deleted
- the column order changes
It can be a real nuisance for transformations that you run regularly.
In Easy Data Transform you can handle this using Stack. But this can be quite tedious to do.
So we are adding a new Schema feature to inputs in v2. This gives you the option to store an ordered list of column names with each input and say what you want to do if the input does not match the schema.
- add missing columns (in the schema, but not in the input) with empty values
- rearrange input columns into the same order as the schema
- add or ignore extra columns (in the input, but not in the schema)
- stop with an error
We hope to have a beta version that customers can try in not too long.