How are you running processes using EDT ? Or welcome to Prashant's low cost ETL heaven

Some knowledge sharing which might be helpful to others

Problem we solved

  • Individualised reports to each employee regarding their task from project managers
  • Individualised reports to each salesman regarding their orders from ERP

What we did

  • We currently have tonnes of processes which require EDT to run on daily / weekly basis. Once we discovered batch processes we decided to use some automation to help out
  • all files are downloaded from SQL in csv format using AutoSQL (One time purchase and just works fab)
  • using chatGPT we’ve made json files for each processes , it’s general structure is below
  1. Input Files from AutoSQL are renamed by AutoSQL
  2. Non SQL files are copied in 1 single input directory via python
  3. Python runs the CLI batch process for EDT
  4. EDT Processes files and outputs them
  5. Using python(xlwings) we copy the csv items in a templatised (is this a word?) ready XLSX file which contains ready made table objects
  6. Using XLWings solves the biggest problems I have with V1 of EDT , no templates
  7. Files are emailed to users using postmark api

Learnings

  • the most tedious part was back and forth with chatGPT by team to make modular jsons, in hindsight we should have hired a programmer from fiverr
  • We found more value in having some manual process , to verify data once downloaded , EDT is sensitive to column changes.
  • XLWings was a game changer , it allows us to interact with excel tables as an object and add data to it directly without messing up ranges .

Questions

  • what do your starting / end processes look like ?
  • what tool you recommend ?
  • Post not making sense ? well it’s 1:15 AM at time of posting this , what do you expect :stuck_out_tongue:


2 Likes

Hoping to make some of this easier in v2:

  • import from SQL databases
  • use existing XLSX files as templates when outputting to Excel
  • handle Excel ranges/tables
  • built in data validation
  • email outputs (maybe)
3 Likes
  • Oooh exciting , outputting to excel templates will really help produce pixel perfect excel files
  • The massive advantage of handling excel tables as an object is ,
    • you don’t need to know where the table is on the given sheet , we can refer to it directly via [FileName][SheetName][TableName] .
    • You can add items / delete items from Table directly as an object
    • Multiple tables can exist on one sheet
  • We found more value in having some manual process , to verify data once downloaded , EDT is sensitive to column changes.

In any transformation which I use productive (repeatedly) I start with the stack transaction to bring the columns in the required order and even remove with this columns which are not needed for the processing. This saves in case users change the order of columns. It doesn’t help if a user change the column Header description, then you lost.

So I do some manuell steps, too, to check the input files and/or the out put ones, In case a column is missing in the output or the result are in wrong columns can be exactly the header name change issue.

2 Likes