Mac Automator - Folder action : process large TXT files, filter data, copy to excel and move process

Posted:
in Mac Software

Every day I want to process one very large txt file (up to 5 million rows, up to 400 MB) with following layout:

Prefix|Code|Date|Time|Type|Size|Price|Flag 1|Flag 2
Z|X|20210903|07:00:00|S|33|27.71||
Z|ABCD|20210903|07:00:00|S|97|5.26||
Z|ABCD|20210903|07:00:00|S|75|3.14||
Z|EFG|20210903|07:00:01|S|300|9.21||
Z|HIJKL|20210903|07:00:23|S|17|5.76||
Z|MN|20210903|07:01:14|S|8|5.76||
Z|EFG|20210903|07:09:00|S|300|9.16||
Z|ABCD|20210903|07:14:35|S|100|120.07||
Z|OPQRSTU|20210903|07:29:10|S|100|5.76||
All files have a specific file name convention:
  • CHARshYYYYMMDD.txt, where YYYY is the year, MM the month, DD the day

I created 3 folders to start. I store all data on an external disk drive as the files are taking too much space on my laptop (Mac OS):

  • [path]/01-toProcess
  • [path]/02-processed
  • [path]/03-reporting

I want to create a folder action which does the following:

  • When a file starting with "CHARsh" and of filetype .txt is dropped in folder 01-toProcess,
  • Process the file(s) and filter all records where 'Code' matches 'ABCD' and 'EFG'
  • For each code, create a separate txt file with filename CHARsh-[code]-YYYYMMDD.txt, where [code] is the code name, for example "CHARsh-ABCD-YYYYMMDD.txt"
  • The file should have the following layout, with the header included, and one extra column Example for code ABCD:
Prefix|Code|Date|Time|Type|Size|Price|Flag 1|Flag 2|Timestamp
Z|ABCD|20210903|07:00:00|S|97|5.26||03-09-2021 07:00:00
Z|ABCD|20210903|07:00:00|S|75|3.14||03-09-2021 07:00:00
Z|ABCD|20210903|07:14:35|S|100|120.07||03-09-2021 07:14:35
  • So, I added a column "Timestamp" with format 'dd-mm-yyyy hh:mm:ss'. I think this is the best format to make better line charts in excel...
  • Save that file into a separate folder called 04-[code]. If the folder would not yet exist, the automator should create one... So, in the example 2 folders/files would be created:
    • folder 04-ABCD, and file(s) CHARsh-ABCD-YYYYMMDD.txt
    • folder 04-EFG, and file(s) CHARsh-EFG-YYYYMMDD.txt
  • Import the filtered data into an excel file per code, so
    • ABCD.xslx
    • EFG.xslx
  • Every day, the new data should be added in the same excel File and Tab, below the data of the previous day (so I have only one table). So I am creating a history of data in excel per 'Type'
  • At the end of the process, the original input file is moved to the '02-processed' folder

Once I have the excel file(s), I can start creating line graphs with the data...

I think this is quite a challenge, especially as the files are extremely big, and the data is not sorted on type, but on date/time... I hope this is feasible...

I hope some smart brains can help me out... Thanks in advance ! 

Anna

Sign In or Register to comment.