Whew, that was a lot to read!
Most of it looks like basic string manipulation, and I agree with @Stranger_On_A_Train there that it can be drastically reduced down to a few lines of any modern programming language, Python or otherwise (especially the whole section dedicated to “make this letter uppercase”, which would be a single function in any modern language).
But what I don’t understand is what you said in the other thread (which is from 2017? have the app requirements changed since then?):
How does your Easytrieve program (or some other part of your workflow) “know” that JOHNNY'S BUILDERS AND
belongs together with CONTRACTORS INC
as two parts of a name, while 8910 HARVARD
is part of the street? And not every incoming row of data will be structured exactly that way, right, with the second “field” consisting of part of a name and part of a street?
Like you’re not just moving certain characters around from one column to another, you actually have to have some sort of heuristic there to try to detect “these words & characters together are probably a name, and that other part is probably part of the street address”. I don’t see that in the existing Easytrieve code, unless I’m missing something?
Is that feature (string token detection & name/address normalization) part of your requirements for the new app or not?
You can probably port the current Easytrieve code you have to Python or another modern language with ChatGPT, Gemini, Claude, or another coding-centric LLM. If you can provide sample input & output files, it can also write the tests for you to help ensure that the resulting conversion is correct. Here’s the beginning of a ChatGPT convo, for example, but I can’t properly test it without input & output samples: ChatGPT - Easytrieve Program Analysis
However, I don’t know if this would fully solve your problem… it’s one thing to just reach feature parity with your existing Easytrieve code, but it’s much harder if you also need to try to clean up the resulting names & addresses.
I also don’t believe Tableau would be able to do this kind of thing in its GUI interface alone; you would probably also need to write some sort of script or a very convoluted query to match your current transformation.
This sort of “ETL” (extract transform load) is very common in today’s workflows, and it would probably be easier to maintain over time (especially for anyone else who comes after you) if you did port it to Python, learning the basics with LLM or YouTube help… it is the de-facto standard language for data processing now, especially with all the AI stuff. There are also many, many online tutorials and videos for Python, both free and paid.
But I’m not sure what your underlying goals here are. Your existing programs read like they’re translating between several archaic systems; in this new version, will the output still have to go into other archaic systems (so it’s necessary to maintain the current formatting) or are they being used for some other purpose ultimately? Cuz you could drastically simplify the logic if you are able to also use modern data formats that aren’t character-delimited like this.
Can you provide more details?
And FYI, for readability, you can make code blocks
with tags like [code]code blocks[/code]
, or just surround them with triple backticks like:
```
your code goes here
```
becomes:
`** Write CPEG-format output file **`
`CPEG-OUT-REC = ’ ’
CPEG-REC-CD = 'C ’
CPEG-CONTRIB# = ‘802364’
CPEG-BUS-CAT = ‘1250’
*CPEG-EXTDT = CPEG-A-EXT-DATE
CPEG-EXTDT = WS-DATE
CPEG-EXTDT-MM = WS-MM
CPEG-EXTDT-DD = WS-DD
CPEG-EXTDT-CC = ‘20’
CPEG-EXTDT-YY = WS-YY`
etc.
You can also hide the whole thing inside a details block that won’t expand until someone clicks on it:
[details="Summary"]
This text will be hidden
[/details]