Tableau Data Prep

Reply · June 11, 2025, 3:52am

Whew, that was a lot to read!

Most of it looks like basic string manipulation, and I agree with @Stranger_On_A_Train there that it can be drastically reduced down to a few lines of any modern programming language, Python or otherwise (especially the whole section dedicated to “make this letter uppercase”, which would be a single function in any modern language).

But what I don’t understand is what you said in the other thread (which is from 2017? have the app requirements changed since then?):

Recommend data-cleaning/reformatting programs/languages

Let me try to show you the kind of thing we’re trying to do. This is what we get:
[JOHNNY’S BUILDERS AND ] [CONTRACTORS INC 8910 HARVARD ] [ST APT 700 ] [SAN FRANCISCO] [CA]
This is what we want:
[JOHNNY’S BUILDERS & CONTRACTORS INC ] [************] [8910 HARVARD ST #700 ] [************] [SAN FRANCISCO] [CA]
Where the brackets indicate fields.

How does your Easytrieve program (or some other part of your workflow) “know” that JOHNNY'S BUILDERS AND belongs together with CONTRACTORS INC as two parts of a name, while 8910 HARVARD is part of the street? And not every incoming row of data will be structured exactly that way, right, with the second “field” consisting of part of a name and part of a street?

Like you’re not just moving certain characters around from one column to another, you actually have to have some sort of heuristic there to try to detect “these words & characters together are probably a name, and that other part is probably part of the street address”. I don’t see that in the existing Easytrieve code, unless I’m missing something?

Is that feature (string token detection & name/address normalization) part of your requirements for the new app or not?

You can probably port the current Easytrieve code you have to Python or another modern language with ChatGPT, Gemini, Claude, or another coding-centric LLM. If you can provide sample input & output files, it can also write the tests for you to help ensure that the resulting conversion is correct. Here’s the beginning of a ChatGPT convo, for example, but I can’t properly test it without input & output samples: ChatGPT - Easytrieve Program Analysis

However, I don’t know if this would fully solve your problem… it’s one thing to just reach feature parity with your existing Easytrieve code, but it’s much harder if you also need to try to clean up the resulting names & addresses.

I also don’t believe Tableau would be able to do this kind of thing in its GUI interface alone; you would probably also need to write some sort of script or a very convoluted query to match your current transformation.

This sort of “ETL” (extract transform load) is very common in today’s workflows, and it would probably be easier to maintain over time (especially for anyone else who comes after you) if you did port it to Python, learning the basics with LLM or YouTube help… it is the de-facto standard language for data processing now, especially with all the AI stuff. There are also many, many online tutorials and videos for Python, both free and paid.

But I’m not sure what your underlying goals here are. Your existing programs read like they’re translating between several archaic systems; in this new version, will the output still have to go into other archaic systems (so it’s necessary to maintain the current formatting) or are they being used for some other purpose ultimately? Cuz you could drastically simplify the logic if you are able to also use modern data formats that aren’t character-delimited like this.

Can you provide more details?

And FYI, for readability, you can make code blocks with tags like [code]code blocks[/code], or just surround them with triple backticks like:

```
your code goes here

```

becomes:

`**  Write CPEG-format output file                                  **` 
`CPEG-OUT-REC              = ’ ’
CPEG-REC-CD               = 'C ’
CPEG-CONTRIB#             = ‘802364’
CPEG-BUS-CAT              = ‘1250’
*CPEG-EXTDT                = CPEG-A-EXT-DATE
CPEG-EXTDT                = WS-DATE
CPEG-EXTDT-MM               = WS-MM
CPEG-EXTDT-DD               = WS-DD
CPEG-EXTDT-CC               = ‘20’
CPEG-EXTDT-YY               = WS-YY`
etc.

You can also hide the whole thing inside a details block that won’t expand until someone clicks on it:

[details="Summary"]
This text will be hidden
[/details]

Topic		Replies	Views
Recommend data-cleaning/reformatting programs/languages In My Humble Opinion	15	1597	September 18, 2017
Programming alternative Factual Questions	41	12014	May 2, 2010
is there a freeware or really cheap software program similiar to Quicken In My Humble Opinion	10	865	September 30, 2004
Better SW for FrontEnding Data Input into MS SQL Server: Tableau or Crystal Reports? In My Humble Opinion	9	172	September 1, 2023
Any suggestions for business reporting software? In My Humble Opinion	8	2199	June 22, 2012

Tableau Data Prep

Related topics