In our accounting business, we use Google Workspace for client files.
Had a prospect upload AMEX statements, 4 years worth, for review.
One of our tasks is simply counting transactions, by month, to calculate the amount of work the catchup bookkeeping will take. Most statements have a transaction count, AMEX does not. Usually we have an assistant for this, but this week (and next), she is out of office, so it’s up to me.
So I decided to make things easy for my lazy ass. Ask Gemini how many transactions there are in a statement. There were two ways to do this:
- In the folder itself, asking it to give me a count, by month, of the transactions in all statements.
- In each statement, one-by-one.
(I manually counted the first 6 months of statements as to check G’s accuracy.)
The first was a basic failure. It only counted half of the statements, mis-monthed (is that a word) them (for example, the statements start in January 2021, Gemini told me the first statement was December 2020), and got the counts horribly wrong. Revising the prompt did nothing. Gave this up after 10 minutes.
When I asked Gemini, month-by-month, by having the statement open, G was more accurate, but rarely precise. February would have 64 transactions, but G would tell me there were 58. Revising my prompt helped, but only one month was exact with the actual count.
This doesn’t really have anything to do with receiving wrong training in the LLM model or whatever (at least, it seems to me). I’m just asking the thing to count the number of transactions, telling G how to identify what is a transaction, etc.
It just can’t do it and yet you would think this is the sort of task AI could handle easily.