AI is wonderful and will make your life better! (not)

I don’t accept the existence of the new Battlestar Galactica. Cylons are evil toasters and that’s all there is to it.

Yeah, the old one may have been bad, but it was our bad.

I prefer Commander Cartright and my Starbuck to be a guy. I’m old school. John Colicos was a great Baltar. And there’s no Kane like Lloyd Bridges. Imperious Leader can only be Patrick Macnee.

If only the plots were better.

And the less said about Robbie Rist, the better!

I am very good at spending a whole morning doing something “efficiently” instead of just 5 minutes old school.

I doubt it. Either the AI fad will collapse, or the organizations/societies that use it will collapse and be unable to build or maintain them. They are just too incompetent to sustainably run everything they’ve been given power over.

There’s “stupid” and there’s “doesn’t work”, and this so-called AI is the latter.

Well, OCR would come in handy if there was an image in the PDF file, but the content of a PDF is at its heart a bunch of text. In addition to the text visible in the document, it has a bunch of instructions for how to render that text and/or any images embedded in the PDF. If you can interpret that properly and render it correctly, than you can create an image that OCR can take a crack at.

But interpreting a PDF can be a trial. Some of them are using the layout language to create the text as curves, basically rendering the text in vector graphics. Keeps you from having to include fonts in your PDF, but woah that makes interpreting its text a bear. Interpreting the formatting of columns and making the text within them actually make sense is also a problem, even when the information contained within them is actually stored as text.

There are some specialized AI tools for data extraction from PDFs that claim to be as near as perfect as possible, but I’ve never used one. From my experience with other tools, as the document in question becomes more complex, the edge cases creep in.

Yeah, it’s like if you’re really good at translating from Latin, but instead of getting Latin you got somebody else’s bad translation from Latin. So now your translation skills aren’t very useful. You’re instead dealing with someone else’s mess.

And you reminded me of a recent news article I read about some tech CEO that fired 80% of his staff because they didn’t fully embrace AI. You would have got to keep your job. :slight_smile:

Well, I’m not “techy”, so “OCR” may not have been the correct wording. Regardless, I upload PDF’s of Franchise Disclosure Documents to Gemini/GPT all the time to extract data and they rarely fail at the task.

Only some PDFs contain “a bunch of text.” Plenty are just an image of a scanned document, and that’s where OCR is useful.

Well, yeah. Most PDFs aren’t weird.

However, one of the products I support is an email security product that scans literally hundreds of thousands of them a day. The parser doesn’t have a problem with most of them, but every few days or so we get one that it can’t process properly. Sometimes the doc doesn’t actually conform to the PDF spec, sometimes it’s within the spec but uses it in a way that our parser can’t handle. We file a bug on the parser in the latter case.

The next line in the same paragraph covers that.

Even the ones that only have an image visible still contain a bunch of text describing how to present that image.

Not really germane to the conversation: I fucking hate PDFs that are just an image of text. A scan of an old hard copy of a document? Fine, I’ll deal with that, no way around it. But some geniuses will go out of their way to make both our lives harder by taking a screenshot of a PDF that really is nothing but text that you could copy and paste (or a similar document format), and embedding that image into a new PDF. I’m so lucky I’m not close enough to strangle these people.

OCR can usually figure it out, but it doesn’t always. Frankly, if it’s an image of a document that I know is actually copy/pasteable text, I won’t even try OCR. I’ll just send them a message saying "Nope not playing your dumb game, here’s how you export that doc in a useful way.’

Fun article from back in March (practically ancient history) about the pitfalls of using LLMs as OCR engines:

I wish I were still working so I could do this.

“That said, I predict — counter-intuitively — that it will increase the birth rate!” he continued without explanation. “Mark my words. Also, we’re gonna program it that way.”

Stranger

Maybe he just got through streaming Demon Seed?

I’m pretty sure that this film is Elon Musk’s inspiration, and I don’t mean for artificial intelligence.

Stranger

I heard variations of this when Photoshop was launched in the late 1990s.

So…

How do I calculate pi, ChatGPT?

Did anyone ever evaluate Adobe as having astronomically inflated market valuations or predict that it would put the majority of white collar workers on the dole?

Only a ponce would use the Leibniz sequence to calculate \pi. Twenty points higher than me, doesn’t know to use a spigot algorithm to calculate transcendental numbers? *shakes head*

Stranger

Nobody tried to put Photoshop in charge of anything, so that analogy doesn’t work.

My comparison with Photoshop was WRT how things could be altered, and would most people be able to tell the difference.

Deceptive bullshit like this really pisses me off. I’m not saying this didn’t happen, but I’m saying there’s either a great deal of unstated information about what really happened, or some really major glitches in GPT-5, which is a software development issue and not an AI issue.

I just asked ChatGPT the exact same question – to name the first 12 US presidents – and the answer to this simple question was predictably unsurprising:

Q: Who were the first 12 US presidents?

ChatGPT: Here’s the list of the first twelve U.S. presidents in order, with their years in office:

  1. George Washington (1789–1797)
  2. John Adams (1797–1801)
  3. Thomas Jefferson (1801–1809)
  4. James Madison (1809–1817)
  5. James Monroe (1817–1825)
  6. John Quincy Adams (1825–1829)
  7. Andrew Jackson (1829–1837)
  8. Martin Van Buren (1837–1841)
  9. William Henry Harrison (1841) — served only one month before dying in office
  10. John Tyler (1841–1845)
  11. James K. Polk (1845–1849)
  12. Zachary Taylor (1849–1850)

Would you like me to also give you a quick note on each (like their main accomplishments or controversies), or just the list is enough?

This is why I continue to use ChatGPT* for a wide variety of genuinely useful functions, within reason and with an understanding of its limitations, while all the uninformed luddites rage about how “useless” and even “dangerous” it is.

But the OP is really about a different issue, about the significant potential for abuse of AI by corporations. I completely agree, but it should be recognized that this has been going on since the dawn of the computer revolution when corporations took away large amounts of control and autonomy from their customer-facing agents and turned it over to their computers. Thus in a single move managing to both demoralize their employees and deny them any empowerment and simultaneously piss off their customers. AI will undoubtedly take this to the next level, but fundamentally corporations simultaneously shitting on both rank-and-file employees and customers is nothing new.

* Well, that’s not really why. That’s a trivial example that’s easily Googled. But the information you can get from extended interactive conversations is entirely at a different level.