Is AI overhyped?

This is the sort of thing AI should be doing.

Instead it seems like they’re trying to replace Hollywood and Google with ignorant AI bots, which is just asinine.

They’re doing many things with AI. You don’t have to like them all. I’ve been using GPT for complex searches now and it better answers my questions than the morass of ads and shopping links Google has become. I would not have said this about publicly available LLMs a year ago. They’ve improved that much in that time.

:thinking:

OpenAI Revenue: $3.7B in 2024

It is true that they spent $6 billion in 2024 on computing power, but the company’s revenue is set to get better now.

As pointed out, Open AI is not the best example to look for profitability. Open AI and ChatGPT can be the best example of the first mouse that dies in the mousetrap, but it is the second mouse the one that gets the cheese.

DeepMind Revenue and Growth Statistics (2024) - SignHouse.

DeepMind’s revenue tripled from £265.5 million in 2019 to £826.2 million in 2020, reaching £889,4 million in 2021.
After several years, DeepMind made its first profit of £43.8 million ($59.6 million) in 2020.
As of 2022, there were 1,567 people working for DeepMind.

Lol, yup.

I don’t have any disagreement with what you’re saying, but the LLM developers (and divisions within larger companies) have in fact been losing a lot of money, and we don’t know how they are eventually going to monetize or what the typical consumer will pay.

Uh, what makes you think we are not paying now?

You know what I mean. I don’t have the expertise that would allow me to argue the details, but things like ChatGPT, Gemini, etc., are not currently priced so as to recoup their costs, and they usually have a free tier as well.

Wich is why I did not use them as examples. Remember, we are here because it was declared that “AI as it currently exists is not a cost effective tool to do anything”. (Bold mine)

To that you replied:

“This.”

So, not quite, as pointed out: Yes, early adopters are burning money unsustainably, but that is really is not “this”.

One has to remember that as Verasitasium pointed out, It took British biochemist John Kendrew 12 years to get the first protein structure. It was more expensive then, but even now, to confirm the structures of proteins, it costed tens of thousands of dollars to get progress on a single one. In the last few years DeepMind resolved more than 200,000 of them.

That is a huge chunk of money saved, and leading to great progress and cheaper ways to find new cures for diseases and heal poisonings.

Darn it,

I meant to say that besides making profits, that was a huge chunk of time and money saved.

I stand corrected. I was wanting to validate the spirit of that post, but the statement you cite is not strictly true, and the things you are pointing out are indeed valid. Thank you.

Sam Altman has shared a new roadmap for OpenAI’s offerings. I think the main takeaways are that Orion, which many thought was set to finally yield GPT-5, will now be released as ‘GPT-4.5’, and will be the final ‘non-reasoning’ model; and furthermore, that GPT-5 apparently won’t be a single model, but a ‘system that integrates a lot of our technology, including o3’ (which will not receive an independent release). A lot of people are interpreting this as essentially saying that the pure-scaling approach is dead: Orion hasn’t yielded the expected advancement over previous models to justify calling it GPT-5, prompting a pivot to alternative approaches. A further worry is that going forward, you might not really know what you’re actually getting with any call to ‘GPT-5’, which could impact the trustworthiness of its results.

Maybe people are saying this, but it doesn’t follow. LLMs are basically trained on the entire corpus of written human information (and then some). They do some very remarkable things with this, but also have some obvious limitations, and clearly aren’t replicating what a smart human could do with the same access.

The clearest limitation is just that (until recently) they can’t “think” for an extended period of time. There is a fixed amount of processing that goes into each output token and no more. So any type of reasoning that inherently needs more than that cannot be expressed. There’s no “mull on this for 8 hours and give me a yes or no answer.”

Hence why we’ve seen for some time that tricks like “write our your thoughts step by step” actually improve the output–they allow the LLM to spend more computation on the output. And recently, with “reasoning” models, which are really just the same trick but with a model that’s a little more fine-tuned for the purpose.

So there is a move to do more computation at inference time, but that’s no surprise at all; if anything, the surprise is that we could get as far as we did without that. There is some argument as to the best way forward–some say the models should be trained to reason in “latent space”; that is, without having to go back to tokens. Maybe true, maybe not–we’ll see. But regardless, we’ll see models spend more compute on inference.

None of this IMO is equivalent to saying “pure-scaling is dead”, unless they mean just scaling the training set. In terms of human-generated text input, that’s probably true. But there are many other areas to scale, including areas that just haven’t been scaled at all. They’ll all benefit from more compute and more data; it just won’t be quite as easy as increasing the parameter counts and throwing more text at them.

Sure, but the argument is that if they could’ve gotten to something worthy of the name GPT-5 by mere scaling (of training data, parameters, and compute), they presumably just would’ve done so; that they haven’t, and are now essentially abandoning that approach, does at least give evidence that things aren’t progressing as anticipated. Add to that the widespread reporting of diminishing returns since last November, and it seems to be at least a reasonable extrapolation.

I’m sure they always knew that scaling the data and parameters would run out of steam at some point–all technologies hit a point of diminishing returns eventually. Maybe OpenAI hit that point a bit earlier than they’d hoped, but it’s not like GPT-5 was ever promised to be just a larger version of GPT-4. It was always expected that they’d incorporate new features like a reasoning model. And all of those new things are at the beginning of their scaling curves.

Obviously I don’t know who knew what when, but at least in their public-facing communications, they’ve been quite explicitly preaching the gospel of scale, with Altman tweeting ‘there is no wall’ in response to the November reports, and stipulating that the scaling laws are ‘decided by God’.

Well, you can read pretty much whatever you like into a post that short. I would read it as “there is no wall [because there are half a dozen things that we already know will unlock further gains and we haven’t even begun scaling those]”. But it’s impossible to say for sure.

Recent research by the BBC shows “AI” is incapable of summarising news.
Which begs the question.
Can it do anything if the results are looked at critically by experts?

I do think AI is an exellent tool to to suss out dumb examinations: If AI scores a passing grade on your test: your test sucks.

In the early 90s, I worked in the translation industry as well, although in sales and coordination. I did some rewriting of English translations done by Japanese.

There were few skilled native English translators then and many really bad translators. I’m very familiar with the frustrations and problems associated with terrible translators.

I’m in a couple of social media groups that have automatic translation and good god they are terrible. Much worse the google translate.

However, the nice thing about the translation is that it can give a general idea of the discussion. Not everything needs to be perfectly translated.

Part of my wife’s job involves translating contracts and ChatGPT does a great job. It’s so much faster making a few corrections rather than having to generate everything herself.

A lot of “AI can’t do nothing” criticism feels like the LLM equivalent of someone saying that AI image generation is all laughably crude based on the first pictures people were posting from Night Cafe a couple years ago or the garbage posted to Facebook.

I’ve mentioned before that part of my wife’s job is translation and she has also incorporated AI into that portion and finds it to be a real help and time saver. She’s been interpreting for decades on the corporate, medical and legal level so I trust her judgment on how useful it can be in that sphere.

Thanks for the interesting discussion, guys! I’m breaking this out into a separate point, as I have a question for you. Or for your wives, if you care to ask:

I have experience looking at machine translations, as well as correcting the translations of humans who were basically competent but were making mistakes. The problem I have is that, if I can’t trust a translation 100%, then I have to read the whole thing myself. And if I’m doing that, then the difference between “reading and correcting” and “translating from scratch” is very little. That’s why I haven’t found machine translation to be useful in my work, but I could be behind the times.

In the case of a contract or any document of importance, I would be scared that there would be something wrong beyond inept turns of phrase. But maybe your wives have their own routine to deal with that.

Another thing in my case is that I typically have been doing ad copy-level translations, which requires output with a specific tone, cadence, etc. It can’t just be “correct.” A video script needs to sound like a video script, a press release needs to sound like a press release. And it even needs to have a kind of generic personal style to it without sounding too generic or too personal, if that makes sense. A certain kind of intention needs to come through in the writing: we really want to sell you this, convey this to you.

AI is bad at all of this. In fact, whereas a year ago I could be fooled rather easily by, say, AI-written scripts on YouTube, now I recognize them instantly. It’s as if everything has the same cadence and habits of style. (Needless to say, the AI narration is now also painfully obvious. If I hear AI narration on a video, I block the channel entirely.) By the way, I think the same probably is becoming apparent in AI art and video. Yes, it’s amazing what can be done. But it all has a same-y-ness to it, a certain cloying and annoying quality that you can’t unsee once you start to see it. My kid is a genuinely talented artist (dad brag) who has a love-hate relationship with AI art, leaning more into the hate.

There are other big reasons why a thinking and understanding human is much preferable to AI in many cases. I think people tend to assume that the texts we translators are given are written at a professional level and free of mistakes. I don’t know about other countries/languages, but in the case of Japan, I would say maybe 10% of what I have been given to translate over the years, even by big companies, has been at a truly professional level, maybe 50% is by someone in that company who is moderately but not completely competent (in the case of automotive, imagine engineers trying to write, etc.), and 40% is something less than that, all the way down to complete garbage.

Both AI and 99% of native Japanese translators translating into English have the same algorithm: translate literally, word for word. Maybe a human Japanese translator will attempt to correct an objective error or ask an absolutely needed question, but, on the whole, they do not consider the quality of the text their problem. Literal translation = CYA. (The same thing goes for Japanese interpreters, unfortunately. They don’t get “involved.” I see myself as a value-adder to the conversation and will ask follow-up questions, remind people of things they might have forgotten, etc. etc. All of which needs to be done so as not to become the center of attention myself, etc. etc.)

In contrast, my philosophy has always been fight the text. Garbage in, perfection out. We would put a ton of notes in our translations (I and my teammates in Japan, who checked my work–also important and value-adding!–and interfaced with the client) and asked questions as needed. AI is not capable of this and probably won’t be for a long time.

Thank you for indulging my dilation upon this topic!