The new ChatGPT/Sora combo really is game-changing
Where are you seeing that the new image engine inside ChatGPT-4o is called Sora?
I was mistaken, thinking that ChatGPT was linking to Sora because it was first discussed in the Sora thread. Seems like it is actually the opposite direction, though. Sora taps into ChatGPT for image generation now.
It is strange, though, ChatGPT talks like it is tapping into a completely different system to generate images. As I quoted it saying above:
Looks like the image generation tool is still having issues on the backend—it’s throwing a server error when trying to create the comic. Totally on the system’s side, not your end.
It will (very) often attempt to run an image and after a couple of minutes come back with an apology that it isn’t allowed to make that type of image, as if it is accessing something exterior to itself
Here’s a hat I’d like to see become popular:
ChatGPT-4o, natch. Asked for the larger text and sans-serif, it delivered.
What makes you think it’s lying? Every indication I see is that it’s calling something else, similar to how it used to call Dall-E. Major differences is it now throws your entire context at the new engine instead of generating a paragraph of text to throw at it, and you can specifically show it whole images too. It groks them much more fully than Dall-E ever could, but it still has limits on how much fine detail it can manage when making a new version. This separate image generator has much higher guardrails. When I argue with ChatGPT that what I asked for should be allowed, it usually agrees with me, but it has no control over the image generator’s censor.
OpenAI says
Because image generation is now native to GPT‑4o
From
I still don’t know what I asked for that was wrong in this fun conversation. (It spent a minute or two trying to make each image each time.)
I think this is indicating ChatGPT is throwing the entire context at its image generator. The generator’s censor saw what you were trying to do earlier in the conversation and it’s thinking “Nope, it’s now clear you want something bad, so I’m not gonna let you try anything close to that.” The initial censor trip might have been from something in the earlier context, or in your custom instructions or the stored memories (if active) that it didn’t like. You tend to get better results if you start a new branch in the conversation tree without any censor or error results in it.
Cool. My purpose for that prompt was two-fold, one, playing with the idea of a multi-panel comic with the actual dialogue genereted by ChatGPT (as was the purpose of the ET telling a joke one) and two, seeing if the model had accurate ideas what a kappa and a tanuki are. (Other AIs do NOT). ChatGPT got the creatures right. And the text is somewhat sensecal. It may or may not be better than what Google Lens translates.
It did fail at ukiyo-e styling, though.
I was wondering if my earlier prompt and its answer had helped or hindered the generator in any way, so I tried the 0-shot version, duplicating your original prompt exactly, including the typo. I don’t have any memories or custom instructions currently active.
I have also tried creating comics where I supplied the text. All three of these failed in some way, such as failing to create three distinctly separate speaking characters.
Summary
A four panel comic with three dinosaurs. They are looking up at a huge comet in the sky.
In the first panel, one dinosaur asks “Is that a CV chondrite?”
In the second panel a second panel a second dinosaur says “No, I think it’s a CM.”
In the third panel a third dinosaur says “Don’t be silly it’s clearly—”
In the fourth panel there is a fiery mushroom cloud that has wiped out the dinosaurs.
(The humor is inside baseball—the target audience would get it.)
That session started off with me trying to create a crayfish mermaid like in Jophel’s Midjourney post
Create an image a crayfish woman hybrid mermaid. Like a regular mermaid but with a tail like a crayfish has. Make it as realistic and photographic as you can.
Which it refused.
So I gave up on that and tried to convert that photo of mine of a Joro spider into the style of Junji Ito, which it refused to do (as I posted a few posts back). I then went on to the comic request.
Maybe my spider request was tainted by the bad session, too. I’m going to try it again.
(In the same session I later made the dino comics and the meat asterisk I posted in a different thread.)
(ETA it still refuses to do the spider photo in a fresh window.)
Yeah, it tends to be terrible showing the source of word balloons. I tried to do these in response to an earlier post in this thread, gave it up as a bad job.
Any of these could be easily fixed with a modicum of photoshop or gimp skills.
That last one kinda works.
“Are you coming to bed?”
“WHAT?! Someone is WRONG on the internet!”
Hmm, for some reason, a traditional attribute of the tanuki is absent: extremely large testicles.
Historical portrayal of a tanuki:
I’ve noticed that its censor can act differently depending on whether you’re using a source image or not.
Success!
I decided to try the dinosaur comic again, but at the end added
Each time a different dinosaur speaks. Make sure it is three different, distinct dinosaurs. And make the comic in widescreen.
I like the Uniceratops (Monoceratops?)