I was not familiar, no.
I did generate this image:
I was not familiar, no.
I did generate this image:
Bing doesn’t even work for me right now.
I recently got this image in one of my prompts that generates a wide variety of weird stuff. It seems like the face was influenced by Beetlejuice.
I ran it through Clipdrop Uncrop (which does its work without a prompt) and the best result was nearly this. (I had to tweak it slightly with infill at Playground because the arm on the right merged into the body.)
I then decided to play more with the head. I masked away the body and prompted “demon man in forest”. I liked this one because it looks like he has a severe burn, which seems appropriate.
I then ran this result back through Clipdrop Uncrop. I’ve hidden it even though he is the beast with the least.
Based on the Taylor Swift thread, I asked SD 1.5 for “Taylor Swift listening party” and got this not-at-all unsettling event. You DO like Taylor Swift, don’t you?
When it comes to the failures of AI, I wonder how much comes from limited or poorly understood training sets and how much comes from limited or poorly understood prompts. I like thinking up “stress tests” that are unlikely to be closely matched in training images. Recently I tried “A photo of an anthropomorphic sandwich riding a catfish in a mountain stream. The catfish is green with yellow polkadots.”
Dall-E 3 handles it pretty well:
SDXL, not so much:
So I try again with a regular fish, no cat:
Still missing critical details. I wondered if SDXL even understood “anthropomorphic” at all, so I went with a minimal prompt of “anthropomorphic sandwich”:
So SDXL understands it to some extent, but differently from DE3.
Before I tried sandwiches and catfish, I tried putting an anthropomorphic fingerprint on a Komodo Dragon:
The fingerprint tended to inherit the color of the mount, so I tried for a monochromatic fingerprint, hoping everything else would be in color, but it didn’t work out that way.
Still, all impressively close to the details of unusual prompts. I wonder how much more could be squeezed out of Stable Diffusion and even out of earlier AIs like Disco Diffusion if GTP4 could be welded to the front end?
That looks about like an accurate depiction of suburban teenage girls. They’re a little more dressed up (by their standards) than they usually would be at school, but that’s probably to be expected.
A new Runway is out. So far my free trials have been…not spectacular. Here’s a cat attempting to catch a rhinoceros beetle on a kitchen table.
Some have been more successful, if not accurate to the prompt. I got the redhead, but no ice armor and isn’t especially evident that it is the moon.
(For me it says “video can’t play because file is corrupt” for everything, but the videos are fine if you download them.)
The case against Stable Diffusion, Midjourney and DeviantArt was largely thrown out.
The primary issue is that the three plantiffs never copyrighted most of their works. Two had no filed copyrights, the other only filed a copyright on 16 of her works. The judge said that the case on those 16 could still potentially move forward. However, the judge seemed unconvinced that the AI is actually committing any sort of copyright infringement in generating new works:
“The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable), or that all DeviantArt users’ Output Images rely upon (theoretically) copyrighted Training Images, and therefore all Output images are derivative images.
Even if that clarity is provided and even if plaintiffs narrow their allegations to limit them to Output Images that draw upon Training Images based upon copyrighted images, I am not convinced that copyright claims based a derivative theory can survive absent ‘substantial similarity’ type allegations. The cases plaintiffs rely on appear to recognize that the alleged infringer’s derivative work must still bear some similarity to the original work or contain the protected elements of the original work.”
Okay, this is an image that I generated in Bing/Dall-E 3 and outpainted in Stable Diffusion. I used that image as an input for Runway ML Gen 2 and ran it for 8 seconds (no prompt, image only). I then sped it up by 4x and did the reverse section in an off-line video editor. I converted the video to a animated gif (because I haven’t bothered setting up a new Youtube channel after losing the old one).
This is pretty impressive. Especially how it accurately picks out both Yoda in the foreground and the creature in the background (supposed to be a rancor) as active objects.
(Eta: hid the gif because it could get obnoxious infinitely repeating.)
Midjourney has a new feature which is pretty cool and hard to explain. It’s a “style builder” and you enter a prompt and it gives you between 32/64/128 images (depending on how much time credit you want to burn). Then you go through and select favorites, either in A/B pairs or A/B grids of four (your choice). At the end, the system gives you a style code which you can append to your future images to influence the results.
As much as you want, you can go back to your page of 32/64/128 images and pick new ones to tweak the style and get a new code. Or share the page so other people can use it and get style codes.
This is an example of the selection screen based on my earlier prompt: Lenticular photograph of a woman in a flowering garden, Forced depth perception, Aerochrome image, 8k UHD, saturated colors, separated colors, separate layers of depth of field, by Miles Aldridge and Mary Quant and Franco Fontana
It then gives me a style code based on my picks (2zzJiByCVoG63JbtEZMHWYh7G in this case) that changes this: Brightly colored Photograph of angelic rapture
…to this, when the code is appended…
Or this for “Retro Housewife in a Nebula”
Another news tidbit: Stability is releasing a 3D model generator.
This is another clear example of ChatGPT adding things to prompts behind the scene. I did four runs of “photo of an android dreaming of a robot sheep” before getting the image closest to what I wanted (top left). As you can see, in one run ChatGPT told Dall-E to use a thought bubble and a park bench, in one a grassy hill and The Milky Way, in one a copy of Do Androids Dream of Electric Sheep? and a glass of water (with nice caustics), and in one, VR goggles and roboboobies.
I feel like I just uploaded my DNA/Identity to a shadowy conglomerate which will soon rule the world, but I fed Night Cafe some pictures of myself to train a MyFace model. I won’t post all the hilarious results I’m coming up with so far (if hilarious is seeing my own face, slightly skewed) but I will post proof positive that Travis Kelce is a beard.
Works pretty well.
Looks like Frank Zappa is verboten again on Bing.