I have noticed that min-dalle is better with “likely” images than “unlikely” ones. For example it does a pretty good job (ignoring the usual mangling) with the classic astronaut riding a horse but astronaut riding a cat makes cat astronauts and astronaut riding a mantis makes a mess. Horses are things you expect to be ridden, cats and mantises are not. (Astronaut riding a camel works, but it puts it in the desert, not space.) For another example, cat wearing a helmet works but cat wearing a colander confuses it (and of course hats are worn and colanders aren’t unless you are a Pastafarian.) Jesus eating a cheeseburger? It makes a good try. Jesus eating a Chihuahua? Confusion.
Show up in what feed?
I’ve been trying to find a way to get min-dalle to compose an image for widescreen with letterboxing. It is easy to get 4:3 letterboxing—just as for widescreen, hdtv, 16:9, sixteen-by-nine, 1.85:1, anamorphic widescreen, all get matted to 4:3. But no widescreen. I did find this interesting article, though:
In Night Cafe it is actually, literally named My Feed. It’s where all the people I follow show up.
So I got into the (actual primary) DALL-E beta and burned through my 50 freebie credits mostly making images based on D&D campaigns for my friends. Just like the other lesser generators sometimes it fixates weirdly on a word in your prompt and outputs nonsense. Out of 50 prompts, I probably had a dozen outputs I’d categorize as “unqualified successes.”
But when it works, holy shit does it work well. And it’s fast.
The DALL-E prompt book linked upthread was a huge help in getting started, and there are lots of other resources online to help generating useful verbiage.
After your free uses, it’s fifteen bucks per 115 prompts, and they’re definitely going to be getting some of my money. I just need to be super careful - it feels exactly like online gambling. You click a button, wait a few seconds, and maybe get a fabulous digital prize. I’m sure there are already people in the beta who have dumped hundreds of dollars into it.
I don’t think there is a way to get your published pictures in your feed since (as you mentioned) that’s where the pictures of those you follow show up and you can’t follow yourself.
If you just want all your published pics in one place, you can click in the upper right and click ‘view profile’. Your profile has all the published pics. You can even sort by most recent or most liked.
Thanks, I was afraid of that.
You should use the lesser Dall-Es to test out your prompts first. They seem to understand the same prompts in the guide, and should give you an idea if one will work or not in the real deal.
Also, before spending money you might want to see if Night Cafe’s upcoming edition is Dall-E 2 and if it gets 5 free credits per day.
Still playing with min-dalle. I tried “pet rock” and it indeed generated some rocks that had bits of “paint” on them. I wanted them to be more elaborate so I tried “painted pet rock”. That was more detailed, but after looking at a number of images I realized that there was a theme—it was making paintings of pets. So “painted x rock” is a prompt. I have tried a number of variations, here’s a sample:
It works for me because you expect hand-painted rocks to be a bit sloppy and goofy.
I tried this and it wasn’t super helpful for the kind of testing I wanted. No big deal, though, it’s still loads of fun to play with.
I’m onto a new suite of pieces (I’m artist, man, I talk like that now).
The Alphabet Suite
I’m putting in six lines of modifiers, one each for these categories: Artists; Art Movement & Style; Mediums & Techniques; Photography; Design Tools & Communities; Descriptive Terms.
Then the piece gets modifiers that begin with A for each category. The next piece will get modifiers that begin with B. Et cetera. I’ll come up with some sort of alliterative title for each and see what I get. I’m dedicating 5 credits to each piece - an 2X art run (thumbnail) and a 3X-accuracy coherent run.
My first piece was for the numbers, so no actual artists. I decided to name my piece the first 500 or so digits of Pi. The art run went fine, but I broke the app on the coherent run a couple of times before I got it to run after lopping off 450 digits or so.
First results:
… and a new wrinkle. Second run I’m doing only a 2X accuracy run so I can save one last credit to fold it in half - I’m getting a lot of interesting things with the symmetrical modifier.
I’m going to be buying a pile of DALL-E 2 credits for some personal projects. If anybody would like to request a prompt or two, feel free to put it here or message me. Each unique input is cheap but not free, so I’m not going to go wild with it. But I’m happy to turn some Doper thoughts into pictures, especially if it’s going to be in some way useful for a project of your own.
The Dallery Gallery resources are very very useful.
Sometimes I wonder what “ideas” the AIs come up with. For instance, I recently tried “cyclopia” with min-dalle, curious if it had any conception of the nightmare-inducing fetal development disorder. Across four 25-image runs, it gave me…this.
Obviously it is pulling something out of there that it thinks means a type of circular pattern. And it isn’t just what it creates for a generic unknown word, because, for instance “dlyoptdobl” is apparently 48% likely to be the name of a molecule, 12% likely to be an animal, and 4% likely to be porn.
I wish Night Cafe would go ahead and publish that new algorithm already. I’m storing up credits in case it is a good one.
I’d gotten distracted by other things and haven’t been keeping up on the AI art scene but I recently joined the MidJourney beta on Discord and it’s approximately a bajillion times better at faces than Nightcafe but can’t really pair eyes well unless it’s a straight-on portrait. First couple are me creating D&D characters, second two are from previously “failed” Nightcafe prompts
Attractive Hobbit Woman in Leather Armor Wearing a Purple and Yellow Hat
Elderly Renaissance-Era Professor, Clean-shaven, Casting a magic spell from a book
Cyberpunk Elf Woman in 1900s Gibson Girl Style
Zombie Girl in the Style of Henri de Toulouse-Lautrec, Belle Epoque, Colored Pencil
Impressive as those are (even the gimpy-eyed ones would be salvageable with a little time and Photoshop finesse), I was less impressed with its ability to blend terms. I think Nightcafe is still better at your “Cat racecar driver firing a marshmallow machinegun” style prompts in taking a solid crack at what you asked. And MidJourney failed to give me a good Eldritch Cheerleader Carwash.
The diffusion repo on github does not seem to have any kind of improved parser, but on any of the web sites you can mess with the prompt; e.g.
Eldritch Cheerleader Carwash, oil painting, by H.R. Giger
At least the third one is holding some kind of hose or cleaning implement or something… or maybe that is just a baton.
There is a new algorithm (with the opportunity for some free credits as well). I don’t really get what it does. You get one line for modifiers - you can’t use a start image - you can output 1, 4, 9 or 16 images. What pops out looks like a photo search result if you use normal modifiers.
But if you try to confuse it, it still comes up with something:
More experimentation needed.