AI image generation is getting crazy good

Playing around with Flux Krea.

Imgur

Imgur

Imgur

Imgur

Imgur

Imgur

Summary

A vast alien canyon stretching into the horizon, towering crystalline mountains with jagged peaks glowing faintly, a wide river of shimmering light carving through the valley floor, a futuristic city of luminous spires and crystalline towers built along the cliffs, floating bridges connecting the structures, flying creatures soaring through the skies, flocks circling above the glowing towers, a small boat drifting along the radiant river far below, ethereal clouds drifting between the cliffs, cinematic establishing shot, sweeping epic perspective, golden twilight casting long shadows, mystical energy shimmering in the atmosphere, a surreal fusion of science fiction and heroic fantasy

Summary

Waterfall, lush forest, night and day, miniature crescent sun, morning sky, surrounding clouds, mysterious glow, perfectly round sun, sunlight.

Summary

A futuristic cityscape. The city is made up of various buildings and structures, all of which appear to be made of stone or concrete. The buildings are of different sizes and shapes, with some having intricate designs and patterns. Some of the buildings have large domes or domes on top, while others have smaller ones. The sky is a pale blue with a hint of orange and yellow, and there are several hot air balloons floating above the city. The overall mood of the image is surreal and dreamlike.

Summary

a vast plateau rises above an ocean of fog, jagged cliffs piercing through layers of silver mist BREAK distant monoliths shimmer faintly, half-buried in moss, their surfaces etched with forgotten runes BREAK background: pale sun rays scatter through drifting clouds, painting the scene in muted golds and grays BREAK mood: timeless isolation, beauty of silence, a world held between sky and abyss.

Summary

a colossal labyrinthine metropolis clinging to a sheer cliff face, glowing with warm amber light from windows and lanterns, blending medieval timber-framed houses with gritty steampunk elements like exposed gears, riveted pipes, and spiraling stairways threading through misty streets, centered around a thunderous waterfall cascading from the cliff’s heart and echoing through narrow alleys and bridges, while in the foreground a modest wooden skiff with two cloaked travelers glides through dark, rock-strewn waters under an overcast sky that diffuses faint sunlight through fractured clouds, casting a soft otherworldly glow over earthy browns, weathered grays, and cool blues in a richly cinematic, ultra-detailed atmosphere

Summary

a floating woman in a long dress, with long wavy hair, made of smoke, translucent, silhouette passionately playing the violin, fantasy, ethereal, emerging from wisps of smoke, close-up portrait, dark background, best quality, captivating, rich in detail, maximalist style, superb composition, highest aesthetic, whimsical, fantastical, splash art, rich in detail, hyper-detailed, maximalist style, conceptual art, sharp focus, harmony, serenity, tranquility, mysterious glow, superb composition, sharp focus, high contrast, stylized, clear, ultra quality, award-winning best quality, a masterpiece, made of smoke

Interesting. What it’s doing, it’s doing well, but it seems to just completely ignore any elements of the prompt it finds inconvenient.

Some of them feel contradictory or at least very hard to reconcile. Using the city as an example, trying to make a city that is attached to a cliffside, has misty winding streets, has alleyways but also is seen with a recognizable boat and figures in the foreground just doesn’t really seem to work at the scale you’re rendering at.

I made this with SDXL just for funsies but, again, it’s hard to take an angle like this and add in misty streets and alleyways.

Midjourney did a good job (IMO) of conveying scale and mood but no “Glue some gears on it and call it Steampunk” decorations.

Flux Krea in a more realistic style, with simple plain language prompts:

Imgur

Summary

a grizzled old new england fisherman, standing on the deck of his old fishing boat in his old worn peacoat and navy cap, pipe in his mouth, smoking. the boat’s white paint is peeling but overall the vessel is well cared for. the sky is gray and overcast. the sea is churning, wind swirling the fog.

Imgur

Summary

Eerie scary scene of a female figure. Pale deathly skin. Black and white. Wearing a metal half-mask covering her mouth. Wearing revealing robes. Cleavage. Pale eyes. Soft features. Heavy gothic makeup. Emerging from fog, mist, shadows.

Imgur

Summary

a basset hound puppy chewing on a rawhide bone on the carpet.

More than Flux Krea, though. So far as I can tell, the only steampunk in its entire image is the wheel hanging from the bridge.

The three prompts run through Copilot and Sora

Copilot halfway did the cleavage one then rejected it.

Summary

A grizzled old New England fisherman, is standing on the deck of his old fishing boat in his old worn peacoat and navy cap, pipe in his mouth, smoking. A basset hound puppy is chewing on a rawhide bone on the deck at his feet. The boat’s white paint is peeling but overall the vessel is well cared for. The sky is gray and overcast. The sea is churning, wind swirling the fog. Emerging from fog, mist, shadows, an eerie scary scene of a female figure. Pale deathly skin, dressed in black. Wearing a metal half-mask covering her mouth. Wearing revealing robes. Cleavage. Pale eyes. Soft features. Heavy gothic makeup.

Playing around with cliffside towns some more. Customized SDXL with a bit of selective inpainting.

Fun with depth perception (SDXL, various models)

A gallery of Copilot/Sora conversions of photos of objects that I’ve found online

A sample

Your album link didn’t work for me. I got a “not found” error and was redirected to the main imgur page. Did you make your album public? I don’t have an imgur login.

Cat’s eye view

B&W, extreme foreshortening, shallow DoF.

No, not public. Thought it would work with a direct link even if it wasn’t listed (like with Youtube). I stopped making things public on Imgur very early on, because it is an extremely toxic culture there, sort of like the comments section on Youtube with the edition of a zero sum game of upvoting/downvoting to try to push their own images to the top, and I’m only interested in having an image host for here, not in commentary from adolescent trolls. I posted the gallery because I didn’t want to flood the thread with 11 images in a row. (Well, 11 images in a column.) Here they are individually:

Summary

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

https://i.postimg.cc/PfFtL2MJ/Pink-moonface-grid.jpg

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

Imgur: The magic of the Internet

(Imgur apparently keeps deleting that 4th file that I now have at a different host.)

I might have used the wrong term. What I meant, is there a way you can set the permissions on the album so anyone who has the link, like you posted here, can click on it and see all the pictures there. I’m guessing it will probably be more convenient for you to do that when you’ve got a bunch to share at once rather than link them individually here.

There is only one way of making an album/gallery “public” on Imgur, and it makes it as a visible post on the Imgur forums. It is exactly like (and exactly is) going to a different forum where image posts are allowed and making a new thread there of your images, then pointing a link to it here. But the people there are going to comment on it, too, neither knowing or caring that you actually intended for people on a completely different forum to be seeing it instead.

Ah, yeah, I wouldn’t want that either. Thanks for explaining.

In other news, Grok recently added more options for free users to control what’s in the videos it makes from your source images you upload. Or maybe they were there all along, but a bug kept me from seeing them. I’ve noticed other issues cleared up recently too.

ETA: This free place I uploaded the video to will expire it in a month. But if I share it on Grok, I think you need your own login there to see it.

I had never explored Grock before because I thought I would have to do it through Twitter. But now I’ve downloaded the app and am experimenting with it, so far doing only image to video. It automatically begins generating a video as soon as you click on an image but has a menu item to write a custom prompt for a second video, which is pretty inefficient. I don’t see an option to wait until I say so to start generating a clip. So far it hasn’t been very good at understanding the prompts for actions, though. It is mostly pretty good at creating coherent, plausible movement, but progressively looses image detail faster than some of the other image to video options.

Here’s my first try:

And a later try involving Batman’s arch nemesises the Joker, the Riddler, the Puzzler, and the Danny Rebus. (For some reason the bad AI audio didn’t download on that one.)

It did a good job on this with no prompt (other than some extra jar noise at the end)

But not a good job of following the instruction that she drop a jar and break it

With and without prompt it made the ball stick to her face instead of bounce off.

I’m using Grok via its web interface.

Yeah, that happens in the web interface too. Could end up being a waste of your limited daily generations. I found something of a workaround for that, though… with a bit of planning. After you’ve run out your cap for the day, any further attempts you try to do won’t generate, but they do go into your favorites, so when your quota replenishes, you can go in via the imagine tab, then to favorites (the pics next to the entry field at the bottom) and you can use the option menu directly to choose the type you want. I don’t know if that workaround is available in the app. I’m only using the web interface.

Up until yesterday, they were not showing me the options to do voice or custom prompt; my only choices were the normal, spicy, and fun. I assumed that was a limitation on free users, but now I’m not so sure. I was definitely annoyed at the lack of prompting, but I had a flash of inspiration: What if I added a caption to my original image, explaining what I wanted to happen? That worked amazingly often. Since they activated the prompt option for me (was it there for everybody even free users before then? I saw hints in other online discussions that it was there).
Even now, the captions I put in seem to work slightly better than the prompt. And they can work together. I varied how I did the caption. Sometimes it was overlay text, sometimes I painted the directions on a sign appearing within the scene itself. Suboptimal if you want there to be any surprises in the video.
I just got the idea that maybe I should try writing a caption in a space outside the original frame of the source pic, using ImageMagick, then after the video gets generated, find some crop tool to get rid of the caption. I haven’t tried that yet.

My options on the app are custom, speech, fun, and normal.