home

O Romeo, Romeo, wherefore art thou Romeo?

I'm describing the process of creating this image here.

The initial prompt was rather simple. It consisted of two parts:

  • what I wanted to see
    full body portrait, old beggar rugged man in rags (pressed against glass, looking at reflection in window, cold, desolate city, looking to the side:1.2)
  • color theme to add character, I used wildcards because I didn't know what exactly I was looking for
    ({charcoal black|white|vibrant {red|orange|yellow|green|blue}|pastel {pink|blue|green|yellow|beige}|dark {brown|red|purple|gray}} theme:1.4)

I used my photoreal merge with four loras:

The negative wasn't anything fancy (I didn't even clean up the keywords that made no sense here like turtle or pubic hair). There's four embeddings there, but they don't do anything specific.
EasyNegative, negative_hand-neg, (turtle, multiple views, text, title, panties, bangs, blunt bangs, pubic hair, fake animal ears, ribbed sweater:1.3), BadDream, UnrealisticDream

Other settings:
Steps: 36, Sampler: Restart, CFG scale: 6, Size: 512x768, VAE: vae-ft-mse-840000-ema-pruned.vae.pt, Denoising strength: 0.45, Clip skip: 2, Hires upscale: 2, Hires upscaler: ESRGAN_4x

Some of the images:

I ended up going with the green one because it had that "life kicking you in the balls feel". At this point the resolutions was 1024×1536px. Next thing was to img2img it into an anime girl. I switched to nu element and modified the prompt a bit:
full body anime drawing, cute loli girl wearing a dress with frills (pressed against glass, looking at reflection in window, warm flower field, looking to the side:1.2), (vibrant green theme:1.4)

I also removed add detail and 3dmm loras, adding <lora:urachan1629-ArtStyle-53:0.5> instead. Denoising was set to 0.5.

Left side is the original txt2img. Right side is after img2img and some additional inpainting on the face, which was probably pointless considering I was going to upscale and img2img it multiple times anyway.

Next I manually composited the two images (left side), passed it through img2img with the photorealistic settings from before to make the reflection feel more like a part of the world and pasted the anime girl back, this time fading her more into the background (right side). Denoising set to 0.5 again.

Next step was to upscale the image to 2048×3072px using the photorealistic settings (top), anime settings (middle) and composite them together (bottom). Before compositing I inpainted the faces at a higher resolution, which was probably pointless again.

The upscaling was done without any tiling (praise the high amounts vram of my gpu) at 0.35 denoising and using the UltraSharp upscaler. I'm only showing the faces because the rest of the image is similar to what it was.

Then I repeated the whole process again, this time upscaling to 4096×6144px, which was way too big both for the model and my poor gpu. I used the UltraSharp upscaler, 0.35 denoising, and tile size of 1600×1600px. I did the upscale of the whole image for the photorealistic part, but the anime one I just did on a cropped image to save time. After compositing yet again I was left with this. Left side is from right before upscales for comparison, right side is after all upscaling and compositing.

All that was left was color correction, denoising, sharpening, contrast, etc. Left side is before corrections, Right side is after.

Since this was supposed to be gritty and naturalistic I went ahead and added lens distortions, chromatic aberration, noise (mostly to the dark areas) and faded the whole thing a bit more. Left side is the previous image, Right side is after distortions. Did it make the image better? I don't know, I like the effect, some might not. I might have made it too reddish/purple but it was 3am and I just wanted it done already. Purple isn't that bad either.

That is it, why did you even read this. Go and make some cute anime girls and upload them somewhere so that I can look.