User seeks AI model recommendations for image, audio, and video generation

By PulseAugur Editorial · [1 sources] · 2026-06-22 19:36

A user on Reddit is seeking recommendations for AI models and workflows for image, audio, and video generation, specifically for a setup with 12GB of VRAM. They are looking for advice on portrait generation using techniques like IPAdapter or LoRA training, and for adult content generation, inquiring if Lustify SDXL remains the best option. The user also seeks alternatives to ElevenLabs for text-to-speech with more emotional range and better image captioning models for AI agents. Additionally, they are asking for guidance on starting with AI video generation on their hardware and recommendations for inference models for general tasks, coding, and roleplaying, mentioning Hermes and DeepSeek as current options. AI

RANK_REASON User-generated content asking for recommendations on AI models and tools, not a primary release or significant industry event.

Read on r/StableDiffusion →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

User seeks AI model recommendations for image, audio, and video generation

COVERAGE [1]

r/StableDiffusion TIER_2 English(EN) · /u/ReadNFO · 2026-06-22 19:36

Back to SD after a while - Questions about the current image / audio / video meta

<div class="md"><p>Hey guys!</p> <p>I'm back to messing with Stable Diffusion after a while and was wondering what are the current models that you guys recommend for image, video and audio (TTS) generation. I run a 4070 RTX so only 12GB VRAM. I plan to get somethin…

COVERAGE [1]

Back to SD after a while - Questions about the current image / audio / video meta

RELATED ENTITIES

RELATED TOPICS