From 7d9af1692e8a360bf322d01d1b64640e91c5208f Mon Sep 17 00:00:00 2001 From: Norbert Schmidt Date: Mon, 12 Jan 2026 13:19:50 +0100 Subject: [PATCH] Update README.md --- README.md | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/README.md b/README.md index c14bb7a..862002a 100644 --- a/README.md +++ b/README.md @@ -97,6 +97,95 @@ freqs_dtype = torch.float32 freqs_dtype = torch.float32 ``` +## prompting guide + +LTX-2 works best with detailed, flowing paragraph prompts rather than comma-separated tags. describe what happens in the video like you're writing a screenplay. + +### prompt structure + +write prompts as flowing paragraphs that include: + +1. **scene setting** - location, time of day, weather +2. **camera work** - shot type, movement, framing +3. **subject action** - what's happening, how it moves +4. **visual style** - lighting, colors, atmosphere +5. **audio cues** - ambient sounds, music mood (LTX-2 generates audio too!) + +### example prompts + +**bad prompt:** +``` +wolf, snow, forest, walking, cinematic +``` + +**good prompt:** +``` +EXT. SNOWY FOREST - DUSK. A cinematic tracking shot follows a lone grey wolf +walking through deep powder snow between towering pine trees. The camera moves +alongside at eye level as soft blue twilight filters through the branches. +The wolf's breath is visible in the cold air, paws crunching softly in the snow. +Atmospheric and moody, shallow depth of field with gentle film grain. +``` + +### cinematography terms that work well + +- **shot types:** wide establishing shot, medium shot, close-up, extreme close-up, overhead shot +- **camera movement:** tracking shot, dolly in/out, pan, crane up, handheld, steadicam +- **framing:** shallow depth of field, rack focus, silhouette, rule of thirds +- **lighting:** golden hour, blue hour, rim light, volumetric light, natural lighting +- **style:** cinematic, documentary style, film grain, anamorphic, photorealistic + +### negative prompts + +always include a negative prompt to avoid common issues: + +``` +blurry, low quality, distorted, deformed, ugly, bad anatomy, text, watermark, signature +``` + +if you're getting unwanted artistic styles, add: + +``` +cartoon, anime, illustration, painting, drawing, sketch, cgi, 3d render, digital art, stylized +``` + +## multi-scene films with image-to-video + +LTX-2 supports image-to-video generation using `LTX2ImageToVideoPipeline`. you can create continuity between scenes by using the last frame of scene N as the input image for scene N+1. + +### important warnings + +- **style corruption can propagate** - if one scene produces artifacts or wrong style, it will affect all subsequent scenes +- **the prompt still applies** but the input image has strong influence on visual style +- **use higher guidance_scale (5.0+)** to give the prompt more weight over the image +- **if a scene goes wrong**, use the last frame from an earlier good scene instead + +### example workflow + +```python +from diffusers import LTX2Pipeline, LTX2ImageToVideoPipeline + +# scene 1: text-to-video +t2v_pipe = LTX2Pipeline.from_pretrained("Lightricks/LTX-2", torch_dtype=torch.bfloat16) +result1 = t2v_pipe(prompt="...", guidance_scale=4.0, ...) +last_frame = result1.frames[0][-1] + +# scene 2+: image-to-video for continuity +i2v_pipe = LTX2ImageToVideoPipeline.from_pretrained("Lightricks/LTX-2", torch_dtype=torch.bfloat16) +result2 = i2v_pipe( + image=last_frame, + prompt="...", # prompt still matters! + guidance_scale=5.0, # higher to enforce prompt style + ... +) +``` + +## distilled model warning + +there's a distilled version available (`blanchon/LTX-2-Distilled-diffusers`) that promises faster generation with fewer steps. + +**do not use it for production** - in our testing it produces severe artifacts, cartoon-style corruption, and generally unusable output. stick with the full `Lightricks/LTX-2` model. + ## troubleshooting **out of memory** - reduce resolution/frames or close other apps @@ -105,6 +194,17 @@ freqs_dtype = torch.float32 **import errors** - make sure you installed diffusers from git, not pip +**cartoon/artistic style when you wanted photorealistic:** +- add "photorealistic, cinematic film look, real world footage" to your prompt +- add "cartoon, anime, illustration, painting, drawing" to negative prompt +- increase guidance_scale to 5.0 or higher +- if using image-to-video, check if the input image has style issues + +**scene continuity problems in multi-scene films:** +- check each scene individually before combining +- if a scene has artifacts, regenerate it with text-to-video or use a different input frame +- style corruption from bad frames propagates to all subsequent scenes + ## credits - [lightricks](https://github.com/Lightricks) for ltx-2