Update README.md
This commit is contained in:
100
README.md
100
README.md
@@ -97,6 +97,95 @@ freqs_dtype = torch.float32
|
|||||||
freqs_dtype = torch.float32
|
freqs_dtype = torch.float32
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## prompting guide
|
||||||
|
|
||||||
|
LTX-2 works best with detailed, flowing paragraph prompts rather than comma-separated tags. describe what happens in the video like you're writing a screenplay.
|
||||||
|
|
||||||
|
### prompt structure
|
||||||
|
|
||||||
|
write prompts as flowing paragraphs that include:
|
||||||
|
|
||||||
|
1. **scene setting** - location, time of day, weather
|
||||||
|
2. **camera work** - shot type, movement, framing
|
||||||
|
3. **subject action** - what's happening, how it moves
|
||||||
|
4. **visual style** - lighting, colors, atmosphere
|
||||||
|
5. **audio cues** - ambient sounds, music mood (LTX-2 generates audio too!)
|
||||||
|
|
||||||
|
### example prompts
|
||||||
|
|
||||||
|
**bad prompt:**
|
||||||
|
```
|
||||||
|
wolf, snow, forest, walking, cinematic
|
||||||
|
```
|
||||||
|
|
||||||
|
**good prompt:**
|
||||||
|
```
|
||||||
|
EXT. SNOWY FOREST - DUSK. A cinematic tracking shot follows a lone grey wolf
|
||||||
|
walking through deep powder snow between towering pine trees. The camera moves
|
||||||
|
alongside at eye level as soft blue twilight filters through the branches.
|
||||||
|
The wolf's breath is visible in the cold air, paws crunching softly in the snow.
|
||||||
|
Atmospheric and moody, shallow depth of field with gentle film grain.
|
||||||
|
```
|
||||||
|
|
||||||
|
### cinematography terms that work well
|
||||||
|
|
||||||
|
- **shot types:** wide establishing shot, medium shot, close-up, extreme close-up, overhead shot
|
||||||
|
- **camera movement:** tracking shot, dolly in/out, pan, crane up, handheld, steadicam
|
||||||
|
- **framing:** shallow depth of field, rack focus, silhouette, rule of thirds
|
||||||
|
- **lighting:** golden hour, blue hour, rim light, volumetric light, natural lighting
|
||||||
|
- **style:** cinematic, documentary style, film grain, anamorphic, photorealistic
|
||||||
|
|
||||||
|
### negative prompts
|
||||||
|
|
||||||
|
always include a negative prompt to avoid common issues:
|
||||||
|
|
||||||
|
```
|
||||||
|
blurry, low quality, distorted, deformed, ugly, bad anatomy, text, watermark, signature
|
||||||
|
```
|
||||||
|
|
||||||
|
if you're getting unwanted artistic styles, add:
|
||||||
|
|
||||||
|
```
|
||||||
|
cartoon, anime, illustration, painting, drawing, sketch, cgi, 3d render, digital art, stylized
|
||||||
|
```
|
||||||
|
|
||||||
|
## multi-scene films with image-to-video
|
||||||
|
|
||||||
|
LTX-2 supports image-to-video generation using `LTX2ImageToVideoPipeline`. you can create continuity between scenes by using the last frame of scene N as the input image for scene N+1.
|
||||||
|
|
||||||
|
### important warnings
|
||||||
|
|
||||||
|
- **style corruption can propagate** - if one scene produces artifacts or wrong style, it will affect all subsequent scenes
|
||||||
|
- **the prompt still applies** but the input image has strong influence on visual style
|
||||||
|
- **use higher guidance_scale (5.0+)** to give the prompt more weight over the image
|
||||||
|
- **if a scene goes wrong**, use the last frame from an earlier good scene instead
|
||||||
|
|
||||||
|
### example workflow
|
||||||
|
|
||||||
|
```python
|
||||||
|
from diffusers import LTX2Pipeline, LTX2ImageToVideoPipeline
|
||||||
|
|
||||||
|
# scene 1: text-to-video
|
||||||
|
t2v_pipe = LTX2Pipeline.from_pretrained("Lightricks/LTX-2", torch_dtype=torch.bfloat16)
|
||||||
|
result1 = t2v_pipe(prompt="...", guidance_scale=4.0, ...)
|
||||||
|
last_frame = result1.frames[0][-1]
|
||||||
|
|
||||||
|
# scene 2+: image-to-video for continuity
|
||||||
|
i2v_pipe = LTX2ImageToVideoPipeline.from_pretrained("Lightricks/LTX-2", torch_dtype=torch.bfloat16)
|
||||||
|
result2 = i2v_pipe(
|
||||||
|
image=last_frame,
|
||||||
|
prompt="...", # prompt still matters!
|
||||||
|
guidance_scale=5.0, # higher to enforce prompt style
|
||||||
|
...
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## distilled model warning
|
||||||
|
|
||||||
|
there's a distilled version available (`blanchon/LTX-2-Distilled-diffusers`) that promises faster generation with fewer steps.
|
||||||
|
|
||||||
|
**do not use it for production** - in our testing it produces severe artifacts, cartoon-style corruption, and generally unusable output. stick with the full `Lightricks/LTX-2` model.
|
||||||
|
|
||||||
## troubleshooting
|
## troubleshooting
|
||||||
|
|
||||||
**out of memory** - reduce resolution/frames or close other apps
|
**out of memory** - reduce resolution/frames or close other apps
|
||||||
@@ -105,6 +194,17 @@ freqs_dtype = torch.float32
|
|||||||
|
|
||||||
**import errors** - make sure you installed diffusers from git, not pip
|
**import errors** - make sure you installed diffusers from git, not pip
|
||||||
|
|
||||||
|
**cartoon/artistic style when you wanted photorealistic:**
|
||||||
|
- add "photorealistic, cinematic film look, real world footage" to your prompt
|
||||||
|
- add "cartoon, anime, illustration, painting, drawing" to negative prompt
|
||||||
|
- increase guidance_scale to 5.0 or higher
|
||||||
|
- if using image-to-video, check if the input image has style issues
|
||||||
|
|
||||||
|
**scene continuity problems in multi-scene films:**
|
||||||
|
- check each scene individually before combining
|
||||||
|
- if a scene has artifacts, regenerate it with text-to-video or use a different input frame
|
||||||
|
- style corruption from bad frames propagates to all subsequent scenes
|
||||||
|
|
||||||
## credits
|
## credits
|
||||||
|
|
||||||
- [lightricks](https://github.com/Lightricks) for ltx-2
|
- [lightricks](https://github.com/Lightricks) for ltx-2
|
||||||
|
|||||||
Reference in New Issue
Block a user