Commit Graph

19 Commits

Author SHA1 Message Date
Prince Canuma
ac67ee8b1e Remove the generate_dev.py file, consolidating its functionality into generate.py. Enhance the video generation pipeline to support both distilled and dev models, integrating dynamic sigma scheduling and classifier-free guidance (CFG) for improved video quality. Update command-line interface to accommodate new pipeline options and refactor related functions for better maintainability. 2026-01-19 02:13:00 +01:00
Prince Canuma
0538af6554 Enhance video generation pipeline by integrating Rich for styled console output and progress tracking. Update dependencies in pyproject.toml to include Rich. Refactor print statements to use console methods for improved user experience during video and audio processing. 2026-01-19 01:43:14 +01:00
Prince Canuma
cae11291a9 Remove the audio-video generation pipeline from generate_av.py and integrate audio capabilities into generate.py. This includes adding audio position grid creation, audio frame computation, and updating the denoising function to handle audio latents. Enhance the command-line interface to support audio generation options and update the model configuration accordingly. 2026-01-19 01:28:53 +01:00
Prince Canuma
b1bf9e2dc0 Enhance video generation with progress bar for streaming and remove debug prints from tiling decoder 2026-01-17 23:53:53 +01:00
Prince Canuma
7f20840dc7 Add streaming support to video generation 2026-01-17 23:17:08 +01:00
Prince Canuma
78244a2d66 Cast dtype to bf16 in video and audio generation processes 2026-01-17 17:20:22 +01:00
Prince Canuma
e4cdbb7eab add vae tiling 2026-01-17 07:51:54 +01:00
Prince Canuma
f607112407 Refactor video and audio latent generation in generate_video and generate_video_with_audio
- Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning.
- Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio.
- Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation.
- Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.
2026-01-17 01:38:53 +01:00
Prince Canuma
146f5d2981 Add image-to-video (I2V) conditioning support
- Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation.
- Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning.
- Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`.
- Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning.
- Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d
2026-01-17 00:19:52 +01:00
Prince Canuma
e1bff927df Auto-detect timestep_cond from model metadata () 2026-01-16 14:55:50 +01:00
Prince Canuma
a658911f98 add audio 2026-01-16 01:15:22 +01:00
Prince Canuma
81daf3f67d Add prompt enhancement feature to video generation
- Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling.
- Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder.
- Added command-line arguments for prompt enhancement options.
- Created new system prompt files for T2V and I2V generation to guide the enhancement process.
2026-01-15 14:31:00 +01:00
Prince Canuma
fc6ef20c1b Add custom text encoder with quantization
Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com>
2026-01-13 22:56:51 +01:00
Prince Canuma
01d895bc77 Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation 2026-01-13 17:12:11 +01:00
Prince Canuma
4f6fc8252c Add example usage to README and enhance console output in generate.py with ANSI colors 2026-01-12 16:45:09 +01:00
Prince Canuma
7eac6ae7de Replace imageio with OpenCV for video saving in generate.py; updated default frame count to 100. 2026-01-12 16:12:41 +01:00
Prince Canuma
666e1f2e0c Refactor model path handling: moved get_model_path function to utils.py and updated generate.py to use the new import. 2026-01-12 15:54:32 +01:00
Prince Canuma
75511a0b17 Remove main.py and refactor video generation logic into generate.py. 2026-01-12 14:23:02 +01:00
Prince Canuma
d1ca36a315 initial commit (LTX-2) 2026-01-11 23:48:33 +01:00