Commit Graph

35 Commits

Author SHA1 Message Date
Prince Canuma
f607112407 Refactor video and audio latent generation in generate_video and generate_video_with_audio
- Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning.
- Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio.
- Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation.
- Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.
2026-01-17 01:38:53 +01:00
Prince Canuma
d52e567c56 Enhance precision in denormalization and normalization processes
- Updated `denormalize` and `pixel_norm` methods in `LTX2VideoDecoder` and `PerChannelStatistics` classes to cast mean and standard deviation to float32 for improved precision.
- Ensured that the output of normalization operations retains the original data type of the input tensor.
2026-01-17 01:14:29 +01:00
Prince Canuma
146f5d2981 Add image-to-video (I2V) conditioning support
- Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation.
- Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning.
- Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`.
- Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning.
- Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d
2026-01-17 00:19:52 +01:00
Prince Canuma
5f86e881d7 Update top_p parameter in sampler function to 1.0 for enhanced sampling control in LTX2TextEncoder 2026-01-16 21:08:14 +01:00
Prince Canuma
f6e0e5d5a4 Update av_ca_timestep_scale_multiplier to 1000 in model configuration for consistency across modules 2026-01-16 15:59:22 +01:00
Prince Canuma
e1bff927df Auto-detect timestep_cond from model metadata () 2026-01-16 14:55:50 +01:00
Prince Canuma
a658911f98 add audio 2026-01-16 01:15:22 +01:00
Prince Canuma
81daf3f67d Add prompt enhancement feature to video generation
- Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling.
- Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder.
- Added command-line arguments for prompt enhancement options.
- Created new system prompt files for T2V and I2V generation to guide the enhancement process.
2026-01-15 14:31:00 +01:00
Prince Canuma
f5134fa172 adjust gelu and precision 2026-01-15 12:49:21 +01:00
Prince Canuma
349a82f763 Refactor GroupNorm3d: Optimize data type handling by casting input, weight, and bias to float32 for consistency and performance 2026-01-15 04:46:56 +01:00
Prince Canuma
09c2b460a7 Refactor LTX2VideoDecoder and ResBlockGroup: Change up_blocks and res_blocks from lists to dictionaries for better parameter tracking in MLX 2026-01-15 03:48:16 +01:00
Prince Canuma
3fcd8f90be Refactor LTXModel: Change transformer_blocks from list to dictionary 2026-01-15 03:47:52 +01:00
Prince Canuma
e7067fea11 Refactor LTX2VideoDecoder: Remove redundant comments for residual parameter 2026-01-14 01:21:43 +01:00
Prince Canuma
957093c29b use numpy for improved float64 precision and performance 2026-01-14 00:03:00 +01:00
Prince Canuma
74af04718d Remove commented-out code and clean up text encoder initialization 2026-01-13 23:31:54 +01:00
Prince Canuma
ea063f7550 Cast LM weights to bfloat16 2026-01-13 23:30:26 +01:00
Prince Canuma
fc6ef20c1b Add custom text encoder with quantization
Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com>
2026-01-13 22:56:51 +01:00
Prince Canuma
01d895bc77 Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation 2026-01-13 17:12:11 +01:00
Prince Canuma
61b003ff2c Revise README for text-to-video generation example
Updated example prompt and parameters for video generation.
2026-01-12 17:21:54 +01:00
Prince Canuma
535dd9a066 Update README for macOS requirements 2026-01-12 17:19:32 +01:00
Prince Canuma
070edc0de6 Replace poodles.mp4 with poodles.gif in examples directory 2026-01-12 17:14:12 +01:00
Prince Canuma
74f7f22a0f remove ref 2026-01-12 17:11:05 +01:00
Prince Canuma
084927c74a Remove optional dependencies section from README 2026-01-12 16:48:28 +01:00
Prince Canuma
ebe4e74585 Add pre-commit configuration for code formatting and linting with Black, isort, and autoflake 2026-01-12 16:47:34 +01:00
Prince Canuma
4f6fc8252c Add example usage to README and enhance console output in generate.py with ANSI colors 2026-01-12 16:45:09 +01:00
Prince Canuma
28417fe126 fix git ignore 2026-01-12 16:35:41 +01:00
Prince Canuma
54fb1ed076 fix uv lock 2026-01-12 16:35:15 +01:00
Prince Canuma
c7f94052e8 fix toml 2026-01-12 16:35:03 +01:00
Prince Canuma
28d03d8846 setup 2026-01-12 16:17:45 +01:00
Prince Canuma
7eac6ae7de Replace imageio with OpenCV for video saving in generate.py; updated default frame count to 100. 2026-01-12 16:12:41 +01:00
Prince Canuma
666e1f2e0c Refactor model path handling: moved get_model_path function to utils.py and updated generate.py to use the new import. 2026-01-12 15:54:32 +01:00
Prince Canuma
75511a0b17 Remove main.py and refactor video generation logic into generate.py. 2026-01-12 14:23:02 +01:00
Prince Canuma
7114b023bd - Refactor video generation script
- Introduced argparse for parameter handling, streamlined model loading, and enhanced denoising functions.
- Updated VAE weight sanitization for compatibility and improved activation function handling in text projection.
- Added support for saving individual frames and refined output video generation process.
2026-01-12 14:04:53 +01:00
Prince Canuma
d1ca36a315 initial commit (LTX-2) 2026-01-11 23:48:33 +01:00
Prince Canuma
9f01d22750 Initial commit 2025-05-07 12:21:09 +02:00