Prince Canuma
ac67ee8b1e
Remove the generate_dev.py file, consolidating its functionality into generate.py. Enhance the video generation pipeline to support both distilled and dev models, integrating dynamic sigma scheduling and classifier-free guidance (CFG) for improved video quality. Update command-line interface to accommodate new pipeline options and refactor related functions for better maintainability.
2026-01-19 02:13:00 +01:00
Prince Canuma
0538af6554
Enhance video generation pipeline by integrating Rich for styled console output and progress tracking. Update dependencies in pyproject.toml to include Rich. Refactor print statements to use console methods for improved user experience during video and audio processing.
2026-01-19 01:43:14 +01:00
Prince Canuma
cae11291a9
Remove the audio-video generation pipeline from generate_av.py and integrate audio capabilities into generate.py. This includes adding audio position grid creation, audio frame computation, and updating the denoising function to handle audio latents. Enhance the command-line interface to support audio generation options and update the model configuration accordingly.
2026-01-19 01:28:53 +01:00
Prince Canuma
749762a0b9
Update audio decoder configuration to use an empty set for attention resolutions in both generate_av.py and generate_dev.py. Add a print statement for loading audio VAE decoder weights in generate_dev.py.
2026-01-18 21:55:38 +01:00
Prince Canuma
7069cc39c9
Add audio generation capabilities to video pipeline, including audio position grid creation, audio frame computation, and integration of audio VAE and vocoder. Update tests to cover new audio functionalities.
2026-01-18 21:28:56 +01:00
Prince Canuma
b36ad1e22d
add tests
2026-01-18 11:18:18 +01:00
Prince Canuma
e483eab039
Optimize positional embedding handling in TransformerArgsPreprocessor and improve RoPE frequency computation in _precompute_freqs_cis_double_precision for enhanced performance and precision.
2026-01-18 11:13:32 +01:00
Prince Canuma
62fc4805a0
Add LTX-2 Dev Model video generation pipeline
2026-01-18 11:13:11 +01:00
Prince Canuma
b1bf9e2dc0
Enhance video generation with progress bar for streaming and remove debug prints from tiling decoder
2026-01-17 23:53:53 +01:00
Prince Canuma
f256c5fb25
add tests
2026-01-17 23:36:39 +01:00
Prince Canuma
7f20840dc7
Add streaming support to video generation
2026-01-17 23:17:08 +01:00
Prince Canuma
f33f496fba
Merge branch 'main' into pc/add-vae-tiling
2026-01-17 19:37:21 +01:00
Prince Canuma
e692b7a6b3
Add i2v
...
Add i2v
2026-01-17 19:37:06 +01:00
Prince Canuma
785b0b955d
Merge branch 'main' into pc/add-i2v
2026-01-17 19:36:28 +01:00
Prince Canuma
26fa8919ed
Merge pull request #13 from Blaizzy/Blaizzy-patch-1
...
Update actions
2026-01-17 19:36:14 +01:00
Prince Canuma
c89de996eb
Update GitHub Sponsors username in FUNDING.yml
2026-01-17 19:35:24 +01:00
Prince Canuma
0669998e15
Add audio support
...
Add audio support
2026-01-17 19:31:21 +01:00
Prince Canuma
61c56cd989
Add RoPE tests and warning for bfloat16 precision loss in RoPE calculations
2026-01-17 19:28:05 +01:00
Prince Canuma
78244a2d66
Cast dtype to bf16 in video and audio generation processes
2026-01-17 17:20:22 +01:00
Prince Canuma
883c6b0ad8
ensure dtype cast
2026-01-17 13:03:48 +01:00
Prince Canuma
e4cdbb7eab
add vae tiling
2026-01-17 07:51:54 +01:00
Prince Canuma
f607112407
Refactor video and audio latent generation in generate_video and generate_video_with_audio
...
- Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning.
- Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio.
- Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation.
- Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.
2026-01-17 01:38:53 +01:00
Prince Canuma
d52e567c56
Enhance precision in denormalization and normalization processes
...
- Updated `denormalize` and `pixel_norm` methods in `LTX2VideoDecoder` and `PerChannelStatistics` classes to cast mean and standard deviation to float32 for improved precision.
- Ensured that the output of normalization operations retains the original data type of the input tensor.
2026-01-17 01:14:29 +01:00
Prince Canuma
ecda6d10e5
Merge pull request #9 from Blaizzy/pc/fix-text-encoder
...
Fix text encoder
2026-01-17 01:10:36 +01:00
Prince Canuma
146f5d2981
Add image-to-video (I2V) conditioning support
...
- Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation.
- Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning.
- Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`.
- Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning.
- Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d
2026-01-17 00:19:52 +01:00
Prince Canuma
5f86e881d7
Update top_p parameter in sampler function to 1.0 for enhanced sampling control in LTX2TextEncoder
2026-01-16 21:08:14 +01:00
Prince Canuma
f6e0e5d5a4
Update av_ca_timestep_scale_multiplier to 1000 in model configuration for consistency across modules
2026-01-16 15:59:22 +01:00
Prince Canuma
e1bff927df
Auto-detect timestep_cond from model metadata ()
2026-01-16 14:55:50 +01:00
Prince Canuma
a658911f98
add audio
2026-01-16 01:15:22 +01:00
Prince Canuma
81daf3f67d
Add prompt enhancement feature to video generation
...
- Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling.
- Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder.
- Added command-line arguments for prompt enhancement options.
- Created new system prompt files for T2V and I2V generation to guide the enhancement process.
2026-01-15 14:31:00 +01:00
Prince Canuma
f5134fa172
adjust gelu and precision
2026-01-15 12:49:21 +01:00
Prince Canuma
349a82f763
Refactor GroupNorm3d: Optimize data type handling by casting input, weight, and bias to float32 for consistency and performance
2026-01-15 04:46:56 +01:00
Prince Canuma
09c2b460a7
Refactor LTX2VideoDecoder and ResBlockGroup: Change up_blocks and res_blocks from lists to dictionaries for better parameter tracking in MLX
2026-01-15 03:48:16 +01:00
Prince Canuma
3fcd8f90be
Refactor LTXModel: Change transformer_blocks from list to dictionary
2026-01-15 03:47:52 +01:00
Prince Canuma
e7067fea11
Refactor LTX2VideoDecoder: Remove redundant comments for residual parameter
2026-01-14 01:21:43 +01:00
Prince Canuma
957093c29b
use numpy for improved float64 precision and performance
2026-01-14 00:03:00 +01:00
Prince Canuma
74af04718d
Remove commented-out code and clean up text encoder initialization
2026-01-13 23:31:54 +01:00
Prince Canuma
ea063f7550
Cast LM weights to bfloat16
2026-01-13 23:30:26 +01:00
Prince Canuma
fc6ef20c1b
Add custom text encoder with quantization
...
Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com >
2026-01-13 22:56:51 +01:00
Prince Canuma
01d895bc77
Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation
2026-01-13 17:12:11 +01:00
Prince Canuma
61b003ff2c
Revise README for text-to-video generation example
...
Updated example prompt and parameters for video generation.
2026-01-12 17:21:54 +01:00
Prince Canuma
535dd9a066
Update README for macOS requirements
2026-01-12 17:19:32 +01:00
Prince Canuma
070edc0de6
Replace poodles.mp4 with poodles.gif in examples directory
2026-01-12 17:14:12 +01:00
Prince Canuma
74f7f22a0f
remove ref
2026-01-12 17:11:05 +01:00
Prince Canuma
084927c74a
Remove optional dependencies section from README
2026-01-12 16:48:28 +01:00
Prince Canuma
ebe4e74585
Add pre-commit configuration for code formatting and linting with Black, isort, and autoflake
2026-01-12 16:47:34 +01:00
Prince Canuma
4f6fc8252c
Add example usage to README and enhance console output in generate.py with ANSI colors
2026-01-12 16:45:09 +01:00
Prince Canuma
28417fe126
fix git ignore
2026-01-12 16:35:41 +01:00
Prince Canuma
54fb1ed076
fix uv lock
2026-01-12 16:35:15 +01:00
Prince Canuma
c7f94052e8
fix toml
2026-01-12 16:35:03 +01:00