Daniel
1cf878f5e0
More poodles
2026-03-11 09:24:06 +01:00
Daniel
d207275fea
fix(wan): Fix scheduler sigma schedule and add debug flags
2026-03-11 09:18:01 +01:00
Daniel
afd15018b7
chore: Cleanup -- reorganize README and docs
2026-03-11 09:17:25 +01:00
Daniel
061ae4407c
feat(wan): Add chunked VAE encoding and TI2V-5B support
2026-03-11 09:16:52 +01:00
Daniel
967218b7c1
feat(wan): Add diagnostic scripts and porting guide
2026-03-11 09:16:22 +01:00
Daniel
9bdda9f22e
feat(wan): Add tiled VAE decoding and fix TI2V quality
2026-03-11 09:16:22 +01:00
Daniel
9597b7c9c5
perf(wan): Add mx.compile and fix first-frame artifacts
2026-03-11 09:14:43 +01:00
Daniel
849cc45d84
feat(wan): Add LoRA with improved quantization pipeline
2026-03-11 09:13:20 +01:00
Daniel
dbab95ec45
fix(wan): Fix RoPE frequency construction
2026-03-11 09:12:19 +01:00
Daniel
f4195f0118
feat(wan): Add I2V-14B dual-model support
2026-03-11 09:12:19 +01:00
Daniel
2bb95c61ed
feat(wan): Add Wan2.2 I2V support
2026-03-11 09:08:10 +01:00
Daniel
93da550f65
feat(wan): Add DPM++ 2M and UniPC schedulers
2026-03-11 09:08:10 +01:00
Daniel
e64483a66a
feat(wan): Add Wan2.1/2.2 T2V with quantization support
2026-03-11 09:08:10 +01:00
Prince Canuma
7a74946c57
Merge pull request #14 from Blaizzy/pc/add-streaming
...
Add --stream flag and chunked conv memory optimization for VAE decoding
2026-01-21 15:42:55 +01:00
Prince Canuma
ffdeec72a6
Merge branch 'main' into pc/add-streaming
2026-01-21 15:42:16 +01:00
Prince Canuma
7ad14e18ca
Merge pull request #12 from Blaizzy/pc/add-vae-tiling
...
Add VAE Tiling + BFloat16 Support for Memory-Efficient Video Generation
2026-01-21 15:41:46 +01:00
Prince Canuma
b1bf9e2dc0
Enhance video generation with progress bar for streaming and remove debug prints from tiling decoder
2026-01-17 23:53:53 +01:00
Prince Canuma
f256c5fb25
add tests
2026-01-17 23:36:39 +01:00
Prince Canuma
7f20840dc7
Add streaming support to video generation
2026-01-17 23:17:08 +01:00
Prince Canuma
f33f496fba
Merge branch 'main' into pc/add-vae-tiling
2026-01-17 19:37:21 +01:00
Prince Canuma
e692b7a6b3
Add i2v
...
Add i2v
2026-01-17 19:37:06 +01:00
Prince Canuma
785b0b955d
Merge branch 'main' into pc/add-i2v
2026-01-17 19:36:28 +01:00
Prince Canuma
26fa8919ed
Merge pull request #13 from Blaizzy/Blaizzy-patch-1
...
Update actions
2026-01-17 19:36:14 +01:00
Prince Canuma
c89de996eb
Update GitHub Sponsors username in FUNDING.yml
2026-01-17 19:35:24 +01:00
Prince Canuma
0669998e15
Add audio support
...
Add audio support
2026-01-17 19:31:21 +01:00
Prince Canuma
61c56cd989
Add RoPE tests and warning for bfloat16 precision loss in RoPE calculations
2026-01-17 19:28:05 +01:00
Prince Canuma
78244a2d66
Cast dtype to bf16 in video and audio generation processes
2026-01-17 17:20:22 +01:00
Prince Canuma
883c6b0ad8
ensure dtype cast
2026-01-17 13:03:48 +01:00
Prince Canuma
e4cdbb7eab
add vae tiling
2026-01-17 07:51:54 +01:00
Prince Canuma
f607112407
Refactor video and audio latent generation in generate_video and generate_video_with_audio
...
- Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning.
- Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio.
- Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation.
- Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.
2026-01-17 01:38:53 +01:00
Prince Canuma
d52e567c56
Enhance precision in denormalization and normalization processes
...
- Updated `denormalize` and `pixel_norm` methods in `LTX2VideoDecoder` and `PerChannelStatistics` classes to cast mean and standard deviation to float32 for improved precision.
- Ensured that the output of normalization operations retains the original data type of the input tensor.
2026-01-17 01:14:29 +01:00
Prince Canuma
ecda6d10e5
Merge pull request #9 from Blaizzy/pc/fix-text-encoder
...
Fix text encoder
2026-01-17 01:10:36 +01:00
Prince Canuma
146f5d2981
Add image-to-video (I2V) conditioning support
...
- Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation.
- Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning.
- Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`.
- Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning.
- Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d
2026-01-17 00:19:52 +01:00
Prince Canuma
5f86e881d7
Update top_p parameter in sampler function to 1.0 for enhanced sampling control in LTX2TextEncoder
2026-01-16 21:08:14 +01:00
Prince Canuma
f6e0e5d5a4
Update av_ca_timestep_scale_multiplier to 1000 in model configuration for consistency across modules
2026-01-16 15:59:22 +01:00
Prince Canuma
e1bff927df
Auto-detect timestep_cond from model metadata ()
2026-01-16 14:55:50 +01:00
Prince Canuma
a658911f98
add audio
2026-01-16 01:15:22 +01:00
Prince Canuma
81daf3f67d
Add prompt enhancement feature to video generation
...
- Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling.
- Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder.
- Added command-line arguments for prompt enhancement options.
- Created new system prompt files for T2V and I2V generation to guide the enhancement process.
2026-01-15 14:31:00 +01:00
Prince Canuma
f5134fa172
adjust gelu and precision
2026-01-15 12:49:21 +01:00
Prince Canuma
349a82f763
Refactor GroupNorm3d: Optimize data type handling by casting input, weight, and bias to float32 for consistency and performance
2026-01-15 04:46:56 +01:00
Prince Canuma
09c2b460a7
Refactor LTX2VideoDecoder and ResBlockGroup: Change up_blocks and res_blocks from lists to dictionaries for better parameter tracking in MLX
2026-01-15 03:48:16 +01:00
Prince Canuma
3fcd8f90be
Refactor LTXModel: Change transformer_blocks from list to dictionary
2026-01-15 03:47:52 +01:00
Prince Canuma
e7067fea11
Refactor LTX2VideoDecoder: Remove redundant comments for residual parameter
2026-01-14 01:21:43 +01:00
Prince Canuma
957093c29b
use numpy for improved float64 precision and performance
2026-01-14 00:03:00 +01:00
Prince Canuma
74af04718d
Remove commented-out code and clean up text encoder initialization
2026-01-13 23:31:54 +01:00
Prince Canuma
ea063f7550
Cast LM weights to bfloat16
2026-01-13 23:30:26 +01:00
Prince Canuma
fc6ef20c1b
Add custom text encoder with quantization
...
Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com >
2026-01-13 22:56:51 +01:00
Prince Canuma
01d895bc77
Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation
2026-01-13 17:12:11 +01:00
Prince Canuma
61b003ff2c
Revise README for text-to-video generation example
...
Updated example prompt and parameters for video generation.
2026-01-12 17:21:54 +01:00
Prince Canuma
535dd9a066
Update README for macOS requirements
2026-01-12 17:19:32 +01:00