Prince Canuma
|
95d7c81b20
|
Remove deprecated stubs for video conversion and generation; introduce new weight conversion and generation scripts for Wan2.2 models in MLX.
|
2026-03-18 17:20:36 +01:00 |
|
Prince Canuma
|
7b9d0a5e44
|
Merge branch 'main' into pc/unify-apis
|
2026-03-18 17:14:17 +01:00 |
|
Prince Canuma
|
fea0f87df9
|
Fix token handling in LTX-2 text encoder by directly appending response tokens to the generated tokens list, improving clarity and consistency in token generation.
|
2026-03-18 13:50:33 +01:00 |
|
Prince Canuma
|
f5e311a77c
|
Update default values for STG and modality scales in LTX-2 video generation; enhance help descriptions for command-line arguments
|
2026-03-18 12:17:47 +01:00 |
|
Prince Canuma
|
f8e371e9ce
|
Enhance upsampler weight detection logic in LTX-2 model; improve clarity in comments and streamline spatial scale determination for x1.5 and x2 formats
|
2026-03-17 15:14:57 +01:00 |
|
Prince Canuma
|
57f66bcae2
|
Add custom spatial upscaling support to LTX-2 video generation; introduce spatial_upscaler parameter and enhance resolution handling for two-stage pipelines
|
2026-03-17 02:23:47 +01:00 |
|
Prince Canuma
|
cc302d79b0
|
Refactor comments and optimize key skipping logic in LTX-2 model conversion; improve clarity in code documentation
|
2026-03-17 00:39:52 +01:00 |
|
Prince Canuma
|
643f250195
|
Update README.md with installation instructions, supported models, and usage examples; add new LTX-2 model documentation for pipelines and features.
|
2026-03-16 23:03:05 +01:00 |
|
Prince Canuma
|
f9880a0683
|
Add audio encoder sanitization and configuration inference to LTX-2 model conversion process; update conversion print statements for new encoder step
|
2026-03-16 22:35:27 +01:00 |
|
Prince Canuma
|
7a576bfbf4
|
Refactor weight loading and utility functions for LTX-2 model; remove deprecated weight loading file and update imports accordingly
|
2026-03-16 22:25:22 +01:00 |
|
Prince Canuma
|
dd573d53d2
|
Refactor audio VAE directory structure and update related paths in conversion and loading functions
|
2026-03-16 21:53:37 +01:00 |
|
Prince Canuma
|
a6a6bb2166
|
Move weight loading functions to a new file for better organization and maintainability
|
2026-03-16 17:28:06 +01:00 |
|
Prince Canuma
|
3a0da19adb
|
Refactor LTX-2 model structure
|
2026-03-16 14:50:01 +01:00 |
|
Prince Canuma
|
6f6105b715
|
Add audio to video conditioning
|
2026-03-16 01:42:11 +01:00 |
|
Prince Canuma
|
f53b9e0807
|
Add Dev Two-Stage HQ pipeline mode
|
2026-03-16 00:34:13 +01:00 |
|
Prince Canuma
|
df81bc852f
|
fix save tensors
|
2026-03-15 23:08:12 +01:00 |
|
Prince Canuma
|
cecd68197c
|
fix tiling, rope precision and weights
|
2026-03-15 22:58:55 +01:00 |
|
Prince Canuma
|
ebcd5dd4e4
|
optimize memory usage by batching weight updates
|
2026-03-15 03:12:47 +01:00 |
|
Prince Canuma
|
53bae534e7
|
fix LTX-2.3 audio
|
2026-03-15 02:06:35 +01:00 |
|
Prince Canuma
|
eb0d1355e4
|
Fix LTX-2.3 decoder grainy bug
|
2026-03-14 21:56:03 +01:00 |
|
Prince Canuma
|
5644492f7d
|
Update generate.py to enhance denoising functionality with optional Spatiotemporal Guidance (STG) support. Modify DEFAULT_NEGATIVE_PROMPT for improved clarity and detail. Implement auto-detection of STG blocks based on transformer configuration. Refactor denoise_dev function to incorporate STG parameters, allowing for more flexible audio-visual integration during video generation.
|
2026-03-14 20:02:42 +01:00 |
|
Prince Canuma
|
ffe271699a
|
Refactor LoRA loading for v2.3 in generate.py to prioritize distilled-lora files over full model weights, enhancing model flexibility. Update key sanitization logic to utilize a replacement list for improved readability and maintainability. Modify denoise_dev_av function to include sigma parameters for audio and video modalities, ensuring consistent handling of latent variables during processing. Adjust Vocoder weight loading to allow for non-strict loading, accommodating additional keys in model weights.
|
2026-03-14 15:24:50 +01:00 |
|
Prince Canuma
|
9cba2ea7cd
|
Enhance README.md with new usage examples for STG and modality scale parameters in video generation. Update generate.py to support STG and modality guidance in the denoising process, allowing for improved audio-visual integration. Refactor attention mechanisms in the transformer to include options for skipping self-attention, facilitating STG perturbation and modality isolation. Update LTXModel and transformer block processing to accommodate new parameters for enhanced flexibility in model configurations.
|
2026-03-14 10:26:12 +01:00 |
|
Prince Canuma
|
f346e09de4
|
Refactor audio handling in generate_video function to preserve stage 1 audio latents during stage 2 processing. Remove redundant audio re-denoising steps, ensuring audio integrity while refining video output. Update comments for clarity on audio processing logic.
|
2026-03-13 16:09:07 +01:00 |
|
Prince Canuma
|
387d4fc301
|
improve dev color and quality
|
2026-03-13 09:51:24 +01:00 |
|
Prince Canuma
|
835ba33202
|
Enhance README.md with detailed descriptions of LTX-2 features, pipeline options, and usage examples for text-to-video, image-to-video, and audio-video generation. Update generate.py to improve LoRA loading functionality, allowing for local files, directories, or HuggingFace repos. This update improves flexibility in model configurations and enhances user guidance in the documentation.
|
2026-03-13 01:39:39 +01:00 |
|
Prince Canuma
|
7435facc52
|
Add support for DEV_TWO_STAGE pipeline and implement LoRA merging functionality in generate.py. Enhance video generation capabilities by allowing LoRA weights to be loaded and merged into the model, improving flexibility in model configurations. Update pipeline handling to accommodate the new two-stage generation process.
|
2026-03-13 01:22:45 +01:00 |
|
Prince Canuma
|
e0aafd72fc
|
Refactor generate.py to ensure temporal coordinates and position grids are processed in bfloat16 for consistency with PyTorch's precision behavior. Update denoise_dev_av function to apply standard ratio rescaling for audio and video guidance, enhancing numerical fidelity and model compatibility.
|
2026-03-12 21:26:38 +01:00 |
|
Prince Canuma
|
b07b1e3213
|
Update .gitignore to exclude additional configuration and model files. Modify generate.py to enhance console output with rescale parameter and adjust default values for inference steps and CFG scale. Refactor text encoder to align positional embedding max position with PyTorch defaults, improving compatibility and performance.
|
2026-03-12 17:13:43 +01:00 |
|
Prince Canuma
|
d1fa47722b
|
Fix timestep_conditioning logic in infer_vae_decoder_config to ensure consistent behavior based on has_timestep flag.
|
2026-03-11 18:30:29 +01:00 |
|
Daniel
|
33dd3c2edd
|
Revert small change to mlx_video/generate.py
|
2026-03-11 12:41:44 +01:00 |
|
Daniel
|
281750f0a9
|
Revert changes to existing files by copying some code.
|
2026-03-11 12:35:47 +01:00 |
|
Daniel
|
ae410f3121
|
Update Wan2.1/Wan2.2 README.md
|
2026-03-11 12:24:59 +01:00 |
|
Daniel
|
c144c8817c
|
refactor(wan): move causal_temporal tiling to wan/tiling.py
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
2026-03-11 12:02:54 +01:00 |
|
Daniel
|
1cf878f5e0
|
More poodles
|
2026-03-11 09:24:06 +01:00 |
|
Daniel
|
d207275fea
|
fix(wan): Fix scheduler sigma schedule and add debug flags
|
2026-03-11 09:18:01 +01:00 |
|
Daniel
|
afd15018b7
|
chore: Cleanup -- reorganize README and docs
|
2026-03-11 09:17:25 +01:00 |
|
Daniel
|
061ae4407c
|
feat(wan): Add chunked VAE encoding and TI2V-5B support
|
2026-03-11 09:16:52 +01:00 |
|
Daniel
|
9bdda9f22e
|
feat(wan): Add tiled VAE decoding and fix TI2V quality
|
2026-03-11 09:16:22 +01:00 |
|
Daniel
|
9597b7c9c5
|
perf(wan): Add mx.compile and fix first-frame artifacts
|
2026-03-11 09:14:43 +01:00 |
|
Daniel
|
849cc45d84
|
feat(wan): Add LoRA with improved quantization pipeline
|
2026-03-11 09:13:20 +01:00 |
|
Daniel
|
dbab95ec45
|
fix(wan): Fix RoPE frequency construction
|
2026-03-11 09:12:19 +01:00 |
|
Daniel
|
f4195f0118
|
feat(wan): Add I2V-14B dual-model support
|
2026-03-11 09:12:19 +01:00 |
|
Daniel
|
2bb95c61ed
|
feat(wan): Add Wan2.2 I2V support
|
2026-03-11 09:08:10 +01:00 |
|
Daniel
|
93da550f65
|
feat(wan): Add DPM++ 2M and UniPC schedulers
|
2026-03-11 09:08:10 +01:00 |
|
Daniel
|
e64483a66a
|
feat(wan): Add Wan2.1/2.2 T2V with quantization support
|
2026-03-11 09:08:10 +01:00 |
|
Prince Canuma
|
207c223354
|
Add LTX-2.3 model architecture with prompt-conditioned adaptive layer normalization (adaln) support. Introduce gating mechanisms in attention modules and update transformer configurations to accommodate new parameters. Refactor video and audio processing to utilize adaptive normalization, improving model flexibility and performance. Update weight loading and initialization logic to support dynamic block structures in the decoder.
|
2026-03-10 16:47:36 +01:00 |
|
Prince Canuma
|
d028b239fb
|
Update LTX conversion script to support LTX-2.3 safetensors format. Enhance documentation and improve file matching logic for variant detection in local directories.
|
2026-03-10 08:01:26 +01:00 |
|
Prince Canuma
|
576e01da14
|
Implement linking of text encoder and tokenizer directories in conversion process. Enhance error handling in LTX2TextEncoder for tokenizer loading, providing a fallback model if the specified path is unavailable.
|
2026-03-09 18:25:32 +01:00 |
|
Prince Canuma
|
41ed62f7e8
|
Add LTX-2 conversion script for safetensors to MLX directory layout. Implement modular structure
|
2026-03-09 18:16:20 +01:00 |
|