mlx-video

Author	SHA1	Message	Date
Daniel	1cf878f5e0	More poodles	2026-03-11 09:24:06 +01:00
Daniel	d207275fea	fix(wan): Fix scheduler sigma schedule and add debug flags	2026-03-11 09:18:01 +01:00
Daniel	afd15018b7	chore: Cleanup -- reorganize README and docs	2026-03-11 09:17:25 +01:00
Daniel	061ae4407c	feat(wan): Add chunked VAE encoding and TI2V-5B support	2026-03-11 09:16:52 +01:00
Daniel	967218b7c1	feat(wan): Add diagnostic scripts and porting guide	2026-03-11 09:16:22 +01:00
Daniel	9bdda9f22e	feat(wan): Add tiled VAE decoding and fix TI2V quality	2026-03-11 09:16:22 +01:00
Daniel	9597b7c9c5	perf(wan): Add mx.compile and fix first-frame artifacts	2026-03-11 09:14:43 +01:00
Daniel	849cc45d84	feat(wan): Add LoRA with improved quantization pipeline	2026-03-11 09:13:20 +01:00
Daniel	dbab95ec45	fix(wan): Fix RoPE frequency construction	2026-03-11 09:12:19 +01:00
Daniel	f4195f0118	feat(wan): Add I2V-14B dual-model support	2026-03-11 09:12:19 +01:00
Daniel	2bb95c61ed	feat(wan): Add Wan2.2 I2V support	2026-03-11 09:08:10 +01:00
Daniel	93da550f65	feat(wan): Add DPM++ 2M and UniPC schedulers	2026-03-11 09:08:10 +01:00
Daniel	e64483a66a	feat(wan): Add Wan2.1/2.2 T2V with quantization support	2026-03-11 09:08:10 +01:00
Prince Canuma	7a74946c57	Merge pull request #14 from Blaizzy/pc/add-streaming Add --stream flag and chunked conv memory optimization for VAE decoding	2026-01-21 15:42:55 +01:00
Prince Canuma	ffdeec72a6	Merge branch 'main' into pc/add-streaming	2026-01-21 15:42:16 +01:00
Prince Canuma	7ad14e18ca	Merge pull request #12 from Blaizzy/pc/add-vae-tiling Add VAE Tiling + BFloat16 Support for Memory-Efficient Video Generation	2026-01-21 15:41:46 +01:00
Prince Canuma	b1bf9e2dc0	Enhance video generation with progress bar for streaming and remove debug prints from tiling decoder	2026-01-17 23:53:53 +01:00
Prince Canuma	f256c5fb25	add tests	2026-01-17 23:36:39 +01:00
Prince Canuma	7f20840dc7	Add streaming support to video generation	2026-01-17 23:17:08 +01:00
Prince Canuma	f33f496fba	Merge branch 'main' into pc/add-vae-tiling	2026-01-17 19:37:21 +01:00
Prince Canuma	e692b7a6b3	Add i2v Add i2v	2026-01-17 19:37:06 +01:00
Prince Canuma	785b0b955d	Merge branch 'main' into pc/add-i2v	2026-01-17 19:36:28 +01:00
Prince Canuma	26fa8919ed	Merge pull request #13 from Blaizzy/Blaizzy-patch-1 Update actions	2026-01-17 19:36:14 +01:00
Prince Canuma	c89de996eb	Update GitHub Sponsors username in FUNDING.yml	2026-01-17 19:35:24 +01:00
Prince Canuma	0669998e15	Add audio support Add audio support	2026-01-17 19:31:21 +01:00
Prince Canuma	61c56cd989	Add RoPE tests and warning for bfloat16 precision loss in RoPE calculations	2026-01-17 19:28:05 +01:00
Prince Canuma	78244a2d66	Cast dtype to bf16 in video and audio generation processes	2026-01-17 17:20:22 +01:00
Prince Canuma	883c6b0ad8	ensure dtype cast	2026-01-17 13:03:48 +01:00
Prince Canuma	e4cdbb7eab	add vae tiling	2026-01-17 07:51:54 +01:00
Prince Canuma	f607112407	Refactor video and audio latent generation in `generate_video` and `generate_video_with_audio` - Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning. - Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio. - Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation. - Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.	2026-01-17 01:38:53 +01:00
Prince Canuma	d52e567c56	Enhance precision in denormalization and normalization processes - Updated `denormalize` and `pixel_norm` methods in `LTX2VideoDecoder` and `PerChannelStatistics` classes to cast mean and standard deviation to float32 for improved precision. - Ensured that the output of normalization operations retains the original data type of the input tensor.	2026-01-17 01:14:29 +01:00
Prince Canuma	ecda6d10e5	Merge pull request #9 from Blaizzy/pc/fix-text-encoder Fix text encoder	2026-01-17 01:10:36 +01:00
Prince Canuma	146f5d2981	Add image-to-video (I2V) conditioning support - Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation. - Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning. - Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`. - Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning. - Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d	2026-01-17 00:19:52 +01:00
Prince Canuma	5f86e881d7	Update top_p parameter in sampler function to 1.0 for enhanced sampling control in LTX2TextEncoder	2026-01-16 21:08:14 +01:00
Prince Canuma	f6e0e5d5a4	Update av_ca_timestep_scale_multiplier to 1000 in model configuration for consistency across modules	2026-01-16 15:59:22 +01:00
Prince Canuma	e1bff927df	Auto-detect timestep_cond from model metadata ()	2026-01-16 14:55:50 +01:00
Prince Canuma	a658911f98	add audio	2026-01-16 01:15:22 +01:00
Prince Canuma	81daf3f67d	Add prompt enhancement feature to video generation - Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling. - Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder. - Added command-line arguments for prompt enhancement options. - Created new system prompt files for T2V and I2V generation to guide the enhancement process.	2026-01-15 14:31:00 +01:00
Prince Canuma	f5134fa172	adjust gelu and precision	2026-01-15 12:49:21 +01:00
Prince Canuma	349a82f763	Refactor GroupNorm3d: Optimize data type handling by casting input, weight, and bias to float32 for consistency and performance	2026-01-15 04:46:56 +01:00
Prince Canuma	09c2b460a7	Refactor LTX2VideoDecoder and ResBlockGroup: Change up_blocks and res_blocks from lists to dictionaries for better parameter tracking in MLX	2026-01-15 03:48:16 +01:00
Prince Canuma	3fcd8f90be	Refactor LTXModel: Change transformer_blocks from list to dictionary	2026-01-15 03:47:52 +01:00
Prince Canuma	e7067fea11	Refactor LTX2VideoDecoder: Remove redundant comments for residual parameter	2026-01-14 01:21:43 +01:00
Prince Canuma	957093c29b	use numpy for improved float64 precision and performance	2026-01-14 00:03:00 +01:00
Prince Canuma	74af04718d	Remove commented-out code and clean up text encoder initialization	2026-01-13 23:31:54 +01:00
Prince Canuma	ea063f7550	Cast LM weights to bfloat16	2026-01-13 23:30:26 +01:00
Prince Canuma	fc6ef20c1b	Add custom text encoder with quantization Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com>	2026-01-13 22:56:51 +01:00
Prince Canuma	01d895bc77	Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation	2026-01-13 17:12:11 +01:00
Prince Canuma	61b003ff2c	Revise README for text-to-video generation example Updated example prompt and parameters for video generation.	2026-01-12 17:21:54 +01:00
Prince Canuma	535dd9a066	Update README for macOS requirements	2026-01-12 17:19:32 +01:00

1 2

65 Commits