mlx-video

Author	SHA1	Message	Date
Prince Canuma	ac67ee8b1e	Remove the generate_dev.py file, consolidating its functionality into generate.py. Enhance the video generation pipeline to support both distilled and dev models, integrating dynamic sigma scheduling and classifier-free guidance (CFG) for improved video quality. Update command-line interface to accommodate new pipeline options and refactor related functions for better maintainability.	2026-01-19 02:13:00 +01:00
Prince Canuma	0538af6554	Enhance video generation pipeline by integrating Rich for styled console output and progress tracking. Update dependencies in pyproject.toml to include Rich. Refactor print statements to use console methods for improved user experience during video and audio processing.	2026-01-19 01:43:14 +01:00
Prince Canuma	cae11291a9	Remove the audio-video generation pipeline from generate_av.py and integrate audio capabilities into generate.py. This includes adding audio position grid creation, audio frame computation, and updating the denoising function to handle audio latents. Enhance the command-line interface to support audio generation options and update the model configuration accordingly.	2026-01-19 01:28:53 +01:00
Prince Canuma	749762a0b9	Update audio decoder configuration to use an empty set for attention resolutions in both generate_av.py and generate_dev.py. Add a print statement for loading audio VAE decoder weights in generate_dev.py.	2026-01-18 21:55:38 +01:00
Prince Canuma	7069cc39c9	Add audio generation capabilities to video pipeline, including audio position grid creation, audio frame computation, and integration of audio VAE and vocoder. Update tests to cover new audio functionalities.	2026-01-18 21:28:56 +01:00
Prince Canuma	b36ad1e22d	add tests	2026-01-18 11:18:18 +01:00
Prince Canuma	e483eab039	Optimize positional embedding handling in TransformerArgsPreprocessor and improve RoPE frequency computation in _precompute_freqs_cis_double_precision for enhanced performance and precision.	2026-01-18 11:13:32 +01:00
Prince Canuma	62fc4805a0	Add LTX-2 Dev Model video generation pipeline	2026-01-18 11:13:11 +01:00
Prince Canuma	b1bf9e2dc0	Enhance video generation with progress bar for streaming and remove debug prints from tiling decoder	2026-01-17 23:53:53 +01:00
Prince Canuma	f256c5fb25	add tests	2026-01-17 23:36:39 +01:00
Prince Canuma	7f20840dc7	Add streaming support to video generation	2026-01-17 23:17:08 +01:00
Prince Canuma	f33f496fba	Merge branch 'main' into pc/add-vae-tiling	2026-01-17 19:37:21 +01:00
Prince Canuma	e692b7a6b3	Add i2v Add i2v	2026-01-17 19:37:06 +01:00
Prince Canuma	785b0b955d	Merge branch 'main' into pc/add-i2v	2026-01-17 19:36:28 +01:00
Prince Canuma	26fa8919ed	Merge pull request #13 from Blaizzy/Blaizzy-patch-1 Update actions	2026-01-17 19:36:14 +01:00
Prince Canuma	c89de996eb	Update GitHub Sponsors username in FUNDING.yml	2026-01-17 19:35:24 +01:00
Prince Canuma	0669998e15	Add audio support Add audio support	2026-01-17 19:31:21 +01:00
Prince Canuma	61c56cd989	Add RoPE tests and warning for bfloat16 precision loss in RoPE calculations	2026-01-17 19:28:05 +01:00
Prince Canuma	78244a2d66	Cast dtype to bf16 in video and audio generation processes	2026-01-17 17:20:22 +01:00
Prince Canuma	883c6b0ad8	ensure dtype cast	2026-01-17 13:03:48 +01:00
Prince Canuma	e4cdbb7eab	add vae tiling	2026-01-17 07:51:54 +01:00
Prince Canuma	f607112407	Refactor video and audio latent generation in `generate_video` and `generate_video_with_audio` - Removed direct initialization of latents with random noise, replacing it with a conditional approach based on I2V (Image-to-Video) conditioning. - Introduced a structured flow for applying noise during the latent state creation, enhancing the conditioning process for both video and audio. - Updated the noise application logic to ensure proper handling of conditioned and unconditioned frames in both stages of video generation. - Improved code clarity and maintainability by consolidating latent shape definitions and restructuring noise application logic.	2026-01-17 01:38:53 +01:00
Prince Canuma	d52e567c56	Enhance precision in denormalization and normalization processes - Updated `denormalize` and `pixel_norm` methods in `LTX2VideoDecoder` and `PerChannelStatistics` classes to cast mean and standard deviation to float32 for improved precision. - Ensured that the output of normalization operations retains the original data type of the input tensor.	2026-01-17 01:14:29 +01:00
Prince Canuma	ecda6d10e5	Merge pull request #9 from Blaizzy/pc/fix-text-encoder Fix text encoder	2026-01-17 01:10:36 +01:00
Prince Canuma	146f5d2981	Add image-to-video (I2V) conditioning support - Introduced `load_image`, `prepare_image_for_encoding`, and `apply_conditioning` functions for handling image inputs and conditioning during video generation. - Enhanced `generate_video` and `denoise_av` functions to accept optional image inputs for I2V conditioning. - Updated command-line interface to include parameters for image conditioning, such as `--image`, `--image-strength`, and `--image-frame-idx`. - Added new `VideoConditionByLatentIndex` and `LatentState` classes for managing latent states with conditioning. - Implemented VAE encoder loading and image encoding for conditioning in the video generation process.d	2026-01-17 00:19:52 +01:00
Prince Canuma	5f86e881d7	Update top_p parameter in sampler function to 1.0 for enhanced sampling control in LTX2TextEncoder	2026-01-16 21:08:14 +01:00
Prince Canuma	f6e0e5d5a4	Update av_ca_timestep_scale_multiplier to 1000 in model configuration for consistency across modules	2026-01-16 15:59:22 +01:00
Prince Canuma	e1bff927df	Auto-detect timestep_cond from model metadata ()	2026-01-16 14:55:50 +01:00
Prince Canuma	a658911f98	add audio	2026-01-16 01:15:22 +01:00
Prince Canuma	81daf3f67d	Add prompt enhancement feature to video generation - Introduced `enhance_prompt`, `max_tokens`, and `temperature` parameters in `generate_video` function for improved prompt handling. - Implemented prompt enhancement logic using the new `enhance_t2v` method in the text encoder. - Added command-line arguments for prompt enhancement options. - Created new system prompt files for T2V and I2V generation to guide the enhancement process.	2026-01-15 14:31:00 +01:00
Prince Canuma	f5134fa172	adjust gelu and precision	2026-01-15 12:49:21 +01:00
Prince Canuma	349a82f763	Refactor GroupNorm3d: Optimize data type handling by casting input, weight, and bias to float32 for consistency and performance	2026-01-15 04:46:56 +01:00
Prince Canuma	09c2b460a7	Refactor LTX2VideoDecoder and ResBlockGroup: Change up_blocks and res_blocks from lists to dictionaries for better parameter tracking in MLX	2026-01-15 03:48:16 +01:00
Prince Canuma	3fcd8f90be	Refactor LTXModel: Change transformer_blocks from list to dictionary	2026-01-15 03:47:52 +01:00
Prince Canuma	e7067fea11	Refactor LTX2VideoDecoder: Remove redundant comments for residual parameter	2026-01-14 01:21:43 +01:00
Prince Canuma	957093c29b	use numpy for improved float64 precision and performance	2026-01-14 00:03:00 +01:00
Prince Canuma	74af04718d	Remove commented-out code and clean up text encoder initialization	2026-01-13 23:31:54 +01:00
Prince Canuma	ea063f7550	Cast LM weights to bfloat16	2026-01-13 23:30:26 +01:00
Prince Canuma	fc6ef20c1b	Add custom text encoder with quantization Co-authored-by: HimanshU Mourya <40685364+codingstark-dev@users.noreply.github.com>	2026-01-13 22:56:51 +01:00
Prince Canuma	01d895bc77	Add frame number validation in video generation and update Gemma3 text encoder to use validated mlx-vlm implementation	2026-01-13 17:12:11 +01:00
Prince Canuma	61b003ff2c	Revise README for text-to-video generation example Updated example prompt and parameters for video generation.	2026-01-12 17:21:54 +01:00
Prince Canuma	535dd9a066	Update README for macOS requirements	2026-01-12 17:19:32 +01:00
Prince Canuma	070edc0de6	Replace poodles.mp4 with poodles.gif in examples directory	2026-01-12 17:14:12 +01:00
Prince Canuma	74f7f22a0f	remove ref	2026-01-12 17:11:05 +01:00
Prince Canuma	084927c74a	Remove optional dependencies section from README	2026-01-12 16:48:28 +01:00
Prince Canuma	ebe4e74585	Add pre-commit configuration for code formatting and linting with Black, isort, and autoflake	2026-01-12 16:47:34 +01:00
Prince Canuma	4f6fc8252c	Add example usage to README and enhance console output in generate.py with ANSI colors	2026-01-12 16:45:09 +01:00
Prince Canuma	28417fe126	fix git ignore	2026-01-12 16:35:41 +01:00
Prince Canuma	54fb1ed076	fix uv lock	2026-01-12 16:35:15 +01:00
Prince Canuma	c7f94052e8	fix toml	2026-01-12 16:35:03 +01:00

1 2

57 Commits