feat(wan): Add Wan2.1/2.2 T2V with quantization support

2026-02-26 16:16:07 +01:00
parent 7a74946c57
commit e64483a66a
21 changed files with 5309 additions and 35 deletions
--- a/.github/skills/fast-mlx/SKILL.md
+++ b/.github/skills/fast-mlx/SKILL.md
@@ -0,0 +1,26 @@
+---
+name: fast-mlx
+description: Optimize MLX code for performance and memory. Use when asked to implement or speed up MLX models or algorithms, reduce latency/throughput bottlenecks, tune lazy evaluation, type promotion, fast ops, compilation, memory use, or profiling.
+---
+
+# Fast MLX
+
+## Workflow
+
+- Looks for opportunities to compile functions of mostly elementwise operations.
+- For models with fixed shape inputs or where the shapes don't change much, compile the entire graph
+- Replace slow implementations with MLX fast ops
+- Identify evaluation boundaries and unintended sync points (`mx.eval`, `item()`, NumPy conversions).
+- Check dtype promotion and scalar usage; keep precision consistent with intent.
+- Review compilation strategy; avoid unnecessary recompiles and closure captures.
+- Reduce peak memory via lazy loading order and releasing temporaries before `mx.eval`.
+- Suggest profiling steps if the bottleneck is unclear.
+
+## References
+
+- Read `references/fast-mlx-guide.md` for detailed tips and examples. Use it as the source of truth.
+
+## Output expectations
+
+- Provide concrete code changes with brief rationale
+- Call out changes that need user confirmation (e.g., enabling async eval or shapeless compile).