v0.4.2

Fine-tune LLMs on your Mac with Apple Silicon

Prototype locally, scale to cloud. Same code, just change the import.

Copied! $ pip install mlx-tune

Fine-tune Locally

Train LLMs on M1–M5 Macs natively with Apple’s MLX framework. No cloud GPU required.

Unified Memory

Access up to 512GB unified RAM on Mac Studio. Load larger models than discrete GPU VRAM allows.

Same API as Unsloth

Write once, run on Mac or CUDA. Just change the import line—your training code stays the same.

Export Anywhere

Save to HuggingFace format, GGUF for Ollama and llama.cpp, or merged weights for deployment.

One import change. That’s it.

Your Unsloth training scripts work on Apple Silicon with a single line change.

Unsloth (CUDA)
from unsloth import FastLanguageModel
from trl import SFTTrainer, SFTConfig
MLX-Tune (Apple Silicon)
from mlx_tune import FastLanguageModel
from mlx_tune import SFTTrainer, SFTConfig

Rest of your code stays exactly the same.

Up and running in minutes

A complete fine-tuning pipeline in under 20 lines.

from mlx_tune import FastLanguageModel, SFTTrainer, SFTConfig
from datasets import load_dataset

# Load any HuggingFace model (1B model for quick start)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="mlx-community/Llama-3.2-1B-Instruct-4bit",
    max_seq_length=2048,
    load_in_4bit=True,
)

# Add LoRA adapters
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_alpha=16,
)

# Load a dataset
dataset = load_dataset("yahma/alpaca-cleaned", split="train[:100]")

# Train with SFTTrainer (same API as TRL!)
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    tokenizer=tokenizer,
    args=SFTConfig(
        output_dir="outputs",
        per_device_train_batch_size=2,
        learning_rate=2e-4,
        max_steps=50,
    ),
)
trainer.train()

# Save (same API as Unsloth!)
model.save_pretrained("lora_model")           # Adapters only
model.save_pretrained_merged("merged", tokenizer)  # Full model

Get MLX-Tune

Using uv (recommended)

uv pip install mlx-tune

Using pip

pip install mlx-tune

From source (development)

git clone https://github.com/ARahim3/mlx-tune.git
cd mlx-tune
uv pip install -e .

Requirements

HardwareApple Silicon Mac (M1 / M2 / M3 / M4 / M5)
OSmacOS 13.0+
Memory16 GB+ unified RAM  (32 GB+ for 7B+ models)
Python3.9+

Supported trainers

All trainers use native MLX — no subprocess calls or CUDA wrappers.

Method Trainer Use Case
SFT SFTTrainer Instruction fine-tuning
DPO DPOTrainer Preference learning
ORPO ORPOTrainer Combined SFT + odds ratio preference
GRPO GRPOTrainer Reasoning with multi-generation (DeepSeek R1 style)
KTO KTOTrainer Kahneman-Tversky optimization
SimPO SimPOTrainer Simple preference optimization
VLM SFT VLMSFTTrainer Vision-Language model fine-tuning

MLX-Tune vs Unsloth

Feature Unsloth (CUDA) MLX-Tune
Platform NVIDIA GPUs Apple Silicon
Backend Triton Kernels MLX Framework
Memory VRAM (limited) Unified (up to 512 GB)
API Original 100% Compatible
Best For Production training Local dev & large models
Note

MLX-Tune is not a replacement for Unsloth. It’s a bridge: prototype on your Mac, then deploy to CUDA with the original Unsloth for production training.