Running local models on Macs gets faster with Ollama's MLX support - Ars Technica

Running local models on Macs gets faster with Ollama’s MLX support

Apple Silicon Macs get a performance boost thanks to better unified memory usage.

Samuel Axon

–

Mar 31, 2026 7:00 pm

54...

## 正文

LLMs

Running local models on Macs gets faster with Ollama’s MLX support

Apple Silicon Macs get a performance boost thanks to better unified memory usage.

Samuel Axon

–

Mar 31, 2026 7:00 pm

A graphic made by Ollama to announce MLX support.

Credit:

Ollama

A graphic made by Ollama to announce MLX support.

Credit:

Ollama

Text

settings

Story text

Size

Small

Standard

Large

Width

Standard

Wide

Links

Standard

Orange

* Subscribers only

Learn more

Minimize to nav

Ollama, a runtime system for operating large language models on a local computer, has introduced support for Apple’s open source

MLX

framework for machine learning. Additionally, Ollama says it has improved caching performance and now supports Nvidia’s

NVFP4

format for model compression, making for much more efficient memory usage in certain models.

Combined, these developments promise significantly improved performance on Macs with Apple Silicon chips (M1 or later)—and the timing couldn’t be better, as local models are starting to gain steam in ways they haven’t before outside researcher and hobbyist communities.

The recent runaway success of OpenClaw—which raced its way to over 300,000 stars

on GitHub

, made headlines with experiments like

Moltbook

and became an obsession in