One of the most powerful features of Bwocks is that it can run AI locally on your computer.

No internet required.

No per-prompt billing.

No surprise usage limits.

If that sounds intimidating — don’t worry. You don’t need to be a developer, and you don’t need to understand how models work internally. This guide walks through everything step by step.

Why run AI locally at all?

Cloud AI is fast and convenient. But it also comes with tradeoffs you don’t always notice at first:

  • You’re dependent on pricing changes
  • Usage limits can kick in at inconvenient times
  • Model behavior can change without warning
  • You need a stable internet connection

Local models flip that around.

What local models give you

  • Predictable cost – free once downloaded
  • Offline use – planes, trains, bad Wi-Fi, no problem
  • Consistency – the model you download stays the same
  • Freedom to experiment – no “is this prompt worth it?” anxiety

Local models aren’t a replacement for cloud AI. They’re a second gear — especially good for cleanup, rewriting, extraction, classification, and structured work inside spreadsheets.

What you’ll install (just one thing)

To run local models, you’ll install a small helper app called Ollama.

Ollama handles:

  • downloading models
  • running them in the background
  • making them available to apps like Bwocks

You don’t need to configure servers or write code.

Step 1: Download Ollama

Go to: 👉 https://ollama.com

Download the version for your operating system:

  • macOS: download the .dmg
  • Windows: download the installer
  • Linux: follow the instructions on the site

Install it like any other app.

Once installed:

  • On macOS: open Ollama from Applications
  • On Windows: launch Ollama from the Start menu

Ollama should now be running quietly in the background.

Step 2: Open your terminal (don’t worry)

You’ll use the terminal once to download a model.

On macOS

  1. Open Spotlight (⌘ + Space)
  2. Type Terminal
  3. Press Enter

On Windows

  1. Press Start
  2. Search for Command Prompt or PowerShell
  3. Open it

That’s it — you’re in.

Step 3: Download a model

In the terminal, type:

ollama pull gemma3:4b

(If you want to replace the gemma3 model with another of your choice, you would replace this portion of the command. Above is simply for illustration. We’ll get into choosing a model a bit farther down the page.)

Press Enter.

This will:

  • download the model
  • store it locally on your computer
  • make it available instantly once finished

Depending on your internet speed, this can take a few minutes.

💡 You only do this once per model.

Step 4: Make sure Ollama is running

Ollama needs to be running in the background.

  • On macOS: make sure the Ollama app is open
  • On Windows: make sure Ollama is running in the system tray

If it’s running, you’re good.

Step 5: Enable local models in Bwocks

Now switch to Bwocks.

  1. Open Settings
  2. Enable Local LLMs
  3. Enter the model name exactly as downloaded
    • Example: gemma3:4b
  4. Save

That’s it.

From now on, you can select that local model in:

  • AI Columns
  • AI Cleanup
  • Image or text generation (depending on the model)

Local and cloud models work the same way inside Bwocks.

What performance should you expect?

Local models are slower than cloud AI — that’s normal. To understand what that means, here’s a quick explanation.

A token is roughly:

  • 3–4 characters of English text, or
  • about ¾ of a word

So “tokens per second” is basically “how fast text appears on screen.”

Typical local model performance (7–8B models)

MachineTokens / secondApprox. characters / secondWhat it feels like
MacBook Air (M1 / M2)5–15~15–60 chars/secText appears steadily, line by line
MacBook Pro (M1 / M2 Pro / Max)10–25~30–100 chars/secSmooth and readable
Mid-range Windows laptop (CPU-only)3–10~10–40 chars/secSlower, but usable
Gaming PC / Alienware (dedicated GPU)25–60+~75–240+ chars/secFeels fast, close to cloud

For comparison, cloud models often run at 50–150+ tokens per second.

Local models trade speed for:

  • zero per-token cost
  • offline reliability
  • predictable behavior

In spreadsheets — where you’re often running AI across many rows — this tradeoff is usually worth it.

Choosing the right model (simple guidance)

You don’t need to try everything. Start small.

A great default: gemma3:4b

This is a fantastic all-around model:

  • fast on most machines
  • surprisingly creative
  • excellent at cleanup, rewriting, extraction, and classification
  • runs very well on gaming laptops and modern Macs

This is the model many people stick with for day-to-day work.

Other common choices (optional)

  • Mistral 7B – solid instruction following, lightweight
  • Llama 3 (larger variants) – better reasoning, slower
  • Code-focused models – useful if you’re generating or analyzing code
  • Vision models – for image + text workflows

Rule of thumb:

  • Smaller models = faster, cheaper, more predictable
  • Larger models = slower, smarter, heavier

You can always download more later.

When local models shine (and when they don’t)

Local models are especially good for:

  • AI Cleanup
  • text normalization
  • extraction into columns
  • rewriting and summarizing
  • structured, repeatable tasks

Cloud models are better when:

  • speed matters most
  • you need deep reasoning
  • you’re doing complex, one-off work

Bwocks lets you use both, side by side.

Final thoughts

Running AI locally used to feel like something only developers did.

That’s no longer true.

With tools like Ollama and Bwocks:

  • setup takes minutes
  • usage feels familiar
  • and the benefits compound quickly

You don’t have to switch everything to local models. But having the option — especially for cleanup, iteration, and offline work — changes how freely you can think and experiment.

And that’s the real win.