Construct Your Personal Native AI Coding Agent with Gemma 4 and OpenCode

Encoding Categorical Knowledge for Outlier Detection

Reconstructing the Desk of Contents a PDF Forgot to Ship, So RAG Can Scope by Part

at the moment are a part of regular growth work.

Many individuals use them via cloud-hosted fashions, because it’s simply handy, and really succesful fashions can be utilized.

However in the case of value management, or for those who don’t wish to ship your code to the cloud for privateness issues, or you’re experimenting and wish to higher perceive how the agent stack really works, you would possibly wish to strive a neighborhood setup.

That is what this publish is about. Right here, we’ll arrange a neighborhood coding agent with three items:

Ollama, for serving the mannequin;
Gemma 4, because the native LLM;
OpenCode, because the agent interface.

By the tip, we’ll have OpenCode related to a neighborhood LLM.

Determine 1. The general structure. (Picture by writer)

1. Set up Ollama

We begin by putting in Ollama, which is able to serve the Gemma 4 mannequin domestically.

In the event you haven’t used it earlier than, Ollama is a runtime for downloading, working, and serving native language fashions from your personal machine. As soon as it’s arrange, Ollama exposes a neighborhood API endpoint. This fashion, different instruments (e.g., OpenCode) can discuss to the mannequin instantly.

On Home windows machines, you are able to do that from the official installer:

https://ollama.com/obtain

Alternatively, you can too set up it from PowerShell through the use of winget:

winget set up Ollama.Ollama

After set up, you need to have the ability to see the Ollama from the Home windows Begin menu. You’ll be able to launch it like every other app. As soon as it’s working, you need to see the Ollama icon within the system tray, and this implies the native Ollama service is working within the background.

Determine 2. Ollama App interface. (Picture by writer)

As well as, you possibly can open a brand new PowerShell window and test if the Ollama CLI is obtainable:

ollama --version

If you’re on a Linux machine, you possibly can set up Ollama with:

"curl ‒fsSL https://ollama.com/set up.sh | sh"

After set up, test if Ollama is obtainable:

ollama --version

As soon as Ollama is put in, it runs a neighborhood server in your machine. Later, OpenCode will discuss to this native Ollama server as a substitute of calling a cloud mannequin supplier.

2. Obtain Gemma 4

Subsequent, we put together a neighborhood LLM. For this publish, we’ll use Gemma 4.

Gemma 4 is a brand new open mannequin launched by Google on April 2, 2026. This mannequin is designed for reasoning, coding, multimodal understanding, and agentic workflows.

It is available in a number of sizes, together with smaller edge-oriented variants and bigger workstation-oriented variants. Since this publish is about working the mannequin domestically on a laptop computer, we’ll arrange the edge-friendly variants, i.e., the E2B (gemma4:e2b) and E4B (gemma4:e4b) variants.

In Ollama’s naming, the E stands for “efficient” parameters.

For this walkthrough, I take advantage of the E4B mannequin because it offers extra functionality. In PowerShell:

ollama pull gemma4:e4b

On Linux, use the identical command:

ollama pull gemma4:e4b

You’ll be able to test the downloaded mannequin:

ollama listing

On my machine, Ollama reviews the next:

gemma4:e4b    9.6 GB

For reference, my laptop computer has an Intel i7-13800H CPU, 32 GB RAM, and an NVIDIA RTX 2000 Ada Laptop computer GPU with about 8 GB VRAM. You’ll be able to select gemma4:e2b as a substitute if E4B feels too sluggish.

A couple of technical notes right here. The model of gemma4:e4b that we downloaded earlier is a 4-bit quantized mannequin, with GGUF because the native mannequin format utilized by Ollama runtimes. On my machine, Ollama reviews gemma4:e4b helps with a 128K context size.

Earlier than shifting to the following step, we will do a fast take a look at:

ollama run gemma4:e4b "what is the capital of France?"

In the event you get “Paris” again, then congratulations, Gemma 4 is now out there in your native machine via Ollama.

Observe that the primary name might be sluggish as a result of Ollama has to load the mannequin. As soon as the mannequin is heat, the following prompts ought to reply quicker.

3. Set up OpenCode

Subsequent, we’d like an agent interface. We’ll use OpenCode for that.

You probably have used instruments like Claude Code or Codex, OpenCode belongs to the identical broad class. You’ll be able to consider it as an agent runtime that may function inside a neighborhood repo, examine information, run instructions, and carry out numerous duties.

An necessary distinction that issues for us is that OpenCode is open-source and agnostic about LLM suppliers. You’ll be able to join it to cloud fashions (e.g., Claude/GPT/Gemini fashions), or you possibly can join it to a neighborhood mannequin served by Ollama.

That’s precisely what we’ll do right here.

If you’re on a Home windows machine, you’d must first set up Node.js. You are able to do so by way of:

winget set up OpenJS.NodeJS.LTS

On Linux, you are able to do:

sudo apt replace
sudo apt set up -y nodejs npm

After set up, you need to open a brand new PowerShell window and confirm if each node and npm can be found:

node --version
npm --version

Now we will set up OpenCode:

npm set up -g opencode-ai

Then confirm the set up:

opencode --version

At this level, OpenCode is put in. You’ll be able to merely launch the interactive OpenCode TUI (terminal UI) from any challenge folder by working:

opencode

Determine 3. OpenCode TUI. (Picture by writer)

4. Join OpenCode to Gemma 4

By default, OpenCode doesn’t know which mannequin we wish to use. Due to this fact, we have to level it to the Gemma 4 mannequin, served by Ollama.

Let’s first create an Ollama mannequin tag with the total context window (128K) enabled. That is necessary as a result of we wish to make certain the agent can work correctly with out being truncated in context.

We are able to try this with a small Ollama Modelfile. Particularly, we will create a file known as gemma4-e4b-128k.Modelfile within the folder/repo we wish to work with:

FROM gemma4:e4b
PARAMETER num_ctx 131072

Then, within the command line, we create a brand new Ollama tag by:

ollama create gemma4:e4b-128k -f gemma4-e4b-128k.Modelfile

One thing to level out: this may not set off a brand new mannequin downloading! It simply creates an Ollama profile that makes use of the identical Gemma 4 E4B mannequin, however explicitly units the runtime context window to 128K.

Okay, we will proceed to attach OpenCode to the Gemma 4 mannequin. For that, we have to create an opencode.json file within the challenge folder:

{
  "$schema": "https://opencode.ai/config.json",
  "supplier": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "title": "Ollama (native)",
      "choices": {
        "baseURL": "http://localhost:11434/v1"
      },
      "fashions": {
        "gemma4:e4b-128k": {
          "title": "Gemma 4 E4B 128K"
        }
      }
    }
  },
  "mannequin": "ollama/gemma4:e4b-128k"
}

Two necessary items right here:

First, OpenCode talks to Ollama via Ollama’s native OpenAI-compatible endpoint:

http://localhost:11434/v1

Second, notice that we set the mannequin title by following OpenCode’s supplier/mannequin format:

ollama/gemma4:e4b-128k

You utilize our newly created mannequin tag above.

Now, for those who launch OpenCode from the identical challenge folder by way of:

opencode

You must see gemma4:e4b-128k listed.

Determine 4. OpenCode related to the native Gemma 4 mannequin. (Picture by writer)

Now we’re all arrange!

5. What Can You Do With This Setup?

With OpenCode TUI launched, you possibly can take a look at your setup by asking the agent to do just a few duties. For instance, you possibly can ask the agent to write down a README file, clarify particular features, create testing scripts, and so on.

Actually, past coding, you can too ask the agent to do many workspace duties, comparable to file manipulations, content material extractions, and so forth.

OpenCode additionally offers you room to develop the setup. You may also join instruments to the agent, set up agent abilities with SKILL.md, and outline specialised brokers with AGENTS.md.

What’s extra, you possibly can run duties from the command line with:

opencode run "Summarize this repository."

For extra programmatic use, OpenCode also can run as a server, so the TUI isn’t the one interface.

And right here is a very powerful factor: all of your knowledge stays totally native.

You could find related OpenCode docs right here:

CLI: https://opencode.ai/docs/cli/

Abilities: https://opencode.ai/docs/abilities/

MCP: https://opencode.ai/docs/mcp-servers/

Server mode: https://opencode.ai/docs/server/