Turn old phones into AI infrastructure

Phonon turns old Android phones into a private, local AI inference cluster. Alpha — the project works, and we need your help getting it to beta. Test it, break it, tell us what's missing.

Get Started How It Works GitHub →

Alpha

Works, evolving — help us ship

Open

MIT, fully permissive

Self-hosted, always free

Help

Wanted — testers, phones, code

What it is

Your phones, your inference

No cloud, no subscription, no data leaving your network. A cluster built from phones you already own — or old ones gathering dust in a drawer.

🔒

Total Privacy

Inference runs entirely on your local network. Your prompts, your data, your models — never touch an external server unless you choose to join the public pool.

♻️

Zero E-Waste

Every old phone with an NPU is a compute node waiting to work. Phonon gives devices a second life as inference hardware instead of landfill.

⚡

NPU-Accelerated

LiteRT-LM runs models directly on Pixel Tensor and Qualcomm Hexagon NPUs. Inference that uses the hardware already in the phone.

🔌

OpenAI-Compatible API

Point any agent framework at your Phonon cluster via the standard /v1/chat/completions API. No special integration needed.

📱

Any Android Phone

Pixels, Galaxies, OnePlus — if it runs Android 14+ with an NPU, it participates. Mix and match models across devices.

🌐

Stays on Your Network

No telemetry, no phone-home, no cloud dependency. Your prompts and models never touch an external server.

This is early days

We need you to make this work

Phonon works. The coordinator runs, phones pair, inference happens. But alpha software needs people testing it on real hardware in real networks. That's where you come in.

Alpha

Test it, break it, report it

0 $

To try it yourself

Open

MIT, fully permissive

Now

The best time to start

How to help

Three ways to contribute

The fastest path to beta is people like you using it, testing it, and helping us improve it.

🔧

Test and Report

Set up a cluster on your hardware. Try different phones, different networks, different models. Open issues for everything that doesn't work — and everything that does.

📱

Donate a Phone

Have an old Android phone gathering dust? Send it in. More hardware diversity means more regression coverage and a faster path to beta.

💻

Contribute Code

The coordinator is Go, the sidecar is Kotlin, the visualizer is Compose Canvas. Pick an issue, open a PR. MIT — no CLA, no friction.

Try it. Break it. Tell us.

Alpha software is a conversation. Run it on your hardware, open issues for everything you find, and help us ship something solid together.

Get Started Open an Issue Contribute

Architecture

How Phonon works

A coordinator orchestrates a fleet of Android phones. Each phone runs a sidecar service that loads models, accepts inference requests, and reports health — all over your local network.

1 The Coordinator

A single Go binary runs on any Linux device on your network — a Raspberry Pi, an Odroid, a spare laptop, or even a cloud VM. It serves an OpenAI-compatible API, discovers phones via mDNS, load-balances requests across available devices, and exposes a web UI for management.

$ phonon coordinator --config cluster.yaml
# Coordinator starts, begins mDNS discovery

OpenAI-compatible API (/v1/chat/completions)
mDNS service discovery — phones announce themselves
Round-robin load balancing across phones
Health monitoring with Prometheus metrics
YAML-based declarative configuration
Embedded web UI

2 The Phone Sidecar

Each phone runs a Kotlin foreground service that connects to the coordinator, downloads models, and runs inference using LiteRT-LM — Google's NPU-first inference runtime. The sidecar reports battery level, temperature, and processing state every heartbeat.

Runs as a persistent foreground service with a notification
Inference via LiteRT-LM on Pixel Tensor and Qualcomm Hexagon NPUs
Heartbeat and health reporting every 30 seconds
Model download, cache, and management
Battery-aware: reduces load when unplugged and below 20%
Pre-flight readiness checks before joining the pool

3 Deployment Modes

Pool Mode — Available Now

Each phone runs independently. The coordinator load-balances requests round-robin. All phones work in parallel. Best for high-throughput inference with smaller models.

Shard Mode — In Development

Multiple phones collectively run one large model via pipelined-ring parallelism. The coordinator routes to the group as a single logical device. Enables running models larger than any single phone can handle.

4 The Visualization Engine

Every phone screen becomes a live telemetry display. The visualization engine renders device state — inference activity, token rate, temperature, battery — as real-time animations that make a rack of phones informative and beautiful.

Theme packs are community-contributed Kotlin files that implement the VisualizationPack interface. Each pack receives a VizState snapshot every frame and renders using that data. Built-in packs include Synthwave, Honeycomb, Veil, Neon Ring, and more.

Try the live visualizer to see them in action.

Visualization Packs

Theme gallery

Each pack is a self-contained Canvas 2D renderer in the sidecar APK, mirrored in the JS bench for in-browser prototyping. Screenshots below show each theme in its mid-activity state.

[Neon Ring screenshot]

Neon Ring

A living reactor core: palette, turbulence, orbital collisions and energy arcs shift smoothly from calm idle to a roiling high-inference state.

[Bioluminescent Dreamscape screenshot]

Bioluminescent Dreamscape

Free-swimming bioluminescent organisms drift through the abyss, their luminous bodies tracing sinuous paths in the dark.

[Matrix Rain screenshot]

Matrix Rain

CRT phosphor rain whose speed tracks workload, brightness tracks battery, and hue bends toward red as the phone heats up.

[Cyber HUD screenshot]

Cyber HUD

Tactical heads-up display: perspective wireframe, reactive corner brackets, a radar mapping live peers, and an oscilloscope tracing inference load.

[LCARS screenshot]

LCARS

A structurally faithful Star Trek: TNG Okudagram console — unit-grid elbows, half-pill end-caps, numbered elements, and bracket-grouped readouts.

Build your own pack

Grab the JS bench, prototype something, port it to Kotlin, and open a PR.

Bench on GitHub → Theme Pack Issues →

Contribute

Help us get to beta

Phonon is alpha software that works. The fastest path to a stable, reliable beta is people like you testing it, reporting issues, donating hardware, and contributing code.

🔧 Test and report

This is the single most valuable thing you can do. Set up a Phonon cluster on whatever hardware you have. Try different phone models, different network configurations, different models. Everything you find — every crash, every rough edge, every unclear error message — open an issue.

Even reporting "it worked perfectly on my hardware" is valuable. It tells us what's stable and what combinations need more testing.

Open an issue on GitHub →

📱 Donate a phone

Have an old Android phone sitting in a drawer? Every phone we receive gets wiped, tested, and added to our testing cluster. More hardware diversity means better regression coverage.

For now, reach out via GitHub Issues to coordinate shipping.

💻 Contribute code

The full stack is open source under MIT:

Coordinator — Go HTTP server, mDNS discovery, load balancing
Sidecar — Kotlin foreground service, LiteRT-LM inference
Visualizer — Compose Canvas theme packs
Website — right here, single-file HTML, solarpunk

Browse the open issues, pick one, and open a PR. No CLA, no bureaucracy.

View the repo on GitHub →

📢 Spread the word

Star the repo. Tell your friends. Write about what you built with your phone cluster. The more people know about Phonon, the more testers and contributors we'll find.

Ready to jump in?

The project is small, the issues are real, and every contribution matters.

Try It First GitHub →

Documentation

Getting started with Phonon

Overview

Phonon turns old Android phones into a private AI inference cluster. A single Go coordinator binary runs on any Linux device on your network (Raspberry Pi, Odroid, spare laptop, cloud VM). One or more Android phones with the Phonon sidecar APK connect to the coordinator and form an inference pool.

The coordinator serves an OpenAI-compatible API. Any tool that can call /v1/chat/completions can use Phonon — no special integration needed.

Hardware Requirements

Coordinator

Any Linux device (Raspberry Pi, Odroid, x86 machine)
1 GB RAM minimum, 2 GB+ recommended
Network access to your phone fleet (same LAN or Tailscale)
Docker or Go runtime

Phones

Android 14+
NPU recommended (Pixel 6+, Galaxy S24+, OnePlus 12+ — any device with a Tensor or Hexagon NPU)
4 GB+ RAM
Available via ADB or network pairing

Installation

1. Install the coordinator

docker pull chezgoulet/phonon:latest
docker run -d \
  --name phonon \
  -p 8080:8080 \
  -v /path/to/config.yaml:/etc/phonon/config.yaml \
  chezgoulet/phonon:latest

Or run the binary directly:

# Download the latest release from GitHub
wget https://github.com/chezgoulet/phonon/releases/latest/download/phonon-linux-arm64
chmod +x phonon-linux-arm64
./phonon-linux-arm64 coordinator --config config.yaml

2. Install the APK on each phone

Recommended (Obtainium): Add the GitHub releases URL to Obtainium for automatic updates. The APK is release-signed and compatible with sideloading.

Manual: Download the APK from the latest GitHub Release and sideload via ADB.

3. Pair phones to the coordinator

Open the Phonon sidecar app. It discovers the coordinator via mDNS automatically. Accept the pairing prompt. The phone is now part of your cluster.

Configuration

The coordinator uses a YAML configuration file. Here's a minimal example:

cluster:
  name: "home-pool"
  listen: ":8080"

discovery:
  mdns: true

models:
  - name: "gemma-4-e2b"
    url: "https://huggingface.co/google/gemma-4-e2b-litert"
    min_ram_mb: 4096

telemetry:
  prometheus: true
  event_log: "sqlite:///var/lib/phonon/events.db"

Usage

Once your cluster is running, point any OpenAI-compatible client at the coordinator:

curl http://phonon.local:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-4-e2b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Theme Packs

Phonon's sidecar includes a visualization engine that renders phone state (inference load, temperature, battery) as animated displays. Theme packs are community-contributed Kotlin files that implement the VisualizationPack interface.

Built-in themes: Synthwave, Honeycomb, Veil, Cyber HUD, Neon Ring, Matrix Rain, Bioluminescent, LCARS. Submit your own via PR.

Troubleshooting

Phone not discovered

Check that the phone and coordinator are on the same network. Verify mDNS is enabled on your router. Try restarting the sidecar app.

Model download fails

Ensure the phone has internet access for the initial download. Models are cached after download. Check available storage space.

Poor inference performance

Check battery level — phones below 20% and not charging enter degraded mode. Check temperature — thermal throttling reduces performance. Verify the phone has an NPU and that LiteRT-LM is using it.

Blog

Latest from Phonon

Engineering deep-dives, release notes, and the philosophy behind turning e-waste into AI infrastructure.

Coming soon.

Release notes, deep-dives, and project essays will live here.

About

Why Phonon exists

AI inference should be private, sustainable, and accessible. Phonon uses hardware that already exists to make that possible.

🌱 The problem

AI inference is expensive, centralized, and generates e-waste. Cloud providers charge per-token at margins that reflect datacenter costs, not actual compute value. Hardware turns over every few years, and each upgrade cycle creates more specialized waste.

Meanwhile, billions of smartphones sit in drawers. They have screens, batteries, networking, cooling, and NPUs — all the infrastructure needed to run inference. They're already paid for. They already exist.

💡 The insight

Phonon started with a simple observation: the hardware needed for AI inference is already in people's pockets and drawers. The question isn't whether phones can do inference — it's whether we can build the software to make it practical.

That's what Phonon is: a Go coordinator and an Android sidecar APK that turn any phone into a private inference node. No cloud required. No new hardware needed.

🔧 The architecture

Phonon is two components: a Go coordinator and an Android sidecar APK. The coordinator serves an OpenAI-compatible API and load-balances across phones. The sidecar runs inference via LiteRT-LM. Everything communicates over your local network. No telemetry. No phone-home.

👤 The team

Phonon is built by a single human and two AI agents — Vigilant (systems) and Dev (implementation). A one-person operation running on unfashionable hardware.

Based in Burlington, Vermont. Moving to Sherbrooke, Quebec. The server runs on a Linode VPS in Toronto and an Odroid at home. The phones come from eBay and Swappa.

⚖️ License

Phonon is MIT — fully permissive. The coordinator, sidecar, theme packs, and website are all open source. No CLA, no friction. Self-hosted is always fully functional.

Quickstart

Get your cluster running

One coordinator, a few phones, and you'll have private inference running on your local network in about 15 minutes.

1. Get the coordinator running

Pull the Docker image for any Linux device:
docker pull chezgoulet/phonon:latest

2. Install the APK

Download the latest release APK from GitHub and sideload it onto each phone. Use Obtainium for automatic updates.

3. Pair and verify

Open the app on each phone. It discovers the coordinator via mDNS automatically. Accept the pairing prompt. Your cluster is live.

4. Point your agent at it

Any tool that speaks the OpenAI API can use Phonon. Point it at http://phonon.local:8080/v1.

5. Watch it work

Each phone screen turns into a live telemetry display. Pick a theme pack or write your own.

6. Tell us how it went

Open an issue with what worked, what didn't, and what confused you. Every report makes the project better.

Need help?

Open an issue on GitHub or browse the existing ones. The project is small — you'll get a response.

GitHub → Open an Issue

Turn old phones into AI infrastructure

Your phones, your inference

Total Privacy

Zero E-Waste

NPU-Accelerated

OpenAI-Compatible API

Any Android Phone

Stays on Your Network

We need you to make this work

Alpha

0 $

Open

Now

Three ways to contribute

Test and Report

Donate a Phone

Contribute Code

Try it. Break it. Tell us.

How Phonon works

1 The Coordinator

2 The Phone Sidecar

3 Deployment Modes

Pool Mode — Available Now

Shard Mode — In Development

4 The Visualization Engine

Theme gallery

Neon Ring

Bioluminescent Dreamscape

Matrix Rain

Cyber HUD

LCARS

Build your own pack

Help us get to beta

🔧 Test and report

📱 Donate a phone

💻 Contribute code

📢 Spread the word

Ready to jump in?

Getting started with Phonon

Guide

Overview

Hardware Requirements

Coordinator

Phones

Installation

1. Install the coordinator

2. Install the APK on each phone

3. Pair phones to the coordinator

Configuration

Usage

Theme Packs

Troubleshooting

Phone not discovered

Model download fails

Poor inference performance

Latest from Phonon

Why Phonon exists

🌱 The problem

💡 The insight

🔧 The architecture

👤 The team

⚖️ License

Get your cluster running

1. Get the coordinator running

2. Install the APK

3. Pair and verify

4. Point your agent at it

5. Watch it work

6. Tell us how it went

Need help?

Loading...