Turn old phones into AI infrastructure

Phonon turns old Android phones into a private, local AI inference cluster. Alpha — the project works, and we need your help getting it to beta. Test it, break it, tell us what's missing.

Alpha
Works, evolving — help us ship
Open
MIT, fully permissive
$0
Self-hosted, always free
Help
Wanted — testers, phones, code

Your phones, your inference

No cloud, no subscription, no data leaving your network. A cluster built from phones you already own — or old ones gathering dust in a drawer.

🔒

Total Privacy

Inference runs entirely on your local network. Your prompts, your data, your models — never touch an external server unless you choose to join the public pool.

♻️

Zero E-Waste

Every old phone with an NPU is a compute node waiting to work. Phonon gives devices a second life as inference hardware instead of landfill.

NPU-Accelerated

LiteRT-LM runs models directly on Pixel Tensor and Qualcomm Hexagon NPUs. Inference that uses the hardware already in the phone.

🔌

OpenAI-Compatible API

Point any agent framework at your Phonon cluster via the standard /v1/chat/completions API. No special integration needed.

📱

Any Android Phone

Pixels, Galaxies, OnePlus — if it runs Android 14+ with an NPU, it participates. Mix and match models across devices.

🌐

Stays on Your Network

No telemetry, no phone-home, no cloud dependency. Your prompts and models never touch an external server.

We need you to make this work

Phonon works. The coordinator runs, phones pair, inference happens. But alpha software needs people testing it on real hardware in real networks. That's where you come in.

Alpha

Test it, break it, report it

0 $

To try it yourself

Open

MIT, fully permissive

Now

The best time to start

Three ways to contribute

The fastest path to beta is people like you using it, testing it, and helping us improve it.

🔧

Test and Report

Set up a cluster on your hardware. Try different phones, different networks, different models. Open issues for everything that doesn't work — and everything that does.

📱

Donate a Phone

Have an old Android phone gathering dust? Send it in. More hardware diversity means more regression coverage and a faster path to beta.

💻

Contribute Code

The coordinator is Go, the sidecar is Kotlin, the visualizer is Compose Canvas. Pick an issue, open a PR. MIT — no CLA, no friction.

Try it. Break it. Tell us.

Alpha software is a conversation. Run it on your hardware, open issues for everything you find, and help us ship something solid together.

Get Started Open an Issue Contribute

How Phonon works

A coordinator orchestrates a fleet of Android phones. Each phone runs a sidecar service that loads models, accepts inference requests, and reports health — all over your local network.

1 The Coordinator

A single Go binary runs on any Linux device on your network — a Raspberry Pi, an Odroid, a spare laptop, or even a cloud VM. It serves an OpenAI-compatible API, discovers phones via mDNS, load-balances requests across available devices, and exposes a web UI for management.

$ phonon coordinator --config cluster.yaml # Coordinator starts, begins mDNS discovery
  • OpenAI-compatible API (/v1/chat/completions)
  • mDNS service discovery — phones announce themselves
  • Round-robin load balancing across phones
  • Health monitoring with Prometheus metrics
  • YAML-based declarative configuration
  • Embedded web UI

2 The Phone Sidecar

Each phone runs a Kotlin foreground service that connects to the coordinator, downloads models, and runs inference using LiteRT-LM — Google's NPU-first inference runtime. The sidecar reports battery level, temperature, and processing state every heartbeat.

  • Runs as a persistent foreground service with a notification
  • Inference via LiteRT-LM on Pixel Tensor and Qualcomm Hexagon NPUs
  • Heartbeat and health reporting every 30 seconds
  • Model download, cache, and management
  • Battery-aware: reduces load when unplugged and below 20%
  • Pre-flight readiness checks before joining the pool

3 Deployment Modes

Pool Mode — Available Now

Each phone runs independently. The coordinator load-balances requests round-robin. All phones work in parallel. Best for high-throughput inference with smaller models.

Shard Mode — In Development

Multiple phones collectively run one large model via pipelined-ring parallelism. The coordinator routes to the group as a single logical device. Enables running models larger than any single phone can handle.

4 The Visualization Engine

Every phone screen becomes a live telemetry display. The visualization engine renders device state — inference activity, token rate, temperature, battery — as real-time animations that make a rack of phones informative and beautiful.

Theme packs are community-contributed Kotlin files that implement the VisualizationPack interface. Each pack receives a VizState snapshot every frame and renders using that data. Built-in packs include Synthwave, Honeycomb, Veil, Neon Ring, and more.

Try the live visualizer to see them in action.

Theme packs, live

This is the same benchmark tool used to develop theme packs. Tweak VizState sliders and watch packs react in real time. Each pack is a self-contained Kotlin file in the phonon repo.

🧩 What you're seeing

The bench simulates the theme pack runtime in your browser. Use the controls on the left to change VizState fields — inference load, battery level, temperature, processing status — and watch the selected pack react in real time at ~60 fps.

Each pack you see here is a VisualizationPack implementation in the phonon repo. Prototype new packs in the bench, then port them to Kotlin and open a PR.

  • Pack selector — switch between available themes
  • State sliders — inference load, battery, temperature, queue depth
  • Toggles — processing mode, charging status, low-power mode
  • Live output — rendered at full frame rate

Build your own pack

Grab the bench, prototype something, port it to Kotlin, and open a PR.

Get Started Theme Pack Issues →

Help us get to beta

Phonon is alpha software that works. The fastest path to a stable, reliable beta is people like you testing it, reporting issues, donating hardware, and contributing code.

🔧 Test and report

This is the single most valuable thing you can do. Set up a Phonon cluster on whatever hardware you have. Try different phone models, different network configurations, different models. Everything you find — every crash, every rough edge, every unclear error message — open an issue.

Even reporting "it worked perfectly on my hardware" is valuable. It tells us what's stable and what combinations need more testing.

Open an issue on GitHub →

📱 Donate a phone

Have an old Android phone sitting in a drawer? Every phone we receive gets wiped, tested, and added to our testing cluster. More hardware diversity means better regression coverage.

For now, reach out via GitHub Issues to coordinate shipping.

💻 Contribute code

The full stack is open source under MIT:

  • Coordinator — Go HTTP server, mDNS discovery, load balancing
  • Sidecar — Kotlin foreground service, LiteRT-LM inference
  • Visualizer — Compose Canvas theme packs
  • Website — right here, single-file HTML, solarpunk

Browse the open issues, pick one, and open a PR. No CLA, no bureaucracy.

View the repo on GitHub →

📢 Spread the word

Star the repo. Tell your friends. Write about what you built with your phone cluster. The more people know about Phonon, the more testers and contributors we'll find.

Ready to jump in?

The project is small, the issues are real, and every contribution matters.

Try It First GitHub →

Getting started with Phonon

Overview

Phonon turns old Android phones into a private AI inference cluster. A single Go coordinator binary runs on any Linux device on your network (Raspberry Pi, Odroid, spare laptop, cloud VM). One or more Android phones with the Phonon sidecar APK connect to the coordinator and form an inference pool.

The coordinator serves an OpenAI-compatible API. Any tool that can call /v1/chat/completions can use Phonon — no special integration needed.

Hardware Requirements

Coordinator

  • Any Linux device (Raspberry Pi, Odroid, x86 machine)
  • 1 GB RAM minimum, 2 GB+ recommended
  • Network access to your phone fleet (same LAN or Tailscale)
  • Docker or Go runtime

Phones

  • Android 14+
  • NPU recommended (Pixel 6+, Galaxy S24+, OnePlus 12+ — any device with a Tensor or Hexagon NPU)
  • 4 GB+ RAM
  • Available via ADB or network pairing

Installation

1. Install the coordinator

docker pull chezgoulet/phonon:latest
docker run -d \
  --name phonon \
  -p 8080:8080 \
  -v /path/to/config.yaml:/etc/phonon/config.yaml \
  chezgoulet/phonon:latest

Or run the binary directly:

# Download the latest release from GitHub
wget https://github.com/chezgoulet/phonon/releases/latest/download/phonon-linux-arm64
chmod +x phonon-linux-arm64
./phonon-linux-arm64 coordinator --config config.yaml

2. Install the APK on each phone

Recommended (Obtainium): Add the GitHub releases URL to Obtainium for automatic updates. The APK is release-signed and compatible with sideloading.

Manual: Download the APK from the latest GitHub Release and sideload via ADB.

3. Pair phones to the coordinator

Open the Phonon sidecar app. It discovers the coordinator via mDNS automatically. Accept the pairing prompt. The phone is now part of your cluster.

Configuration

The coordinator uses a YAML configuration file. Here's a minimal example:

cluster:
  name: "home-pool"
  listen: ":8080"

discovery:
  mdns: true

models:
  - name: "gemma-4-e2b"
    url: "https://huggingface.co/google/gemma-4-e2b-litert"
    min_ram_mb: 4096

telemetry:
  prometheus: true
  event_log: "sqlite:///var/lib/phonon/events.db"

Usage

Once your cluster is running, point any OpenAI-compatible client at the coordinator:

curl http://phonon.local:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-4-e2b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Theme Packs

Phonon's sidecar includes a visualization engine that renders phone state (inference load, temperature, battery) as animated displays. Theme packs are community-contributed Kotlin files that implement the VisualizationPack interface.

Built-in themes: Synthwave, Honeycomb, Veil, Cyber HUD, Neon Ring, Matrix Rain, Bioluminescent, LCARS. Submit your own via PR.

Troubleshooting

Phone not discovered

Check that the phone and coordinator are on the same network. Verify mDNS is enabled on your router. Try restarting the sidecar app.

Model download fails

Ensure the phone has internet access for the initial download. Models are cached after download. Check available storage space.

Poor inference performance

Check battery level — phones below 20% and not charging enter degraded mode. Check temperature — thermal throttling reduces performance. Verify the phone has an NPU and that LiteRT-LM is using it.

Latest from Phonon

Engineering deep-dives, release notes, and the philosophy behind turning e-waste into AI infrastructure.

Coming soon.

Release notes, deep-dives, and project essays will live here.

Why Phonon exists

AI inference should be private, sustainable, and accessible. Phonon uses hardware that already exists to make that possible.

🌱 The problem

AI inference is expensive, centralized, and generates e-waste. Cloud providers charge per-token at margins that reflect datacenter costs, not actual compute value. Hardware turns over every few years, and each upgrade cycle creates more specialized waste.

Meanwhile, billions of smartphones sit in drawers. They have screens, batteries, networking, cooling, and NPUs — all the infrastructure needed to run inference. They're already paid for. They already exist.

💡 The insight

Phonon started with a simple observation: the hardware needed for AI inference is already in people's pockets and drawers. The question isn't whether phones can do inference — it's whether we can build the software to make it practical.

That's what Phonon is: a Go coordinator and an Android sidecar APK that turn any phone into a private inference node. No cloud required. No new hardware needed.

🔧 The architecture

Phonon is two components: a Go coordinator and an Android sidecar APK. The coordinator serves an OpenAI-compatible API and load-balances across phones. The sidecar runs inference via LiteRT-LM. Everything communicates over your local network. No telemetry. No phone-home.

👤 The team

Phonon is built by a single human and two AI agents — Vigilant (systems) and Dev (implementation). A one-person operation running on unfashionable hardware.

Based in Burlington, Vermont. Moving to Sherbrooke, Quebec. The server runs on a Linode VPS in Toronto and an Odroid at home. The phones come from eBay and Swappa.

⚖️ License

Phonon is MIT — fully permissive. The coordinator, sidecar, theme packs, and website are all open source. No CLA, no friction. Self-hosted is always fully functional.

Get your cluster running

One coordinator, a few phones, and you'll have private inference running on your local network in about 15 minutes.

1. Get the coordinator running

Pull the Docker image for any Linux device:
docker pull chezgoulet/phonon:latest

2. Install the APK

Download the latest release APK from GitHub and sideload it onto each phone. Use Obtainium for automatic updates.

3. Pair and verify

Open the app on each phone. It discovers the coordinator via mDNS automatically. Accept the pairing prompt. Your cluster is live.

4. Point your agent at it

Any tool that speaks the OpenAI API can use Phonon. Point it at http://phonon.local:8080/v1.

5. Watch it work

Each phone screen turns into a live telemetry display. Pick a theme pack or write your own.

6. Tell us how it went

Open an issue with what worked, what didn't, and what confused you. Every report makes the project better.

Need help?

Open an issue on GitHub or browse the existing ones. The project is small — you'll get a response.

GitHub → Open an Issue
← Back to blog

Loading...