Bare Metal Presence — a private, local, photoreal AI avatar

Bare Metal Presence · preview · look it in the eye

A face to talk to.
100% local.

A photorealistic avatar that hears you, thinks, and answers out loud — a real face and voice rendered entirely on your own GPU. It lip-syncs to its own speech in real time, and nothing — audio, video, or transcript — ever leaves the box. No cloud avatar service. No per-minute meter.

Get the runtime How it works →

Live loop “Hey — can you summarize what we talked about yesterday?”

Hear

Whisper transcribes your speech on-device — no audio is uploaded.

Think

Your local model composes the reply on your own GPU — with memory of past chats.

Speak

On-device text-to-speech voices the answer — no audio leaves your machine.

Face

A photoreal face is rendered frame-by-frame on your GPU, lip-synced to the reply as it speaks.

Whisper STTYour local LLMOn-device TTSGPU-rendered face0 bytes to the cloud

Private by construction

Per minute — your GPU, your model, no meter. Cloud avatar agents bill by the minute.

Bytes leave your machine — audio, video and transcript stay on-device

1 GPU

Runs the whole avatar — speech, model, voice and face on a single card

100%

Offline-capable — pull the network cable and it still talks back

Hear · think · speak · appear — one stack

The whole avatar
runs on your GPU.

Cloud avatar platforms stream a face from their servers — your voice and video ride to the cloud, get rendered there, and come back as a video feed you don’t control. Bare Metal Presence runs the entire loop — speech, model, voice, and the photoreal face — on the same box. The face is rendered frame-by-frame on your own GPU and lip-synced to its own speech, so nothing you say and no frame it draws ever leaves the machine.

Hear

On-device speech

Whisper runs locally and transcribes as you talk. Your voice is never streamed to a server.

Think

Your local model

Any model in the catalog, running on your own GPU — with memory of your past conversations.

Speak

On-device voice

Natural text-to-speech renders the reply on your own GPU. The whole round-trip stays offline.

Face

Rendered on your GPU

A photoreal face is generated frame-by-frame and lip-synced to the speech — no cloud video feed.

A face to talk to.100% local.

The whole avatarruns on your GPU.