Careers

Build the fastest, most capable local AI engine on the planet.

We run data-center-grade inference on the GPUs people already own — no cloud, no API fees, no data leaving the device. It's a hard systems problem: CUDA kernels, a ground-up tensor-parallel transport that runs over ordinary networking, and a consumer product that has to just work on Windows. We're a tiny, founder-led team, we ship every week, and we're hiring the people who will define the next ten years of the company.

Own the outcome

Small team, enormous surface area. You'll own a system end-to-end and ship it to real users.

Bare metal, literally

We go all the way down — kernels, memory, the wire format. No layers of abstraction to hide behind.

Ship weekly

Fast loops, real hardware, real customers. We'd rather learn from production than debate in a doc.

Founding team

Founding Engineer — Inference & Kernels Cofounder track Full-time · Equity-heavy

Own the engine. You'll design and build the inference stack that lets a $300 consumer GPU punch far above its weight — writing and tuning custom CUDA kernels, pushing TensorRT-LLM past where it's meant to go, and squeezing 7B–70B-class models onto cards that "shouldn't" run them through quantization and attention work. This is a true cofounder-track seat: you'll set the technical direction of the core product alongside the founder and build the team behind it.

What you'll own

Custom CUDA / attention kernels and low-level perf work — latency, throughput, memory.
Our TensorRT-LLM fork: quantization (INT4/FP8/NVFP4), paged attention, the executor path.
Model coverage — bringing new architectures up and making them fit consumer VRAM tiers.
Technical roadmap and early engineering hires as we grow.

You might be a fit if

You've written production CUDA and can reason about a GPU at the warp/SM level.
Deep C/C++ and Python; comfortable in concurrency, memory management, and the systems layer.
You've worked with an inference engine — TensorRT-LLM, vLLM, SGLang, or llama.cpp.
You want the ownership of a startup CTO and the rigor of a staff engineer.

Apply →

Founding GTM Lead Cofounder track Full-time · Equity-heavy

Be the first commercial hire and the business half of the founding team. You'll take go-to-market from founder-led selling to a repeatable motion — owning positioning, pricing, the early sales playbook, and our first enterprise relationships. You're comfortable without a playbook because you'll write it: define the ICP, run outbound and inbound experiments, close the first six-figure contracts, and stand up the operations that turn a product people love into a company.

What you'll own

Go-to-market strategy with the founder — market sizing, segmentation, pricing, partnerships.
The early sales playbook: outbound + inbound experiments, ICP definition, the first deals.
Positioning and messaging in partnership with product — how the world understands us.
Early company operations and the commercial team we hire behind you.

You might be a fit if

You've been a founder or early commercial hire at a seed / Series A startup.
You can close and deliver six-figure contracts and build the motion behind them.
Some mix of enterprise sales, startup ops, and early-stage marketing — and you figure things out.
You're technical enough to sell a real systems product to real engineers.

Apply →

Go-to-market

Technical BDR (Enterprise) Full-time

Open the doors. You'll run outbound into enterprise accounts — researching the right teams, writing the cold outreach that actually lands, and booking qualified demos for the founder, GTM lead, and account executives. You're the top of the funnel: every pipeline conversation starts with a meeting you set. It's a high-ownership seat with a clear path into full-cycle enterprise sales as we grow.

What you'll own

Outbound prospecting into enterprise accounts — account research, list-building, multi-channel outreach.
Cold email, calls, and LinkedIn that get replies — and the experiments to make them convert.
Booking and qualifying demos, then handing off a warm, well-briefed meeting.
Keeping the CRM and pipeline honest — every touch, every next step tracked.

You might be a fit if

You've hit outbound quota as a BDR/SDR, ideally selling technical or developer products.
You're relentless and organized — high volume without dropping the thread.
You write crisp, human outreach and can speak credibly to engineers.
You want a path from booking demos to owning full-cycle enterprise deals.

Apply →

Enterprise Account Executive Full-time

Close the deals. You'll own full-cycle enterprise sales — taking the demos the BDR team books, running technical evaluations alongside our engineers, and closing six-figure on-prem contracts. You're selling a real systems product to technical buyers who care about privacy, performance, and owning their own stack, so you lead with substance, not spin. As an early commercial hire you'll shape the playbook you sell from.

What you'll own

Full-cycle enterprise deals — from qualified demo to signed six-figure contract.
Technical evaluations and POCs run with our engineers and the customer's team.
Navigating procurement, security review, and on-prem deployment requirements.
Forecasting, pipeline hygiene, and the repeatable motion the GTM lead is building.

You might be a fit if

You've carried and hit an enterprise quota closing technical or infrastructure software.
You're comfortable selling to engineers and CISOs — privacy, on-prem, performance.
You thrive without a finished playbook and help write the one that scales.
You'd rather win a few high-stakes accounts than spray a thousand low-intent leads.

Apply →

Engineering

Forward Deployed Engineer Full-time

Embed with our biggest customers and solve their hardest deployment problems in person. You'll sit inside a partner organization, map the real-world problem, design the solution, and ship production code that makes Bare Metal AI work on their hardware and in their workflows. It's a hybrid of software, platform, and field engineering — customer-facing, but the core of the job is building. You carry no sales quota; you carry the deployment.

What you'll own

End-to-end deployments at flagship customers — from first call to in-production.
Production integration code: getting the daemon, models, and transport running on their fleet.
Translating messy real-world requirements into a concrete technical solution.
The feedback loop back to the core engineering team — you are the product's eyes in the field.

You might be a fit if

Strong generalist engineer who ships real code and likes being in the room with customers.
You operate with founder-level autonomy and staff-level technical rigor.
Comfortable on Windows, GPUs, networking, and debugging someone else's environment.
You'd rather own one high-stakes account than ten tickets.

Apply →

Distributed Systems Engineer Full-time

Own the transport. Our tensor-parallel mesh runs over ordinary networking across multiple machines on a home or office network — no NVLink, no InfiniBand, no NCCL. You'll design and run the foundational layer that keeps GPUs in lockstep across the wire: collective communication over standard networks, rank negotiation, fault tolerance, and the latency budget that makes multi-machine inference feel local.

What you'll own

The collective network transport — the wire protocol that keeps multiple GPUs in sync.
Latency, concurrency, fault tolerance, and consistency across nodes that can drop or stall.
Rank negotiation, peer discovery, and the health/heartbeat layer for the mesh.
Performance work that keeps cross-machine TP within striking distance of a single box.

You might be a fit if

You've built low-latency networked systems — RPC, sockets, gRPC/WebSockets/TCP at depth.
Strong CS fundamentals and real large-scale or real-time system design experience.
Proficient in C++/Go/Python and happy living in the protocol stack.
You sweat tail latency and treat a dropped packet as a design problem, not an exception.

Apply →

Product / Full-Stack Engineer Full-time

Own the experience. The desktop app, the self-managing daemon, the web client — the surfaces real people touch. You'll ship a local-AI product that installs in one click, manages GPUs and models on its own, and just works on Windows. From the installer to the chat UI to the OpenAI/Anthropic-compatible API, you turn a hard systems engine into something people love to use.

What you'll own

The Windows desktop app and installer — one-click, self-contained, self-updating.
The daemon's control plane: model management, GPU reconciliation, voice, the API shims.
The web client and chat experience — streaming, model switching, settings.
The polish that separates a demo from a product people pay for.

You might be a fit if

Strong full-stack chops — Python backend, modern JS/HTML/CSS front end, no framework dogma.
You care about the last 10% — latency, error states, install-time edge cases.
Bonus: Windows packaging, PyInstaller / native desktop, or audio/voice experience.
You ship, measure, and iterate instead of polishing in private.

Apply →

Headquarters

Built in Palm Springs, California.

Our home base sits at the foot of the San Jacinto mountains in the California desert — sunshine, mid-century calm, and the kind of focus that's hard to find in a crowded tech hub. We build in person, we ship every week, and we do it 110 miles from the noise of the Bay.

350+ days of sunshine a year and palm-lined streets to clear your head between deploys.
Two hours from LA, an hour from San Diego — close to the action, far from the rent.
Mid-century-modern design capital with a serious food, hiking, and tennis scene.
Desert quiet that's made for deep work, plus a relocation package to get you here.

Apply

Pick the role that fits — or choose “Something else” if you're exceptional at something adjacent and want to build the fastest, most capable local AI engine on the planet. A founder reads every application.

Prefer email? Write us at [email protected]