We run data-center-grade inference on the GPUs people already own — no cloud,
no API fees, no data leaving the device. It's a hard systems problem: CUDA kernels,
a ground-up tensor-parallel transport that runs over ordinary networking, and a consumer
product that has to just work on Windows. We're a tiny, founder-led team, we ship
every week, and we're hiring the people who will define the next ten years of the company.
Own the outcome
Small team, enormous surface area. You'll own a system end-to-end and ship it to real users.
Bare metal, literally
We go all the way down — kernels, memory, the wire format. No layers of abstraction to hide behind.
Ship weekly
Fast loops, real hardware, real customers. We'd rather learn from production than debate in a doc.
Founding Engineer — Inference & Kernels
Cofounder track
Full-time · Equity-heavy
Own the engine. You'll design and build the inference stack that lets a $300 consumer
GPU punch far above its weight — writing and tuning custom CUDA kernels, pushing
TensorRT-LLM past where it's meant to go, and squeezing 7B–70B-class models onto cards
that "shouldn't" run them through quantization and attention work. This is a true
cofounder-track seat: you'll set the technical direction of the core product alongside
the founder and build the team behind it.
What you'll own
- Custom CUDA / attention kernels and low-level perf work — latency, throughput, memory.
- Our TensorRT-LLM fork: quantization (INT4/FP8/NVFP4), paged attention, the executor path.
- Model coverage — bringing new architectures up and making them fit consumer VRAM tiers.
- Technical roadmap and early engineering hires as we grow.
You might be a fit if
- You've written production CUDA and can reason about a GPU at the warp/SM level.
- Deep C/C++ and Python; comfortable in concurrency, memory management, and the systems layer.
- You've worked with an inference engine — TensorRT-LLM, vLLM, SGLang, or llama.cpp.
- You want the ownership of a startup CTO and the rigor of a staff engineer.
Founding GTM Lead
Cofounder track
Full-time · Equity-heavy
Be the first commercial hire and the business half of the founding team. You'll take
go-to-market from founder-led selling to a repeatable motion — owning positioning,
pricing, the early sales playbook, and our first enterprise relationships. You're
comfortable without a playbook because you'll write it: define the ICP, run outbound and
inbound experiments, close the first six-figure contracts, and stand up the operations
that turn a product people love into a company.
What you'll own
- Go-to-market strategy with the founder — market sizing, segmentation, pricing, partnerships.
- The early sales playbook: outbound + inbound experiments, ICP definition, the first deals.
- Positioning and messaging in partnership with product — how the world understands us.
- Early company operations and the commercial team we hire behind you.
You might be a fit if
- You've been a founder or early commercial hire at a seed / Series A startup.
- You can close and deliver six-figure contracts and build the motion behind them.
- Some mix of enterprise sales, startup ops, and early-stage marketing — and you figure things out.
- You're technical enough to sell a real systems product to real engineers.
Technical BDR (Enterprise)
Full-time
Open the doors. You'll run outbound into enterprise accounts — researching the right
teams, writing the cold outreach that actually lands, and booking qualified demos for
the founder, GTM lead, and account executives. You're the top of the funnel: every
pipeline conversation starts with a meeting you set. It's a high-ownership seat with a
clear path into full-cycle enterprise sales as we grow.
What you'll own
- Outbound prospecting into enterprise accounts — account research, list-building, multi-channel outreach.
- Cold email, calls, and LinkedIn that get replies — and the experiments to make them convert.
- Booking and qualifying demos, then handing off a warm, well-briefed meeting.
- Keeping the CRM and pipeline honest — every touch, every next step tracked.
You might be a fit if
- You've hit outbound quota as a BDR/SDR, ideally selling technical or developer products.
- You're relentless and organized — high volume without dropping the thread.
- You write crisp, human outreach and can speak credibly to engineers.
- You want a path from booking demos to owning full-cycle enterprise deals.
Enterprise Account Executive
Full-time
Close the deals. You'll own full-cycle enterprise sales — taking the demos the BDR team
books, running technical evaluations alongside our engineers, and closing six-figure
on-prem contracts. You're selling a real systems product to technical buyers who care
about privacy, performance, and owning their own stack, so you lead with substance, not
spin. As an early commercial hire you'll shape the playbook you sell from.
What you'll own
- Full-cycle enterprise deals — from qualified demo to signed six-figure contract.
- Technical evaluations and POCs run with our engineers and the customer's team.
- Navigating procurement, security review, and on-prem deployment requirements.
- Forecasting, pipeline hygiene, and the repeatable motion the GTM lead is building.
You might be a fit if
- You've carried and hit an enterprise quota closing technical or infrastructure software.
- You're comfortable selling to engineers and CISOs — privacy, on-prem, performance.
- You thrive without a finished playbook and help write the one that scales.
- You'd rather win a few high-stakes accounts than spray a thousand low-intent leads.
Forward Deployed Engineer
Full-time
Embed with our biggest customers and solve their hardest deployment problems in person.
You'll sit inside a partner organization, map the real-world problem, design the solution,
and ship production code that makes Bare Metal AI work on their hardware and in their
workflows. It's a hybrid of software, platform, and field engineering — customer-facing, but
the core of the job is building. You carry no sales quota; you carry the deployment.
What you'll own
- End-to-end deployments at flagship customers — from first call to in-production.
- Production integration code: getting the daemon, models, and transport running on their fleet.
- Translating messy real-world requirements into a concrete technical solution.
- The feedback loop back to the core engineering team — you are the product's eyes in the field.
You might be a fit if
- Strong generalist engineer who ships real code and likes being in the room with customers.
- You operate with founder-level autonomy and staff-level technical rigor.
- Comfortable on Windows, GPUs, networking, and debugging someone else's environment.
- You'd rather own one high-stakes account than ten tickets.
Distributed Systems Engineer
Full-time
Own the transport. Our tensor-parallel mesh runs over ordinary networking across multiple machines
on a home or office network — no NVLink, no InfiniBand, no NCCL. You'll design and run the
foundational layer that keeps GPUs in lockstep across the wire: collective communication
over standard networks, rank negotiation, fault tolerance, and the latency budget that
makes multi-machine inference feel local.
What you'll own
- The collective network transport — the wire protocol that keeps multiple GPUs in sync.
- Latency, concurrency, fault tolerance, and consistency across nodes that can drop or stall.
- Rank negotiation, peer discovery, and the health/heartbeat layer for the mesh.
- Performance work that keeps cross-machine TP within striking distance of a single box.
You might be a fit if
- You've built low-latency networked systems — RPC, sockets, gRPC/WebSockets/TCP at depth.
- Strong CS fundamentals and real large-scale or real-time system design experience.
- Proficient in C++/Go/Python and happy living in the protocol stack.
- You sweat tail latency and treat a dropped packet as a design problem, not an exception.
Product / Full-Stack Engineer
Full-time
Own the experience. The desktop app, the self-managing daemon, the web client — the
surfaces real people touch. You'll ship a local-AI product that installs in one click,
manages GPUs and models on its own, and just works on Windows. From the installer to the
chat UI to the OpenAI/Anthropic-compatible API, you turn a hard systems engine into
something people love to use.
What you'll own
- The Windows desktop app and installer — one-click, self-contained, self-updating.
- The daemon's control plane: model management, GPU reconciliation, voice, the API shims.
- The web client and chat experience — streaming, model switching, settings.
- The polish that separates a demo from a product people pay for.
You might be a fit if
- Strong full-stack chops — Python backend, modern JS/HTML/CSS front end, no framework dogma.
- You care about the last 10% — latency, error states, install-time edge cases.
- Bonus: Windows packaging, PyInstaller / native desktop, or audio/voice experience.
- You ship, measure, and iterate instead of polishing in private.
Headquarters
Built in Palm Springs, California.
Our home base sits at the foot of the San Jacinto mountains in the California
desert — sunshine, mid-century calm, and the kind of focus that's hard to find
in a crowded tech hub. We build in person, we ship every week, and we do it 110
miles from the noise of the Bay.
- 350+ days of sunshine a year and palm-lined streets to clear your head between deploys.
- Two hours from LA, an hour from San Diego — close to the action, far from the rent.
- Mid-century-modern design capital with a serious food, hiking, and tennis scene.
- Desert quiet that's made for deep work, plus a relocation package to get you here.
Apply
Pick the role that fits — or choose “Something else” if you're exceptional at something
adjacent and want to build the fastest, most capable local AI engine on the planet. A founder
reads every application.