LLM Integration
Claude, GPT, and open models. Tool use, structured output, streaming, function calling, prompt caching, model routing, and the boring plumbing that makes them production-safe.
Newton Technology is the practice of Joe Newton — twenty years building for the web, now focused on AI-native products: integration, retrieval, agents, evals, and the unglamorous engineering that turns a clever demo into a reliable system.
AI work in 2026 is overflowing with confidence and underflowing with measurement. The practice below is what separates a feature that ships from a demo that lingers.
A weekend prototype is the easy 20%. The hard 80% is latency budgets, retries, idempotency, observability, cost ceilings, and graceful failure on the worst inputs your users will ever produce. That’s the work I do.
If we can’t measure it, we can’t ship it — and we certainly can’t improve it. Every AI surface I build starts with an eval harness, a regression set, and a number that has to go up before anything goes out.
Streaming, caching, optimistic UI, and graceful degradation aren’t back-of-the-sprint optimizations — they’re the product. Users feel every millisecond, and a fast wrong-ish answer often beats a slow perfect one.
A working list, not a brochure. If your problem is shaped roughly like one of these, we should talk.
Claude, GPT, and open models. Tool use, structured output, streaming, function calling, prompt caching, model routing, and the boring plumbing that makes them production-safe.
Vector search, hybrid retrieval, document pipelines, chunking strategies that don’t butcher your data, and citations users can actually trust.
Multi-step reasoning, tool orchestration, human-in-the-loop checkpoints, and the guardrails that keep an agent from running up a four-figure bill on a Tuesday.
Eval harnesses, regression suites, golden datasets, traces, and cost dashboards. A feature without an eval is a wish; a feature with an eval is a system.
Full-stack work on modern web stacks: TypeScript, React, Next.js, Postgres, edge runtimes. I’ve been doing this part since long before it was fashionable.
Bringing AI into an existing product without ripping it apart. Surgical changes, careful rollouts, real measurement — not a rewrite dressed up as a feature.
Five steps. No deck-ware, no “discovery phases” that last a quarter, no surprises in week ten.
Understand the problem, not the prompt. We agree on what success looks like, what we’re measuring, and what we’re explicitly not doing yet.
Working code by the end of week one. Always. Real models, real data, real interface — even if it’s ugly and held together with TypeScript.
Build the measurement before the polish. The eval becomes the spec; the spec becomes the regression suite; the regression suite is what lets us move fast later without breaking things.
Production from day one, behind a flag if needed. No six-month staging environments. Users find things evals never will.
Real traces, real failures, real wins. We tune prompts, swap models, add tools, prune scope. The eval number goes up. We do it again next week.
“If I have seen further, it is by standing on the shoulders of giants.” — Isaac Newton, 1675
I started programming on a beige tower in elementary school and never quite stopped. Twenty years professionally now — the web before it had standards, JavaScript before it had types, AI before it could write a paragraph on its own.
Newton Technology is the name I give to the work I most want to do: careful, measured engineering applied to a field that has more hype than rigor. If that sounds like the kind of partner you’re looking for, the contact section is right below.