CI for AI features
Fail the build when an AI feature regresses.
PromptGate is a GitHub App that turns prompt and model changes into checked CI status, the same way a failing unit test fails the build. Catch silent regressions before your users do.
Who PromptGate is for
-
Engineering managers shipping AI features
You own a repo with one or more AI-touching code paths and you merge prompt or model changes weekly. Your evals live in a notebook. You have no automated regression bar between "merge to main" and "ship to prod."
-
Platform engineers and staff engineers
You want CI to fail loudly when a colleague swaps a model or tweaks a system prompt and quality drops. You do not have budget to build an in-house eval platform.
-
Teams 20–200 engineers
You are past the prototype phase. AI features are in production, customers depend on them, and "we tested it locally" is no longer a release strategy.
How it works
-
Install the GitHub App
Grant access to the repos with AI code paths.
-
Drop in
promptgate.yamlPoint at one or more eval suites. Deterministic judges (string, regex, JSON schema, function-call shape) and LLM judges (BYO key) both work out of the box.
-
Every PR gets a required status check
Regressions fail the build. The dashboard shows the diff between the PR run and main, so reviewers can decide whether the regression is intentional.
Join the waitlist
We are validating intent before we ship. Tell us which AI feature in your product is hardest to test today and we will email you when PromptGate is open for installs.
FAQ
Does PromptGate run my LLM calls for me?
No. You bring your own LLM provider key (OpenAI today, more providers on the roadmap). PromptGate orchestrates the eval runs and reports pass/fail. Your prompts, your data, your provider, your bill.
What does the free tier cover?
One repo, one eval suite, 100 LLM-judge runs per month, and 30 days of run history. That is enough to prove the workflow on a side project or a single production path.
What about GitLab or Bitbucket?
GitHub at launch. GitLab support is on the roadmap for later this year. Bitbucket is demand-driven.
How do you store my evals?
Eval configs live in your repo as promptgate.yaml.
We store run results (pass/fail, judge output, durations) in
our managed Postgres with an S3-compatible artifact store
behind it. Inputs and outputs you mark as sensitive can be
hashed before storage.
Who is building this?
Yelle Software, a small custom-software studio. PromptGate is our first own-IP SaaS.