CI for AI features
Fail the build when an AI feature regresses.
PromptGate is a GitHub App that turns prompt and model changes into checked CI status, the same way a failing unit test fails the build. Catch silent regressions before your users do.
Built by Yelle Software
Who it is for
Teams shipping AI to production
-
Engineering managers
You own a repo with AI-touching code paths and merge prompt or model changes weekly. Evals live in a notebook. You need an automated regression bar between merge and prod.
-
Platform and staff engineers
You want CI to fail loudly when a colleague swaps a model or tweaks a system prompt and quality drops, without building an in-house eval platform.
-
Teams at 20–200 engineers
AI features are in production and customers depend on them. "We tested it locally" is no longer a release strategy.
How it works
Three steps to a required check
-
Install the GitHub App
Grant access to the repos with AI code paths.
-
Drop in
promptgate.yamlPoint at one or more eval suites. Deterministic judges (string, regex, JSON schema, function-call shape) and LLM judges (BYO key) both work out of the box.
-
Every PR gets a required status check
Regressions fail the build. The dashboard shows the diff between the PR run and main so reviewers can decide whether the regression is intentional.
Early access
Join the waitlist
We are validating intent before we ship. Tell us which AI feature in your product is hardest to test today and we will email you when PromptGate is open for installs.
FAQ
Common questions
Does PromptGate run my LLM calls for me?
No. You bring your own LLM provider key (OpenAI today, more providers on the roadmap). PromptGate orchestrates the eval runs and reports pass/fail. Your prompts, your data, your provider, your bill.
What does the free tier cover?
One repo, one eval suite, 100 LLM-judge runs per month, and 30 days of run history. Enough to prove the workflow on a side project or a single production path.
What about GitLab or Bitbucket?
GitHub at launch. GitLab support is on the roadmap for later this year. Bitbucket is demand-driven.
How do you store my evals?
Eval configs live in your repo as promptgate.yaml.
We store run results (pass/fail, judge output, durations) in
our managed Postgres with an S3-compatible artifact store
behind it. Inputs and outputs you mark as sensitive can be
hashed before storage.
Who is building this?
Yelle Software, a small custom-software studio. PromptGate is our first own-IP SaaS.