CI for AI features

Fail the build when an AI feature regresses.

PromptGate is a GitHub App that turns prompt and model changes into checked CI status, the same way a failing unit test fails the build. Catch silent regressions before your users do.

Join the waitlist See how it works

GitHub App Free tier · 1 repo BYO LLM key

Built by Yelle Software

promptgate / regression-check

  ✗ rag-answer-quality            judge: gpt-4o   p50  -7.2%
  ✗ tool-call-shape (search_docs) regex          1 fail / 12
  ✓ refusal-on-pii                deterministic  ok
  ✓ json-schema-valid             deterministic  ok
Required check failed. Block merge until 2/4 regressions are resolved.

Who it is for

Teams shipping AI to production

Engineering managers

You own a repo with AI-touching code paths and merge prompt or model changes weekly. Evals live in a notebook. You need an automated regression bar between merge and prod.
Platform and staff engineers

You want CI to fail loudly when a colleague swaps a model or tweaks a system prompt and quality drops, without building an in-house eval platform.
Teams at 20–200 engineers

AI features are in production and customers depend on them. "We tested it locally" is no longer a release strategy.

How it works

Three steps to a required check

Install the GitHub App

Grant access to the repos with AI code paths.
Drop in promptgate.yaml

Point at one or more eval suites. Deterministic judges (string, regex, JSON schema, function-call shape) and LLM judges (BYO key) both work out of the box.
Every PR gets a required status check

Regressions fail the build. The dashboard shows the diff between the PR run and main so reviewers can decide whether the regression is intentional.

Early access

Join the waitlist

We are validating intent before we ship. Tell us which AI feature in your product is hardest to test today and we will email you when PromptGate is open for installs.

JavaScript is disabled. The form will still submit, but you will see a plain JSON response instead of an inline message.

FAQ

Common questions

Does PromptGate run my LLM calls for me?

No. You bring your own LLM provider key (OpenAI today, more providers on the roadmap). PromptGate orchestrates the eval runs and reports pass/fail. Your prompts, your data, your provider, your bill.

What does the free tier cover?

One repo, one eval suite, 100 LLM-judge runs per month, and 30 days of run history. Enough to prove the workflow on a side project or a single production path.

What about GitLab or Bitbucket?

GitHub at launch. GitLab support is on the roadmap for later this year. Bitbucket is demand-driven.

How do you store my evals?

Eval configs live in your repo as promptgate.yaml. We store run results (pass/fail, judge output, durations) in our managed Postgres with an S3-compatible artifact store behind it. Inputs and outputs you mark as sensitive can be hashed before storage.

Who is building this?

Yelle Software, a small custom-software studio. PromptGate is our first own-IP SaaS.

Fail the build when an AI feature regresses.

Teams shipping AI to production

Engineering managers

Platform and staff engineers

Teams at 20–200 engineers

Three steps to a required check

Install the GitHub App

Drop in promptgate.yaml

Every PR gets a required status check

Join the waitlist

Common questions

Drop in `promptgate.yaml`