QualityPilot
8 min read

Vercel breach, April 2026 — what we did in the first 60 minutes

A Vercel employee's OAuth handoff to a compromised AI tool leaked NPM tokens, GitHub keys and DB passwords for 500+ staff plus customer credentials. Here's our exact response playbook — four secrets rotated inside an hour via API, zero downtime.

securityvercelincident-responsedevopsoauth

Yesterday DOU covered the Vercel breach in detail. Short version: a Vercel employee connected an AI tool (Context.ai) to their Google Workspace via OAuth. Context.ai got compromised. Attackers inherited the employee's access to Vercel's internal infrastructure. Environment variables marked "non-sensitive" turned out to be stored unencrypted — NPM tokens, GitHub keys, DB passwords, signing keys for 500+ staff plus some customer credentials walked out the door. The hacker wants $2M to keep quiet.

As a team whose production lives on Vercel, we had to ask one question: how bad is it for us, and what do we do right now? This is the exact playbook we ran, with enough detail that you can copy the shape if you're in the same seat.

The threat model for a customer on Vercel Pro

The breach exposed Vercel's own employees and infrastructure secrets, not customers' deployed secrets. But that's the headline. The honest threat model is broader:

  1. An employee at your cloud provider can read environment variables marked a certain way. That architecture flaw exists until fixed — "we only read sensitive ones" is an honor-system promise.
  2. The specific compromise path (employee OAuth handoff to AI tool) reveals a governance gap that's likely not unique to this employee.
  3. Attackers now have a confirmed working vector into Vercel-class supply chains. Copycats will try it on every cloud vendor with "employee OAuth to LLM" culture.

You can't fix Vercel's security for them. You can assume your secrets were borrowed and act accordingly.

Our response in the first 60 minutes

Minute 0 — triage

Every environment variable currently in our Vercel project, graded by blast radius if leaked:

| Secret | Blast radius | Rotate priority | |---|---|---| | STRIPE_SECRET_KEY | Can charge / refund any customer | High (but high-risk to rotate mid-flight) | | STRIPE_WEBHOOK_SECRET | Can forge subscription events | High | | OPENAI_API_KEY | Can burn our LLM budget | High | | SUPABASE_SERVICE_ROLE_KEY | Full read/write on DB bypassing RLS | High | | GITHUB_API_TOKEN | Can read/write our private repos | High | | CLOUDFLARE_API_TOKEN | Can modify our DNS | Medium (scoped to DNS edit) | | RESEND_API_KEY | Can send email from our domain | Medium | | NEXTAUTH_SECRET | Can forge user JWT sessions | Medium | | CRON_SECRET | Can trigger our cron endpoints | Medium |

Triage rule we picked: rotate anything we can do via API without downtime; document everything else as an operator checklist for the person who has browser access to each provider.

Minute 5–15 — NEXTAUTH_SECRET via Vercel API

NEXTAUTH_SECRET is a shared HMAC secret. Rotation = openssl rand -base64 48 + push via Vercel REST.

NEW=$(openssl rand -base64 48)
curl -s -X POST \
  "https://api.vercel.com/v10/projects/$PID/env" \
  -H "Authorization: Bearer $VERCEL_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"key\":\"NEXTAUTH_SECRET\",\"value\":\"$NEW\",\"type\":\"encrypted\",\"target\":[\"production\"]}"
unset NEW

Side effect: every active NextAuth JWT gets invalidated. In stealth mode with 0 paying users, that's free. If you run this in production with live users, warn them first and pick a low-traffic window.

One quirk: the Vercel CLI v51.6.1 at vercel env add is buggy for preview-scoped variables — it keeps prompting for branch selection even with --yes --value. REST API doesn't have that problem. If CLI fights you, drop to REST.

Minute 15–25 — CRON_SECRET with GitHub Actions sync

Our cron endpoints validate Authorization: Bearer ${CRON_SECRET}. Two sides consume this secret:

  1. Vercel's own Cron Jobs feature (reads from project env automatically)
  2. A GitHub Actions workflow that pings the same endpoints as a failover

If you rotate only on Vercel, the GitHub Actions cron starts failing with 401 until you also update the repo secret. The atomic sequence:

NEW=$(openssl rand -hex 32)

# 1. Update Vercel env via REST (all 3 targets)
for env in production preview development; do
  curl -s -X POST "https://api.vercel.com/v10/projects/$PID/env" \
    -H "Authorization: Bearer $VERCEL_TOKEN" \
    -H "Content-Type: application/json" \
    -d "{\"key\":\"CRON_SECRET\",\"value\":\"$NEW\",\"type\":\"encrypted\",\"target\":[\"$env\"]}"
done

# 2. Sync to GitHub Actions repo secret
echo "$NEW" | gh secret set CRON_SECRET --repo YOUR_ORG/YOUR_REPO

unset NEW

Both sides get the same new value. Next scheduled cron run uses the new secret on both paths. No cron is missed.

Minute 25–40 — RESEND_API_KEY with key revocation

Resend's API lets you create and delete API keys programmatically. Full four-step rotation:

# 1. Create new key (authenticated with the OLD key)
NEW=$(curl -s -X POST \
  -H "Authorization: Bearer $OLD_RESEND_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"Production (rotated 2026-04-21)","permission":"full_access"}' \
  https://api.resend.com/api-keys | jq -r .token)

# 2. Push to Vercel envs
for env in production preview development; do
  curl -s -X POST "https://api.vercel.com/v10/projects/$PID/env" \
    -H "Authorization: Bearer $VERCEL_TOKEN" -H "Content-Type: application/json" \
    -d "{\"key\":\"RESEND_API_KEY\",\"value\":\"$NEW\",\"type\":\"encrypted\",\"target\":[\"$env\"]}"
done

# 3. Delete the old key (authenticated with the NEW key)
curl -s -X DELETE \
  -H "Authorization: Bearer $NEW" \
  "https://api.resend.com/api-keys/$OLD_KEY_UUID"

# 4. unset + done
unset NEW

If the attacker exfiltrated the old Resend key, it's dead by step 3. They can't send email from your domain anymore. Note Resend expects the key UUID with hyphens in the DELETE path; it rejects the hex string form from NPM packaging as 422 validation_error. Copy-paste the UUID as it appears in the list response.

Minute 40–55 — CLOUDFLARE_API_TOKEN with expanded scope

Our old token was scoped narrowly to Zone:DNS:Edit on one zone — low blast radius, but rotate anyway. Since we were creating a new one, we took the chance to add Zone:Analytics:Read so our automated dashboards can finally pull DNS query metrics. Least privilege still, just with one more verb.

Cloudflare's API has a quirk: the User:API Tokens:Write permission is required to create tokens via API, but the free plan doesn't expose that permission easily to fine-grained tokens. Practically, new token creation is a browser-only step. So this one was a hybrid:

  1. Create new token via Cloudflare dashboard (browser, 2 minutes)
  2. Paste the value into our CLI flow, push to Vercel envs via REST
  3. Revoke the old token via dashboard (browser, 30 seconds)

Minute 55–60 — smoke test

Every rotated secret needs a verification step. Ours was three curls:

curl -s -o /dev/null -w "% {http_code}\n" https://www.qlens.dev/                  # 200
curl -s -o /dev/null -w "% {http_code}\n" https://www.qlens.dev/api/status       # 200
curl -s -o /dev/null -w "% {http_code}\n" https://www.qlens.dev/api/stripe-webhook  # 405 (no POST body, signature missing — correct)

If any of these flipped, we'd know immediately which secret was now out of sync between our services. They were green. Deploy was Ready in 44 seconds via vercel deploy --prod.

What we deliberately didn't rotate

STRIPE_SECRET_KEY — rotating this mid-flight can break in-progress checkout.session.completed webhooks, Customer Portal sessions that are already open, and subscription management calls the Stripe Dashboard makes on our behalf. The correct sequence requires scheduling a low-traffic window, pausing webhook delivery briefly, and coordinating with any live sessions. We flagged this for an operator-scheduled rotation instead of a panic-rotate.

STRIPE_WEBHOOK_SECRET — same reason, but Stripe provides a "rolling" grace period where both old and new secrets verify for 24 hours. Slightly safer than the main key, but still wanted a quiet moment.

GITHUB_API_TOKEN and SUPABASE_SERVICE_ROLE_KEY — both are user-scoped tokens that can only be regenerated via the provider's web dashboard. Browser-only steps, flagged for the operator.

OPENAI_API_KEY — same. OpenAI dashboard-only creation. An attacker with the old key could rack up a LLM bill, but our usage metering would flag a sudden 100× spend within hours, so the observable-damage window is bounded.

The four things we'll do differently going forward

1. Every new secret gets an API-rotation runbook on day one

For each provider, we now document "how to rotate via API without downtime" alongside the install steps. Resend and Vercel are trivially rotatable. Stripe has the rolling-grace pattern. OpenAI is browser-only and we accept it. Writing the runbook takes 15 minutes per provider; it saves an hour during an actual incident.

2. Mark all secrets as Vercel "Sensitive"

Vercel's --sensitive flag stores the variable in a way that cannot be retrieved via vercel env pull — only the running function reads it at runtime. The breach specifically exposed "non-sensitive" variables that were stored unencrypted. Anything that's actually a secret should use --sensitive. (Vercel doesn't allow it on the development scope, which is fine — dev doesn't need real secrets.)

3. Quarterly secret-rotation schedule

Even without a breach, long-lived secrets compound risk. We're locking in a quarterly rotation for everything with an API path, and annual for everything browser-only.

4. Audit OAuth grants on personal Google accounts

The Vercel employee's compromise started with their personal Google Workspace OAuth grants. Everyone on our team does this check now, monthly:

  • https://myaccount.google.com/permissions — revoke anything you don't recognize
  • If you're a Google Workspace admin, also: https://admin.google.com/ac/owl — for domain-wide grants
  • Specifically watch for AI tools: Context.ai was the vector here; similar tools (note-takers, calendar AIs, email assistants) all request similar scopes and are a growing attack surface

What this doesn't solve

We rotated our keys. We didn't — and can't — audit Vercel's internal incident response. If their architecture still stores "non-sensitive" variables unencrypted after this, the next breach will expose the same class of data. That's their problem to fix, not ours.

What we can control: treat every secret as already known to an adversary, and design for that assumption. That's defense-in-depth: short-lived tokens, scoped permissions, rotation runbooks, and monitoring that catches unusual API usage from your own keys.

Timeline

  • Apr 19 — DOU publishes the Vercel breach article
  • Apr 20, 11:30 UTC — we start triage
  • Apr 20, 12:30 UTC — four of nine priority secrets rotated via API; operator checklist written for the rest
  • Apr 20, 12:35 UTC — production deploy with new secrets, smoke test green
  • Apr 20, 13:00 UTC — this post

No customer impact. No downtime. Inside an hour.

If you're a Vercel customer reading this today and you haven't done your triage yet — the playbook above works. Copy it. The cost of rotating secrets you don't need to rotate is negligible. The cost of skipping one you did need to is the story that ends up on DOU tomorrow.

Our test-health scanner works on any public GitHub repo at qlens.dev/scan — if you want to see the product that owns the infrastructure we just secured. Five free attempts per day per IP, no signup.


ShareTwitterLinkedIn

About QualityPilot

QualityPilot watches your CI for failed tests and proposes a fix as a GitHub PR. You merge or you don't — no auto-merge, no fluff. See how it works.

Related posts