Re-calibrated against the latest models, every quarter

Hire engineers who can actually think.

Everyone has AI now, so resumes mean nothing. Crackd puts candidates in a real codebase for 90 minutes — AI and all — and shows who can really build.

AI allowed and logged · 90-minute screens · Pay per screen, no seats

The signals you used to trust are gone.

Resume, take-home, timed round — one open browser tab clears all three. So you advance people on borrowed signal, and a wrong senior hire still costs six figures.

The question isn't whether someone used AI. It's whether they can tell when it's about to walk them off a cliff.

What AI already does for them

A resume that clears every keyword filter
A clean, well-commented take-home
Most LeetCode-style problems, instantly
Confident, fluent answers in a screen share

What it can't do for them

Own a real problem under pressure, push back on a bad spec, and catch the bug the model swore wasn't there.

Take the screen yourself

Here's a real PR. See what you catch.

Click any line you'd comment on, then submit. We'll line you up against a top-decile candidate and an AI reviewing cold. About thirty seconds.

PR #347 — add favorites endpoint
routes/favorites.py

0 lines flagged

Your review

Top candidates leave three to five meaningful comments. There are deliberate red herrings too — flag one and it costs you.

One outright security hole, one concurrency bug, one smaller correctness issue. Go find them.

How it runs

Four steps, and almost none of them are yours.

01

Set up the screen

Pick a role preset, nudge a few sliders, add any hard requirements. Five minutes.

02

Send one link

Drop in emails or share one invite URL. No account for them to create.

03

They work, you don’t

A real PR, a live bug, and a judgment call — AI on and fully logged, all recorded.

04

Read the report

A composite score, a five-axis breakdown, and the moments that mattered. Ten-minute call.

Good engineers were never the ones who typed the fastest.

Interviews rewarded speed, recall, and a tolerance for unpaid take-homes — none of it the actual job, and all of it now free from a model.

What moves your roadmap is judgment: knowing what's worth building, and noticing when a confident answer is quietly wrong. AI doesn't have that. It borrows yours.

So we hand a candidate a real problem with AI switched on, and grade the decisions they make. Everything else was always noise.

Find the cracked ones.

Interactive · live scoring

One answer. Score it against your team, not a generic rubric.

Drag the dials to match how your team works. The same answer, re-scored live — because what fits a ship-fast team is wrong for one that gates every deploy.

Your team's style

Terse Thorough

Looks good. Two concerns: (1)..., (2)..., Tradeoffs OK if [...].

Blunt Diplomatic

"I’m not convinced this approach holds at scale. My reasoning:"

Minimal Extensive

Short summary + tests mentioned.

Ship-fast Cautious

"Behind a flag with a rollback plan."

Independent AI-leveraged

Pair-programs with AI; accepts ~50%; drives high-level decisions.

Candidate response

Looks solid overall. Two concerns: (1) `user_id` comes from query params — that's an auth bypass, should come from the session/JWT. (2) The favorites fetch is going to N+1 at scale, a JOIN would fix it. Otherwise good — happy to walk through alternatives if helpful.

Style match

90/ 100

Strong fit

Cultural fit is one axis among five — capped at 10% of the composite, so it never overrules technical signal.

What you get back

A report you can decide from in ten minutes.

One composite score with a confidence read, a five-axis breakdown, and the few moments that moved it — each timestamped to the replay. It fits on a page.

Candidate Report · 2026-05-22 14:23

Sarah K. · Senior Backend Engineer

Strong yes · High confidence

Composite

0

/ 100

Technical competence0
Judgment0
Communication0
AI fluency0
Cultural fit0

Standout moments

  • 3:12

    Spotted the auth bypass on line 8 within 3:12 of starting

    Top 5% of historical candidates

  • 18:47

    Asked the right clarifying question before approving

    “Is rate limiting in scope for this PR?” — top 15%

  • 24:30

    Kept an AI suggestion for the race condition without verifying

    AI hallucinated; candidate didn’t catch it. Bottom 20% on this fixture.

Replay timeline · 89 minOpen full replay →
PR Review · 34mDebugging · 29mDecision · 26m

Pricing

Pay per screen. Not per seat.

One number. No tiers, no per-seat upsell. Run it on every candidate or just one — same price, same signal.

$100/ screen

Volume pricing starts at 25 screens. No subscription, no lock-in, cancel any time.

Every screen includes

  • PR Review + Decision Comm modules
  • Live Debugging + Code Tracing (Wave 2)
  • Configurable cultural dials
  • Full 5-axis scored report with replay
  • Standout moments analysis
  • Personalized candidate feedback
  • AI usage telemetry per session
  • Stripe billing, magic-link candidate flow
  • Unlimited workspace members

FAQ

The skeptical questions.

The ones founders and hiring managers ask before they buy.

Still on the fence? Email us directly.