# Founding Engineer Hiring Evaluation Kit

**Role:** Founding Engineer (Full-Stack: React + Python)
**Company:** AI Writing Assistant Startup | Seed Stage | 4 people | $2M raised
**Evaluation Philosophy:** Bias toward raw ability + speed over pedigree

---

## Part 1: Candidate Scorecard

### Scoring Scale

| Score | Label | Definition |
|-------|-------|------------|
| 1 | Weak | Significant concerns; well below the bar for this role |
| 2 | Mixed | Some positive signals but notable gaps |
| 3 | Meets Bar | Solid performance; would be effective in this area |
| 4 | Strong | Exceeded expectations; clear strength |
| 5 | Exceptional | Top-tier signal; rare talent in this dimension |

### The 6 Criteria

| # | Criterion | Weight | Source(s) | What We're Looking For |
|---|-----------|--------|-----------|----------------------|
| 1 | **Technical Velocity** | 25% | Trial project, Technical deep-dive | How fast does the candidate ship working software? Measured by how much functional scope they completed in the 4-hour trial, code quality under time pressure, and ability to make smart shortcuts without accumulating reckless debt. We care about output per unit time, not perfection. |
| 2 | **Full-Stack Depth** | 20% | Technical deep-dive, Trial project | Genuine proficiency across React and Python — not just surface familiarity. Can they architect a React component tree, manage state sensibly, write clean Python APIs, and reason about data modeling? At seed stage, there is no "that's not my layer." |
| 3 | **Product Instinct** | 20% | Product thinking round, Trial project | Do they default to asking "what does the user need?" before "what's the cleanest abstraction?" Can they make product tradeoff calls independently — scoping down, picking the right MVP cut, anticipating edge cases that matter vs. ones that don't? The trial project is the strongest signal here: did they build something a user would actually want to use, or just a technically correct shell? |
| 4 | **Learning Velocity & Adaptability** | 15% | Technical deep-dive, Founder-fit round | How fast do they ramp on unfamiliar territory? Probe for: how they approached the AI/LLM parts of the trial (if new to them), how they've navigated ambiguous or fast-changing technical domains before, and whether they show genuine curiosity vs. staying in their comfort zone. At a 4-person startup, the stack will change every quarter. |
| 5 | **Ownership & Drive** | 10% | Founder-fit round, References | Do they operate like an owner or an employee? Signals: initiative in past roles, willingness to do unglamorous work, evidence of shipping side projects or going beyond job descriptions, comfort with ambiguity, and intrinsic motivation (not just "I want to join a startup because equity"). |
| 6 | **Communication & Collaboration** | 10% | All rounds, Trial project writeup | Can they explain technical decisions clearly? Do they write coherent commit messages / PR descriptions? In a 4-person team, every person is a bottleneck — unclear communication multiplies coordination cost. Also assess: do they disagree constructively? Are they coachable? |

### Scorecard Template

```
CANDIDATE: ___________________________
DATE: ___________________________
EVALUATOR(S): ___________________________

CRITERION                          WEIGHT    SCORE (1-5)    WEIGHTED
─────────────────────────────────────────────────────────────────────
1. Technical Velocity               25%       ___           ___
2. Full-Stack Depth                 20%       ___           ___
3. Product Instinct                 20%       ___           ___
4. Learning Velocity & Adaptability 15%       ___           ___
5. Ownership & Drive                10%       ___           ___
6. Communication & Collaboration    10%       ___           ___
─────────────────────────────────────────────────────────────────────
WEIGHTED TOTAL                      100%                    ___/5.00

THRESHOLD GUIDE:
  4.0+  → Strong Hire
  3.5–3.9 → Hire (with noted risks)
  3.0–3.4 → Borderline — requires strong reference signals to proceed
  <3.0  → No Hire

HARD DISQUALIFIERS (any one = automatic No Hire):
  [ ] Score of 1 on Technical Velocity
  [ ] Score of 1 on Full-Stack Depth
  [ ] Score of 1 or 2 on Ownership & Drive
  [ ] Trial project was not functional / could not demo
  [ ] Dishonesty detected in any round

NOTES PER CRITERION:

1. Technical Velocity:
   _______________________________________________________________

2. Full-Stack Depth:
   _______________________________________________________________

3. Product Instinct:
   _______________________________________________________________

4. Learning Velocity & Adaptability:
   _______________________________________________________________

5. Ownership & Drive:
   _______________________________________________________________

6. Communication & Collaboration:
   _______________________________________________________________

OVERALL IMPRESSION / GUT CHECK:
_______________________________________________________________
```

---

## Part 2: Back-Channel Reference Check Script

**Purpose:** Validate the signals from interviews and trial project through 2 back-channel references — people who have actually worked with the candidate but were NOT provided by the candidate. These conversations are confidential and should be conducted by a founder.

**How to find back-channel references:** LinkedIn mutual connections, former colleagues at shared companies, investor network introductions, or engineering community contacts. Ideal references: a former manager and a former peer/collaborator.

### Pre-Call Setup

- **Duration:** 20–25 minutes
- **Tone:** Conversational, not interrogative. You're a founder doing diligence, not HR running a checklist.
- **Opening framing:** "We're a small seed-stage team and [Candidate] is in our process for a founding engineer role. I know you worked with them at [Company]. I'd love to get your honest perspective — this is completely confidential and I won't share anything attributable back to them."

### Reference Check Script

#### Section A: Context & Relationship (2–3 min)

1. "How did you work with [Candidate]? What was your working relationship — were you their manager, peer, or on a different team?"
2. "How long did you overlap, and what were you building together?"

*[Purpose: Establish credibility of the reference and context for their observations.]*

#### Section B: Technical Ability & Speed (5–6 min)

3. "If you had to rate [Candidate]'s raw engineering ability compared to other engineers you've worked with, where would they fall? Top 5%? Top 25%? Middle of the pack?"
4. "One thing that matters a lot to us is speed — how fast someone can go from idea to working software. How would you describe [Candidate]'s pace? Can you give me a specific example?"
5. "How did they handle working across the stack? Were they genuinely comfortable in both frontend and backend, or did they gravitate to one side?"

*[Maps to: Technical Velocity, Full-Stack Depth]*

#### Section C: Product Sense & Judgment (4–5 min)

6. "Did [Candidate] ever push back on a product decision or suggest a better approach to a feature? What happened?"
7. "When given an ambiguous problem — something where the spec was unclear or the right approach wasn't obvious — how did they handle it?"

*[Maps to: Product Instinct, Learning Velocity]*

#### Section D: Ownership & Work Ethic (4–5 min)

8. "Tell me about a time [Candidate] went beyond what was asked of them. What did that look like?"
9. "How did they handle things going wrong — a production incident, a missed deadline, a project that wasn't working? Did they step up or step back?"
10. "On a scale of 1–10, how much did [Candidate] operate like an owner versus someone executing tasks they were given?"

*[Maps to: Ownership & Drive]*

#### Section E: Collaboration & Red Flags (3–4 min)

11. "What was it like to collaborate with them day-to-day? Were they easy to work with? Any friction points?"
12. "Is there anything that might surprise us — good or bad — after working with them for 3 months?"
13. "If you were starting a company tomorrow and needed a founding engineer, would you try to recruit [Candidate]?"

*[Maps to: Communication & Collaboration. Question 13 is the single most diagnostic question — hesitation or qualification here is a strong signal.]*

#### Section F: Closing (1–2 min)

14. "Is there anything I should have asked but didn't?"
15. "Anyone else you'd recommend I speak with who worked closely with them?"

### Reference Check Scoring Sheet

```
REFERENCE: ___________________________
RELATIONSHIP TO CANDIDATE: ___________________________
COMPANY/PERIOD: ___________________________
DATE OF CALL: ___________________________
CALLER: ___________________________

SIGNAL AREA                  RATING (Strong+/+/Neutral/−/Strong−)    NOTES
──────────────────────────────────────────────────────────────────────────────
Raw technical ability         ___                                     ___
Speed / output                ___                                     ___
Full-stack comfort            ___                                     ___
Product judgment              ___                                     ___
Handles ambiguity             ___                                     ___
Ownership mentality           ___                                     ___
Collaboration / team fit      ___                                     ___
"Would you recruit them?"     ___                                     ___

RED FLAGS SURFACED:
_______________________________________________________________

OVERALL REFERENCE SIGNAL:  [ ] Strong Positive  [ ] Positive  [ ] Neutral  [ ] Negative
```

---

## Part 3: Hiring Decision Memo

### Template

---

**HIRING DECISION MEMO**

**Candidate:** [Full Name]
**Role:** Founding Engineer (Full-Stack: React + Python)
**Date:** [Date]
**Author:** [Your Name]
**Decision:** [ ] HIRE / [ ] NO HIRE

---

#### 1. Executive Summary

*[2–3 sentences. State the recommendation and the single strongest reason for or against.]*

> We recommend [HIRE / NO HIRE] for [Candidate] as Founding Engineer. [One sentence on the primary driver of this decision.] [One sentence on the key risk or key strength that tipped the balance.]

---

#### 2. Signal Synthesis

**Interview Signals:**

| Round | Key Positive Signals | Key Concerns |
|-------|---------------------|--------------|
| Technical Deep-Dive | | |
| Product Thinking | | |
| Founder-Fit | | |

**Trial Project Assessment:**

- **What they built:** [Brief description of what the candidate shipped in 4 hours]
- **Scope completed:** [What % of the brief did they cover? Did they make smart scope cuts?]
- **Code quality:** [Clean/messy/pragmatic? Would you merge this PR?]
- **Product quality:** [Would a user actually find this useful? Did they think about UX?]
- **Standout moment:** [The single most impressive thing they did]
- **Concern:** [The single most concerning thing observed]

**Reference Check Signals:**

| Reference | Relationship | Overall Signal | Key Takeaway |
|-----------|-------------|----------------|--------------|
| Reference 1 | | | |
| Reference 2 | | | |

---

#### 3. Scorecard Results

| Criterion | Score | Notes |
|-----------|-------|-------|
| Technical Velocity (25%) | /5 | |
| Full-Stack Depth (20%) | /5 | |
| Product Instinct (20%) | /5 | |
| Learning Velocity (15%) | /5 | |
| Ownership & Drive (10%) | /5 | |
| Communication (10%) | /5 | |
| **Weighted Total** | **/5.00** | |

---

#### 4. Risk Assessment

*[Identify the top 2–3 risks of making this hire. Be specific and honest.]*

| Risk | Severity (H/M/L) | Likelihood (H/M/L) | Mitigation |
|------|------------------|--------------------:|------------|
| *Example: Candidate has never worked at a company <10 people; may struggle with ambiguity and context-switching* | M | M | Pair closely with CEO for first 30 days; set explicit expectations about role fluidity in offer conversation |
| | | | |
| | | | |
| | | | |

---

#### 5. What We'd Be Betting On

*[Complete this section only for HIRE decisions. Articulate the affirmative case — what is the specific upside we believe this person brings?]*

- **Primary bet:** [e.g., "Exceptional technical velocity will let us ship v2 of the product 6 weeks faster than our current trajectory."]
- **Secondary bet:** [e.g., "Strong product instinct means they'll need less product management overhead, freeing up founders."]
- **Wildcard upside:** [e.g., "Their AI/NLP background could unlock features we haven't scoped yet."]

---

#### 6. 30/60/90 Day Onboarding Plan

**Purpose:** De-risk the hire by setting clear milestones, creating fast feedback loops, and front-loading the areas where we identified concerns.

##### Week 1: Orientation & First Ship

| Day | Activity | Owner |
|-----|----------|-------|
| Day 1 | Dev environment setup; full codebase walkthrough with CTO/technical founder; access to all tools (GitHub, Slack, Figma, monitoring, CI/CD) | CTO |
| Day 1 | "How we work" conversation: decision-making norms, communication expectations, deployment cadence, on-call expectations | CEO |
| Day 2–3 | Pick up a small, well-scoped ticket (bug fix or minor feature) and ship it to production. Goal: first PR merged within 48 hours. | Candidate + CTO |
| Day 4–5 | Ship a second, slightly larger ticket. Begin attending all product and engineering syncs. | Candidate |
| End of Week 1 | 30-minute check-in: "What's confusing? What's slowing you down? What would you change?" | CEO |

##### Days 8–30: Build Credibility & Context

| Milestone | Target | Success Criteria |
|-----------|--------|-----------------|
| Ship 3–5 meaningful PRs | By Day 20 | PRs are reviewed, merged, and deployed without major rework |
| Own a small feature end-to-end | By Day 30 | Candidate scopes, builds, tests, and deploys a user-facing feature with minimal hand-holding |
| Understand the full architecture | By Day 30 | Can draw the system diagram from memory and explain key design decisions |
| First user interaction | By Day 30 | Sit in on at least 2 user calls or read 10 user feedback threads |
| Written 30-day self-assessment | Day 30 | Candidate writes a 1-page reflection: what they've learned, what they'd improve in the product/codebase, where they want to focus next |

**30-Day Check-in Agenda:**
- Review scorecard criteria vs. observed performance
- Discuss any risks identified in Section 4 — are they materializing?
- Candidate shares their 1-page self-assessment
- Align on 60-day goals
- Explicit conversation: "Is this working for both of us?"

##### Days 31–60: Increase Scope & Autonomy

| Milestone | Target | Success Criteria |
|-----------|--------|-----------------|
| Own a medium-sized feature or initiative | By Day 45 | Candidate drives the project from problem definition through shipping, making product tradeoff calls along the way |
| Contribute to technical architecture decisions | By Day 50 | Candidate proposes at least one meaningful technical improvement (infra, DX, performance, testing) and ships it |
| Begin mentoring or pairing with any new hires/contractors | By Day 60 | If applicable — demonstrates ability to unblock others |
| Demonstrate product judgment | By Day 60 | At least one instance of proactively identifying a product improvement, not just executing specs |

**60-Day Check-in Agenda:**
- Review 60-day goals
- Discuss: "What decisions are you comfortable making alone? Where do you still want a check-in?"
- Feedback in both directions (founder to candidate AND candidate to founders)
- Set 90-day goals collaboratively — candidate should be driving this conversation

##### Days 61–90: Full Ownership

| Milestone | Target | Success Criteria |
|-----------|--------|-----------------|
| Own a major product area or workstream | By Day 75 | Candidate is the primary decision-maker for a significant surface area of the product |
| Ship a feature they conceived | By Day 90 | End-to-end: identified the opportunity, scoped it, built it, shipped it, measured it |
| Operate independently | By Day 90 | Founders are not a bottleneck for candidate's output; candidate proactively communicates status and blockers |
| Contribute to hiring | By Day 90 | Participate in evaluating the next engineering hire (technical interview or trial project review) |
| Cultural contribution | By Day 90 | Candidate has visibly shaped a team norm, process, or tool choice |

**90-Day Review:**
- Formal performance conversation
- Compare actual performance against the "bets" made in Section 5 of this memo
- Discuss: compensation review (if promised at offer), role evolution, areas for growth
- Decision point: Is this person on track to be a long-term pillar of the engineering team?

---

#### 7. Compensation & Closing Strategy

*[Notes on offer construction — not included in the evaluation but relevant to the decision.]*

- **Comp range:** [Base + equity range]
- **Key motivators for this candidate:** [What do they care about? Learning? Impact? Financial upside? Title?]
- **Closing risks:** [Counter-offers, competing processes, timeline constraints]
- **Recommended closing approach:** [e.g., "Founder dinner + detailed equity walkthrough + 48-hour exploding offer" or "Give them a week, they're deliberate decision-makers"]

---

#### 8. Final Recommendation

**Decision:** [ ] HIRE / [ ] NO HIRE

**Confidence Level:** [ ] High / [ ] Medium / [ ] Low

**If HIRE — conditions or caveats:**
- [e.g., "Hire, but address the ambiguity-tolerance risk explicitly in the offer conversation and set up weekly 1:1s for the first 60 days."]

**If NO HIRE — what would change our mind:**
- [e.g., "If references come back significantly stronger than expected on ownership signals, we'd reconsider."]

**Signatures:**

___________________________ (Founder 1)
___________________________ (Founder 2)
___________________________ (Date)

---

*This evaluation kit is designed for a seed-stage startup (4 people, $2M raised) hiring a Founding Engineer. The weighting deliberately favors raw ability and shipping speed over credentials, years of experience, or pedigree — because at this stage, the ability to build fast and make good product calls under uncertainty is what determines survival.*