# ProductMetrics — Full Corpus

> The complete ProductMetrics dataset inlined as text: all 44 metrics and 4 measurement frameworks, with full definitions, formulas, benchmarks, guidance, and source citations. This is the entire corpus — a single fetch gives an agent everything. The same data is available as structured JSON at https://productmetrics.org/api (see https://productmetrics.org/api/openapi.json for the schema).

Canonical pages live under https://productmetrics.org/metrics/{slug} and https://productmetrics.org/frameworks/{slug}. Company-specific claims and benchmark figures carry source citations; please carry the citation through rather than presenting a figure as bare fact.

---

# Metrics (44)

## Daily Active Users (DAU)

- Slug: dau
- URL: https://productmetrics.org/metrics/dau
- Categories / tags: acquisition

Definition: Unique users who engage with your product in a single day.

Formula: DAU = Unique users with ≥1 qualifying action per day

What it measures: Count of unique users with at least one qualifying action in a 24-hour period. What counts as "active" is product-specific: it might be logging in, viewing content, or completing a core action like sending a message.

What to watch:
- Rising: Indicates growing engagement, but verify sustainability. A viral spike that fades within 7-14 days signals temporary interest, not real growth. Pair with retention metrics.
- Falling: Could signal product issues, seasonality, or a shift in user behavior. Segment by cohort to identify whether it's new user acquisition or existing user engagement that's declining.

In practice: After launching push notifications, a productivity app saw DAU jump 40% in week one. But sessions per user dropped from 3.2 to 1.8. Users opened the app more but did less each time. The team shifted to weekly digest notifications, which recovered session depth while maintaining the DAU gain.

Vanity risk: Without retention context, DAU is a vanity metric. A viral spike that fades within two weeks signals temporary interest, not real growth. Always pair with retention curves.

Related metrics:
- nday: Retention — viral spikes mean nothing without retention.

---

## Monthly Active Users (MAU)

- Slug: mau
- URL: https://productmetrics.org/metrics/mau
- Categories / tags: acquisition

Definition: Unique users who engage with your product at least once in a calendar month.

Formula: MAU = Unique users with ≥1 qualifying action in a 30-day (or calendar-month) window

What it measures: The size of your active user base over a month. As with DAU, what counts as "active" is product-specific—a login, a content view, or a core action—and the definition must stay consistent to compare months. MAU is the standard denominator for reach and the base for the DAU/MAU stickiness ratio, but on its own it says nothing about how often or how deeply those users engage.

What to watch:
- Rising: Your reach is growing. But MAU counts a user who showed up once the same as a daily power user, so confirm the growth is engaged by checking DAU/MAU stickiness and retention curves alongside it.
- Falling: Either acquisition slowed or existing users are lapsing. Decompose with growth accounting (new vs. resurrected vs. churned) to tell a top-of-funnel problem from a retention problem—the fix is different for each.

When not to use: For high-frequency products where daily habit is the goal, DAU is the more honest headline—MAU can look healthy while daily engagement quietly erodes. For deliberately infrequent products (tax software, travel booking), monthly counts swing with seasonality and a rolling-30-day window reads better than a calendar month.

In practice: A B2B tool celebrated MAU climbing past 50,000, but DAU/MAU stickiness was stuck at 8%—most "monthly active" users logged in once and never returned that month. The team reframed their North Star from MAU to weekly active users who completed a core action, which exposed that real engaged usage was a fraction of the headline number and redirected the roadmap toward activation rather than top-of-funnel acquisition.

Vanity risk: MAU is the textbook vanity metric: it grows with any one-time visit and never falls as long as acquisition outruns lapse, so it can rise for months while engaged usage shrinks. Never report MAU without a stickiness or retention companion.

Related metrics:
- dau: DAU — the daily counterpart; MAU is the monthly denominator, DAU the daily numerator.
- stickiness: DAU/MAU Stickiness — MAU is the denominator; the ratio is what turns a raw reach count into a habit signal.
- growth-accounting: Growth Accounting — decomposes MAU change into new, resurrected, retained, and churned users.

---

## Sessions Per User (SPU)

- Slug: spu
- URL: https://productmetrics.org/metrics/spu
- Categories / tags: acquisition

Definition: How often users return to your product in a given period.

Formula: SPU = Total sessions / Total unique users

What it measures: Average number of separate sessions per user over a given period. A session typically ends after 30 minutes of inactivity, though this varies by platform.

What to watch:
- Rising: Users are returning more frequently, a sign of habit formation. High-frequency products (messaging, social) should target 3+ daily sessions; lower-frequency products (finance, travel) may see 2-4 weekly.
- Falling: Users may be consolidating activity into fewer, longer sessions (not necessarily bad) or losing interest. Cross-reference with session duration to distinguish these cases.

In practice: A news app saw SPU rise from 1.4 to 2.1 after launching personalized feeds, but average session duration dropped 30%. Users were snacking on headlines rather than reading articles. The team added a "deep read" mode and saw both metrics improve together.

Related metrics:
- dau: DAU — frequency without depth can be hollow.
- stickiness: DAU/MAU Stickiness — measures habit strength across a month.

---

## Conversion Rate

- Slug: conversion
- URL: https://productmetrics.org/metrics/conversion
- Categories / tags: acquisition

Definition: Percentage of users who complete a goal action.

Formula: Conversion Rate = (Users who completed action / Users who could have) × 100%

What it measures: Percentage of users who complete a specific goal action. "Conversion" is context-dependent: it might mean signing up, subscribing, purchasing, or completing any defined step. Always specify what you're measuring (visitor-to-signup, trial-to-paid).

Benchmarks:
- E-commerce: 2-3% (visitor to purchase)
- SaaS trial-to-paid: 15-25% (opt-in trials)

What to watch:
- Rising: Your funnel is more effective. But check volume: optimizing for conversion can sometimes attract lower-intent visitors who inflate the denominator.
- Falling: Something is blocking users. Use funnel analysis to find the drop-off point. Also check for audience mix shifts, as different traffic sources convert at different rates.

In practice: An e-commerce site simplified checkout from 5 pages to 1 and saw conversion jump from 2.1% to 3.4%. But average order value dropped 15%. Users were impulse-buying smaller items. They added a "frequently bought together" prompt, which recovered AOV while keeping most of the conversion gain.

Tools: Session Recording (Hotjar, FullStory), Funnel Analysis.

Vanity risk: Conversion rate without volume context is misleading. Converting 50% of 10 visitors beats converting 2% of 10,000 only on paper. Always report conversion rate alongside absolute numbers.

Related metrics:
- trial: Trial-to-Paid Conversion — the critical conversion for SaaS businesses.

---

## Feature Adoption Rate (FAR)

- Slug: far
- URL: https://productmetrics.org/metrics/far
- Categories / tags: acquisition, leading

Definition: Percentage of eligible users who use a feature.

Formula: FAR = (Users who used the feature / Eligible users exposed to it) × 100%

What it measures: Percentage of eligible users who use a specific feature. "Eligible" is key: measure against users who could use the feature (had access, saw it), not your entire user base.

Benchmarks:
- Core features: Target 50%+ adoption
- Secondary features: 10-30% adoption is typical

What to watch:
- Rising: The feature is finding its audience. Low adoption isn't always bad, as some features are for power users only.
- Falling: Initial curiosity may be wearing off. Track whether users who try the feature continue using it (feature retention), not just first use.

In practice: A project management tool launched a time-tracking feature with 45% adoption in week one, dropping to 12% by week four. Users liked the idea but found manual time entry tedious. The team added automatic tracking, and adoption stabilized at 38%: lower than the spike but sustainable.

Vanity risk: First-use adoption spikes are vanity. A feature with 45% adoption in week one that drops to 12% by week four never had real adoption. Track sustained usage, not trial clicks.

Related metrics:
- activation: Activation Rate — feature adoption often correlates with activation success.

---

## Customer Acquisition Cost (CAC)

- Slug: cac
- URL: https://productmetrics.org/metrics/cac
- Categories / tags: acquisition, health

Definition: Total cost to acquire a paying customer.

Formula: CAC = (Sales + Marketing costs) / New customers acquired

What it measures: Total cost to acquire a new paying customer, including marketing spend, sales salaries, tools, and overhead allocated to acquisition. Note: no standardized calculation method exists—companies calculate CAC differently, making benchmarking imprecise.

Benchmarks:
- No single "average" exists — CAC varies widely by industry and buyer segment.
- SMB-tier figures (FirstPageSage, 2025): ~$274 (e-commerce) up to ~$1,450 (fintech); enterprise-tier CAC runs several times higher.
- Treat any published figure as order-of-magnitude only — your own blended CAC is the number that matters.

Calculation methods:
- Paid CAC: Ad spend / New customers via paid channels. Best for measuring channel efficiency and optimizing specific acquisition campaigns.
- Blended CAC: Total acquisition cost / Total new customers. Best for measuring overall business health and unit economics across all channels.

What to watch:
- Falling: Acquisition is more efficient, but verify quality. Cheaper customers may churn faster or spend less. Pair with LTV to ensure you're not sacrificing long-term value.
- Rising: Competition is intensifying, or you've saturated easy-to-reach audiences. Segment by channel, as some channels scale poorly. If LTV rises faster than CAC, rising costs can still be profitable.

In practice: A SaaS company saw paid search CAC rise from $120 to $180 over six months as competition increased. Content marketing CAC was $95 but took 6 months to show results. They maintained paid search for immediate pipeline while investing in content for long-term CAC reduction. Blended CAC stabilized at $135.

Related metrics:
- ltvcac: LTV:CAC Ratio — the fundamental unit economics equation.
- payback: CAC Payback — how fast you recover acquisition costs.

Sources:
- Average CAC for SaaS by industry & customer type — FirstPageSage (2025) — https://firstpagesage.com/marketing/average-cac-for-saas-businesses-by-industry-and-customer-type-fc/

---

## Growth Accounting

- Slug: growth-accounting
- URL: https://productmetrics.org/metrics/growth-accounting
- Categories / tags: acquisition, leading

Definition: Breaking down user growth into its components. Reveals whether growth is healthy or hollow.

Formula: MAU Growth = New Users + Resurrected Users − Churned Users

What it measures: A framework that decomposes user growth into New (first-time users), Retained (continued from last period), Resurrected (returned after absence), and Churned (stopped using). The User Quick Ratio ((New + Resurrected) / Churned) measures growth efficiency.

What to watch:
- High resurrection, high churn: Users cycle in and out. You're not building a stable base. Investigate why users leave and what brings them back.
- Low resurrection, low churn: Stable but may lack growth. Your existing users stay, but you're not winning back lapsed users.
- User Quick Ratio above 4: Excellent for SaaS. Above 1.5 is very good for consumer apps. Below 1 means you're shrinking.

In practice: A mobile game showed 15% MAU growth, but growth accounting revealed 60% of "active" users each month were resurrected players who churned again within weeks. True retained users were declining. They shifted focus from re-engagement campaigns to fixing the core gameplay loop that caused churn in the first place.

Vanity risk: Top-line MAU growth is vanity when it masks a leaky bucket. If 60% of your "active" users each month are resurrected churners, you have a revolving door, not real growth.

Related metrics:
- dau: DAU — the daily input to growth accounting.
- churn: Churn Rate — understanding why users leave.

---

## Activation Rate

- Slug: activation
- URL: https://productmetrics.org/metrics/activation
- Categories / tags: activation, leading, health

Definition: Percentage of new users who reach a meaningful first milestone.

Formula: Activation Rate = (Users who activated / Total new users) × 100%

What it measures: Percentage of new users who complete a predefined action that correlates with long-term retention. The "activation event" varies by product. Famous examples: Facebook (adding 7 friends within 10 days), Dropbox (uploading first file), Slack (a team exchanging 2,000 messages), Twitter (around 30 follows as a retention tipping point). Activation is the only part of your product that 100% of users touch—poor activation cascades into poor retention regardless of product quality.

Benchmarks:
- Good: 60-70% activation rate
- Excellent: 80%+ activation rate

What to watch:
- Above 60%: Strong activation. You're converting most new users to engaged users. Exceptional products reach 80%+.
- Below 40%: Significant friction exists. Check onboarding flow, value proposition clarity, and technical issues. Also verify your activation event still correlates with retention—it may need updating as your product evolves.

Finding the metric: Use regression analysis to identify the user action that most strongly correlates with long-term retention. This "magic number" becomes your activation goal—like Facebook's discovery that users who added 7 friends in 10 days retained dramatically better than those who didn't.

In practice: A project management tool defined activation as "create first project." After reducing the setup flow from 8 steps to 3, activation rose from 34% to 52%. But when they analyzed retention, users who also invited a teammate had 3× better retention, so they added teammate invitation to their activation definition.

Tools: User Onboarding Flows, Progress Bars, A/B Testing.

Related metrics:
- ttv: Time-to-Value — how fast users reach activation.
- nday: N-day Retention — activated users retain dramatically better.

Sources:
- Facebook’s "7 friends in 10 days" magic moment — Mode (from Chamath Palihapitiya’s growth talk) — https://mode.com/blog/facebook-aha-moment-simpler-than-you-think/
- Slack’s "2,000 messages" threshold — First Round Review (Stewart Butterfield) — https://review.firstround.com/from-0-to-1b-slacks-founder-shares-their-epic-launch-strategy/
- Dropbox’s activation "aha moment" — Amplitude — https://amplitude.com/blog/aha-moment-dropbox

---

## Time-to-Value (TTV)

- Slug: ttv
- URL: https://productmetrics.org/metrics/ttv
- Categories / tags: activation, leading

Definition: Time from signup to first "aha moment."

Formula: TTV = Median time from signup to first value event

What it measures: Elapsed time from signup to a user's first meaningful value moment. You must define what "value" means for your product: completing a task, achieving a result, or reaching an "aha moment" feature. Use median to avoid outliers skewing results.

Benchmarks:
- Consumer apps: Within first session (minutes)
- B2B SaaS: Hours to days depending on complexity

What to watch:
- Shorter: Users reach value faster, which strongly correlates with retention. Best-in-class products aim for value within the first session.
- Longer: Friction in onboarding, unclear value prop, or complex setup requirements. Map the user journey step-by-step to find where time is lost.

In practice: A budgeting app had a median TTV of 4 days because users signed up but didn't link accounts until later. They added a "quick demo mode" with sample data so users could explore immediately. TTV for demo users was 8 minutes, and those users linked real accounts at 2× the rate.

Related metrics:
- activation: Activation Rate — TTV measures velocity to activation.
- trial: Trial-to-Paid — shorter TTV improves conversion.

---

## Trial-to-Paid Conversion Rate

- Slug: trial
- URL: https://productmetrics.org/metrics/trial
- Categories / tags: activation, revenue

Definition: Percentage of trial users who become paying customers. The moment of truth for your value proposition.

Formula: Trial-to-Paid = (Users who converted to paid / Total trial users) × 100%

What it measures: How effectively your trial experience demonstrates enough value to justify payment. This metric is highly sensitive to trial design: opt-out trials (requiring credit card upfront) convert 2-3× higher than opt-in trials, but attract different user types.

What to watch:
- Opt-in trials: Target ~25% (Lincoln Murphy’s rule of thumb for no-credit-card trials). Below 15% suggests users aren't reaching value within the trial period; well above it, verify you're not filtering out valuable users with too-complex signup.
- Opt-out trials: Target ~60% (credit-card-required trials). Below 40% indicates poor activation or value mismatch; above 60% is best-in-class. Watch for involuntary churn in Month 2 from users who forgot to cancel.

In practice: A project management tool offered 30-day trials with 22% conversion. When they analyzed user behavior, most converters decided within 7 days; non-converters rarely returned after Day 10. They switched to 14-day trials with more aggressive onboarding emails. Conversion rose to 31%: shorter timeline created urgency and focused the team on faster activation.

Related metrics:
- activation: Activation Rate — trial conversion depends on activation.
- pql: PQLs — identify high-intent trial users.

Sources:
- Free trial conversion benchmarks (Lincoln Murphy’s ~25% / ~60% targets) — Databox — https://databox.com/converting-trial-users-to-paid-customers

---

## Product-Qualified Leads (PQLs)

- Slug: pql
- URL: https://productmetrics.org/metrics/pql
- Categories / tags: activation, leading

Definition: Users whose product engagement signals conversion likelihood. Quality over quantity.

Formula: PQLs = Count of users meeting predefined engagement criteria
PQL Rate = (PQLs / Total trial or free users) × 100%

What it measures: Users who have demonstrated meaningful product engagement that correlates with conversion likelihood. Unlike marketing-qualified leads (based on content engagement), PQLs are qualified by actual product usage.

What to watch:
- Rising: More users are reaching meaningful engagement, which is good for pipeline. Track PQL-to-paid conversion to validate your criteria.
- Falling: Fewer users are engaging deeply. Check if onboarding is broken, if traffic quality has declined, or if product changes have made the "aha moment" harder to reach.

In practice: A SaaS company defined PQLs as "created 3+ projects and invited 1+ teammates." Sales closed 35% of PQLs vs. 8% of all trial users. When they added "used integration feature" to the criteria, PQL volume dropped 40% but close rate jumped to 52%. Better targeting made their sales team more efficient.

Related metrics:
- trial: Trial-to-Paid — PQLs predict trial conversion.
- activation: Activation Rate — PQL criteria often mirror activation events.

---

## Day 1/7/30 Retention

- Slug: nday
- URL: https://productmetrics.org/metrics/nday
- Categories / tags: retention, leading

Definition: Percentage of users who return on specific days after signup. The clearest early signal of product-market fit.

Formula: Day N Retention = (Users active on Day N / Users who signed up on Day 0) × 100%

What it measures: Whether users come back after their first experience. Day 1 measures immediate onboarding success; Day 7 indicates early habit formation; Day 30 reflects sustained value delivery. These milestones reveal problems weeks before they show in revenue.

Benchmarks:
- Rough mobile-app medians: Day 1 ~25%, Day 7 ~8–12%, Day 30 ~4–7% — these vary widely by source, year, and category.
- Finance and gaming typically run above these medians; many consumer-app categories fall below.
- Published retention tables drift year to year — your own cohort trend matters more than any external benchmark.

What to watch:
- Strong Day 1, weak Day 7: Users explore but don't form habits. Focus on the first-week experience.
- Weak Day 1: Onboarding is failing. Users aren't finding value quickly enough.

In practice: A social app had strong Day 1 (32%) but crashed to 8% by Day 7. Users explored features once but didn't return. Analysis showed most users never added friends. They added a "Find friends from contacts" prompt on Day 2, and Day 7 retention jumped to 18%. The feature existed before, but users needed a nudge at the right moment.

Related metrics:
- activation: Activation Rate — poor activation leads to poor retention.

Sources:
- Mobile app retention benchmarks — UXCam (2026, aggregating AppsFlyer/Adjust) — https://www.uxcam.com/blog/mobile-app-retention-benchmarks/

---

## Cohort Retention Curves

- Slug: cohort
- URL: https://productmetrics.org/metrics/cohort
- Categories / tags: retention

Definition: How retention changes over time for groups of users who joined together. The gold standard for retention analysis.

Formula: Plot % of cohort still active at Week 1, 2, 3... through Week N

What it measures: Retention tracked by user cohorts (typically grouped by signup week or month) over their entire lifecycle. The curve shape matters more than any single number: healthy products flatten into a horizontal line; struggling products decline continuously toward zero.

Benchmarks:
- Consumer Social: Good: 25% | Great: 45% (6-month)
- Consumer Transactional: Good: 30% | Great: 50% (6-month)
- Consumer SaaS: Good: 40% | Great: 70% (6-month)
- SMB/Mid-market SaaS: Good: 60% | Great: 80% (6-month)
- Enterprise SaaS: Good: 70% | Great: 90% (6-month)

What to watch:
- Curve flattens: Healthy sign — you have a stable user base that finds ongoing value.
- Curve never flattens: Every cohort eventually churns to zero. Your product isn't creating lasting value.

In practice: A fitness app saw overall retention of 15% at 6 months but noticed cohort curves never flattened, just declined more slowly. When they segmented by workout type, users who tried strength training in week 1 had curves that flattened at 35%, while cardio-only users declined to 5%. They redesigned onboarding to introduce strength training earlier.

Related metrics:
- nday: N-day Retention — point-in-time retention snapshots.
- churn: Churn Rate — the inverse view of retention.

Sources:
- What is good retention? (benchmark study with Casey Winters) — Lenny Rachitsky — https://www.lennysnewsletter.com/p/what-is-good-retention-issue-29

---

## Customer Churn Rate

- Slug: churn
- URL: https://productmetrics.org/metrics/churn
- Categories / tags: retention, health, leading

Definition: Percentage of customers who leave. Critical insight: 5% monthly churn compounds to 46% annual churn.

Formula: Logo Churn = Customers lost / Starting customers
Revenue Churn = Lost MRR / Starting MRR (more important)

What it measures: Percentage of customers who cancel or stop using your product over a period. Revenue churn matters more than logo churn: losing one $10K customer differs vastly from losing ten $100 customers. Small monthly numbers compound dangerously—5% monthly churn means losing 46% of customers annually.

What to watch:
- B2C SaaS: Target 3-5% monthly (good), <2% (great)
- B2B SMB/Mid-Market: Target 2.5-5% monthly (good), <1.5% (great)
- B2B Enterprise: Target 1-2% monthly (good), <0.5% (great)

In practice: An online learning platform saw monthly churn spike from 6% to 11% after a price increase. But when they segmented by engagement, high-engagement users actually churned less. The spike came from "zombie" subscribers who rarely used the product. The team let them churn and focused on converting engaged free users instead.

Tools: Cohort Analysis, Exit Surveys, Churn Prediction Models.

Related metrics:
- activation: Activation Rate — poor activation leads to poor retention.

---

## Customer Retention Rate (CRR)

- Slug: crr
- URL: https://productmetrics.org/metrics/crr
- Categories / tags: retention, health

Definition: Percentage of existing customers who stay.

Formula: CRR = ((Customers at end − New customers) / Customers at start) × 100%

What it measures: Percentage of existing customers who remain active over a period. The key is excluding new customers: you're measuring whether people who were already customers stayed.

Benchmarks:
- Subscription businesses: Target 90%+ monthly retention
- Enterprise SaaS: Target 95%+ monthly retention

What to watch:
- Rising: Your existing customers are stickier. Even small retention improvements compound dramatically — Bain & Company’s research found that retaining 5% more customers can lift profits by 25–95%.
- Falling: Something is driving existing customers away. Segment by cohort, tenure, and usage patterns to find who's leaving and when. Early-tenure churn points to onboarding issues; late-tenure churn suggests value erosion.

In practice: A fitness app saw CRR drop from 85% to 78% after adding new workout types. The new content overwhelmed the home screen, and existing users couldn't find their saved workouts. Restoring a "My Workouts" quick-access tab recovered retention to 87%.

Related metrics:
- churn: Churn Rate — the inverse of retention.
- nday: N-day Retention — early warning signals.

Sources:
- E-Loyalty: Your Secret Weapon on the Web (5% retention → 25–95% profit) — Bain & Company / HBR (2000) — https://www.bain.com/insights/e-loyalty-your-secret-weapon-on-the-web/

---

## DAU/MAU Ratio (Stickiness)

- Slug: stickiness
- URL: https://productmetrics.org/metrics/stickiness
- Categories / tags: retention

Definition: How many days per month users engage with your product. The simplest measure of habit strength.

Formula: Stickiness = (Daily Active Users / Monthly Active Users) × 100%

What it measures: The percentage of monthly users who engage on any given day. A 20% ratio means the average user engages about 6 days per month (20% × 30 days). Higher stickiness indicates stronger habits and product-market fit.

Benchmarks:
- B2B SaaS: Average ~13% | Good 20%+ | Great 40%+
- B2C apps: Average ~20% | Good 25–35% | Great 50%+
- Messaging apps: Good 50%+ | Great 60%+
- Social apps like Facebook have historically exceeded 50% DAU/MAU — among the stickiest consumer products.

What to watch:
- Above 25%: Users are forming habits around your product.
- Below 15%: Most users aren't making your product part of their routine.

When not to use: Products designed for infrequent use—Airbnb, TurboTax, travel booking apps—will naturally show low stickiness without indicating problems. If your product solves a periodic need, low DAU/MAU is expected and healthy.

In practice: A note-taking app had 15% stickiness, solid but not habit-forming. Analysis showed users only opened the app when they had something specific to capture. They added a daily review feature that surfaced old notes, raising stickiness to 28%. Users now had a reason to open the app even without new input.

Related metrics:
- dau: DAU — the numerator in stickiness.
- nday: N-day Retention — cohort-based view of engagement.

Sources:
- DAU/MAU ratio benchmarks (~13% SaaS average, ~40% excellent) — Gainsight — https://www.gainsight.com/essential-guide/product-management-metrics/dau-mau/

---

## Customer Effort Score (CES)

- Slug: ces
- URL: https://productmetrics.org/metrics/ces
- Categories / tags: retention, leading

Definition: How easy it is for customers to get what they need. A better predictor of churn than CSAT.

Formula: CES = Average score on "How easy was it to [complete task]?" (1-7 scale)

What it measures: The effort customers expend to accomplish a goal: getting support, completing a purchase, or using a feature. Measured via survey after specific interactions. Research shows CES predicts repurchase and loyalty better than satisfaction scores.

What to watch:
- Low effort (6-7): Customers find interactions easy. CEB/Gartner research found that 96% of customers with a high-effort interaction become more disloyal, versus just 9% of those with a low-effort experience.
- High effort (1-4): Friction is driving customers away. Identify the specific touchpoints causing friction: complex forms, slow support, confusing navigation, or too many steps.

In practice: A SaaS company tracked CSAT at 85% but saw unexpected churn. When they added CES surveys after support tickets, they discovered resolution required an average of 2.3 contacts per issue. Customers were satisfied with agent interactions but exhausted by the process. They implemented first-contact resolution targets, and CES rose from 4.2 to 6.1 while churn dropped 18%.

Related metrics:
- csat: CSAT — satisfaction with specific interactions.
- churn: Churn Rate — CES predicts churn better than CSAT.

Sources:
- Stop Trying to Delight Your Customers (the CES research) — Dixon, Freeman & Toman, HBR (2010) — https://hbr.org/2010/07/stop-trying-to-delight-your-customers

---

## Customer Satisfaction Rate (CSAT)

- Slug: csat
- URL: https://productmetrics.org/metrics/csat
- Categories / tags: retention

Definition: Satisfaction with a specific interaction.

Formula: CSAT = (Satisfied responses / Total responses) × 100%

What it measures: Satisfaction score for a specific interaction or experience, typically via a survey ("How satisfied were you?" on a 1-5 scale). Unlike NPS, which gauges overall loyalty, CSAT is best for evaluating specific touchpoints.

Benchmarks:
- B2C: Target 80%+
- B2B enterprise: Target 90%+

What to watch:
- Rising: The specific experience you're measuring is improving. Watch response rates, as low participation can skew results toward extremes.
- Falling: Something changed in that touchpoint. Because CSAT is context-specific, you can often pinpoint the issue. Compare before/after when you make changes.

In practice: An online education platform added video transcripts and saw course CSAT rise from 72% to 86%. But completion rates didn't improve. Learners were more satisfied but using transcripts to skim rather than engage. They redesigned transcripts as a supplement rather than an alternative to video.

Vanity risk: CSAT is context-specific by design, but reporting a single aggregate CSAT score across all touchpoints is vanity. An 85% average can hide that support is at 95% while onboarding is at 60%.

Related metrics:
- ces: CES — effort-based satisfaction.
- nps: NPS — overall loyalty measure.

---

## Monthly Recurring Revenue (MRR)

- Slug: mrr
- URL: https://productmetrics.org/metrics/mrr
- Categories / tags: revenue, health

Definition: Total predictable monthly revenue from subscriptions.

Formula: MRR = Sum of each customer's monthly subscription value

What it measures: Sum of all active subscription revenue, normalized to a monthly value. For annual plans, divide by 12. The annualized form is ARR (Annual Recurring Revenue), which is simply MRR × 12 — the two express the same recurring base on different cadences, so SaaS teams use MRR for operational tracking and ARR for board and investor framing. MRR is the heartbeat metric for subscription businesses.

What to watch:
- Rising: Growth is coming from new customers (New MRR), existing customers upgrading (Expansion MRR), or both. Break down MRR by source to understand what's driving growth.
- Falling: Churn and downgrades are outpacing new business. Calculate Net MRR (New + Expansion − Churn − Contraction) to see the full picture.

In practice: A B2B SaaS company saw MRR grow 8% month-over-month, but when they decomposed it, 70% came from expansion revenue and only 30% from new sales. They doubled down on upsell features while rebuilding their top-of-funnel acquisition.

Vanity risk: Top-line MRR growth without decomposition is vanity. If 70% of growth comes from expansion revenue and new sales are flat, you have a concentration risk. Always break MRR into New, Expansion, Churn, and Contraction.

Related metrics:
- arpu: ARPU — revenue per user.
- nrr: NRR — revenue retention from existing customers.
- churn: Churn Rate — the leak in MRR.

---

## Average Revenue Per User (ARPU)

- Slug: arpu
- URL: https://productmetrics.org/metrics/arpu
- Categories / tags: revenue, health

Definition: Revenue generated per user over a period.

Formula: ARPU = Total revenue / Total users

What it measures: Revenue per user over a specific period, typically monthly. Clarify whether you're measuring paying users only (ARPPU) or all users including free tier, as these are very different numbers.

What to watch:
- Rising: Users are paying more through upgrades, add-ons, or price increases. But watch for declining user count: ARPU can rise while total revenue falls if you're losing lower-value customers.
- Falling: Could indicate successful expansion into a lower-price segment (growth dilution), increased discounting, or downgrades. Segment by customer tier to understand the cause.

In practice: A streaming service launched a lower-priced ad-supported tier, causing ARPU to drop from $14 to $11. But total revenue grew 40% because the subscriber base doubled. The ARPU decline was intentional, a trade-off for market expansion.

Vanity risk: ARPU can rise while total revenue falls if you are losing lower-value customers. A streaming service dropping from $14 to $11 ARPU while doubling subscribers is winning. ARPU in isolation tells the wrong story.

Related metrics:
- ltv: LTV — lifetime revenue projection.
- mrr: MRR — total recurring revenue.

---

## Lifetime Value (LTV)

- Slug: ltv
- URL: https://productmetrics.org/metrics/ltv
- Categories / tags: revenue, health

Definition: Total projected revenue from a customer over their entire relationship.

Formula: LTV = ARPU × Gross Margin % × (1 / Churn Rate)
E-commerce: Avg. order value × Purchase frequency × Avg. customer lifespan

What it measures: Projected total revenue from a customer over their entire relationship with your product. LTV is an estimate based on historical patterns.

What to watch:
- Rising: Customers are staying longer, spending more, or both. The critical ratio is LTV:CAC. Most healthy businesses target at least 3:1 (every dollar spent acquiring a customer returns three dollars).
- Falling: Investigate whether it's driven by shorter lifespans (retention problem), lower spending (engagement or pricing problem), or both. Segment by acquisition channel, as some sources may deliver lower-quality customers.

In practice: An e-commerce company launched a loyalty program and saw LTV rise from $180 to $245, a 36% increase. But CAC also rose 25% because the loyalty program's marketing costs weren't attributed. When they calculated LTV:CAC, it only improved from 2.4:1 to 2.6:1, prompting them to optimize the program's costs.

Related metrics:
- ltvcac: LTV:CAC Ratio — the fundamental unit economics equation.
- churn: Churn Rate — directly impacts LTV.

---

## Net Revenue Retention (NRR)

- Slug: nrr
- URL: https://productmetrics.org/metrics/nrr
- Categories / tags: revenue, health

Definition: Revenue retained from existing customers, including expansion and churn. The single best predictor of sustainable growth.

Formula: NRR = ((Starting MRR + Expansion - Contraction - Churn) / Starting MRR) × 100%

What it measures: How much revenue you retain and grow from your existing customer base, independent of new sales. NRR above 100% means existing customers generate more revenue over time, even without acquiring anyone new. NRR is also called NDR (Net Dollar Retention)—the two are interchangeable names for the same figure. Median private SaaS NRR sat near 101% in 2024 (KeyBanc/Sapphire), down from the ~105% range of a few years earlier—the bar is getting harder to clear.

Benchmarks:
- Below 100%: Losing money from existing customers—urgent problem
- 100-110%: Solid retention with modest expansion
- 110-120%: Strong expansion, good product-market fit
- 120%+: Exceptional (usage-based or platform companies)

Calculation methods:
- Net (NRR / NDR): Includes expansion, so it can exceed 100%. This is the headline retention number—it answers whether the existing base grows or shrinks on its own.
- Gross (GDR / Gross Dollar Retention): Excludes expansion and counts only contraction and churn, so it caps at 100% and can never exceed it. GDR isolates how much revenue you keep before any upsell masks the leak. A wide NRR–GDR gap means expansion is papering over real churn—track both, because a 115% NRR with 85% GDR is a very different business from 115% NRR with 98% GDR.

What to watch:
- Above 100%: Companies with NRR above 100% grow at twice the rate of those below. Best-in-class SaaS companies hit 120%+ through usage-based pricing or strong upsell motions.
- Below 100%: No amount of acquisition can outrun a leaky bucket at scale—prioritize reducing churn and contraction before investing in growth.

In practice: A B2B SaaS company had 95% NRR, meaning they lost 5% of revenue from existing customers each year. After analyzing cohorts, they found mid-market accounts churned at 2× the rate of enterprise. They rebuilt onboarding for mid-market and added a customer success tier, raising NRR to 108%. The same sales team now generated faster growth because each new customer compounded rather than leaked.

Related metrics:
- churn: Churn Rate — the denominator drag on NRR.
- mrr: MRR — the base for NRR calculation.
- quickratio: Quick Ratio — growth efficiency view.

Sources:
- 2024 KeyBanc Capital Markets / Sapphire Ventures SaaS Survey (median NRR ~101%) — https://info.sapphireventures.com/2024-keybanc-capital-markets-and-sapphire-ventures-saas-survey

---

## LTV:CAC Ratio

- Slug: ltvcac
- URL: https://productmetrics.org/metrics/ltvcac
- Categories / tags: revenue, health

Definition: Return on acquisition investment. The fundamental unit economics equation.

Formula: LTV:CAC = Customer Lifetime Value / Customer Acquisition Cost

What it measures: How much value you get back for each dollar spent acquiring a customer. A 3:1 ratio means every $1 in acquisition generates $3 in lifetime value. This is the most fundamental measure of business model health.

What to watch:
- Below 1:1: Unsustainable. You're paying more to acquire customers than they're worth. Either reduce CAC or increase LTV immediately.
- 3:1: The standard target. Enough margin to cover operations and generate profit. Most VCs expect at least this ratio.
- Above 5:1: Potentially underinvesting in growth. You have room to spend more aggressively on acquisition without hurting unit economics.

In practice: A B2C subscription had 2.4:1 LTV:CAC, breakeven territory. They couldn't profitably scale paid acquisition. Instead of cutting CAC, they added a premium tier that increased average LTV by 40%, pushing the ratio to 3.4:1. The same acquisition spend now generated profitable returns.

Related metrics:
- payback: CAC Payback Period.
- cac: Customer Acquisition Cost.

---

## CAC Payback Period

- Slug: payback
- URL: https://productmetrics.org/metrics/payback
- Categories / tags: revenue, health

Definition: Months to recover customer acquisition costs. Determines how fast you can reinvest in growth.

Formula: CAC Payback = CAC / (Monthly ARPA × Gross Margin %)

What it measures: How many months of customer revenue are needed to recover the cost of acquiring that customer. Shorter payback means faster reinvestment and more efficient growth. This metric directly impacts cash flow and fundraising needs.

What to watch:
- Under 12 months: Healthy for SMB SaaS. You can reinvest acquisition costs within a year, enabling self-funding growth.
- 12-18 months: Acceptable for mid-market. Requires more capital but still sustainable.
- Over 24 months: Dangerous unless you have strong retention guarantees. Long payback strains cash and increases risk if churn accelerates.

In practice: A SaaS company had 18-month payback, limiting growth to what fundraising allowed. They analyzed segments and found enterprise deals had 24-month payback but SMB was 10 months. They launched a self-serve SMB tier with 8-month payback, using that cash flow to fund slower-burning enterprise sales. Blended payback dropped to 13 months.

Related metrics:
- ltvcac: LTV:CAC Ratio — total return on acquisition.
- cac: CAC — the numerator in payback.

---

## Quick Ratio (SaaS)

- Slug: quickratio
- URL: https://productmetrics.org/metrics/quickratio
- Categories / tags: revenue, health

Definition: Revenue growth efficiency: how much you gain versus lose. The health check for scaling.

Formula: Quick Ratio = (New MRR + Expansion MRR) / (Churned MRR + Contraction MRR)

What it measures: The ratio of revenue added to revenue lost in a period. A Quick Ratio of 4 means you add $4 for every $1 lost. This single number captures growth efficiency better than growth rate alone, which can mask underlying churn problems.

What to watch:
- Above 4: Excellent. A quick ratio of 4+ is a well-known VC benchmark for healthy early-stage SaaS (popularized by Mamoon Hamid, Social Capital). You're adding revenue much faster than losing it.
- 2-4: Healthy growth. Sustainable for most stages, though later-stage companies should trend higher.
- Below 2: Barely sustainable. You're working hard just to replace lost revenue. Below 1 means you're shrinking.

In practice: A startup celebrated 40% YoY growth but had a Quick Ratio of 1.8. For every $1.80 they added, $1 churned out. When they reduced churn by 20%, Quick Ratio jumped to 2.9 with the same sales effort. They realized churn reduction was higher-leverage than acquisition.

Related metrics:
- nrr: NRR — related efficiency measure.
- mrr: MRR — the revenue being measured.
- churn: Churn Rate — the denominator in Quick Ratio.

Sources:
- Mamoon Hamid (Social Capital) on the SaaS quick ratio — SaaStr — https://www.saastr.com/mamoon-hamid-social-capital-numbers-actually-matter-founders-video-transcript

---

## Rule of 40

- Slug: rule-of-40
- URL: https://productmetrics.org/metrics/rule-of-40
- Categories / tags: revenue, health

Definition: A SaaS health heuristic: a company’s revenue growth rate plus its profit margin should add up to at least 40%.

Formula: Rule of 40 = Revenue growth rate (%) + Profit margin (%) ≥ 40%

What it measures: The balance between growth and profitability in a single number. The premise is that growth and margin are substitutes a SaaS business can trade against each other: a company growing 60% can afford to burn at a −20% margin, and one growing only 10% needs a 30% margin to compensate—both clear 40. It is a portfolio-level sniff test, not a precise valuation tool. "Profit margin" is not standardized—EBITDA margin and free-cash-flow margin are the two most common choices—so always state which margin you used, because the same company can pass on one definition and miss on another.

Benchmarks:
- At or above 40%: the bar for a "healthy" SaaS company under the rule.
- Below 40%: growth and profitability together aren’t keeping pace; the gap points to where to push (faster growth or better margin).
- Originally framed for scaled SaaS (the late-stage investor it came from applied it to companies with at least ~$50M revenue); it is far noisier for very early-stage companies.

What to watch:
- Passes on growth, fails on margin: A high-growth, cash-burning profile. Acceptable while growth is genuinely high and capital is available, but the rule says the burn should narrow as growth decelerates—watch that the two move together.
- Passes on margin, fails on growth: A profitable but slow profile. Durable, but the rule flags that growth has become the constraint. The risk is over-optimizing for margin and starving the growth that makes a SaaS multiple worth paying.

In practice: A late-stage SaaS company growing 35% at a −10% margin summed to 25 and missed the rule. Rather than chase growth with more burn, they trimmed unprofitable acquisition channels and lifted margin to +8% while growth held at 33%—a sum of 41. The same business now cleared the bar not by growing faster but by making its existing growth more efficient, which is exactly the trade-off the rule is designed to surface.

Vanity risk: Because "profit margin" isn’t standardized, the Rule of 40 is easy to pass on paper by picking the most flattering margin definition. A company that clears 40 on adjusted EBITDA but misses badly on free-cash-flow margin hasn’t really passed—always pin down which margin is in the sum before trusting the result.

Related metrics:
- mrr: MRR — the recurring-revenue base whose growth rate feeds the rule (often via its annualized form, ARR).
- nrr: NRR — strong net retention is one of the most efficient ways to raise the growth half of the rule without proportional burn.
- quickratio: Quick Ratio — a complementary efficiency check; the Rule of 40 weighs growth against profit, the Quick Ratio weighs revenue gained against revenue lost.

Sources:
- Rule of 40 (definition; popularized by Brad Feld, "The Rule of 40% For a Healthy SaaS Company," 2015) — Wikipedia — https://en.wikipedia.org/wiki/Rule_of_40

---

## Gross Margin

- Slug: gross-margin
- URL: https://productmetrics.org/metrics/gross-margin
- Categories / tags: revenue, health

Definition: The share of revenue left after the direct cost of delivering your product. The ceiling on how profitable the business can ever be.

Formula: Gross Margin = (Revenue − COGS) / Revenue × 100%

What it measures: How much of each revenue dollar survives the cost of actually serving customers. For SaaS, COGS is the direct cost of running the service—hosting and infrastructure, customer support, and payment-processing fees—not sales, marketing, or R&D. Gross margin sets the ceiling on every downstream margin and unit-economics metric: LTV, CAC payback, and the Rule of 40 all degrade when it is low, because there is simply less of each dollar to work with. It is the structural reason software is a high-multiple business—once built, each additional customer is cheap to serve.

Benchmarks:
- Software SaaS commonly runs ~70–80%+; the median total gross margin for private SaaS was ~71% in the 2024 KeyBanc survey, and subscription-only margin sat near 79% (KeyBanc 2024 / ICONIQ 2025).
- Self-serve / PLG models reach roughly 80–85% (lighter COGS: hosting, payment processing, automated support).
- Enterprise / high-touch models run roughly 70–75% (heavier COGS: implementation, professional services, DevOps).
- AI-native products often sit lower (~50–60%) because GPU compute and model-inference costs land in COGS.

What to watch:
- Rising: Usually infrastructure efficiency (better cloud utilization, cheaper egress) or a mix shift toward higher-margin self-serve revenue. Confirm it’s structural and not a one-time credit—reserved-instance discounts and vendor credits flatter a quarter without changing the underlying cost curve.
- Falling: Watch for COGS creeping in where it doesn’t belong—heavy professional-services revenue, generous support staffing, or inference costs on an AI feature can drag blended margin down. Segment subscription margin from services margin; a healthy software margin can be masked by a low-margin services line bundled into the total.

In practice: A SaaS company reported a blended 62% gross margin and assumed its software economics were weak. Splitting the line revealed subscription margin was a healthy 81%, dragged down by a break-even professional-services arm sold to win enterprise logos. The product itself was fine; the services were a customer-acquisition cost masquerading as low-margin revenue. Reporting the two separately changed how the board read the business and stopped a misguided push to cut hosting spend that wasn’t the problem.

Vanity risk: A blended gross margin can hide a structural problem. A flattering total margin built partly on a one-time cloud credit, or a software margin diluted by a low-margin services line reported as one number, both mislead. Separate subscription margin from services margin and strip one-time costs before trusting the figure.

Related metrics:
- ltvcac: LTV:CAC Ratio — gross margin is baked into LTV, so a low margin quietly weakens the entire unit-economics equation.
- mrr: MRR — recurring revenue is the top line of the margin calculation; margin determines how much of that MRR actually converts to profit.
- payback: CAC Payback — payback is computed on gross-margin-adjusted revenue, so a lower margin directly lengthens how long it takes to recover CAC.

Sources:
- SaaS Gross Margin Benchmarks — Self-Serve vs. High-Touch (median ~71% total / ~79% subscription, per 2024 KeyBanc SaaS Survey & 2025 ICONIQ) — HumanR — https://www.humanr.ai/intelligence/saas-gross-margin-benchmarks-self-serve-vs-high-touch

---

## Net Promoter Score (NPS)

- Slug: nps
- URL: https://productmetrics.org/metrics/nps
- Categories / tags: referral, retention, leading

Definition: How likely customers are to recommend you.

Formula: NPS = % Promoters − % Detractors (ranges from −100 to +100)

What it measures: A loyalty metric based on one question: "How likely are you to recommend us?" (0-10 scale). Respondents are grouped as Promoters (9-10), Passives (7-8), or Detractors (0-6).

Benchmarks:
- Good: 30+
- Excellent: 50+
- World-class: 70+

What to watch:
- Rising: Improving loyalty. But NPS varies by industry, so compare to competitors rather than absolute benchmarks.
- Falling: Investigate qualitative feedback from Detractors. A small NPS drop may reflect a vocal minority; a sustained decline signals systemic issues.

In practice: A hospitality app redesigned its booking flow and saw NPS jump from 32 to 48. But when they segmented by user type, power users' NPS actually dropped (they missed removed shortcuts). The team added back keyboard shortcuts for frequent bookers while keeping the simplified flow for casual users.

Vanity risk: NPS without segmentation hides critical signals. An overall NPS of 40 can mask power users at -10 and casual users at 60. Always segment by user type, tenure, and usage level.

Related metrics:
- pmf: PMF Score — deeper product-market fit signal.
- csat: CSAT — interaction-specific satisfaction.

---

## Product-Market Fit Score

- Slug: pmf
- URL: https://productmetrics.org/metrics/pmf
- Categories / tags: referral, leading

Definition: How disappointed users would be without your product. The earliest reliable signal of PMF.

Formula: PMF Score = % of users answering "Very disappointed" if they could no longer use the product

What it measures: A survey-based leading indicator of product-market fit. Ask users: "How would you feel if you could no longer use [product]?" Options: Very disappointed, Somewhat disappointed, Not disappointed. The percentage answering "Very disappointed" predicts growth potential.

What to watch:
- Above 40%: Strong product-market fit indicator — this is Sean Ellis’s 40% threshold. Slack scored 51% on this survey when validating PMF. Companies crossing the threshold typically see organic growth accelerate.
- Below 40%: Keep iterating. Your product solves a problem but isn't yet a must-have. Focus on understanding what "very disappointed" users love and double down on that.

In practice: A productivity tool launched with 28% PMF score. The team analyzed the "very disappointed" segment and found they all used one specific feature: automated time blocking. They rebuilt the entire product around that feature, and PMF score rose to 47%. The pivot was informed by users who already loved them, not average users.

Related metrics:
- nps: NPS — loyalty measure.
- activation: Activation Rate — PMF depends on users experiencing core value.

Sources:
- How Superhuman Built an Engine to Find PMF (Sean Ellis 40% test; Slack scored 51%) — First Round Review — https://review.firstround.com/how-superhuman-built-an-engine-to-find-product-market-fit/

---

## Viral Coefficient (K-factor) (K)

- Slug: k-factor
- URL: https://productmetrics.org/metrics/k-factor
- Categories / tags: referral, leading

Definition: How many new users each existing user generates through invitations. The core measure of organic, self-propelled growth.

Formula: K = Invites sent per user × Invite conversion rate

What it measures: The number of new users a single user brings in through referrals or invitations during one viral cycle. A K of 0.5 means every two users generate one new user; a K of 1 means each user replaces themselves exactly. The threshold that matters is 1: at K ≥ 1 the loop is self-sustaining and growth compounds without paid acquisition, while at K < 1 each cohort produces a smaller next cohort and virality decays to zero on its own. Cycle time—how long one invite-to-signup loop takes—matters as much as K itself: a modest K with a fast cycle can outpace a higher K that takes weeks to turn over.

What to watch:
- K ≥ 1: Self-sustaining viral growth—rare, and usually only briefly. Sustained K above 1 implies pure exponential expansion, which almost no product holds for long. If you measure K ≥ 1, validate the inputs before celebrating; double-counting invites or attributing organic signups to the loop is the common cause of an inflated reading.
- K < 1: The realistic case for the vast majority of products. A sub-1 K still lowers blended CAC—a K of 0.3 means roughly a third of growth is free—so the goal is usually to maximize K as a CAC reducer, not to chase K ≥ 1. Improve it by lifting either lever: more invites sent per user (better prompts, timing, incentives) or higher invite conversion (clearer value in the invite, lower friction to accept).

When not to use: For products without a genuine invitation or sharing mechanic, K is not meaningful—word-of-mouth that doesn’t flow through a trackable invite is better captured by NPS or organic-channel attribution. Don’t manufacture a viral loop just to have a K to report.

In practice: A collaboration tool measured K at 0.4 and set a goal of crossing 1.0. Months of invite-flow optimization moved it only to 0.55. Reframing the goal exposed the real win: even at 0.55, referrals were cutting blended CAC by more than a third, which was a larger lever than any paid channel they ran. They stopped chasing the 1.0 milestone and instead optimized K as a cost-of-acquisition reducer, which made the unit economics work without the fantasy of pure virality.

Vanity risk: A reported K is only as honest as its inputs. Counting invites that were never delivered, or crediting the loop for signups that would have come organically, inflates K without inflating growth. A K ≥ 1 that doesn’t show up as compounding user counts is a measurement artifact, not viral growth.

Related metrics:
- nps: NPS — captures willingness to recommend; K measures whether that willingness actually converts into new users through a loop.
- cac: Customer Acquisition Cost — every fraction of K below 1 still reduces blended CAC; the two are the paid and organic halves of acquisition.
- conversion: Conversion Rate — invite conversion rate is one of the two multiplicative inputs to K.

---

## Scroll Depth

- Slug: scroll-depth
- URL: https://productmetrics.org/metrics/scroll-depth
- Categories / tags: activation

Definition: How far users scroll down a page, measured as a percentage of total page height.

Formula: Scroll Depth = (Furthest scroll position / Total page height) × 100%

What it measures: The percentage of page content a user actually sees. Measured as the furthest point a user scrolls to during a page visit. Typically tracked at 25%, 50%, 75%, and 100% thresholds, or as a continuous percentage. Aggregated as median or average across sessions.

Benchmarks:
- Blog/editorial content: 50–60% average depth
- Landing pages: 60–70% average depth
- Product pages: 40–55% average depth
- On long-form content, higher scroll depth tends to correlate with higher conversion — though the exact lift varies by page and intent.

What to watch:
- Deep scrolling (75%+): Users are engaging with your full content. Validate that conversion elements are placed where users actually reach them, not just above the fold. High depth on key pages often predicts activation.
- Shallow scrolling (<25%): Most users aren’t seeing your content. Check for slow load times, misleading page titles, or weak opening content. If scroll depth is shallow but bounce rate is low, users may be finding what they need in the header—which could be fine.

In practice: A SaaS company discovered that their pricing page had 35% average scroll depth, meaning most visitors never saw the feature comparison table below the fold. Moving the comparison above the pricing tiers increased trial signups by 22%. The insight wasn’t that the content was bad—it was invisible.

Related metrics:
- stickiness: DAU/MAU Ratio — deep scrolling on content pages often correlates with habitual product usage.
- activation: Activation Rate — scroll depth on onboarding pages reveals whether users see enough to activate.
- ces: Customer Effort Score — shallow scrolling on help pages suggests users can’t find answers.

Sources:
- Scrolling and attention (scroll-depth behavior) — Nielsen Norman Group — https://www.nngroup.com/articles/scrolling-and-attention/

---

## Form Abandonment Rate

- Slug: form-abandonment
- URL: https://productmetrics.org/metrics/form-abandonment
- Categories / tags: activation

Definition: The percentage of users who start filling out a form but leave without submitting it.

Formula: Form Abandonment Rate = ((Forms started − Forms submitted) / Forms started) × 100%

What it measures: The proportion of users who interact with at least one form field but never submit. A form is "started" when a user focuses on any field. Tracks the gap between intent (starting the form) and completion (submitting it). High abandonment points to friction in the conversion path.

Benchmarks:
- Checkout flows average ~70% cart abandonment (Baymard, across 50+ studies; range 55–85%)
- Registration/signup forms: ~40–60% abandonment is typical
- Contact forms: ~20–40% abandonment is typical
- Longer forms abandon more — Baymard finds most checkouts can safely cut 20–60% of their form fields.

What to watch:
- Rising abandonment: Check for newly added fields, broken validation, confusing labels, or mobile layout issues. Segment by device type—mobile abandonment is typically 10–15% higher than desktop. Also check if users are abandoning on a specific field, which reveals exactly where friction lives.
- Low abandonment: Good, but verify your form isn’t so short that it’s under-qualifying leads. A signup form with near-zero abandonment but high post-signup churn may need more qualifying fields, not fewer.

When not to use: For forms that are intentionally exploratory (search boxes, filters). These have high "abandonment" by design because users interact without always submitting.

In practice: A fintech startup saw 72% abandonment on their account application form. Field-level analysis showed users dropping off at the "employer phone number" field. Removing that single field (which wasn’t required for underwriting) dropped abandonment to 51% and increased completed applications by 73%.

Related metrics:
- activation: Activation Rate — form completion often defines the activation event.
- ces: Customer Effort Score — high form abandonment is a direct signal of excessive effort.
- conversion: Conversion Rate — form abandonment is a primary driver of conversion drop-off.

Sources:
- Cart & checkout abandonment rate (~70%, 50+ studies) — Baymard Institute — https://baymard.com/lists/cart-abandonment-rate

---

## Frustration Score

- Slug: frustration-score
- URL: https://productmetrics.org/metrics/frustration-score
- Categories / tags: retention, health
- Kind: composite (emerging, tool-defined score — not an industry standard)

Definition: A composite score measuring user frustration signals during a session, derived from rage clicks, dead clicks, error encounters, and navigation reversals.

Formula: Frustration Score = Weighted sum of (rage clicks × 3 + dead clicks × 2 + errors × 2 + U-turns × 1) / session duration in minutes

What it measures: An aggregate behavioral signal that combines multiple indicators of user frustration into a single per-session score. Higher scores indicate more frustrated sessions. Unlike survey-based metrics (CSAT, CES), frustration score is entirely observed—users don’t need to tell you they’re frustrated; their behavior shows it.

What to watch:
- High-frustration sessions: Segment by page, device, and user cohort to find patterns. A few high-frustration pages often account for most of the problem. Prioritize pages where frustration is high AND traffic is high—that’s where the most users are suffering.
- Frustration trending up: Often follows a deploy. Compare frustration scores before and after releases to catch UX regressions early. A 20%+ increase in average frustration after a release warrants investigation.

In practice: A B2B platform used frustration scoring to identify their worst UX problem: a dropdown menu that appeared clickable but required hover. It generated 40% of all rage clicks site-wide. The fix was a 2-line CSS change that reduced average frustration score by 35% across the entire product.

Related metrics:
- ces: Customer Effort Score — frustration score is the behavioral counterpart to the survey-based CES.
- churn: Customer Churn Rate — persistently high frustration is a leading indicator of churn.
- csat: CSAT — frustration inversely correlates with satisfaction scores.
- rage-click-rate: Rage Click Rate — rage clicks are the highest-weighted input to frustration score.

---

## Rage Click Rate

- Slug: rage-click-rate
- URL: https://productmetrics.org/metrics/rage-click-rate
- Categories / tags: retention

Definition: The percentage of sessions containing rage clicks—rapid, repeated clicks on the same element, indicating user frustration with unresponsive or misleading UI.

Formula: Rage Click Rate = (Sessions with ≥1 rage click / Total sessions) × 100%

What it measures: The proportion of sessions where a user clicks the same element 3 or more times within 2 seconds. Rage clicks reveal elements that look interactive but aren’t, slow-loading buttons, or broken functionality. Unlike error rates, rage clicks catch UX problems that don’t throw errors but still frustrate users.

Benchmarks:
- Healthy products: <3% of sessions contain rage clicks
- Acceptable range: 3–7% of sessions
- Needs attention: >7% of sessions
- A single broken element can account for 50%+ of all rage clicks site-wide

What to watch:
- Concentrated rage clicks: If most rage clicks target 1–2 elements, you have a specific UI bug. Common culprits: buttons with slow API calls and no loading state, elements styled like links but without click handlers, and disabled buttons with no visual feedback.
- Dispersed rage clicks: If rage clicks are spread across many elements, you may have a systemic responsiveness issue. Check for global performance problems: slow JavaScript execution, render-blocking resources, or aggressive throttling.

In practice: An e-commerce site found 12% of checkout sessions had rage clicks—nearly all on the "Apply Coupon" button. The button triggered a 3-second API call with no loading indicator. Adding a spinner and disabling the button during validation dropped rage click rate to 2% and reduced checkout abandonment by 8%.

Related metrics:
- frustration-score: Frustration Score — rage clicks are weighted highest in the composite frustration score.
- ces: Customer Effort Score — rage clicks indicate UI elements that require excessive effort.
- form-abandonment: Form Abandonment Rate — rage clicks in forms often precede abandonment.

---

## Bounce Rate

- Slug: bounce-rate
- URL: https://productmetrics.org/metrics/bounce-rate
- Categories / tags: acquisition, leading

Definition: Percentage of sessions where the visitor views one page and leaves without further interaction.

Formula: Bounce Rate = (Single-page sessions / Total sessions) × 100%

What it measures: The share of visits that end on the entry page with no second pageview, click, or qualifying interaction. Bounce thresholds depend on the page’s job: a single-purpose landing page with 90% bounce can be healthy if visitors took the action; a blog post with 40% bounce can be struggling if visitors left mid-read. Always interpret bounce alongside scroll depth and time on page.

Benchmarks:
- Marketing landing pages: 70–90%
- Blog and editorial: 60–80%
- E-commerce category pages: 20–45%
- SaaS app dashboards (logged-in): 10–30%

What to watch:
- Sudden spike on a single page: Usually a deploy regression — broken layout, slow load, or a tracking script that fires before paint. Compare bounce rate before and after the last release. If isolated to one page, look at scroll depth on that page; near-zero scroll plus high bounce often means the page didn’t render.
- High bounce on paid traffic: Message-mismatch between the ad and the landing page. The visitor expected something the page didn’t deliver in the first 3 seconds. Compare bounce by traffic source — paid bouncing 2× organic is a creative or targeting problem, not a page problem.
- Low bounce that doesn’t convert: Visitors are clicking around but not completing the goal. Bounce is healthy by itself but says nothing about whether engagement is productive. Pair with conversion rate or session duration before celebrating.

In practice: A B2B landing page had 78% bounce rate and the team assumed the page was failing. Scroll depth showed visitors were reading 60%+ of the page on average — they just weren’t clicking the CTA, which sat below a long block of social proof. Moving the CTA above the social proof dropped bounce to 71% and lifted demo requests roughly 60%, because the visitors who would have bounced were the ones who scrolled past the CTA without seeing it.

Vanity risk: Bounce rate without page-intent context is misleading. A pricing page with 30% bounce isn’t automatically better than a single-purpose contact page with 90% bounce — they have different jobs. Always interpret bounce alongside what the page was supposed to do.

Related metrics:
- scroll-depth: Scroll Depth — bounce + low scroll = visitor never engaged; bounce + high scroll = page did its job in one view.
- frustration-score: Frustration Score — bounces accompanied by rage clicks or errors are a bug signal, not a disinterest signal.
- conversion: Conversion Rate — bounce rate is the negative-space mirror of conversion on entry pages.

---

## Average Session Duration

- Slug: avg-session-duration
- URL: https://productmetrics.org/metrics/avg-session-duration
- Categories / tags: retention, leading

Definition: The average length of a user session, from first interaction to last, across all sessions in a given period.

Formula: Avg Session Duration = Total session time / Total sessions

What it measures: How long visitors stay engaged in a single visit, on average. A session typically begins on the first page or interaction and ends after 30 minutes of inactivity. Duration is a blunt instrument on its own — long sessions can mean deep engagement or users stuck on a slow page — but it becomes useful when paired with depth (pages per session) and intent (which goal action).

Benchmarks:
- Marketing landing pages: 30–90 seconds
- Blog and editorial: 2–4 minutes
- E-commerce browse: 3–6 minutes
- SaaS product app: 8–20 minutes

What to watch:
- Rising: Visitors are spending more time. Verify with scroll depth and pages per session — long sessions with low depth often mean users are stuck or confused, not engaged. A 5-minute session that ends with a rage click is worse than a 90-second session that converts.
- Falling: Could be efficiency (users finding what they need faster — good) or disinterest (users leaving sooner — bad). Check conversion rate alongside: if it held or rose, faster sessions are healthy; if it dropped, you lost engagement.

When not to use: For single-page applications without analytics that mark explicit session ends, duration estimates are unreliable — most analytics tools attribute the time-on-last-page as zero, deflating the average.

In practice: A documentation site celebrated when average session duration jumped from 2.4 to 4.1 minutes after a navigation redesign. Three weeks later, support tickets rose 18%. The new navigation buried key answers two clicks deeper — users were spending more time because they couldn’t find what they needed. Reverting the navigation cut session duration back to 2.5 minutes and reduced tickets by 22%. Longer wasn’t better; faster was.

Vanity risk: Average session duration without a depth or conversion companion is hollow. A site can lift duration by making everything slower or more confusing — a technically successful number that hides a UX failure. Always interpret alongside what users actually accomplished.

Related metrics:
- bounce-rate: Bounce Rate — sessions that end on page one have near-zero duration; bounce rate and average duration are two halves of the same picture.
- scroll-depth: Scroll Depth — pairs with duration to distinguish engaged sessions from stuck ones.
- session-score: Session Score — duration is one of the four inputs to the composite session score.

---

## Session Score

- Slug: session-score
- URL: https://productmetrics.org/metrics/session-score
- Categories / tags: retention, health
- Kind: composite (emerging, tool-defined score — not an industry standard)

Definition: A composite 0–100 score measuring whether sessions are deep and engaged or shallow and bouncy, derived from bounce rate, average duration, pages per session, and session-level frustration.

Formula: Session Score = 100 − (bounce penalty + duration penalty + depth penalty + frustration penalty)

What it measures: A single per-site score that absorbs the four signals most predictive of session health: how often visitors leave on page one (bounce, weighted heaviest), how long they stay (duration), how far they explore (pages per session), and how often they hit friction (frustration sessions). The penalty weights add up to 100 — a perfect score means none of the four failure modes are present at scale; a low score means at least one failure mode is dominant. Designed for at-a-glance health monitoring, not deep diagnosis: when the score drops, look at the breakdown to find which input regressed.

What to watch:
- Score above 75: Most sessions are healthy by all four inputs. Use the breakdown to find the largest remaining penalty and decide whether the marginal effort is worth it — a 78 with a small bounce penalty is usually healthier than chasing 90.
- Score below 50: At least one input has collapsed. Look at the four penalties: a big bounce penalty points to landing-page or traffic-source problems; a big duration or depth penalty points to engagement or navigation; a big frustration penalty points to specific UX bugs. Fix the dominant penalty first — improvements compound through the composite.
- Score moving without an obvious cause: Composite scores can shift because one input changed dramatically while others held steady. Always pull the breakdown before reacting — a 10-point drop driven entirely by frustration is a different problem than the same drop driven by bounce.

In practice: A SaaS marketing site watched session score drop from 71 to 58 over two weeks with no obvious traffic shift. The breakdown showed bounce and duration penalties both rose ~6 points each. Tracing back, a recent A/B test on the homepage hero had been promoted to 100% — the new variant loaded a 1.4MB hero video that pushed first-paint past 4 seconds on mobile. Reverting the variant returned session score to 70 within three days. The composite caught the regression before any single page-level metric crossed an alarm threshold.

Vanity risk: Composite scores can mask offsetting changes — a site that improved bounce while frustration crept up may show flat session score even though the underlying experience changed materially. Always pull the breakdown before declaring a score stable.

Related metrics:
- bounce-rate: Bounce Rate — heaviest weighted input (max 30 points of penalty).
- avg-session-duration: Average Session Duration — second input (max 25 points of penalty).
- frustration-score: Frustration Score — session-level frustration is the fourth input (max 20 points of penalty).
- engagement-score: Engagement Score — overlapping composite that emphasizes return visits rather than depth-within-session.

---

## Navigation Score

- Slug: navigation-score
- URL: https://productmetrics.org/metrics/navigation-score
- Categories / tags: retention, health
- Kind: composite (emerging, tool-defined score — not an industry standard)

Definition: A composite 0–100 score measuring whether visitors move through a site cleanly, derived from drop-off rates at key pages, session-loop frequency, and the share of single-page sessions.

Formula: Navigation Score = 100 − (drop-off penalty + loop penalty + single-page penalty)

What it measures: A per-site score that captures three distinct ways navigation can fail. Drop-offs flag pages where visitors disproportionately leave the site — an exit funnel that points to a specific page problem. Loops flag sessions that revisit the same page repeatedly, the behavioral signature of confusion or a missing answer. Single-page sessions flag visits that never explored beyond entry. Each penalty maxes between 30–40 points; a low score means at least one of the three failure modes is widespread. The score is a proxy for whether the site’s information architecture is getting visitors where they’re trying to go.

What to watch:
- Score above 75: Visitors are flowing through the site without major dead-ends or disorientation. Use the breakdown to identify whether remaining friction is concentrated (one bad page) or distributed (a structural IA issue).
- Concentrated drop-off penalty: One or two pages account for most of the score loss. These are usually fixable — broken links, confusing CTAs, or content that doesn’t match the page’s entry context. Tools that show top exit pages and their drop-off rates make this a one-evening fix in many cases.
- High loop penalty: Visitors are returning to the same page mid-session, often the homepage or a search results page. They tried something, it didn’t work, and they backed up to try again. This is a navigation-design problem, not a page problem — labels are unclear, the IA forces too many pivots, or search isn’t surfacing the right results.

In practice: A documentation site had navigation score at 62 with most of the penalty (24 of 38 lost) coming from loops centered on the homepage. Watching session recordings showed users searching for an API method, landing on a tutorial that mentioned the method but didn’t explain it, returning to the homepage, searching again with a slight variation, and looping. Adding direct-link search results for API method names cut the loop rate in half within a week, and navigation score rose to 78. The penalty breakdown told them where to look; the recordings told them what to fix.

Vanity risk: A site with shallow content (few pages worth visiting) can score low on navigation score for structural reasons that aren’t failures — there simply isn’t much to navigate. Interpret the score relative to what visitors are trying to do, not against an absolute target.

Related metrics:
- bounce-rate: Bounce Rate — the single-page-session penalty in this composite is essentially the bounce contribution to navigation score.
- scroll-depth: Scroll Depth — high drop-off plus low scroll depth often means visitors didn’t see the navigation paths the page offered.
- session-score: Session Score — overlapping composite; navigation score emphasizes flow between pages rather than depth within a session.

---

## Engagement Score

- Slug: engagement-score
- URL: https://productmetrics.org/metrics/engagement-score
- Categories / tags: retention, health
- Kind: composite (emerging, tool-defined score — not an industry standard)

Definition: A composite 0–100 score measuring depth and return frequency of visitor engagement, derived from median session duration, bounce rate, return-visit rate, and the share of low-engagement sessions.

Formula: Engagement Score = 100 − (duration penalty + bounce penalty + return penalty + tier penalty)

What it measures: A per-site score that complements session score by looking at engagement as a pattern rather than a single-session property. It uses median session duration (not average — robust to outliers), bounce rate, the share of visitors who return at least once, and the share of sessions that fall in the lowest engagement tiers (bounced or brief). Each input contributes up to 25 points of penalty. The score answers a different question than session score: not whether individual sessions are deep, but whether visitors are developing a habit of coming back.

What to watch:
- Score above 75: Visitors are spending real time and many are returning. Use the breakdown to identify whether remaining penalty is on the depth side (duration, bounce) or the return side (return rate, low-engagement tiers).
- High return penalty, low duration penalty: Visitors are engaged when they’re here, but they’re not coming back. This is a top-of-funnel or memorability problem — strong content or product, but no reason to return. Email digests, feed mechanics, or product hooks often address this faster than UX work.
- Low return penalty, high duration penalty: Visitors return often but bounce off quickly each time. Usually means the homepage or entry page isn’t routing returning visitors to where they want to go. Personalization or "continue reading" patterns address this directly.

In practice: A media site’s engagement score sat at 64 for months. The breakdown was unbalanced — duration and bounce penalties were both small, but return penalty was 18 of a possible 25. Visitors loved the articles but never came back. Launching a weekly email digest of the most-shared articles lifted the return rate from 18% to 31% over two months, and engagement score climbed to 72. The composite identified that "make existing visitors return" was a higher-leverage move than "make existing visitors stay longer," which the team had been chasing for a year.

Vanity risk: Composite scores hide offsetting changes. An engagement score that holds steady can mask a worsening return rate offset by improving duration — the underlying engagement pattern shifted even though the headline number didn’t. Always check the breakdown.

Related metrics:
- bounce-rate: Bounce Rate — direct input (max 25 points of penalty); a high score is easier to achieve when bounce is low.
- avg-session-duration: Average Session Duration — the duration input here uses the median rather than the mean for robustness against outliers.
- session-score: Session Score — overlapping composite; engagement score weights return-visit behavior, session score weights within-session depth.
- stickiness: DAU/MAU Stickiness — the authenticated-product analogue; engagement score is the marketing-site or anonymous-visitor view of the same idea.

---

## Largest Contentful Paint (LCP)

- Slug: lcp
- URL: https://productmetrics.org/metrics/lcp
- Categories / tags: retention, leading

Definition: How long until the largest content element in the viewport finishes rendering — the moment a page visually looks loaded to a visitor.

Formula: LCP = Time from navigation start to the render of the largest text block or image in the initial viewport, reported at the 75th percentile of real visits

What it measures: Perceived load speed. LCP marks when the biggest above-the-fold element — a hero image, a headline, a video poster — becomes visible. It is one of the three official Core Web Vitals and the one users feel most directly: a page can have started painting, but if the main content isn’t there yet, it doesn’t feel loaded. Google evaluates it at the 75th percentile of field sessions, so "good" means three of every four real visits were fast — not the average, which hides the slow tail.

Benchmarks:
- Good: ≤ 2.5s at P75
- Needs improvement: 2.5s–4.0s at P75
- Poor: > 4.0s at P75
- Thresholds: Google web.dev Core Web Vitals (field data, 75th percentile)

What to watch:
- Above 2.5s at P75: Visitors are waiting to see your main content. The usual culprits, in order of frequency: an unoptimized hero image, render-blocking CSS/JS, slow server response (check TTFB first — LCP can never beat it), and client-side rendering that defers the largest element behind a hydration step.
- Good lab score, poor field score: Your test machine is fast and your users’ phones are not. LCP is a field metric; trust the P75 of real visits over a Lighthouse run on a wired connection.
- Regressed after a deploy: A new hero asset, a font swap, or a third-party script frequently moves LCP. Correlate the regression window with the release that preceded it.

In practice: A marketing site had LCP at 4.3s at P75 and a bounce rate climbing on mobile. The largest element was an uncompressed 1.8MB hero PNG served at full resolution to every device. Converting it to a responsive WebP set and preloading the chosen source dropped LCP to 2.1s at P75 within a day of deploy. Mobile bounce rate fell six points over the following two weeks — the page hadn’t changed, only how fast it appeared.

Vanity risk: A "good" lab LCP measured on a fast connection is not a result — it is a hypothesis. Only the 75th-percentile field value across real visits tells you whether users actually experienced a fast page. Optimizing the lab number while the field number stays poor is motion without progress.

Related metrics:
- ttfb: Time to First Byte — LCP is bounded below by TTFB; a slow server caps how fast the largest paint can ever be.
- fcp: First Contentful Paint — FCP is the first pixel, LCP is the meaningful one; a large FCP→LCP gap points at a heavy primary asset.
- bounce-rate: Bounce Rate — slow LCP is one of the most reliable predictors of visitors leaving before they engage.

Sources:
- Largest Contentful Paint (LCP) — Google web.dev — https://web.dev/articles/lcp
- Core Web Vitals (thresholds, field data, P75) — Google web.dev — https://web.dev/articles/vitals

---

## Interaction to Next Paint (INP)

- Slug: inp
- URL: https://productmetrics.org/metrics/inp
- Categories / tags: retention, leading

Definition: How long the page takes to visually respond after a user interacts — the latency between a tap, click, or keypress and the next frame the browser paints.

Formula: INP = The near-worst interaction latency observed during a visit, from input to next render, reported at the 75th percentile of real visits

What it measures: Responsiveness. INP observes every click, tap, and key press in a session and reports a value near the worst one — the single laggy interaction a visitor remembers. It replaced First Input Delay as a Core Web Vital in 2024 because it captures the whole session, not just the first interaction. High INP is the technical signature of a page that feels janky: the user did something and the interface didn’t answer quickly.

Benchmarks:
- Good: ≤ 200ms at P75
- Needs improvement: 200ms–500ms at P75
- Poor: > 500ms at P75
- Thresholds: Google web.dev Core Web Vitals (field data, 75th percentile)

What to watch:
- Above 200ms at P75: The main thread is busy when users try to interact. Long tasks from heavy JavaScript, large hydration payloads, and unoptimized event handlers are the common causes. Profile the interactions with the worst latency, not the average.
- Worse on content-heavy routes: Pages with large DOMs or many third-party widgets serialize interaction handling behind layout and script work. Per-route INP isolates which templates are the problem.
- Correlates with rage clicks: When a tap doesn’t respond, users tap again. A spike in repeated rapid clicks alongside high INP is the same problem seen from the behavioral side.

In practice: An interactive pricing configurator had INP at 540ms at P75. Every slider change triggered a synchronous recalculation across the whole DOM. Debouncing the recompute and moving the heavy math off the main thread brought INP to 170ms. Support tickets describing the tool as "frozen" or "broken" stopped within the week — the logic had always been correct; it just hadn’t felt responsive.

Vanity risk: INP is a near-worst-case metric by design — averaging it away defeats its purpose. A "good" mean interaction time with a poor P75 means a meaningful share of visitors hit a janky interaction; that is the experience that drives them away, not the average they never noticed.

Related metrics:
- lcp: Largest Contentful Paint — LCP is load responsiveness, INP is interaction responsiveness; a page can pass one and fail the other.
- engagement-score: Engagement Score — unresponsive pages depress session depth; INP is often the hidden cause behind a falling engagement composite.
- bounce-rate: Bounce Rate — an interface that doesn’t respond to the first interaction is a leading cause of immediate exits.

Sources:
- Interaction to Next Paint (INP) — Google web.dev — https://web.dev/articles/inp
- Core Web Vitals (thresholds, field data, P75) — Google web.dev — https://web.dev/articles/vitals

---

## Cumulative Layout Shift (CLS)

- Slug: cls
- URL: https://productmetrics.org/metrics/cls
- Categories / tags: retention, leading

Definition: How much the page visually jumps while it loads — a unitless score for unexpected movement of content the user can already see.

Formula: CLS = Sum of layout-shift scores (impact fraction × distance fraction) for unexpected shifts, taking the worst session window, reported at the 75th percentile of real visits

What it measures: Visual stability. CLS quantifies content moving after it has rendered — a button that slides down as an ad loads, text that reflows when a late font arrives, an image with no reserved space pushing everything below it. It is the one Core Web Vital with no time unit: it is a score where lower is better and 0 means nothing shifted. The damage is concrete — users click the wrong thing because the right thing moved.

Benchmarks:
- Good: ≤ 0.1 at P75
- Needs improvement: 0.1–0.25 at P75
- Poor: > 0.25 at P75
- Thresholds: Google web.dev Core Web Vitals (field data, 75th percentile; CLS is unitless)

What to watch:
- Above 0.1 at P75: Something is rendering without reserved space. The repeat offenders: images and iframes without width/height attributes, ad and embed slots that size themselves on load, and web fonts that reflow text when they swap in. Each is individually fixable once identified.
- Concentrated on one route: A single template with a late-loading element usually dominates the score. Per-route CLS points straight at the page to fix rather than a site-wide hunt.
- Mobile far worse than desktop: Narrow viewports amplify shift distance and stack more content vertically. A CLS that looks fine on a desktop test can be poor in the field where most visits are mobile.

In practice: A publisher’s article pages had CLS at 0.31 at P75. A leaderboard ad slot with no reserved height pushed the entire article down by roughly 90px the instant it filled. Reserving the slot’s dimensions in CSS so the space was held before the ad arrived brought CLS to 0.04. Accidental ad clicks — and the refund requests that came with them — dropped sharply, because the content stopped moving out from under readers’ taps.

Vanity risk: CLS is windowed to the worst burst of shifting in a session for a reason: a page that is stable on average but lurches once during load still misclicked the user once. The average hides exactly the event that did the damage.

Related metrics:
- lcp: Largest Contentful Paint — a late-loading large element often causes both a slow LCP and a layout shift when it finally arrives.
- inp: Interaction to Next Paint — distinct failure modes (movement vs. latency); a page can be stable but unresponsive, or fast but jumpy.
- bounce-rate: Bounce Rate — a page that physically moves under a visitor’s tap is a frustration signal that correlates with early exits.

Sources:
- Cumulative Layout Shift (CLS) — Google web.dev — https://web.dev/articles/cls
- Core Web Vitals (thresholds, field data, P75) — Google web.dev — https://web.dev/articles/vitals

---

## First Contentful Paint (FCP)

- Slug: fcp
- URL: https://productmetrics.org/metrics/fcp
- Categories / tags: retention, leading

Definition: How long until the browser renders the first piece of content — any text, image, or non-blank pixel — after a visitor requests the page.

Formula: FCP = Time from navigation start to the first render of any DOM content, reported at the 75th percentile of real visits

What it measures: The end of the blank screen. FCP is the first sign of life — the moment a visitor knows the page is working at all. It is a supporting diagnostic metric rather than one of the three official Core Web Vitals, but it is a leading indicator for LCP: a slow FCP guarantees a slow LCP, so it is where load-performance debugging usually starts. A fast FCP followed by a slow LCP isolates the problem to the primary content asset rather than the critical rendering path.

Benchmarks:
- Good: ≤ 1.8s at P75
- Needs improvement: 1.8s–3.0s at P75
- Poor: > 3.0s at P75
- Thresholds: Google web.dev (field data, 75th percentile; FCP is a diagnostic metric, not an official Core Web Vital)

What to watch:
- Above 1.8s at P75: The critical rendering path is blocked. Render-blocking stylesheets and synchronous scripts in the document head are the classic causes, followed by a slow TTFB — the browser cannot paint anything until the first bytes arrive.
- FCP good but LCP poor: The shell paints fast but the main content lags. The bottleneck is the largest element (usually a hero image or a client-rendered block), not the page framework. Focus optimization on LCP, not FCP.
- FCP and TTFB both poor: The problem is upstream of the browser entirely. Fix server response time first; FCP cannot be faster than the bytes it is waiting on.

In practice: A single-page app showed FCP at 3.4s at P75 with users reporting a long white screen. The entire UI was deferred behind a JavaScript bundle that had to download, parse, and execute before anything rendered. Adding a server-rendered shell with a meaningful loading state cut FCP to 1.2s. Total time to interactive barely moved — but the perceived wait collapsed, because visitors saw something immediately instead of nothing.

Vanity risk: A fast FCP can mask a slow experience: painting a spinner or an empty layout quickly improves FCP without improving when the visitor can actually use the page. FCP is only meaningful read alongside LCP — first paint without meaningful paint is a vanity improvement.

Related metrics:
- lcp: Largest Contentful Paint — FCP is the first pixel, LCP the meaningful one; debug FCP first, then close the gap to LCP.
- ttfb: Time to First Byte — FCP is bounded below by TTFB; rule out the server before blaming the front end.
- bounce-rate: Bounce Rate — a long blank screen is one of the earliest points at which a visitor decides to leave.

Sources:
- First Contentful Paint (FCP) — Google web.dev — https://web.dev/articles/fcp

---

## Time to First Byte (TTFB)

- Slug: ttfb
- URL: https://productmetrics.org/metrics/ttfb
- Categories / tags: retention, leading

Definition: How long from the start of the request until the browser receives the first byte of the response — the server-and-network portion of load time, before any rendering begins.

Formula: TTFB = Time from navigation start to receipt of the first response byte (redirects + DNS + connection + request + server processing), reported at the 75th percentile of real visits

What it measures: Backend and delivery speed. TTFB is everything that happens before the browser can do anything: DNS lookup, connection setup, redirects, and the server actually generating the response. It is a diagnostic metric, not a Core Web Vital, but it is the floor under every paint metric — LCP and FCP can never be faster than TTFB. A poor TTFB means front-end optimization is rearranging deck chairs; the fix is upstream.

Benchmarks:
- Good: ≤ 0.8s at P75
- Needs improvement: 0.8s–1.8s at P75
- Poor: > 1.8s at P75
- Thresholds: Google web.dev (field data, 75th percentile; TTFB is a diagnostic metric, not an official Core Web Vital)

What to watch:
- Above 0.8s at P75: The server or delivery path is slow. Common causes: uncached dynamic rendering, slow database queries in the request path, no CDN or edge caching, or chains of redirects. Each adds latency before a single pixel can paint.
- Geographic spread in the field: TTFB that is fast near your origin and slow elsewhere is a delivery problem, not an application one. A CDN or edge rendering usually moves it more than backend tuning.
- Spikes correlated with traffic: TTFB that degrades under load points at a capacity or caching ceiling rather than a code path. Look at the response-time distribution during peak windows, not the daily average.

In practice: A content site had TTFB at 1.9s at P75 and had spent a sprint optimizing images with no effect on load metrics. Every page was rendered dynamically on each request with no caching. Adding edge caching with a short TTL for the mostly-static pages brought TTFB to 0.2s, and LCP improved by nearly the same 1.7s — because the paint metrics had been waiting on the server the entire time. The image work only paid off once the floor was removed.

Vanity risk: TTFB measured from a location next to your origin server is a number that flatters the infrastructure and misleads the team. Only the field P75 across real visitor geographies reflects the latency users actually pay before the page even begins to load.

Related metrics:
- fcp: First Contentful Paint — the first thing TTFB gates; if both are poor, fix TTFB first.
- lcp: Largest Contentful Paint — TTFB is the hard floor under LCP; front-end work cannot beat a slow server.
- conversion: Conversion Rate — server latency compounds across multi-step funnels, where every step pays the TTFB cost again.

Sources:
- Time to First Byte (TTFB) — Google web.dev — https://web.dev/articles/ttfb

---

# Frameworks (4)

## AARRR (Pirate Metrics)

- Slug: aarrr
- URL: https://productmetrics.org/frameworks/aarrr
- Origin: Dave McClure, 500 Startups (2007)

Summary: Maps the full user journey from first touch to advocacy. The most widely adopted framework for growth-stage startups because it forces you to measure every stage of the funnel, not just the top.

How it works: AARRR treats the user journey as a funnel with five sequential stages. You measure each stage independently, then look for the biggest drop-off—that’s where to focus. The framework works because it forces you to confront the full picture: a product with great acquisition but terrible activation is wasting money, and AARRR makes that visible. Most teams discover their biggest leverage point is NOT where they expected.

Components:
- Acquisition: How do users find you? [metrics: dau, spu, conversion, far, cac, growth-accounting]
- Activation: Do users have a great first experience? [metrics: activation, ttv, trial, pql]
- Retention: Do users come back? [metrics: nday, cohort, churn, crr, stickiness, ces, csat]
- Revenue: Can you monetize? [metrics: mrr, arpu, ltv, nrr, ltvcac, payback, quickratio]
- Referral: Do users tell others? [metrics: nps, pmf]

Best for: Growth-stage startups, product teams building their first metrics stack, and anyone who needs a comprehensive view of the user journey from acquisition to advocacy.

Limitations: Linear funnel assumption breaks down for products with non-linear journeys (marketplaces, platforms). Doesn’t explicitly address engagement depth or user happiness beyond NPS.

Implementation:
- Define what each stage means for your product. "Acquisition" for a marketplace means something different than for a SaaS tool.
- Pick 1–2 metrics per stage. Resist the urge to track everything. A single metric per stage is better than five.
- Identify your leakiest stage. Run a cohort analysis to see where the biggest drop-off occurs.
- Focus improvement efforts on that stage for 4–6 weeks before moving on.
- Review monthly. The bottleneck shifts as you improve each stage.

Examples:
- Dropbox: Optimizing the Referral stage was Dropbox’s biggest growth lever: the "invite a friend, get free space" program reportedly drove ~3,900% growth in 15 months at minimal marketing spend. (source: Dropbox referral program breakdown — GrowSurf — https://growsurf.com/blog/dropbox-referral-program/)
- Airbnb: Airbnb found that professional photography of listings dramatically improved Activation — guests were far likelier to book a listing with quality photos, reportedly doubling bookings in early test markets.
- HubSpot: HubSpot grew on a freemium-first model, using free tools (Acquisition) to feed paid conversion (Revenue) — a textbook example of working the full funnel, not just the top.

When to avoid: Two-sided marketplaces (the funnel is different for each side), platform businesses where growth is non-linear (network effects don’t fit neatly into stages), and very early-stage products where you haven’t found product-market fit yet—at that stage, only retention matters.

Pairs with:
- okrs: Use OKRs to set quarterly targets for your weakest AARRR stage.
- north-star: Pick a North Star from the stage that matters most right now, with AARRR providing the supporting context.

Sources:
- Startup Metrics for Pirates (the original AARRR deck) — Dave McClure (2007) — https://www.slideshare.net/slideshow/startup-metrics-for-pirates-long-version/89026

---

## HEART

- Slug: heart
- URL: https://productmetrics.org/frameworks/heart
- Origin: Kerry Rodden, Google (2010)

Summary: Measures user experience quality across five dimensions. Designed for UX teams who need to quantify subjective experience, not just business outcomes.

How it works: HEART uses a Goals–Signals–Metrics (GSM) process for each dimension. First, define your goal for that dimension. Then identify user signals that indicate progress. Finally, pick a metric that quantifies the signal. This three-step process prevents teams from tracking metrics that don’t connect to real user outcomes. The five dimensions are intentionally broad—you’re not expected to track all five at once. Pick the 2–3 most relevant to your current priorities.

Components:
- Happiness: User attitudes and satisfaction [metrics: nps, csat]
- Engagement: Depth and frequency of interaction [metrics: dau, spu, stickiness]
- Adoption: New users and feature uptake [metrics: activation, far]
- Retention: Users who keep coming back [metrics: nday, cohort, churn]
- Task Success: Efficiency and completion of user goals [metrics: ces, ttv]

Best for: UX-driven teams, design organizations, and products where user experience quality is the primary competitive advantage. Works well alongside business metrics frameworks.

Limitations: Less prescriptive about business outcomes (revenue, unit economics). Requires careful signal/metric definition per dimension—the framework provides categories, not specific metrics.

Implementation:
- Choose 2–3 HEART dimensions most relevant to your product goals right now. Don’t try to cover all five.
- For each dimension, run the GSM exercise: Goal → Signal → Metric. Write it down.
- Define data collection. Some dimensions (Happiness) require surveys; others (Engagement) use product analytics.
- Establish baselines before making changes. You need a "before" to measure improvement.
- Review quarterly. Swap dimensions as product priorities shift.

Examples:
- Google: Google’s research team created HEART (2010) to measure UX quality at scale across web products like Gmail, YouTube, and Search. Its Goals–Signals–Metrics method — define the goal, find the user signal, then pick the metric — became an industry standard. (source: Measuring the User Experience on a Large Scale — Rodden, Hutchinson & Fu, Google Research (CHI 2010) — https://research.google/pubs/measuring-the-user-experience-on-a-large-scale-user-centered-metrics-for-web-applications/)

When to avoid: Early-stage startups where business model validation matters more than UX polish. Also poor for products where the primary challenge is distribution, not experience—if users never find you, measuring their happiness is premature.

Pairs with:
- aarrr: AARRR covers the business funnel; HEART covers the experience quality. Together they answer both "is it working?" and "is it good?"
- okrs: Each HEART dimension maps naturally to an Objective, with the GSM-derived metric as a Key Result.

Sources:
- Measuring the User Experience on a Large Scale (the original HEART paper) — Google Research, CHI 2010 — https://research.google/pubs/measuring-the-user-experience-on-a-large-scale-user-centered-metrics-for-web-applications/

---

## North Star Metric

- Slug: north-star
- URL: https://productmetrics.org/frameworks/north-star
- Origin: Growth-hacking movement (Sean Ellis coined "growth hacking," 2010); codified by Amplitude

Summary: One metric that captures core customer value. Forces alignment across the entire organization on what matters most, with 3–5 input metrics that directly influence it.

How it works: You identify the single metric that best represents the value your product delivers to customers—not revenue, not growth, but actual value. Then you decompose it into 3–5 input metrics that your teams can directly influence. Every team’s work should trace back to moving one of these inputs. The power is in simplicity: when everyone knows the one number that matters, alignment happens naturally. The "game type" (attention, transaction, productivity) helps you find your North Star by clarifying what kind of value you create.

Components:
- Attention Game: Time spent in product (Netflix: median view hours, Spotify: time listening) [metrics: dau, spu, stickiness]
- Transaction Game: Number of transactions (Airbnb: nights booked, Amazon: purchases) [metrics: conversion, mrr, arpu]
- Productivity Game: Efficiency of work (Slack: messages sent, Asana: tasks completed) [metrics: activation, far, ces]

Best for: Growth-stage and mature companies that need cross-functional alignment. Especially powerful when teams are pulling in different directions or optimizing local metrics at the expense of the whole.

Limitations: Choosing the wrong North Star can misalign the entire company. Many growth-stage companies default to revenue, but companies like Netflix and Airbnb deliberately avoid it as their primary North Star—arguing revenue-as-NSM leads to suboptimal product decisions.

Implementation:
- Determine your game type: attention (time-based value), transaction (exchange-based value), or productivity (efficiency-based value).
- Identify 3–5 candidate North Star Metrics that capture customer value, not business value.
- Test each candidate: Does it correlate with long-term retention? Can every team influence it? Does it measure value delivered, not value extracted?
- Decompose the winner into 3–5 input metrics. Each input should be owned by a specific team.
- Avoid revenue as your North Star. It measures value captured, not value delivered—optimizing for it leads to extraction, not creation.

Examples:
- Spotify: North Star is "time spent listening." Not songs played, not playlists created—time listening. This keeps the entire organization focused on delivering music experiences people want to stay in. (source: The North Star Playbook (Spotify & Airbnb examples) — Amplitude — https://amplitude.com/north-star)
- Airbnb: Uses "nights booked" rather than revenue as its primary North Star. This keeps focus on connecting travelers with great stays, not extracting higher fees. Revenue follows when the core experience is right. (source: The North Star Playbook (Netflix/Airbnb avoid revenue) — Amplitude — https://amplitude.com/north-star)
- Slack: Widely cited as using "messages sent within teams" as an early North Star — a metric that pushed product decisions toward reducing friction in team communication rather than piling on features.

When to avoid: Pre-product-market fit (you don’t yet know what value you deliver), companies with multiple distinct products (one North Star can’t cover unrelated value propositions), and situations where the metric becomes a target that gets gamed—Goodhart’s Law applies.

Pairs with:
- aarrr: AARRR helps identify which funnel stage your North Star lives in and what supports it.
- okrs: The North Star becomes the company-level Objective; input metrics become team-level Key Results.

Sources:
- The North Star Playbook — Amplitude — https://amplitude.com/north-star

---

## OKRs (Objectives & Key Results)

- Slug: okrs
- URL: https://productmetrics.org/frameworks/okrs
- Origin: Andy Grove, Intel (1970s); popularized by John Doerr at Google

Summary: Goal-setting framework that connects aspirational objectives to measurable results. Not a metrics framework per se, but the most common system for operationalizing metrics into team goals.

How it works: OKRs separate aspiration from measurement. Objectives are qualitative and ambitious—they describe the outcome you want. Key Results are quantitative and specific—they tell you whether you got there. The magic is in the constraint: 3–5 Objectives per cycle, each with 2–4 Key Results. This forces prioritization. OKRs cascade from company to team to individual, creating a visible thread from strategy to daily work. They’re not a metrics framework—they’re the system that turns metrics into goals.

Components:
- Objectives: Qualitative, inspirational goals that set direction
- Key Results (Acquisition): Measurable outcomes for user growth [metrics: cac, conversion, growth-accounting]
- Key Results (Engagement): Measurable outcomes for product usage [metrics: activation, nday, stickiness]
- Key Results (Revenue): Measurable outcomes for monetization [metrics: nrr, ltvcac, quickratio]

Best for: Organizations of any size that need to translate strategy into measurable execution. Works as a wrapper around any metrics framework—pair with AARRR or HEART to populate Key Results.

Limitations: OKRs are a goal system, not a measurement system. Without a metrics framework underneath, teams often pick arbitrary Key Results. Quarterly cadence can also create perverse incentives to hit short-term targets.

Implementation:
- Start with 3–5 company-level Objectives for the quarter. These should be ambitious but achievable.
- For each Objective, define 2–4 Key Results that are measurable and time-bound. Use existing metrics from AARRR, HEART, or your North Star inputs.
- Have teams create their own OKRs that align upward. Teams should propose, not be assigned.
- Score at end of quarter: 0.0–1.0. Target 0.6–0.7 on average. Consistently hitting 1.0 means you’re not being ambitious enough.
- Separate OKRs from compensation. The moment OKRs affect pay, people sandbag their targets.

Examples:
- Google: Has used OKRs since its early days. Larry Page credits the system with helping Google reach "10× growth" by setting goals that seemed unreasonable and treating ~60–70% attainment (a 0.6–0.7 score) as success, not failure. (source: How to grade OKRs (the 0.6–0.7 sweet spot; Page on "10x growth") — What Matters / John Doerr — https://www.whatmatters.com/faqs/how-to-grade-okrs)
- Intel: Andy Grove invented OKRs at Intel in the 1970s to keep the fast-moving chip business aligned. The system helped Intel navigate the transition from memory chips to microprocessors by making strategic shifts visible at every level.
- Spotify: A cautionary example: Spotify used OKRs early on but evolved to its own "DIBB / Bets" model (the Spotify Rhythm) around 2016, stepping back from company-wide OKRs after deciding the tracking overhead outweighed the benefit. The system is a means, not an end. (source: Spotify Rhythm — the move beyond company-wide OKRs — Henrik Kniberg / Crisp — https://blog.crisp.se/2016/06/08/henrikkniberg/spotify-rhythm)

When to avoid: Very early-stage startups where goals change weekly (quarterly cadence is too slow). Also problematic when leadership treats OKRs as top-down mandates rather than collaborative goal-setting—that kills the ownership that makes OKRs work.

Pairs with:
- aarrr: AARRR provides the metrics vocabulary; OKRs turn those metrics into quarterly targets.
- north-star: The North Star becomes the company Objective; its input metrics become Key Results distributed across teams.

Sources:
- Measure What Matters / What Matters (OKRs from Intel & Grove to Google) — John Doerr — https://www.whatmatters.com/faqs/how-to-grade-okrs

---