Home Brain, Cognitive, and Mental Health Tests and Diagnostics How to Read Mental Health Test Results: What Common Scores Mean

How to Read Mental Health Test Results: What Common Scores Mean

50
Learn how to read mental health test results, including common score ranges for depression, anxiety, PTSD, bipolar, and substance screens, what cutoff scores mean, and when results need follow-up.

Mental health test results can be useful, but they are easy to overread. A number on a depression, anxiety, trauma, alcohol, or mood questionnaire is not the same thing as a diagnosis. It is a structured signal: it shows how strongly certain symptoms were reported during a specific time window, using a specific scoring system.

The most helpful way to read a score is to ask three questions: what did the test measure, what range does the score fall into, and what should happen next? A high score usually means symptoms deserve closer attention. A low score can be reassuring, but it does not rule out every concern. The real meaning depends on the test, the person’s age and situation, symptom duration, impairment, safety risks, and whether the result fits the broader clinical picture.

Table of Contents

What Mental Health Test Scores Can Tell You

A mental health score tells you how much symptom burden was captured by that particular tool, not the full explanation for why those symptoms are happening. Most common tests are screeners or rating scales, which means they help identify patterns that may need a fuller evaluation.

A depression score, for example, may reflect major depression, grief, burnout, chronic pain, sleep deprivation, medication effects, thyroid disease, substance use, or several of these at once. An anxiety score may reflect generalized anxiety, panic, trauma symptoms, obsessive-compulsive symptoms, medical illness, stimulant use, or a stressful life situation. The score is useful because it organizes the symptoms; it is limited because it cannot interpret the whole person.

This is the key difference between screening and diagnosis. Screening asks, “Is there enough here to look more closely?” Diagnosis asks, “Do the symptoms meet a specific clinical condition, and have other explanations been considered?” A diagnosis usually requires a clinical interview, history, functional assessment, risk assessment, and sometimes medical testing or collateral information from family, teachers, or other clinicians.

Mental health scores are most helpful for four practical uses:

  • Starting a conversation. A score can make vague distress easier to describe.
  • Estimating severity. Many tools group results into minimal, mild, moderate, or severe ranges.
  • Tracking change. Repeating the same tool over time can show whether symptoms are improving, worsening, or staying the same.
  • Flagging safety concerns. Some tests include items about self-harm, suicidal thoughts, mania, substance use, or eating disorder risk that need prompt follow-up even if the total score is not extremely high.

The setting also matters. A score from a primary care visit, school screening, therapy intake, emergency evaluation, workplace wellness program, or online self-test may carry different next steps. A clinic using mental health screening should have a plan for follow-up, because a test result is only useful when someone knows what to do with it.

A helpful rule is to read the score as a signal with a direction, not a label. “Moderate depression symptoms” is more accurate than “I have depression” if the result came from a screener alone. “Elevated anxiety symptoms” is more accurate than “I have generalized anxiety disorder” unless a qualified clinician has confirmed the diagnosis.

How Common Mental Health Scoring Systems Work

Most mental health test scores are built from repeated answers to symptom questions. The scoring method may look simple, but different tools use different logic, so the same number can mean different things on different tests.

Raw total scores

Many common screeners use a raw total score. Each item has a point value, and the points are added together. On the PHQ-9, each of 9 depression items is scored from 0 to 3, giving a total range of 0 to 27. On the GAD-7, 7 anxiety items are scored from 0 to 3, giving a total range of 0 to 21.

A higher raw score usually means more frequent, intense, or impairing symptoms in the area being measured. But raw scores only make sense within the scale’s own range. A 15 on the PHQ-9 and a 15 on the GAD-7 are both clinically meaningful, but they are not interchangeable. One measures depression symptoms; the other measures anxiety symptoms.

Cutoff scores

A cutoff is the score at which a result is considered “positive,” “elevated,” or likely to warrant further evaluation. Cutoffs are chosen to balance two goals: catching as many true cases as possible and avoiding too many false alarms.

Some cutoffs are widely used. A PHQ-9 score of 10 or higher is commonly treated as a threshold for clinically significant depression symptoms. A GAD-7 score of 10 or higher is commonly used as a threshold for clinically significant anxiety symptoms. But cutoffs are not perfect. They can vary by population, age group, setting, language, medical condition, and the purpose of screening.

Severity bands

Severity bands divide scores into ranges such as minimal, mild, moderate, moderately severe, or severe. These bands help describe symptom burden, but they should not be treated as rigid categories. A person scoring near the top of “mild” may function very differently from someone near the bottom of “moderate,” and a single point can move someone across a boundary without changing the real clinical picture much.

Severity bands are best read with impairment. Someone with a moderate score who is missing work, withdrawing socially, or unable to sleep may need more support than someone with a similar score who is still functioning well and improving.

Yes-or-no symptom counts

Some tools count endorsed symptoms rather than rating frequency. PTSD and bipolar screening tools may ask whether certain symptoms happened and then apply a rule about number, timing, and impact. A bipolar screen, for example, is not just about having energetic or irritable days. It also asks whether symptoms clustered together and caused noticeable problems.

T-scores and percentiles

Some newer or broader measures use standardized scores. PROMIS measures, for example, often use T-scores, where 50 represents the average in a reference population and 10 points represents one standard deviation. A T-score of 60 on an anxiety scale usually means symptoms are higher than average. A T-score of 40 on a physical function scale may mean lower-than-average functioning, because the direction of “higher” depends on what the scale measures.

Common Score Ranges and Cutoffs

Common mental health scores are easiest to read when you know the tool’s purpose, total range, and usual threshold for follow-up. The table below gives practical starting points, not diagnostic rules.

ToolWhat it measuresScore rangeCommon interpretationImportant caution
PHQ-9Depression symptoms over the past 2 weeks0 to 270–4 minimal, 5–9 mild, 10–14 moderate, 15–19 moderately severe, 20–27 severeAny self-harm item above 0 needs prompt clinical attention, regardless of total score.
PHQ-2Brief depression screen0 to 63 or higher often triggers PHQ-9 or further assessmentIt is a first-pass screen, not a severity measure for treatment planning.
GAD-7Anxiety symptoms over the past 2 weeks0 to 210–4 minimal, 5–9 mild, 10–14 moderate, 15–21 severeIt is strongest for generalized anxiety but can also flag broader anxiety distress.
GAD-2Brief anxiety screen0 to 63 or higher often triggers GAD-7 or further assessmentA low score does not rule out panic disorder, PTSD, OCD, or specific phobias.
PCL-5PTSD symptoms related to trauma0 to 80A cutoff around 31–33 is often used for probable PTSDPTSD assessment also requires trauma history, symptom clusters, duration, and impairment.
PC-PTSD-5Brief PTSD screen0 to 5Several settings use 4 or more as a positive screenCutoffs can vary depending on whether the goal is to avoid missed cases or reduce false positives.
MDQPossible bipolar spectrum symptomsPattern-basedOften considered positive when multiple manic symptoms occur together and cause problemsA positive result does not prove bipolar disorder; antidepressant history, sleep, substance use, and psychosis history matter.
AUDITAlcohol use risk0 to 408 or higher often suggests hazardous or harmful alcohol useRisk depends on drinking pattern, health status, pregnancy, medications, and safety concerns.
AUDIT-CBrief alcohol use screen0 to 12Common positive thresholds are 4 or more for men and 3 or more for womenLower thresholds may apply in older adults, pregnancy, liver disease, or medication interactions.
SCOFFEating disorder risk0 to 52 or more “yes” answers often suggests need for further evaluationMedical instability can occur even when a short screen seems only mildly positive.
EPDSDepression symptoms during pregnancy or postpartum0 to 3010 or higher often suggests possible depression; higher cutoffs may be used for major depressionAny self-harm response needs prompt follow-up, and postpartum anxiety or OCD may require additional screening.

The PHQ-9 and GAD-7 are among the most familiar examples because they are short, widely used, and easy to repeat over time. More detailed discussions of PHQ-9 depression scores and GAD-7 anxiety scores can help when those specific tests are the ones in front of you.

For all tools, the total score is only one part of interpretation. Individual items may matter as much as the total. Severe sleep loss, panic attacks, suicidal thoughts, trauma re-experiencing, alcohol blackouts, binge-purge behaviors, or periods of decreased need for sleep with risky behavior can change the urgency of follow-up even when the total score is not the highest possible.

Why Results Can Look Worse or Better Than You Feel

A mental health result can feel surprising because tests measure a narrow slice of experience in a structured way. The score may look worse than expected, better than expected, or simply mismatched with how the person describes their life.

A score may look worse than expected when symptoms have become normal to the person living with them. Chronic anxiety is a common example. Someone may say, “I’m just a worrier,” but then endorse near-daily restlessness, irritability, muscle tension, sleep trouble, and difficulty relaxing. The score captures the burden that the person has learned to tolerate.

A score may also look worse during a temporary but intense period: a breakup, bereavement, job loss, exam period, medical scare, caregiving crisis, or poor sleep week. The score is still real, but it may represent acute strain rather than a long-standing disorder. Duration matters. Many diagnostic conditions require symptoms to persist for a certain length of time and cause impairment.

A score may look better than expected when the tool does not measure the main problem. The GAD-7 may be low in someone with panic attacks that occur in sudden bursts rather than constant worry. The PHQ-9 may miss emotional numbness or irritability that a person experiences as depression but does not endorse as sadness. A short screen may miss trauma symptoms, obsessive thoughts, dissociation, eating disorder behaviors, substance use, or hypomanic episodes.

Other factors can distort results:

  • Reading the questions differently than intended. Some people count a symptom only if it is extreme; others count mild symptoms.
  • Medical symptoms overlapping with mental health items. Fatigue, sleep changes, appetite changes, and concentration problems can come from medical illness, medications, pain, pregnancy, menopause, anemia, thyroid disease, sleep apnea, or substance use.
  • Cultural and language differences. Some people express distress more physically, while others underreport emotional symptoms because of stigma or privacy concerns.
  • Age and developmental stage. Children and teens may show irritability, school refusal, stomachaches, behavior changes, or withdrawal rather than adult-style descriptions of mood.
  • Response bias. People may minimize symptoms out of fear, shame, or practical consequences, or overendorse symptoms when they are distressed and desperate to be understood.

This is why false positives and false negatives are possible with any screening tool. A false positive means the score is elevated even though the person does not ultimately meet criteria for that condition. A false negative means the score is low even though a clinically important problem is present.

The solution is not to ignore the test. It is to place the score in context. A clinician should compare the result with the person’s history, current stressors, functioning, risk level, medical background, medication and substance use, and the pattern of symptoms over time.

What a Positive or Borderline Result Means

A positive or borderline mental health result usually means further assessment is appropriate, not that treatment decisions should be made from the number alone. The next step depends on severity, safety, impairment, and whether the result fits the person’s lived experience.

A clearly positive result, such as a PHQ-9 of 18 or a GAD-7 of 16, deserves a timely conversation with a qualified clinician. The clinician may ask about symptom onset, duration, triggers, sleep, appetite, concentration, physical symptoms, substance use, trauma, mood elevation, family history, medical conditions, medications, and daily functioning. If symptoms are impairing work, school, relationships, parenting, or self-care, the result carries more weight.

A borderline result can still matter. A PHQ-9 of 9, for example, is technically in the mild range, but it may be important if the person has never felt this way before, has a history of severe depression, is postpartum, is grieving, has new substance use, or is slipping at work or school. A GAD-7 of 9 may be mild on paper but meaningful if anxiety is causing avoidance, panic-like episodes, poor sleep, or repeated reassurance seeking.

A low result can be useful too. If symptoms were high a month ago and are now low, that can show improvement. But a low score should not be used to dismiss concerns when there are red flags, such as suicidal thoughts, psychosis, mania, major functional decline, eating disorder behaviors, dangerous substance use, or cognitive changes.

When results are being used to track treatment, trends matter more than one number. A drop from a PHQ-9 of 21 to 13 may still leave someone in the moderate range, but it suggests meaningful improvement. A GAD-7 that stays at 14 across several visits may suggest the current plan is not doing enough. A score that improves while functioning worsens deserves a closer look, because the person may be avoiding life demands rather than recovering.

After a positive screen, the follow-up may include a diagnostic interview, repeat screening, a more specific questionnaire, medical evaluation, therapy referral, medication discussion, safety planning, or monitoring over time. The practical pathway is similar to what happens after a positive mental health screen: the result opens the door to a more complete assessment.

Self-tests can be a reasonable first step, especially when someone is trying to decide whether to seek help. But online results should be treated carefully. A self-test cannot confirm a diagnosis, check medical causes, assess risk fully, or tailor treatment. If a result is high, repeated, confusing, or upsetting, it is worth bringing it to a professional rather than trying to interpret it alone.

When Test Results Need Urgent Follow-Up

Some mental health test results require prompt or urgent follow-up because they may signal immediate risk, medical instability, or a condition that can worsen quickly. The total score is not the only concern; certain answers matter on their own.

Seek same-day professional help, urgent care, emergency care, or local crisis support if a test result includes thoughts of suicide, self-harm, harming someone else, or feeling unable to stay safe. In the United States, calling or texting 988 connects people to the Suicide and Crisis Lifeline. In other countries, use the local emergency number or crisis service. If there is imminent danger, call emergency services or go to the nearest emergency department.

A nonzero response to a self-harm item on the PHQ-9 or EPDS should be taken seriously even if the total score is not severe. It does not always mean the person intends to act, but it does mean someone should ask direct safety questions: Are the thoughts passive or active? Is there a plan? Is there intent? Are there means available? Is the person using alcohol or drugs? Are they alone? Have they attempted before? Tools such as suicide screening tools are designed to help clinicians ask these questions in a structured way.

Urgent follow-up is also important when results or symptoms suggest mania or psychosis. Red flags include several days of little or no sleep without feeling tired, unusually high energy, reckless spending, risky sexual behavior, grandiose beliefs, pressured speech, racing thoughts, paranoia, hallucinations, or beliefs that are clearly disconnected from reality. A depression score alone may not capture these symptoms, so clinicians often ask about them separately before prescribing antidepressants or making treatment decisions.

Eating disorder screens need special caution when there is fainting, chest pain, severe restriction, rapid weight loss, purging, laxative misuse, electrolyte concerns, pregnancy, diabetes, or very low heart rate. Alcohol and drug screens need urgent attention when there is withdrawal risk, blackouts, mixing substances, overdose risk, pregnancy, severe liver disease, or driving while impaired.

Sudden mental status changes are not simply “mental health test results.” New confusion, disorientation, severe agitation, sudden personality change, neurological symptoms, head injury, seizure, fever with confusion, or a rapid decline in memory or attention may need immediate medical evaluation. In those situations, a questionnaire can be far less important than ruling out delirium, infection, medication toxicity, neurological illness, intoxication, withdrawal, or other medical causes.

A practical safety rule is this: if the result points to possible danger, loss of reality testing, inability to care for basic needs, or sudden medical change, do not wait for a routine appointment.

How to Discuss Results With a Clinician

The best way to use mental health test results is to bring the score, the date, the tool name, and a few real-life examples of how symptoms are affecting daily life. A clinician can do much more with a score when it is paired with context.

Before the appointment, write down:

  1. The name of the test and your score. Include the date and whether it was taken online, at a clinic, at school, or during therapy.
  2. The time window. Many tools ask about the past 2 weeks, past month, or lifetime symptoms. Mixing timeframes can confuse interpretation.
  3. The symptoms that drove the score. Note which items were highest, not just the total.
  4. Functional impact. Describe what has changed in sleep, work, school, relationships, self-care, parenting, appetite, motivation, concentration, or social life.
  5. Safety concerns. Mention any self-harm thoughts, suicidal thoughts, risky behavior, substance use, or feeling out of control.
  6. Medical and medication context. Include new medications, dose changes, supplements, alcohol or cannabis use, sleep loss, pain, hormonal changes, recent illness, or major life events.
  7. Prior history. Note past episodes, diagnoses, hospitalizations, therapy, medication responses, family history, or previous test scores.

A useful appointment question is: “Does this score match what you are hearing from my history, or do we need to look for another explanation?” That keeps the score in its proper role. It is evidence, not the whole case.

For repeat testing, ask how often the same tool should be used. Weekly scoring may be helpful in some therapy settings, but too much checking can increase anxiety for some people. Many clinicians repeat brief scales every few weeks during treatment, after medication changes, or at key follow-up visits. The goal is to guide care, not to turn every mood shift into a report card.

It is also reasonable to ask what amount of change would be meaningful. Some tools have research-based estimates for clinically meaningful improvement, while others are best interpreted more generally. A small change may reflect normal fluctuation. A larger sustained change, especially when daily functioning improves too, is more convincing.

If the clinician orders additional testing, it does not mean the first result was wrong. A positive depression screen may lead to anxiety screening, trauma assessment, bipolar screening, sleep evaluation, substance use assessment, or blood tests for medical contributors. A broader mental health evaluation often uses several pieces of information because symptoms overlap.

The most balanced interpretation is simple: take the score seriously, but do not let the score define you. It is a tool for understanding distress, deciding what needs attention, and tracking whether support is helping.

References

Disclaimer

This content is for general educational purposes only and is not a substitute for professional medical advice, diagnosis, or treatment. Mental health test scores should be interpreted with a qualified clinician, especially when results involve self-harm, suicide risk, mania, psychosis, substance use, eating disorder symptoms, or major changes in functioning.

Share this article on Facebook, X, or your preferred platform to help others understand mental health test results more clearly.