Psychological testing Study Guide
Study Guide
📖 Core Concepts
Psychological testing – Administration of tests by trained evaluators; scores reflect differences in the construct being measured.
Psychometrics – Scientific study of test construction, reliability, and validity.
Latent variable – An unobserved construct (e.g., intelligence, anxiety) inferred from test items.
Validity – Evidence that a test measures what it claims to measure.
Reliability – Consistency of test scores across forms, administrations, or items (e.g., test‑retest, internal consistency).
Standardization – Uniform procedures for test delivery, scoring, and interpretation across all examinees.
Norms – Representative sample data that define high, low, and average scores for a specific population.
Sample of behavior – The limited set of tasks/items used to represent a larger domain of behavior.
📌 Must Remember
A test must be both valid and reliable to be useful.
Standardization minimizes examiner subjectivity; objectivity reduces scoring bias.
Discrimination: good tests separate extreme groups (e.g., patients vs. healthy controls).
Norm‑referenced scores compare to a population; criterion‑referenced scores assess mastery of a defined content set.
Reliability coefficients (e.g., \(r = .80\) or higher) indicate acceptable consistency.
Cohen’s κ is used to assess inter‑rater reliability for direct observation.
Projective tests have limited validity despite being used to explore unconscious material.
🔄 Key Processes
Test Development
Define construct → generate items → pilot → assess reliability & validity → standardize administration → norm on representative sample.
Psychological Assessment Workflow
Gather data (tests, inventories, collateral info) → integrate across sources → interpret using norms & validity evidence → formulate conclusions/recommendations.
Scoring & Interpretation
Follow objective scoring rules → compute raw scores → convert to standardized scores (e.g., T‑scores, percentiles) → compare to relevant norms (age, grade, etc.).
🔍 Key Comparisons
Achievement vs. Aptitude Tests
Achievement: measures learned knowledge; often criterion‑referenced.
Aptitude: measures potential to learn or perform; can be specific (clerical) or general (intelligence).
Norm‑referenced vs. Criterion‑referenced
Norm‑referenced: “How does this person compare to peers?” (percentiles).
Criterion‑referenced: “Has the person mastered the target content?” (pass/fail, mastery).
Questionnaire/Interview Scales vs. Psychoeducational Tests
Questionnaire/Interview: assess typical behavior or attitudes.
Psychoeducational: assess maximum performance (e.g., IQ, achievement).
⚠️ Common Misunderstandings
Reliability ≠ Validity – A test can be highly consistent yet measure the wrong construct.
High scores always mean pathology – Clinical tests are normed; a score 1 SD above the mean may be typical, not diagnostic.
Projective tests are definitive – Evidence shows limited validity; they should not be sole diagnostic tools.
Norms are universal – Norms are population‑specific; using adult norms for children yields invalid conclusions.
🧠 Mental Models / Intuition
“The Test as a Mirror” – Think of a test as a mirror that reflects the underlying construct only if the glass (validity) is clear and the frame (reliability) is sturdy.
“Sampling the Ocean” – A small, well‑chosen sample of behavior can reliably estimate a huge domain, provided the sample is representative.
🚩 Exceptions & Edge Cases
Direct observation may require Cohen’s κ ≥ 0.70 for acceptable inter‑rater reliability; lower values indicate poor agreement.
Projective tests: despite low validity, may be used for exploratory purposes in research settings where no better instrument exists.
Test security breaches: If items become public, norms and reliability degrade; tests must be withdrawn or revised.
📍 When to Use Which
Diagnosing psychopathology → Clinical symptom scales (e.g., PHQ‑9, PCL‑5) with established cut‑offs.
Career counseling → Interest inventories (e.g., Holland codes) and occupational depression inventories.
Assessing learning disabilities → Achievement tests (norm‑referenced) combined with ability tests.
Evaluating treatment change → Short screening scales (K6/K10) or weekly worry questionnaires for repeated measures.
Measuring attitudes → Likert‑type attitude scales when a unidimensional favorability construct is needed.
👀 Patterns to Recognize
“Norm‑referenced + Standardized score + Percentile” → Typical for large‑scale personality or clinical tests.
“Criterion‑referenced + Pass/Fail” → Common in certification or mastery exams.
“Multiple‑item self‑report + Likert 1–5” → Characteristic of attitude, affect (PANAS), and stress scales.
“Observer‑report + Cohen’s κ” → Signals direct observation methodology.
🗂️ Exam Traps
Choosing “projective test” for a high‑stakes diagnosis – tempting because it sounds “deep”; wrong because validity is limited.
Confusing reliability coefficient with validity evidence – a high \(r\) does not guarantee the test measures the intended construct.
Applying adult norms to a child’s score – yields misleading percentile; always verify age‑matched norms.
Assuming a “high” raw score automatically indicates pathology – need to reference norm‑based cut‑offs (e.g., > 1 SD).
Over‑relying on a single source of information – assessment requires integration of multiple instruments and collateral data.
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or