RemNote Community
Community

Study Guide

📖 Core Concepts Educational assessment – systematic collection & use of data (knowledge, skills, attitudes, etc.) to improve learning and programs. Placement (pre‑assessment) – given before instruction; establishes baseline, usually ungraded. Formative assessment – occurs throughout a course; provides feedback for learning; not necessarily graded. Summative assessment – administered at the end of a unit/course; assigns a grade and evaluates overall achievement. Diagnostic assessment – identifies specific difficulties to guide remediation. Objective assessment – single‑best‑answer items (e.g., MC‑choice, true‑false). Subjective assessment – open‑ended responses (essays, projects) with multiple possible correct answers. Criterion‑referenced – performance judged against a set standard (e.g., “pass/fail” rubric). Norm‑referenced – performance judged relative to peers (e.g., IQ, percentile rank). Formal assessment – written, scored, contributes to a numerical grade. Informal assessment – observations, checklists, portfolios; no grade attached. Internal vs. External – Internal: designed/graded by the school; External: designed by a governing body, graded by independent raters. Reliability – consistency of scores across time, forms, items (temporal stability, form equivalence, internal consistency). Validity – extent to which an assessment measures what it claims (content, criterion, construct). Practicality – considerations of time, cost, ease of construction, administration, scoring. Authenticity – tasks mirror real‑world contexts and language. Washback – impact of the test on teaching & learning (positive → good practices; negative → teaching to the test). Grade inflation – systematic rise in grades for the same level of work, devaluing grades. High‑stakes testing – results drive major decisions (graduation, admission, teacher eval). Large‑scale learning assessment – system‑wide snapshots of achievement, often used for policy. Assessing English‑Language Learners (ELLs) – standard tests may be culturally biased; non‑verbal or adapted tests help reduce bias. Universal screening – tests all students (e.g., for giftedness) to avoid selection bias. Theoretical frameworks – behaviorism, cognitivism, constructivism, sociocultural theory shape assessment design. --- 📌 Must Remember Formative ≠ graded; its primary goal is feedback, not a score. Reliability ≠ validity – a test can be consistent but measure the wrong thing. Criterion‑referenced → “Did the student meet the standard?”; Norm‑referenced → “How does the student compare to others?” Objective items are easier to score reliably; subjective items provide richer evidence of higher‑order thinking. Washback can be positive (encourages deep learning) or negative (teaching to the test). High‑stakes tests amplify washback effects. Authentic tasks increase construct validity but may lower practicality. Grade inflation erodes the signal of grades for selection & motivation. --- 🔄 Key Processes Designing a Reliable Test Draft items → pilot → calculate internal consistency (e.g., Cronbach’s α). Ensure temporal stability by administering the same test to a stable group at two points. Create parallel forms for different administrations; compare scores for form equivalence. Establishing Validity Content validity: map each item to learning objectives. Criterion validity: correlate test scores with an external standard (e.g., later course grade). Construct validity: gather evidence that scores reflect the intended theoretical construct (e.g., factor analysis). Choosing Assessment Type Identify purpose (feedback, grading, placement, diagnosis). Match purpose to type: feedback → formative; final grade → summative; baseline → placement; remediation → diagnostic. Implementing Washback Awareness Review test content → ensure alignment with desired instructional practices. Adjust stakes & feedback loops to encourage positive washback. --- 🔍 Key Comparisons Objective vs. Subjective Objective: single correct answer, high reliability, limited depth. Subjective: multiple valid responses, lower reliability, richer insight. Criterion‑referenced vs. Norm‑referenced Criterion: measures against a standard; all can pass/fail. Norm: ranks learners; performance is relative. Formal vs. Informal Formal: written, scored, contributes to grade. Informal: observational, developmental, no grade. Internal vs. External Internal: school‑controlled, immediate feedback. External: standardized, broader comparability, limited feedback. Reliability vs. Validity Reliability: consistency of scores. Validity: accuracy of what’s being measured. High‑stakes vs. Low‑stakes High‑stakes: outcomes affect graduation, admission, employment. Low‑stakes: primarily for learning information, minimal consequences. --- ⚠️ Common Misunderstandings “Formative assessments are always ungraded.” – They can be graded; the key is that the primary purpose is feedback. “A reliable test must be valid.” – Reliability is about consistency; a test can be consistently wrong. “Norm‑referenced tests are unfair.” – They are appropriate when the goal is ranking, not when a fixed standard is required. “Authentic tasks are always impractical.” – Balance authenticity with practicality; small‑scale authentic tasks can be feasible. “Grade inflation means students are smarter.” – Inflation reflects grade‑giving practices, not student ability. --- 🧠 Mental Models / Intuition Assessment Loop → Set Objectives → Design Tasks → Collect Evidence → Analyze → Feedback → Revise Instruction. Reliability‑Validity Trade‑off – Imagine a scale: tightening the scale (reliability) doesn’t guarantee it points to the right weight (validity). Washback as a Mirror – The test reflects teaching; a distorted mirror (negative washback) changes what teachers focus on. --- 🚩 Exceptions & Edge Cases Authenticity vs. Practicality – Highly authentic tasks (e.g., real‑world projects) may be costly; use mini‑authentic activities to retain validity. Norm‑referenced in Mastery Contexts – Sometimes norm data are used to benchmark mastery standards, but interpret with caution. Diagnostic Without Follow‑up – Identifying a difficulty is useless unless intervention follows. High‑stakes Tests in Low‑resource Settings – May increase negative washback due to limited instructional flexibility. --- 📍 When to Use Which Formative → when you need ongoing feedback to adjust teaching. Summative → at the end of a unit to assign grades or certify competence. Objective items → for large‑scale, high‑reliability needs (e.g., state tests). Subjective items → to assess critical thinking, writing, problem solving. Criterion‑referenced → when a fixed standard defines success (e.g., competency exams). Norm‑referenced → when the goal is selection or ranking (e.g., competitive scholarships). Internal assessment → for quick, formative feedback. External assessment → for policy decisions, accountability. --- 👀 Patterns to Recognize Timing clue – “before instruction” → placement; “throughout the course” → formative; “at the end” → summative. Scoring clue – “single correct answer” → objective; “requires essay” → subjective. Reference clue – “against a rubric” → criterion‑referenced; “compared to peers” → norm‑referenced. Stake clue – “determines graduation” → high‑stakes; “provides practice” → low‑stakes. --- 🗂️ Exam Traps Confusing formative with summative – Look for purpose (feedback vs. grade) rather than the presence of a score. Assuming reliability guarantees validity – Test items may be consistent but still miss the intended construct. Mixing up criterion‑ vs. norm‑referenced – Criterion = fixed standard; Norm = relative standing. Choosing objective format for complex skills – Essays may be needed to capture higher‑order thinking; MC‑items can be a trap if they oversimplify. Over‑emphasizing grade inflation as improvement – Higher grades don’t equal higher learning; watch for inflation signals (grade distribution shift without performance change). ---
or

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or