Subjects/Other/Study Skills and Preprofessional/Research Methods/Program evaluation

Program evaluation - Designing and Conducting Evaluations

Understand the stages of program evaluation, key frameworks such as CDC and CIPP, and practical strategies for conducting evaluations under budget, time, and data constraints.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What is the primary purpose of a needs assessment?

1 of 31

Summary

Program Evaluation: Comprehensive Overview What is Program Evaluation and Why It Matters Program evaluation is a systematic process for determining whether a program is achieving its intended goals and having the desired impact on the population it serves. Rather than assuming a program works, evaluators ask tough questions: Does the problem actually exist? Is the program being delivered correctly? Is it actually making a difference? Evaluation serves multiple purposes. It helps organizations improve their programs while they're running (formative evaluation), and it helps determine whether programs should continue or be modified based on results (summative evaluation). In essence, evaluation is about accountability and learning—proving that a program works while using evidence to make it better. The Four Core Evaluation Stages There are four distinct stages at which programs can be evaluated. Each stage answers a different fundamental question and relies on different types of evidence. Stage 1: Needs Assessment Before launching any program, evaluators must ask: Does this problem actually exist, and is it significant enough to warrant intervention? A needs assessment examines whether the target population truly has the problem the program intends to address. This may sound obvious, but organizations sometimes create programs based on assumptions rather than evidence. A needs assessment grounds evaluation in reality. What a needs assessment accomplishes: A comprehensive needs assessment identifies four critical pieces of information: (1) the precise definition of the problem, (2) who or what is affected, (3) how widespread the problem is, and (4) the measurable effects caused by the problem. For example, a program addressing childhood obesity would need to document not just that obesity exists in a community, but by how much, in which age groups, and what health consequences result. Three population categories: Evaluators distinguish between three overlapping groups: Population at risk: People who could develop the problem but haven't yet (e.g., sedentary children at risk of obesity) Population in need: People who currently have the problem (e.g., children already classified as obese) Population in demand: People actively seeking help for the problem (e.g., families actively enrolling in weight-loss programs) Programs typically serve the population in demand, but a needs assessment helps identify whether resources should be targeted at the population at risk for prevention. The four steps of conducting a needs assessment: Perform a gap analysis: Compare the current state (where things are) with the desired state (where things should be). The gap is the problem that needs solving. Identify priorities and importance: Not all gaps are equally important. Needs assessment determines which problems are most critical and most feasible to address. Identify causes of performance problems: Understanding why a problem exists helps design better solutions. For example, is childhood obesity caused by lack of physical activity, poor nutrition, or both? Identify possible solutions and growth opportunities: A needs assessment doesn't just identify problems—it also explores what interventions might work, based on existing research and community input. Why community input matters: Conducting a needs assessment requires more than data analysis; it demands regular consultation with community stakeholders and potential beneficiaries. People experiencing the problem often have insights that data alone cannot reveal. This collaborative approach increases the likelihood that the program will actually address real community concerns. Stage 2: Program Theory (Logic Model) Once a need is established, the next question becomes: How exactly will this program create change? Program theory, also called a logic model, knowledge map, or impact pathway, is the explicit articulation of the causal chain: If we do X, then Y will happen, which will lead to Z. It's the program's theory about how its actions will produce the intended outcomes. Why program theory matters: Many programs operate with implicit, untested assumptions about how they work. An evaluator's job includes making these assumptions explicit and testing whether they're plausible. A program might assume that providing financial literacy training automatically leads to better financial decisions, but that's not guaranteed—people might need ongoing support or different incentives. The five components of a logic model: Resources or inputs: The money, staff, facilities, and materials the program has available Activities: What the program actually does (e.g., conducts workshops, provides counseling, delivers medication) Outputs: The direct products of activities (e.g., number of workshops held, number of people served) Short-term outcomes: The immediate changes in knowledge, skills, or attitudes (e.g., participants gain financial knowledge) Long-term outcomes: The sustained changes in behavior or social conditions (e.g., participants have better credit scores and lower debt) A critical point: Outputs are not outcomes. Holding 10 workshops (output) is meaningless if participants don't gain knowledge or change behavior (outcome). Evaluators must focus on what actually changes, not just what the program produces. Assessing whether the logic model is sound: An evaluator tests program theory by asking four key questions: Does it relate to social needs? Is the outcome the program aims to achieve actually important for addressing the identified need? Does the logic make sense? Would expert reviewers agree that the causal chain is plausible? (e.g., Does financial literacy training logically lead to better financial decisions?) Does research support it? Have similar programs produced similar results in the research literature, or are you relying on unproven assumptions? Does preliminary observation confirm it? Early implementation can reveal whether the theoretical chain actually occurs in practice. Stage 3: Implementation Assessment (Process Evaluation) Assuming the program theory is sound, the next question is: Is the program actually being delivered as designed? Implementation assessment, often called process evaluation, determines whether: The program reaches the populations it's supposed to reach Participants actually receive the intended services Staff members are adequately qualified and trained The program maintains fidelity to its design while adapting to local conditions Why implementation matters: Even a well-designed program with sound theory will fail if it's poorly implemented. Staff might skip critical activities, serve the wrong population, or deliver services inconsistently. Process evaluation catches these problems in real time. How process evaluation works: Process evaluation is ongoing and uses repeated measures to monitor fidelity to the program design. Rather than waiting until the program ends to discover problems, evaluators continuously check: Are we doing what we said we'd do? Where are we falling short? What barriers are we encountering? For example, a violence prevention program might track whether facilitators are conducting sessions with the frequency planned, covering all curriculum content, and maintaining engagement with participants. If adherence is low, the program can adjust training or resources before implementation problems undermine the entire effort. Stage 4: Impact (Effectiveness) Assessment Finally, after the program has been implemented, the crucial question is: Did the program actually cause the outcomes we intended? Impact evaluation (also called effectiveness evaluation) determines the causal effects of the program on its intended outcomes. This is the most complex form of evaluation because establishing causation requires sophisticated thinking about what would have happened without the program. Understanding outcomes and outcome change: An outcome is a characteristic of the target population or social condition that the program is expected to change. Examples include academic achievement, employment status, health outcomes, or civic participation. Two related concepts are essential: Outcome level: The status of an outcome at a single point in time (e.g., a student's test score in May) Outcome change: The difference in outcome level between two points in time (e.g., the difference between a student's test score in May versus September) However, not all outcome change is caused by the program. People's circumstances change for many reasons. A student's test scores might improve because the program helped them—or because they inherited a stronger work ethic from their parents, or because the testing was easier the second time, or simply due to natural variation. Program effect: The crucial distinction: Program effect is the portion of outcome change that can be uniquely attributed to the program. If students served by a program improve by 10 percentage points on average, but students in a similar non-program school improved by 7 percentage points, the program effect might be approximately 3 percentage points (the difference between the two groups). Measuring program effect requires comparison. The gold standard is a randomized controlled trial where some participants are randomly assigned to receive the program while others form a control group. But evaluators also use quasi-experimental methods, statistical controls, and regression analysis to estimate causal effects when true experiments aren't possible. Measuring outcomes with indicators: Outcomes are abstract concepts (health, learning, economic stability). To measure them, evaluators select observable indicators that vary systematically with changes in the underlying condition. For example: To measure "health," you might use indicators like blood pressure, cholesterol levels, or reported physical activity To measure "economic stability," you might use income, savings, or employment status To measure "academic achievement," you might use test scores, grades, or graduation rates The indicator must actually reflect the outcome it's supposed to measure. Not every health-related indicator measures health equally well. Stage 5: Efficiency (Cost-Benefit) Assessment Beyond asking whether a program works, organizations increasingly ask: Is the program worth its cost? Cost-benefit analysis (also called cost-efficiency analysis) compares the benefits of a program with its costs. An efficient program achieves its outcomes at a reasonable cost compared to alternatives. Two dimensions of efficiency: Static efficiency: Achieving current objectives with the least cost. For example, which reading intervention produces the best reading gains per dollar spent? Dynamic efficiency: Continuous improvement of the program over time. Rather than just comparing cost-to-benefit once, evaluators ask whether the program gets more efficient as it matures and learns from experience. An important point: A program can be effective but inefficient if a similar program produces the same results at lower cost. Evaluation must consider both impact and resources. Major Evaluation Frameworks Different frameworks provide systematic approaches to organizing and conducting evaluations. Two major frameworks appear frequently in evaluation practice. The CDC Six-Step Framework The Centers for Disease Control and Prevention outlines a comprehensive six-step process for planning and conducting evaluations: Engage stakeholders: Identify and involve the people who have a stake in the program—program staff, participants, community members, funders, and policymakers. Stakeholders help define evaluation questions and ensure that evaluation findings are relevant and usable. Describe the program: Create a clear description of what the program is, how it operates, the needs it addresses, and the outcomes it expects. This often involves developing a logic model (as discussed earlier). Focus the evaluation design: Based on program stage and available resources, decide what aspects to evaluate and what evaluation questions to prioritize. A new program might focus on implementation; a mature program might focus on impact. Gather credible evidence: Collect data using methods appropriate to your evaluation questions. This might include surveys, interviews, focus groups, administrative records, or observation. The data must be of high quality and collected systematically. Justify conclusions: Analyze the evidence and draw conclusions that directly answer your evaluation questions. Justify why you've reached these conclusions based on the evidence collected. Ensure use and share lessons learned: Disseminate findings to intended users in formats they can understand and use. Create feedback loops so that findings actually improve the program and inform future decisions. This framework emphasizes that evaluation is not just about research—it's about decision-making and improvement. Each step connects directly to stakeholder needs and organizational learning. The CIPP Model of Evaluation The CIPP model, developed by Daniel Stufflebeam in the 1960s, provides another systematic framework. CIPP stands for Context, Input, Process, and Product—the four main aspects of evaluation. This model directly links evaluation to program decision-making. The four components and key questions: Each CIPP component answers a fundamental question: | Component | Key Question | What It Evaluates | |-----------|--------------|-------------------| | Context | "What should we do?" | Needs, goals, and stakeholder concerns; justifies why the program exists | | Input | "How should we do it?" | Resources, strategies, and how the program design compares to external models and best practices | | Process | "Are we doing it as planned?" | Implementation fidelity, operational issues, and whether the program is being delivered as designed | | Product | "Did the program work?" | Outcomes and whether the program achieved its intended results | Timing of CIPP evaluations: Different CIPP components are most useful at different program stages: Before program start: Context and Input evaluations assess needs and design. Does a need exist? Is the program design sound? Are resources adequate? During implementation: Process evaluation monitors execution. Is the program being delivered as planned? What barriers are emerging? After program completion: Product evaluation determines effectiveness. Did the program achieve its intended outcomes? Formative versus summative questions within CIPP: The CIPP framework incorporates both formative and summative evaluation: Formative questions (e.g., "What needs to be done?" and "Are we doing it as planned?") guide program planning and improvement while the program is running. Answers help staff make mid-course corrections. Summative questions (e.g., "Did the effort succeed?") assess overall success and inform decisions about whether to continue, modify, or terminate the program. Planning a Program Evaluation Before conducting any evaluation, careful planning is essential. Planning includes four interconnected parts: Focusing the evaluation: Determine what specific aspects of the program will be evaluated and what questions will drive the evaluation Collecting the information: Plan data collection methods, timing, and sources Using the information: Identify who will use findings and how they'll apply them Managing the evaluation: Organize resources, timeline, personnel, and budget Critical planning questions that guide the process: What is being evaluated (the whole program, specific components, specific populations)? What is the primary purpose of this evaluation (improvement, accountability, demonstration)? Who are the intended users of the evaluation (program staff, funders, policymakers)? What are the key evaluation questions that stakeholders want answered? What information is required to answer those questions? When is the information needed? What resources (budget, staff, expertise) are available? What data collection methods are most feasible and cost-effective? How will data be analyzed? What is the timeline for the entire evaluation? Without clear answers to these questions, evaluations can become unfocused, expensive, and ultimately not useful. <extrainfo> Practical Approaches to Resource Constraints The Shoestring Approach Real-world evaluations often operate under severe constraints: limited budgets, limited access to data, or tight timelines. The shoestring approach provides practical guidance for conducting methodologically rigorous evaluations despite these constraints. The shoestring approach emphasizes: Simplifying design: Use straightforward designs rather than complex experimental approaches Reducing sample size: Focus on smaller but more carefully selected samples rather than large representative samples Using economical data-collection methods: Leverage volunteers, short surveys, focus groups, and existing records rather than expensive primary data collection Leveraging secondary data: Use administrative records, census data, or data that other organizations have already collected All three constraints (budget, time, and data) can be mitigated through these strategies, careful planning, and triangulating multiple data sources to increase confidence in findings. The Five-Tiered Approach The five-tiered approach refines the shoestring strategy by matching evaluation intensity to program characteristics and available resources. Rather than assuming all programs require the same level of evaluation rigor, this framework suggests that simple programs with ample resources might use less intensive evaluation, while complex programs with limited resources might need creative approaches. The framework provides a conceptual structure for selecting the most appropriate evaluation design given specific constraints and program complexity. </extrainfo> <extrainfo> Collective Impact Evaluation In some cases, multiple organizations from different sectors work together toward a common agenda to solve a specific social problem—this is called collective impact. Evaluating these complex, multi-organizational initiatives requires different approaches at different stages: Early phase (uncertain strategy): Uses developmental evaluation that provides real-time feedback on emerging system dynamics, helping the initiative adapt as it learns Middle phase (implementation): Uses formative evaluation to monitor processes and improve progress, while still employing developmental evaluation for new emerging elements that weren't anticipated Later phase (stability): Uses summative evaluation, combining quantitative and qualitative methods to understand what was ultimately achieved and why </extrainfo>

Flashcards

What is the primary purpose of a needs assessment?

To examine whether a target population actually has the problem a program intends to address.

What four specific elements does a needs assessment identify regarding a problem?

The identity of the problem Who or what is affected How widespread the problem is The measurable effects caused by the problem

What is the first step an evaluator takes when conducting a needs assessment?

Constructing a precise definition of the problem.

How does an evaluator determine the extent of a problem in a needs assessment?

By answering "where" and "how big" the problem is.

In a needs assessment, how is the "how" question answered?

By analyzing whether the proposed plan can eliminate the need.

What are the three categories used to classify populations in a needs assessment?

Population at risk Population in need Population in demand

What is the definition of a program theory (logic model)?

The implicit assumption about how a program’s actions will achieve its intended outcomes.

What are the five major components of a logic model?

Resources or inputs Activities Outputs Short‑term outcomes Long‑term outcomes

What four methods can be used to assess program theory?

Relating the theory to social needs Expert review of logic and plausibility Comparing the theory with existing research and practice Preliminary observation of program implementation

What does an implementation assessment evaluate?

Whether the program is being delivered as intended and if critical components are in place.

What three factors does a process evaluation determine?

If target populations are reached If intended services are received If staff are adequately qualified

What is the goal of an impact evaluation?

To determine the causal effects of the program on its intended outcomes.

In evaluation, how is an "outcome" defined?

A characteristic of the target population or social condition that the program is expected to change.

What is the difference between outcome level and outcome change?

Outcome level is the status at one point in time, while outcome change is the difference between two points in time.

What is a "program effect"?

The portion of outcome change that can be uniquely attributed to the program.

What characterizes an efficient program in a cost-benefit analysis?

A lower cost‑benefit ratio.

What is the focus of static efficiency?

Achieving objectives with the least cost.

What is the focus of dynamic efficiency?

Continuous improvement of the program over time.

What are the six steps in the CDC evaluation framework?

Engage stakeholders Describe the program Focus the evaluation design Gather credible evidence Justify conclusions Ensure use and share lessons learned

What type of evaluation is used during the early phase of collective impact?

Developmental evaluation (for real-time feedback on emerging dynamics).

What type of evaluation is primarily used during the middle phase of collective impact?

Formative evaluation (to monitor processes and improve progress).

What type of evaluation is used during the later phase (stability) of collective impact?

Summative evaluation.

Under what conditions is a shoestring approach used for evaluation?

Limited budget, limited data access, or tight timelines.

What strategies does the shoestring approach emphasize to reduce costs?

Simplifying design Reducing sample size Using economical data-collection methods Leveraging reliable secondary data

What does the acronym CIPP stand for?

Context, Input, Process, and Product.

In the CIPP model, what core question does Context evaluation answer?

"What should we do?"

In the CIPP model, what core question does Input evaluation answer?

"How should we do it?"

In the CIPP model, what core question does Process evaluation answer?

"Are we doing it as planned?"

In the CIPP model, what core question does Product evaluation answer?

"Did the program work?"

What is the purpose of formative questions in the CIPP model?

To guide program planning and improvement.

What is the purpose of summative questions in the CIPP model?

To assess overall success and inform continuation decisions.

Quiz

According to the CDC’s six‑step evaluation framework, what is the first step?

1 of 1

Key Concepts

Evaluation Frameworks

CDC Six‑Step Framework

CIPP Model

Five‑Tiered Approach

Collective Impact Evaluation

Assessment Methods

Needs Assessment

Implementation Assessment

Impact Evaluation

Cost‑Benefit Analysis

Shoestring Approach

Logic Model (Program Theory)

Definitions

Needs Assessment

A systematic process to determine the existence, magnitude, and characteristics of a problem within a target population.

Logic Model (Program Theory)

A visual representation outlining the resources, activities, outputs, and outcomes that explain how a program is intended to achieve its goals.

Implementation Assessment

An evaluation of whether a program is being delivered as planned and whether essential components are in place.

Impact Evaluation

The measurement of a program’s causal effects on its intended outcomes, distinguishing program-attributable change from other influences.

Cost‑Benefit Analysis

An economic assessment that compares the monetary benefits of a program to its costs to determine overall efficiency.

CDC Six‑Step Framework

A structured approach by the Centers for Disease Control and Prevention for planning, conducting, and using program evaluations.

Collective Impact Evaluation

An evaluation methodology for multi‑sector collaborations that assesses progress through developmental, formative, and summative phases.

Shoestring Approach

A strategy for conducting rigorous evaluations under constraints of limited budget, data, or time by simplifying design and leveraging low‑cost methods.

Five‑Tiered Approach

A framework that matches evaluation intensity to program complexity and resource availability, refining the shoestring method.

CIPP Model

An evaluation model (Context, Input, Process, Product) that links assessment to decision‑making throughout a program’s lifecycle.