Strengthening Evidence Use in Professional Learning: Rethinking the Testing Phase

Testing is a critical step in the curriculum-based professional learning (CBPL) purchase journey; demos and pilots are a district’s best chance to assess how well a PL vendor can address their specific needs. It’s a key opportunity to determine curriculum alignment, evaluate potential instructional impact, collect teacher feedback, and examine delivery quality. When testing wraps up, district leaders use these different types of evidence to make a final purchase decision.

However, several barriers limit how thoroughly this evidence can be gathered and evaluated. Time constraints, resource limitations, delivery quality concerns, and the need to secure teacher buy-in can shift priorities during the testing phase. As a result, leaders may focus on evidence of delivery quality, such as presenter-teacher rapport or participant satisfaction. While these elements are important for overall impact, they provide only a partial picture of PL effectiveness. Other critical markers of quality, such as alignment with instructional goals or evidence of student outcomes, can be easy to overlook, especially when they are difficult to measure or interpret.

Research from the EdSignals Studio helps explain how these challenges originate, uncovering many of the practical constraints and behavioral factors that influence how districts engage with evidence during PL pilots. In this article, we share key research insights to help vendors and quality arbiters better support districts during the testing phase of PL adoption.

Limited Time and Resources Prevent Meaningful Testing

Time and cost are two of the most significant barriers to robust PL testing.¹ To accommodate these limitations, districts may skip or abbreviate testing, limiting the amount of context-specific information they can gather about program efficacy. In these cases, testing can become more of a procedural step; pilots are sometimes viewed as a commitment to a single vendor rather than an opportunity to compare options.¹ Decisions may therefore rely less on evidence of instructional impact or local teacher feedback and more on surface-level signals of program quality, such as existing relationships with vendors or recommendations from peer districts.

Addressing this challenge requires rethinking how districts approach testing within real cost and time constraints. Simplifying and diversifying pilot program offerings can help make evidence of PL quality more accessible to vendors and districts.¹ Offering CBPL experiences in various formats, including shorter and cheaper testing options for districts working with narrower time and cost constraints, supports districts with lower bandwidth while increasing the pool of buyers for PL providers.

While compressed, these testing processes still need to generate usable feedback and data. The goal should be to help districts gather context-specific information about CBPL offerings without requiring full-scale pilots. One way districts can do this is by including built-in evaluation tools within test experiences. For example, vendors could hand out learning assessments during demos so district leaders can determine if teachers are gaining new knowledge from the training. Similarly, vendors could provide simple impact rubrics to help leaders assess whether teachers are applying strategies introduced in the PL in their classrooms. These solutions ensure evidence is both accessible and easy to use within the constraints of a real decision context.²

Testing Prioritizes Buy-In Over Program Evaluation

Purchasing CBPL is a high-stakes commitment. Implementing full-scale professional learning requires significant investments of district resources and teacher time, and poorly-received programs can undermine teacher trust in district leadership.¹ When teachers do not buy into a new PL approach, they are less likely to meaningfully implement new strategies, limiting the program’s impact on classroom instruction. These potential consequences present a significant risk that district leaders want to avoid.

Given the long-term implications for teacher trust and engagement, many district leaders are motivated to secure teacher buy-in during testing.¹ While building support from educators is important, problems emerge when pilots are deployed more as a social validation step than a chance to rigorously test programs, assess curriculum alignment, or objectively evaluate vendor quality. Fueling this tendency is a form of zero-risk bias, where decision-makers prefer choices with certain, safer outcomes over those with uncertain but potentially better outcomes.³ In CBPL purchasing, districts with limited bandwidth to evaluate pilots can be driven to minimize the risk of teacher dissatisfaction over the potential gains of rigorously evaluating program effectiveness. The result is that some districts may use pilots primarily to generate support, overlooking a critical opportunity to evaluate program quality.

One way vendors can address this problem is by building trust early so that leaders feel safe evaluating programs more deeply during pilots. This starts at the beginning of the purchase journey, when districts first recognize a need for new PL. By involving teachers in the process of identifying PL goals and scoping available options, leaders can demonstrate their commitment to teacher needs and signal that decisions are being made with their perspectives in mind.

During testing, vendors can further support this trust-building process by offering free demo sessions or video samples of presentations, allowing district leaders to observe delivery quality before committing to a full pilot.¹ These sample formats should highlight critical CBPL workshop features that are known to support teachers through curriculum changes, such as collaborative teacher reflection opportunities or pedagogical sensemaking.⁴ Quality arbiters can also play a role in these solutions by listing vetted vendors that are willing to provide free samples or demos on resource pages, allowing district leaders to easily identify high-quality, low-risk options.¹

Overall, PL samples can serve as valuable trust signals that build bidirectional trust between district leaders and teachers. Teachers have the chance to provide their opinions and perspectives at a crucial moment in the decision-making process, and district leaders feel more confident in their ability to secure buy-in when testing begins. When concerns about teacher buy-in are addressed early, pilots are more likely to serve as evaluation opportunities, allowing district leaders to more effectively collect evidence on other quality signals, such as curriculum alignment or instructional fit.

Delivery Experience Outweighs Evidence of Program Effectiveness

Even when districts run pilots with the intent of gathering robust information, decision-makers may place greater emphasis on presenter quality rather than signals of program quality.¹ Leaders rightfully want to assess whether presenters have charisma, how well they connect with teachers, and whether vendors can maintain high-quality delivery over time.¹ While these facets of PL quality are incredibly important, they are not sufficient indicators of program fit and effectiveness on their own.

While pilots and demos are key opportunities for districts to determine delivery quality, focusing on delivery can make it easier to overlook details about the actual PL content delivered.¹ Decision-makers may move forward with the strongest presenters, even if other CBPL offerings are a better fit for their district’s needs. This pattern can be explained by the halo effect, where a positive impression of a presenter shapes perceptions of the material they’re presenting and the vendor they represent.⁵ In other words, strong presenter traits are generalized to broader assumptions about program effectiveness. Compounding this issue is the availability heuristic, a mental shortcut that causes us to give more weight to information that is immediately available. Since signs of delivery quality are more immediately visible than abstract measures of content quality or instructional impact, they can end up carrying more weight in decision-making.

To encourage more objective decision-making during CBPL testing, districts need accessible ways to evaluate programs beyond delivery quality alone. For example, pre-established evaluation frameworks can serve as guides for decision-makers, encouraging them to assess multiple dimensions of PL quality. When creating these frameworks in-house, districts need to start with a clear understanding of the purpose and intended outcomes of the PL, then identify measures that will help them track changes during testing.⁶ For example, districts could analyze the differences in student math achievement between classes taught by coached teachers and those taught by non-coached teachers. Even simple pre- and post-learning assessments for educators can reveal whether teachers are gaining critical knowledge and skills during PL sessions, helping districts identify strengths and blind spots beyond presenter quality alone.

Extracting More Value from PL Testing

The testing phase in CBPL purchasing is an important opportunity for district leaders to assess whether vendors can deliver on their promises and drive real improvements in the classroom. However, not all districts have the time and resources for rigorous testing, limiting the amount of meaningful information that can be gathered during this process. At the same time, because teacher trust is top of mind during testing, district leaders may overlook important but less visible signals of program quality in favor of securing buy-in and assessing the delivery experience. As a result, the testing phase can sometimes struggle to produce reliable evidence about program effectiveness, weakening a district’s ability to make evidence-based purchasing decisions.

To strengthen evidence use during the testing phase, districts and PL providers should work together to design pilots that generate actionable insights. This means providing time-pressed districts with more accessible testing formats, offering early signals of teacher satisfaction so leaders feel comfortable evaluating other aspects of program quality, and establishing clear testing criteria so more abstract forms of evidence are easier to measure and interpret. When testing is designed to generate usable evidence rather than simply validate decisions, districts are better equipped to make strong purchasing decisions that lead to real classroom impact.

Sources

EdSignals Studio, Cohort 1, 2025
EdSignals Studio, Smarter Demand: Dimensions of Quality in Purchasing Decisions, 2023
EdSignals Studio, Pilot Cohort, 2024
McNeill, K. L., Affolter, R., Lowell, B. R., Cherbow, K., Gonzalez, C., & Lee, S. (2025). Supporting teachers through curriculum-based professional learning: Shifting teachers’ instructional vision of science to empower student voice. Journal of the Learning Sciences, 34(5), 787–839. https://doi.org/10.1080/10508406.2025.2496362
EdSignals Studio, Market Analysis: K-12 Teacher Prep Decision Maps, 2020
Foster, E. (June 2025). Evaluating professional learning doesn’t have to be complicated. Learning Forward. https://learningforward.org/journal/measuring-learning/evaluating-professional-learning-doesnt-have-to-be-complicated/

Strengthening Evidence Use in Professional Learning: Rethinking the Testing Phase

Summary

Limited Time and Resources Prevent Meaningful Testing

Testing Prioritizes Buy-In Over Program Evaluation

Delivery Experience Outweighs Evidence of Program Effectiveness

Extracting More Value from PL Testing

Sources

Strengthening Evidence Use in Professional Learning: How Districts Navigate Early Decisions

Strengthening Evidence Use in Professional Learning: Evaluating Quality and Fit

Strengthening Evidence Use in Professional Learning: The Final Decision

Strengthening Curriculum Purchasing Decisions: The Early Stages

Get Expert Guidance on Evidence

Get Expert Guidance on Evidence

Strengthening Evidence Use in Professional Learning: Rethinking the Testing Phase

Summary

Limited Time and Resources Prevent Meaningful Testing

Testing Prioritizes Buy-In Over Program Evaluation

Delivery Experience Outweighs Evidence of Program Effectiveness

Extracting More Value from PL Testing

Sources

Be Part of the Movement for Quality Education

Related Insights

Strengthening Evidence Use in Professional Learning: How Districts Navigate Early Decisions

Strengthening Evidence Use in Professional Learning: Evaluating Quality and Fit

Strengthening Evidence Use in Professional Learning: The Final Decision

Strengthening Curriculum Purchasing Decisions: The Early Stages

Get Expert Guidance on Evidence

Get Expert Guidance on Evidence