Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 October 2023

A whole learning process-oriented formative assessment framework to cultivate complex skills

  • Xianlong Xu 1 ,
  • Wangqi Shen 2 ,
  • A.Y.M. Atiquil Islam   ORCID: orcid.org/0000-0002-5430-8057 1 , 3 &
  • Yang Zhou 1  

Humanities and Social Sciences Communications volume  10 , Article number:  653 ( 2023 ) Cite this article

1710 Accesses

3 Altmetric

Metrics details

In the 21st century, cultivating complex skills has become an urgent educational need, especially in vocational training and learning. The widespread lack of formative assessment during complex problem-solving and skill-learning processes limits students’ self-perception of their weakness and teachers’ effective monitoring of students’ mastery of complex skills in class. To investigate methods of how to design and carry out formative assessments for the learning of complex skills, a whole learning process-oriented formative assessment framework for complex skills was established. To verify the feasibility and effects of the formative assessment, a controlled experiment involving 35 students majoring in industrial robotics from one of Shanghai’s Technical Institutes was designed. The results indicate that the formative assessment can effectively promote students’ learning of conceptual knowledge and the construction and automation of cognitive schema as well as improve students’ competency in the implementation and transference of complex skills. In addition, the formative assessment, which can optimize the allocation of psychological effort by increasing the proportion of germane cognitive load within the overall cognitive load, does not place an additional cognitive load on students. It can provide methodological support for promoting students’ learning of complex skills.

Similar content being viewed by others

research on formative assessment

Fostering twenty-first century skills among primary school students through math project-based learning

Nadia Rehman, Wenlan Zhang, … Samia Batool

research on formative assessment

The sub-dimensions of metacognition and their influence on modeling competency

Riyan Hidayat, Hermandra & Sharon Tie Ding Ying

research on formative assessment

Exploring the connection between deep learning and learning assessments: a cross-disciplinary engineering education perspective

Sabrina Fawzia & Azharul Karim

Introduction

The global, organizational, and technical evolution of the 21st century has created the urgent need for learners to acquire complex skills across all levels of education and training (Maddens et al., 2020 ; OECD, 2018 ). Complex 21st century skills (21CS), such as creativity, complex problem solving, and collaborative learning, are usually characterized and understood as skills that integrate conceptual knowledge, operative technique, and social affective attitude, which can help people solve complex and unstructured problems in real life (Alahmad et al., 2021 ; Jian et al., 2021 ). The cultivation of students’ complex 21CS is an increasing challenge for people learning and working in current and future societies. Assessment of the learning outcomes of complex 21CS has long been highlighted as one of the most challenging issues that educators and researchers currently face (Eseryel et al., 2013 ; Iñiguez-Berrozpe and Boeren, 2020 ; Kerrigan et al., 2021 ).

In recent decades, the role of educational assessment has expanded from the summative display of students’ acquisition of teaching content to a formative evaluation during the teaching process. The concept of formative assessment emphasizes the integration of instruction and evaluation. It requires a dynamic, continuous framework for evaluating performance that promotes students’ progress and the mastery of increasingly sophisticated skills, which is important for the cultivating of complex 21CS (Ackermans et al., 2021 ; Care et al., 2019 ; Nadolski et al., 2021 ). However, since complex 21CS involve complex learning of different but interconnected knowledge and competencies, formative evaluation practices in the school curricula are relatively scarce (Ackermans et al., 2021 ; Webb et al., 2018 ). As a result, there are still some key issues that must be explored in how to evaluate and monitor students’ learning progress under complex context, such as how and when formative assessment should be offered (Bhagat and Spector, 2017 ; Kim, 2021 ).

Complex skill learning

Learning skills to solve ill-structured complex problems has always been an important issue in the field of educational technology (Thima and Chaijaroen, 2021 ). Complex problem solving is often considered to be the key transferable skill for learners in the 21st century to be competent in multiple working scenarios (Koehler and Vilarinho-Pereira, 2021 ). Mayer ( 1992 ) believes that learners’ experiences with complex problem solving contribute to the formation of their cognitive skills. Merrill ( 2002 a) recommends the Pebble-In-The-Pond design model to provide complex skill training.

Van-Merriënboer and Kester ( 2014 ) propose a detailed theoretical explanation for the complex learning process and the composition of complex skills under real problem backgrounds through the Four Components Instructional Design (4C/ID) model, which was introduced in 2002. The process of learning complex skills is considered the integrative realization process of a series of learning objectives, including an integrative mastery of knowledge, skills, and attitudes (Frerejean et al., 2019 ). The implementation of learning activities is based on decomposed complex skills (which are called constituent skills) under the premise of understanding the process of problem solving. During the process of cultivating complex skills, learners must invest sufficient effort into the construction of cognitive schema and achieve rules automation, which is the automation of cognitive schema (Melo, 2018 ). In addition, to avoid placing an excessive cognitive load on learners during learning tasks, based on the Cognitive Load Theory, the 4C/ID model claims to scaffold students’ learning by designing components of supportive information, procedural information, and part-task practice within different tasks (Marcellis et al., 2018 ).

The 4C/ID model reveals the learning process during complex problem solving, which provides researchers and practitioners with direct and clear teaching approaches (Costa et al., 2021 ; Merrill, 2002 b). This model is widely used in the field of complex learning because it can help learners master complex skills and improve skill transfer (Frerejean et al., 2019 ; Maddens et al., 2020 ; Zhou et al., 2020 ). The instructional effect of 4C/ID on complex skills has been confirmed by multiple studies (Melo and Miranda, 2015 ; Van-Rosmalen et al., 2013 ). Similarly, the authors’ team found in their research focused on Chinese vocational education that applying the 4C/ID model in instructional design can better enhance students’ academic performance (Xu et al., 2019 ). Therefore, our study uses the 4C/ID model to explain the process of learning complex skills.

Formative assessment of complex skill learning

Formative assessment has been widely recognized as a preferred means of facilitating learning and providing diagnostic information for students and teachers (McCallum and Milner, 2021 ). Formative assessment can evaluate each student comprehensively (Li et al., 2020 ), which is conducive to stimulating students’ internal drive and improving their engagement (Liu, 2022 ). Meanwhile, formative assessment provides students with feedback about learning as an external reference for self-reflection so as to reduce the metacognitive load, adjust and optimize the original knowledge structure, and to continue to develop high-level thinking skills such as analysis, evaluation, and creation (Le and Cai, 2019 ). Although the idea that assessment design should be developed with learning tasks geared toward acquiring complex skills has been proposed, there are few studies that practically explore this domain (Frerejean et al., 2019 ).

Most of the existing research on complex skill evaluation focuses on supporting students’ complex skill learning by applying the 4C/ID model to design curriculum. Summative assessment has been conducted in these studies from the cognitive perspective of skill mastery and cognitive load and the emotional perspective of learning motivation and attitude (Melo and Miranda, 2015 ; Larmuseau et al., 2019 ). However, previous studies have gradually revealed the limitations of instructional design based on summative evaluation in promoting the learning of complex skills. Through a study on 171 psychological and educational science students, Larmuseau et al. ( 2018 ) found that the setting of learning tasks and procedural information had a minimal effect on students’ learning. As such, subsequent research should be conducted on helping learners to choose their own learning task paths better through learners’ collection and evaluation of information in the learning process. Zhou et al. ( 2020 ) research showed that teachers’ demand-oriented intervention could not effectively narrow the gap among students in the class concerning academic performance. Ndiaye et al. ( 2021 ) call for more in-depth measurement and intervention in the teaching process. In a word, these studies illustrate the urgency of formative assessment design and practical research in the field of complex skill learning.

However, current research on the formative assessment of complex skills has faced a series of challenges, including the need to construct an assessment framework concerning specific complex skills, the need to determine how and when to provide formative assessment, and the need to determine the impact of formative assessment on complex skill learning (Bhagat and Spector, 2017 ; Kim, 2021 ; Min, 2012 ). Recently, some researchers have started to focus on the design and implementation of formative assessment and the development of an assessment system. A common method is to provide corrective or cognitive feedback when designing support information for learning tasks. For example, Marcellis et al. ( 2018 ) provided real-time corrective feedback for learners through a Lint code analyzer offered by Android Studio; they also provided cognitive feedback for learners by offering lectures on the problem-solving process. Frerejean et al. ( 2019 ) described a medical practice course based on the 4C/ID model in a blended learning environment in which peers and educators provided corrective and cognitive feedback upon the completion of tasks to promote students’ self-reflection on rules and problem-solving mental models. Leif et al. ( 2020 ) hold that it is necessary to incorporate procedural information composed of operational instruction with corrective feedback using the construction of digital learning environments for complex skill learning based on the 4C/ID model to help students learn rules according to their needs.

Existing studies have put forward some operational modes for how to design courses and provide formative assessment in an ICT-integrated learning environment. However, the main problem lies in the application of formative assessment methods to teaching processes in real classes. On one hand, learners’ understanding of evaluation indicators is the key factor in determining whether formative evaluation can effectively help them identify and overcome their own weaknesses. The teaching process and assessment should be consistent to ensure that students can successfully solve learning tasks; however, existing instructional and assessing frameworks fail to do this (Larmuseau, 2020 ; Pardo et al., 2019 ; Van-Gog et al., 2010 ). On the other hand, as subject experts conducting evaluation in classroom settings, teachers play an important role in monitoring learners’ complex skill learning processes and identifying students’ problems (Crossouard, 2011 ; Sulaiman et al., 2020 ). However, existing studies rarely discuss how to provide teachers with feedback channels concerning formative assessment results to improve the effectiveness of assessment.

Few studies have explored how to implement formative evaluation and obtain helpful feedback on complex skill learning in class. Frerejean et al. ( 2021 ) offer guidance on how to provide formative assessment to cultivate teachers’ differentiated instruction skills. However, although the study theoretically describes the formative assessment design process of complex skill learning to a relatively complete extent, it does not verify the effectiveness of the formative assessment on students’ cognitive and emotional development.

Research questions

Previous research has mainly focused on how ICT can support the timely provision of feedback. Only a few studies have focused on the whole process of formative assessment intervention during learning processes from the perspective of formative assessment framework construction. Meanwhile, there is a lack of analysis on the effect of formative assessment intervention from multiple points of view. Therefore, to monitor and intervene in complex skill learning processes effectively, we aim to construct a formative assessment framework that focuses on the whole learning process of complex skills. Three research questions have been raised to address the current lack of information in this field:

How could the formative assessment model be established to assess the learning progress dynamically, longitudinally, and comprehensively?

How should formative assessment be designed and administered during the complex skill learning processes?

How could the effect of formative assessment on the complex skill learning process be evaluated and understood?

To address these questions, a controlled experimental study was carried out on the complex skill of “Industrial Robot Programming.” Thirty-five students from one of Shanghai’s Technical Institutes participated in this research.

Whole learning process–oriented formative assessment framework for complex skill learning

The first step in accomplishing a dynamic, longitudinal, and comprehensive assessment of students’ learning is understanding the entire scope of complex skill learning processes. The effective cultivation of students’ complex skills requires a systematic approach which could reflect a contemporary understanding of the educational relationships between learning inputs, learning processes, and learning outputs.

Astin ( 1993 ) proposed the I-E-O model and claimed the potential influence of students’ input and environment on their output. The System Theory developed by Bertalanffy also held to the idea that input influences output (Bertalanffy, 1968 ; Yanto et al., 2011 ). Salam, ( 2015 ) suggested that as far as the educational domain is concerned, to implement teaching systematically, teachers and researchers should consider input, process, and output and determine the relationships between learning objectives, content, methods, and evaluation. Therefore, combining the 4C/ID theory and the above elements, the key concepts for the whole complex skill learning process are extracted, including learning objectives in specialized courses, learning environment, construction, automation and transfer of cognitive schema, formative evaluation, mastery of complex skills and problem solving. They correspond to the input, environment, content, method, process, and output in the learning process, respectively.

Moreover, to reveal students’ concrete learning process, we must ascertain the pathways for complex skill learning. The instructional design method of the spiral curriculum proposed by Bruner ( 1977 ) and the Spiral Human Performance Improvement (HPI) Framework proposed by Marker et al. ( 2014 ) provided us with inspiration. Bruner indicated that students’ mastery of knowledge often climbs from simple to complex. Marker et al. used spiral channels in their spiral HPI framework to the cyclical thinking process that professionals experience when facing complex problems. Based on these ideas, we defined the learning process as the process of developing cognitive schemas supported by a series of learning tasks with increasing difficulty and repetitive practice tasks, during which formative assessment ensures student learning and promotes the cycle of learning process for complex skill. This learning process needs to be carried out smoothly under the support of specific technical environments. Finally, we proposed the Spiral Complex Skill Learning Framework (SCSLF).

As shown in Fig. 1 , the whole process of learning complex skills is composed of learning input, learning environments and processes, and learning output. Learning input refers to learning objectives that have been designed with regard to learners’ characteristics and the learning content involved in the given complex skills. The learning environment and process refer to the cultivation of students’ cognition and attitude regarding the learning situation, learning task, assessment, and technical environment. The learning environment includes learning tasks designed in real and complex problem situations and formative assessment based on the human-computer collaborative assessment environment. In addition, as the process of learning complex skills often involves repeated exercises of constituent skills and the increase of task difficulty, which is consistent with the instructional design of Bruner’s ( 1977 ) spiral curriculum, the spiral channel in the SCSLF is used to represent students’ complete learning path. During the learning process, students acquire complex skills through steps ranging from simple to complex: students repeatedly construct proficient cognitive schemas through learning tasks with increasing difficulty and formative feedback, and they acquire the ability to use and transfer schemas comprehensively. After completing relatively lower-level complex skills by following the above process, students attempt to learn more difficult complex skills until they eventually master them all. Learning output refers to the students’ ability to solve problems in similar situations using the complex skill.

figure 1

The figure presents the whole process of learning complex skills.

On this basis, we have determined the development method of formative evaluation tools, the implementation process and approaches of formative assessment, and the effect verification method based on the understanding that the SCSLF provides. The formative assessment design for the whole learning process is shown in Fig. 2 .

figure 2

The figure indicates the whole learning process–oriented formative assessment process for complex skill learning.

First, since the learning process is an upward spiral that is integrated by schema construction, schema automation, and schema transfer, it is possible to establish learning objectives of constituent skills within each level based on skill hierarchy relationships. Thus, we broke down corresponding task objectives to form a comprehensive and measurable assessment indicator system.

Second, the formative assessment tools were developed. Based on the 4C/ID model, the formative assessment process was integrated into the tasks of schema construction, schema proficiency, and comprehensive application and transfer. According to the assessment indicators system above, the corresponding conceptual knowledge tests, process tests, rule tests, and comprehensive project tests were designed to monitor and promote students’ learning process.

Third, we made recommendations for the environment in which the course would take place. The effective implementation of evaluation tasks, as well as the collection, processing, and feedback of learning process data could be conducted through a human-machine collaborative assessment environment.

Finally, to verify the effectiveness of the formative assessment for the whole complex skill learning process, we carried out the summative evaluation of learners’ academic performance through a comprehensive transfer task of complex skills. Because the learner’s cognitive load, learning attitude, and emotional factors also play important roles in the learning process, by adding relevant tests to the summative evaluation, the influence of formative assessment was further verified by the comprehensive summative evaluation.

As mentioned above, we explored how to integrate formative assessment into the overall process of students’ complex skill learning in aspects of theory and methodology. We explored the feasibility and means of applying the whole learning process-oriented formative assessment framework for learning real complex skills in the classroom and ascertained its effectiveness in promoting students’ complex skill cultivation.

Industrial Robot Trajectory Programming is a professional course for Mechatronics students at the Shanghai Technical Institute of Electronics & Information (STIEI). In this course, students are required to configure a workstation that can make an industrial robot move according to a specific trajectory in a simulated environment. The environment is created by a computer that mimics a real-world situation.

Freshmen majoring in Mechatronics generally are characterized by difficulty in understanding and mastering the knowledge and skills related to industrial robot programming and lack of practical ability. To improve students’ learning effect regarding complex skills, our team spent a semester with teachers to design a full-task course based on the 4C/ID model. In the previous study, students who received full-task instruction demonstrated better academic performance after the course than those who received traditional lecture-and-exercise-based learning. However, through communication with teachers, we noticed that there were still large learning gaps between application skills and transferable skills. These problems further raised our expectations that formative assessment would improve students’ learning of complex skills. Therefore, we applied the whole learning process-oriented formative assessment framework to the Industrial Robot Trajectory Programming course to improve the learning effect of the whole task-based course.

The experiment and data collection procedure for the study was approved by the Research Ethics Panel of East China Normal University in China. The authors’ team successfully passed the relevant ethics tests. Before conducting the experiment, we conducted verbal recruitment of participants and obtained informed consent from collaborating schools, teachers, and students. Sensitive data collected during the experiment is stored securely in the researchers’ personal database and kept strictly confidential. All data will be completely destroyed within three years after the research is concluded.

Participants

Thirty-five students from two classes in the Mechatronics Division in STIEI participated in the experiment. They attended the Industrial Robot Trajectory Programming course in their second year and had at least two lessons a week of one and a half hours each during the study. Neither class had any experience in programming industrial robotics. Before the experiment, we conducted a pre-test on the knowledge concepts related to the course for the two classes, and the result was that the students in the two classes got similar low scores, and there was no significant difference between the two. As such, we randomly selected the two classes as the experimental group and the control group, including 16 students in the experimental group and 19 students in the control group. Teachers at STIEI volunteered to participate in the study and were responsible for teaching under the two experimental conditions.

Designing the formative assessment of industrial robot programming skill learning

Constructing a formative assessment indicators system.

The first step of achieving a comprehensive and dynamic evaluation and monitoring learning processes for complex skills was to construct a formative assessment index system for complex skills. In order to extract well-organized, measurable, and systematic evaluation indicators from the learning objectives of the course, it was necessary to divide the learning objectives into different phases.

After consulting researchers and professional teachers who were experts in the field, we defined the learning objective of “Industrial Robot Trajectory Programming” as the ability to skillfully configure a robot trajectory motion workstation in an industrial robot simulation program based on the analysis of customer needs. This included the construction of a robot program framework, the compilation of a main program and other functional programs, and the optimization and debugging of the program to successfully and smoothly solve trajectory and motion problems. The skill hierarchy of Industrial Robot Trajectory Programming is shown in Fig. 3 .

figure 3

The presents the skill hierarchy of industrial robot trajectory programming.

After dividing recurrent and non-recurrent skills, we designed the formative assessment indicator system based on the requirements of the learning objectives of each constituent skill (see Fig. 4 ). As can be seen in Table 1 , we proposed eighteen specific assessment points. These cover concepts and attitudes related to problem-solving, the process of problem-solving, rules of problem-solving, and the ability of the comprehensive implementation and transferring of the complex skills. These assessment points indicate the corresponding relationships between formative assessment objectives and key points of assessment.

figure 4

The figure shows the formative assessment objectives hierarchy of the industrial robot programming skill.

Designing formative assessment tasks

To discern when the assessment should be given and to implement real-time monitoring of learning, we designed different types of learning tasks into which assessment indicators were integrated.

According to the key indicators of formative assessment, four kinds of assessment tests were used to investigate learners’ academic performance: the construction, automation, comprehensive implementation, and transfer competencies of cognitive schema during the learning process. Firstly, the knowledge test was designed to evaluate learners’ concept mastery through objective questions and answers. Secondly, the process test was designed to examine the completeness and accuracy of the learner’s grasp of the problem-solving process in the form of a flow chart. Thirdly, the rule test was used to examine the learner’s ability to use operational skills and related rules to solve practical problems in specific situations. Finally, the test for implementation and transferring examined the learners’ application and transfer ability through completing comprehensive tasks in similar situations.

According to the above assessment test types, four formative assessment tasks and one summative task for the industrial robot programming skill learning were designed, as shown in Table 2 . Task 1 mainly assessed whether students had mastered the process of programming robot trajectory motion according to the specific needs of different customers. Tasks 2 to 4 mainly assessed whether the students had mastered the skills required to create a qualified program framework and to write and optimize the program according to the specific needs of different customers with fixed rules and operating steps. Task 5 mainly assessed whether the students had mastered the analysis and creation of industrial robot trajectory motion programming in similar situations according to the different needs of customers.

After implementing the aforementioned process-oriented assessment, the evaluation results were provided to students individually, as well as the aggregated statistics to the teacher in both the experimental and control groups. Teachers offered guidance to individual students or the entire class based on the concepts with higher error rates, incorrect problem-solving procedures, and mistakes during the program execution process that resulted in errors. This guidance was delivered through direct lectures, breaking down problem-solving processes and identifying reasons, and offering one-on-one operational support, among other methods.

Human-computer collaborative assessment environment

The next problem lay in determining how the assessment should be carried out, specifically, how to collect and process the data acquired from students’ task results and to offer instantaneous feedback to teachers and students. The Complex Skills Automatic Assessment System (CSAAS) is an automatic human-computer collaborative assessment system developed by the authors’ research group to address this problem.

The CSAAS is a computer-based learning environment designed for technical and vocational education and training students in Industrial Robotics through presenting hypermedia materials, collecting and analyzing data, and visualizing the learning process and outcomes. The CSAAS encompasses a behavioral recording/analyzing module, an assessment module, and a feedback module to assess students’ learning trajectories and outcomes. The effect of CSAAS on students’ complex skill learning behavior has been verified through an 8-week experiment in which the students showed high academic performance, perceived usefulness and satisfaction with the system. The relevant research results have been compiled into the academic article which is under review currently.

As shown in Fig. 5 , CSAAS offers an interface in which students can submit compressed packages, question answers, flow charts, and other contents according to the requirements of different tasks. The data acquisition methods of different measurements are defined according to different aspects of the course content. For example, during the knowledge test and collection of flow chart data, the results are collected automatically by the computer, while in the rules test evaluation process, students uploaded their products to the system in a classroom equipped with computers. As a result, different data processing methods were adopted. For example, the computer automatically processed the obtained knowledge test data, and the data from the flow chart and rules tests was adjusted and scored manually. After the automated collection and manual processing of the test results, the CSAAS was used to perform statistical analysis on the collected data and to provide specific feedback items for each test based on the formative assessment indicators framework to support students’ schema modification.

figure 5

The figure presents the CSAAS offers an interface in which students can submit compressed packages, question answers and flow charts.

Verifying the effectiveness of whole learning process–oriented formative assessment

Lastly, to verify the formative evaluation effectiveness of the Robot Programming Skill on the whole learning process, we carried out a summative assessment of complex skills after students had completed the learning processes of all constituent skills. Usually, assigning a comprehensive task concerning transfer competencies is the most direct and effective way to determine the influence of formative assessment on complex learning. Based on assessment Task 5, students’ knowledge concept acquisition, the completeness of schema construction, and the standardization and proficiency of schema application were tested in a similar and real robot programming problem setting. Another method used was to conduct a longitudinal comparison of students’ personal competencies to verify the effect of formative assessment through the implementation of timely feedback and intervention based on assessment results. The current study analyzes learners’ procedural learning status and final mastery of each aspect of industrial robot programming skills based on the formative assessment indicators system.

In addition, complex learning tasks and process assessment may aggravate students’ cognitive load, resulting in a negative impact on learning (Larmuseau et al., 2019 ; Marcellis et al., 2018 ; Melo, 2018 ). Motivational characteristics such as self-efficacy are important components for the understanding of students’ learning output and problem solving (Larmuseau et al., 2018 ; Peranginangin et al., 2019 ). We validated questionnaires to verify the impact of formative assessment on learners’ cognitive load and self-efficacy. The purpose of this was to explain the influence of the complex skill learning process on learners’ internal emotions and psychological efforts under the guidance of the whole learning process–oriented assessment.

Experimental procedures

A total of eight lessons were designed, and two lessons were taught every week (see Fig. 6 ). Since it is impossible to control the complexity and order of complex tasks in class, we referred to Frerejean et al. ( 2021 )’s method which recommends focusing learners’ attention on different aspects of complex skills in a specific order and setting up real and complex tasks with the same scenarios to help learners identify and connect the relationships between different constituent skills. In the experimental group, case tasks were offered during the first and second weeks to help students construct schemas and learn knowledge related to industrial robot programming. Students in the experimental group gradually constructed and automated schemas in lessons 1–8. Tasks 1–4 were completed in the students’ academic context. One week after the end of the course, we administered the summative assessment Task 5 to students in the experimental group. The control group was taught according to the above learning task sequence, but the obtained data were not formatively evaluated, and learning feedback was not provided.

figure 6

The figure indicates a total of eight lessons were designed, and two lessons were taught every week.

Instruments

As shown in Table 3 , we established an academic performance scale that displayed students’ knowledge mastery, schema construction, and rule automation during the learning process, as well as students’ complex skill transfer performance from a comprehensive perspective. The scale content consists of 18 key points of measurement in the assessment point system, which is further subdivided into 58 specific items. Researchers and course experts jointly determined the content and weight of the evaluation points. Formative assessment Tasks 1–4 each contain its corresponding parts (see Table 1 and Table 2 ). Summative Task 5 involved all 58 assessment points in the students’ learning process and was designed to characterize students’ comprehensive implementation and transfer competencies regarding complex skills. The questions and answers of formative and summative evaluation were written by two coders independently under the same evaluation guide. In the early stages, the difficulties and differences of the tests were discussed, and an agreement was reached to ensure a unified evaluation.

Cognitive load and self-efficacy were measured in the summary assessment. As shown in Table 4 , Q1-q14 were used to measure cognitive load, and a 5-point Likert scale was designed to reflect students’ psychological load and effort in the process of learning regarding psychological condition, emotional experience, effective teaching, curriculum arrangement, and autonomous learning (Huang et al., 2013 ; Ji, 2012 ). The 14 items on the scale aimed to analyze the impact of the whole learning process–oriented assessment framework on students’ complex skill learning through the investigation of students’ intrinsic, extraneous, and germane cognitive load during learning (Chen, 2007 ; Hadie and Yusoff, 2016 ). The validity and reliability values of the questionnaire were verified by 107 students majoring in robotics from Shanghai vocational and technical school. According to the statistical results of SPSS, the Cronbach’s α coefficient of intrinsic, extraneous, and germane aspects of cognitive load were 0.976, 0.990, and 0.981, and the KMO sampling was 0.893, which is higher than Kaiser and Rice’s ( 1974 ) proposed minimally acceptable value of 0.5.

The self-efficacy questionnaire for students using complex skills to program industrial robotics was developed based on the research of Peranginangin et al. ( 2019 ) and Semilarski et al. ( 2021 ). q15-q18 were used to measure self-efficacy, and a 5-point Likert scale with four items from “totally disagree” to “totally agree” was designed. According to the statistical results of the 107 students, the Cronbach’s α coefficient of intrinsic, extraneous, and germane aspects of cognitive load was 0.931, and the KMO sampling was 0.844.

Data analyses

First, to verify the feasibility and effectiveness of the formative assessment framework, the effect of formative assessment application on complex skills was processed and analyzed using descriptive statistics, and line charts were drawn. Secondly, to understand the specific effects of the application of formative assessment framework, descriptive statistics and a t-test were carried out for academic success, and area maps were drawn. Finally, descriptive statistics and a t test were carried out on the questionnaire data regarding cognitive load and self-efficacy to understand the cognitive load and self-efficacy of students in this process.

Effectiveness of formative assessment for complex skill learning

Figure 7 compares learners’ mastery of knowledge and concepts in the test tasks in both the formative assessment and the summative assessment. During the learning process, the formative assessment and immediate feedback based on teachers’ explanations effectively promoted students’ mastery of knowledge, especially for learners who had poor mastery of robot programming and application concepts and robot command concepts. However, as for measurement items q7-q8, which stand for the relationship between turning motion parameters and actual robot movement, the formative assessment failed to cultivate learners’ knowledge. This phenomenon might have occurred due to this study’s time constraints. The summative assessment was immediately carried out after receiving the feedback from test 4 results, which did not give students much time to review related knowledge.

figure 7

The figure presents the comparison of objective formative and summative scores concerning knowledge and concept in the experimental group.

The promotion effect of the formative assessment framework for learning processes on learners’ complex skill schema construction and automation can be verified by comparing the scores of assessment points in Tasks 1–4 with the corresponding assessment points in Task 5. Data analysis results are shown in Fig. 8 . The results show that in schema construction, the score rates of students for assessment points KI6-10 were less than 60%, which indicates that learners did not establish a complete cognitive schema for robot programming. After receiving cognitive feedback from teachers based on formative assessment, learners’ construction and mastery of overall schema increased dramatically.

figure 8

The figure indicates the results of comparison between formative and summative score concerning schema construction and automation in the experimental group.

The results also show that in terms of schema automation, although learners have better mastered most of the operating skills and working rules of industrial robot programming, the application of the complex skill process evaluation framework further improved the learning effect in some relatively weak assessment points. The results show that for the assessment points that students master (such as KI13 and KI15), formative assessment can offer a better intervention effect. However, formative assessment offers a limited effect on constituent skills that learners’ have mastered during the learning process. Through the evaluation of learners’ ability to apply complex skill transfer, we found that learners’ mastery of each assessment indicator generally reached 80%, indicating that the application of the complex skill process evaluation framework can promote learners’ schema construction to a certain extent in weak areas of complex skill learning.

Academic success

Task 5 was used to verify the results of students’ implementation and transfer competency of complex skills. According to the results of the independent sample t-test, there were significant differences in the application of complex skill transfer between the experimental group and the control group ( t  = 2.56, p  = 0.015, Cohen’s d  = 0.86). The academic performance of students in the experimental group in schema construction ( t  = 2.38, p  = 0. 024, Cohen’s d  = 0.87) and schema proficiency ( t  = 2.083, p  = 0.045, Cohen’s d  = 0.72) was significantly higher than that of the control group. However, in terms of memorizing knowledge concepts, the students in the experimental group were only slightly better than those in the control group ( t  = 0.038, p  = 0.970).

Figure 9 shows the overall state of students’ academic performance in each group. The results show that within a learning environment that incorporates procedural evaluation, students performed better overall in the learning of complex problem-solving processes and operating rules and could better transfer schemas to solve new problems. Although it was not very beneficial for students’ knowledge mastery, formative assessment was indeed effective in reducing the academic gap between students and promoting students’ complex skill learning.

figure 9

The figure presents the findings of comparison of academic performance between the experimental and control groups.

Cognitive load and self-efficacy

We conducted a summative evaluation of learners’ cognitive load that aimed to explore the impact of formative assessment and intervention oriented to the whole learning process of students’ internal cognitive processing. The summative evaluation also included the self-efficacy test that reflected the influence formative assessment had on students’ learning emotions. Table 5 presents the t-test results of the cognitive load and self-efficacy between the experimental group and the control group. The germane cognitive load of students in the experimental group was significantly higher than that in the control group ( t  = 2.19, p  = 0.035). As for intrinsic ( t  = 0.85, p  = 0.402), extraneous ( t  = 0.95, p  = 0.352), and cognitive load as a whole ( t  = 1.66, p  = 0.107), students in the experimental group showed a higher level than those in the control group, but there was no significant difference between the two groups. Meanwhile, there was no significant difference in self-efficacy ( t  = 0.727, p  = 0.472) between the experimental group and the control group.

The formative assessment of students’ complex skills should be a dynamic and spiral process following the learning tests

To construct a formative evaluation framework for complex skills, the learning process is supposed to be analyzed primarily. Frerejean et al. ( 2019 ) indicated that the development of an evaluation framework should be synchronized with the development of learning tasks and used to promote students’ learning process, which has also been proven effectively in this study. Focusing on the key components proposed by the 4C/ID model, I-E-O model, spiral curriculum theory and HPI, the SCSLF was constructed. This framework consists of a spiral learning pathway for students to cultivate complex problem-solving skills. The overview of the instruction and formative assessment process was further proposed for suggesting guidelines to instructors. This overview includes a comprehensive scope of the source, tools, environment, and verification of the formative assessment.

Design of the formative assessment indicators and tasks

When it comes to the concrete steps of design and implemention of the formative assessment, the practical dimensions and tools are developed on the basis of the 4C/ID model. Our research introduces a process similar to Wolterinck et al. ( 2022 ), which is based on the 4C/ID model of designing assessments for learning and includes the definition of learning outcomes, instructional methods, and data collection techniques. However, what makes this study unique is its emphasis on creating a learning environment that enables more effective ways of tracking students’ performance in different aspects. The objectives for learning complex skill were deconstructed based on the learning processes. Focusing on four aspects (objective knowledge mastery, schema construction, schema automation, and schema transfer), the learning objectives were converted into a measurable key point framework of formative assessment. Moreover, a full-task learning model for complex skills learning was proposed, and a series of specific formative assessment tasks were designed according to the above-mentioned assessment point framework. Finally, the acquisition and processing methods of assessment data were analyzed, and assessment tasks were integrated into the whole learning process within the actual learning environment.

Formative assessment could promote students’ mastery of complex problem-solving skills without increasing their cognitive load significantly

To verify the effectiveness of the formative assessment on students’ complex skill learning, the formative and summative evaluating data were collected and analyzed. The results show that the formative assessment assisted learning process effectively improves students’ comprehensive transfer of and ability to implement complex skills. This conclusion echoes Frerejean et al.’s ( 2019 ) view, which indicated that the development of an evaluation framework should be synchronized with the development of learning tasks and used to promote students’ learning process. The formative assessment framework contributes to the monitoring and leading of students’ performance in the learning process. This study supported the findings of Costa et al. ( 2021 ) in their meta-analysis on the effect of the 4C/ID model, which indicated that continuous assessment and interventions could better consolidate students’ learning process and enhance their mastery of procedural and declarative knowledge. The cognitive load of the students in the experimental group was not significantly different from that in the controlled group, which shows that the application of formative assessment will not increase students’ learning burden. Particularly, the students in the experimental group had a significantly larger related cognitive load than those in the control group, which is consistent with Larmuseau et al. ( 2019 ) research results and shows that formative assessment contributes to students’ cultivation of complex skills effectively through promoting their cognitive efforts. In addition, integrating both manual and electronic assessment methods helped teachers integrate formative assessment into their teaching process, which is consistent with Ackerman et al.’s ( 2021 ) findings. However, contrary to the findings of Argelagós et al. ( 2022 ), our study showed that students’ self-efficacy did not significantly improve as a result of receiving the treatment, even though self-efficacy is believed to be related to students’ actual abilities. This could be attributed to the feedback provided, which not only assisted students in their learning but also exposed them to potential areas of improvement.

To further explore the different aspects of the impact of the formative assessment framework on learners’ complex skills learning, we compared the process and summary scores of students in the experimental group. The results show that the whole formative assessment framework can reveal the overall mastery of complex skills in the learning process as well as promote students’ learning and the construction of mature mental models concerning complex problem-solving processes and operation rules. This finding is in line with Kim’s ( 2021 ) research results. In addition, we find that the formative assessment framework can effectively reflect the current state of students’ knowledge mastery, but its promotion effect towards knowledge mastery is relatively limited. This is consistent with Van-Rosmalen et al.’s ( 2013 ) findings. In terms of professional knowledge in industrial robot programming, there was no significant improvement in students’ accuracy on the final test. This phenomenon might have occurred because the feedback provided to students after their formative assessments did not match their actual needs. The cognitive feedback teachers gave could not provide effective support for students’ recognition and understanding of weak objective knowledge concepts. Therefore, it is necessary to improve the forms of evaluation and feedback given to students in the process of solving complex problems in the current complex evaluation system. One possibility would be offering specific exercises and tasks to assess implicit knowledge concepts under a particular working context to improve students’ cognition and mastery of objective knowledge more effectively in related complex fields.

Limitations and suggestions for future research

Although the current research has discussed the possibility of carrying out formative assessment of learners’ complex problem-solving skills, there are still some limitations in the research. Firstly, the study only involved a limited number of students majoring in industrial robotics in a vocational college. The differences in the impact of whole learning process-oriented formative assessment on cognitive skill learning in other complex problem scenarios was not explored. Secondly, this study only explored the impact of formative assessment on the learning of complex skills from the cognitive perspective. Non-cognitive factors such as emotion and motivation are also important parts of and challenges in the assessment process which were not included in the scope of our assessment. Lastly, due to time constraints during the experiment, we did not provide students with sufficient time for review after they had learned all the course materials. This might have resulted in inconsistent assessment and feedback effectiveness in the results section.

In future studies, it is necessary to design and apply the formative assessment framework under a wider range of complex skills scenarios. Firstly, richer measurement and intervention methods also must be introduced to make up for the lack of cultivation of learners’ objective knowledge in current instruction based on the 4C/ID model. This may involve using video clips of classroom sessions to facilitate reflective questioning (Wolterinck-Broekhuis CHD et al. ( 2022 )), designing effective discussions, as well as fostering teacher-student interactions to explore conceptual understanding of objective knowledge (Helms-Lorenz and Visscher, 2022 ). Additionally, emerging technologies such as Virtual Reality (VR) and computer simulation software could be employed to enhance students’ comprehension of abstract foundational knowledge (Solmaz et al., 2023 ). Students need to be given sufficient time to digest the results of formative assessment, as it is a crucial step for the effectiveness of formative assessment. Secondly, during the design process of formative assessment systems, researchers must comprehensively consider the cultivation of students’ cognitive schema construction, cognitive load, learning attitude, and professional emotion associated with complex problem-solving situations in order to achieve a more comprehensive evaluation and improvement of students’ knowledge, skills, and attitudes in the learning process (Pardo et al., 2019 ). Moreover, since we have delved deeply into the complex process of students’ skill acquisition and the specific design of formative assessments, we still discovered that many studies focused on other aspects of assessment for learning. These aspects include teachers’ metacognitive abilities and knowledge base, effective classroom management and diverse teaching methods, as well as interactions and collaboration between teachers and students (Helms-Lorenz and Visscher, 2022 ; Wolterinck-Broekhuis CHD et al., ( 2022 )). These concepts could also play significant roles in the design and implementation of formative assessments in classroom practices.

Conclusions

In the 21st century, complex skills are becoming increasingly indispensable competencies for students and workers. The instruction of complex skills in schools plays a very important role in the cultivation of the talent required by the market. During the current complex skill learning process, the lack of systematic formative evaluation limits students’ learning. Therefore, in order to design a feasible and effective whole learning process–oriented formative evaluation system, this study offered a formative evaluation method for complex skill learning based on the 4C/ID model. An intervention research was then carried out among vocational students to verify the effectiveness of the framework. Our study shows that compared with a simple complex skill learning process based on summative evaluation, a learning process integrated with formative assessment significantly improves learners’ performance in the problem-solving process, implementing problem-solving competencies, and transferring complex skills to similar situations. Moreover, with the goal of not adding an additional cognitive burden to students, the formative assessment method effectively improves students’ germane cognitive load, which also improves students’ active efforts during the learning process. This research will play a role in verifying the effectiveness of formative assessment in the learning of complex problems and guiding the design and development of formative assessment in real and complex learning situations.

Data availability

The datasets generated during and/or analyzed during the current study are available in the Harvard Dataverse repository, https://doi.org/10.7910/DVN/DNYBKB .

Ackermans K, Rusman E, Nadolski R, Specht M, Brand‐Gruwel S (2021) Video‐enhanced or textual rubrics: does the viewbrics’ formative assessment methodology support the mastery of complex (21st century) skills? J Comput Assist Learn 37(3):810–824

Article   Google Scholar  

Alahmad A, Stamenkovska T, Gyori J (2021) Preparing pre-service teachers for 21st century skills education: a teacher education model. GiLE J Skills Dev 1(1):67–86

Argelagós E, Garcia C, Privado J, Wopereis I (2022) Fostering information problem solving skills through online task-centred instruction in higher education. Comput Educ 180:104–433

Astin AW (1993) Assessment for excellence: The philosophy and practice of assessment and evaluation in higher education. Orix Press, New York, USA

Google Scholar  

Bertalanffy L (1968) General system theory: foundations, development, applications. George Braziller, New York, p 289. http://hdl.handle.net/10822/763002

Bhagat KK, Spector JM (2017) Formative assessment in complex problem-solving domains: the emerging role of assessment technologies. Educ Techno Soc 20(4):312–317

Bruner J (1977) The process of education. Harvard University Press, MA

Care E, Kim H, Vista A, Anderson K (2019) Education system alignment for 21st century skills: focus on assessment. Center for Universal Education at the Brookings Institution, Washington. https://eric.ed.gov/?id=ED592779

Chen Q (2007) Cognitive load theory and its development. Technol Educ 9:15–19

CAS   Google Scholar  

Costa JM, Miranda GL, Melo M (2021) Four-component instructional design (4C/ID) model: a meta-analysis on use and effect. Learn Environ Res 25:445–463. https://doi.org/10.1007/s10984-021-09373-y

Crossouard B (2011) Using formative assessment to support complex learning in conditions of social adversity. Assess Educ 18(1):59–72

Eseryel D, Ifenthaler D, Ge X (2013) Validation study of a method for assessing complex ill-structured problem solving by using causal representations. Educ Technol Res Dev 61(3):443–463

Frerejean J, Van-Merriënboer JJ, Kirschner PA, Roex A, Aertgeerts B, Marcellis M (2019) Designing instruction for complex learning: 4C/ID in higher education. Eur J Educ 54(4):513–524

Frerejean J, Van-Geel M, Keuning T, Dolmans D, Van-Merriënboer JJ, Visscher AJ (2021) Ten steps to 4C/ID: training differentiation skills in a professional development program for teachers. Instr Sci 49:395–418. https://doi.org/10.1007/s11251-021-09540-x

Hadie SN, Yusoff MS (2016) Assessing the validity of the cognitive load scale in a problem-based learning setting. J Taibah Univ Med Sci 11(3):194–202

Helms-Lorenz M, Visscher AJ (2022) Unravelling the challenges of the data-based approach to teaching improvement. Sch Eff Sch Improv 33(1):125–147

Hwang GJ, Yang LH, Wang SY (2013) A concept map-embedded educational computer game for improving students’ learning performance in natural science courses. Comput Educ 69:121–130

Iñiguez-Berrozpe T, Boeren E (2020) 21st century skills for all: Adults and problem solving in technology rich environments. Tech Know Learn 25:929–951

Ji C (2012) Development of cognitive load scale for senior high school students and related research. Nanjing Normal University, Nanjing. https://t.cnki.net/kcms/detail?v=yBz58I57kKfYNhmNOQoKzW4DinFJ48hmyuSIvzsPaRDcfPgtZ2iq98gxOO_a58R3SXGlipnHq7LfhWq-CmJDgfBM9tfgKTxJ3l4OsdtKGrwGlaTICBukUw==&uniplatform=NZKPT

Jian J, Ma P, Zhang X (2021) Active choice: a positive understanding of “High Dropout Rate” of online courses based on the perspective of learner investment theory. E-Educ Res 4:45–52

Kaiser HF, Rice J (1974) Little jiffy, mark IV. Educ Psychol Meas 34(1):111–117

Kerrigan S, Feng S, Vuthaluru R, Ifenthaler D, Gibson D (2021) Network analytics of collaborative problem-solving. In: Ifenthaler D and Sampson DG (eds) Balancing the tension between digital technologies and learning sciences: Cognition and exploratory learning in the digital age. Springer, Cham. https://doi.org/10.1007/978-3-030-65657-7_4

Kim MK (2021) A design experiment on technology‐based learning progress feedback in a graduate‐level online course. Hum Behav Emerg Technol 3(5):1–19. https://doi.org/10.1002/hbe2.308

Koehler AA, Vilarinho-Pereira DR (2021) Using social media affordances to support ill-structured problem-solving skills: Considering possibilities and challenges. Educ Technol Res Dev 71:199–235. https://doi.org/10.1007/s11423-021-10060-1

Larmuseau C, Elen J, Depaepe F (2018) The influence of students’ cognitive and motivational characteristics on students’ use of a 4C/ID-based online learning environment and their learning gain. In: Proceedings of the 8th International Conference on Learning Analytics and Knowledge, pp 171–180. https://doi.org/10.1145/3170358.3170363

Larmuseau C, Coucke H, Kerkhove P, Desmet P, Depaepe F (2019) Cognitive load during online complex problem-solving in a teacher training context. In: EDEN Conference Proceedings, pp 466–474. https://doi.org/10.38069/edenconf-2019-ac-0052

Larmuseau C, Desmet P, Lancieri L, Depaepe F (2019) Investigating the effectiveness of online learning environments for complex learning. https://dl.acm.org/citation.cfm?id=3303772

Larmuseau C (2020) Learning analytics for the understanding of learning processes in online learning environments. Université de Lille, Lille; Katholieke universiteit te Leuven, Leuven. https://hal.archives-ouvertes.fr/tel-03086072/

Le H, Cai L (2019) Flipped classroom research to promote deep learning: from the perspective of cognitive load theory. J Teach Manag 12:92–95

Leif S, Head D, McCreery M, Fiorentini J, Cole LQ (2020) Acclimation by Design: Using 4C/ID to Scaffold Digital Learning Environments. In: Langran E (ed). Proceedings of SITE Interactive 2020 Online Conference. Online: Association for the Advancement of Computing in Education (AACE), pp 513-517. https://www.learntechlib.org/primary/p/218195/

Li Y, Hu Y, Li Y, Ma H, Su K (2020) Application and practice of process evaluation in general zoology curriculum—Take Shanxi Agricultural University as an example. Heilongjiang Anim Sci Vet Med 23:143–146

Liu Y (2022) Application research of process evaluation in nursing skill training teaching of undergraduate nursing students. Chin Nurs Res 36(2):353–355

MathSciNet   Google Scholar  

Maddens L, Depaepe F, Raes A, Elen J (2020) The instructional design of a 4C/ID-inspired learning environment for upper secondary school students’ research skills. Int J Des Learn 11(3):126–147

Marcellis M, Barendsen E, Van-Merriënboer J (2018) Designing a blended course in Android app development using 4C/ID. In: Proceedings of the 18th Koli Calling International Conference on Computing Education Research, pp 1–5. https://doi.org/10.1145/3279720.3279739

Marker A, Villachica SW, Stepich D, Allen DA, Stanton L (2014) An updated framework for human performance improvement in the workplace: the spiral HPI framework. Perform Improv 53(1):10–23

Mayer RE (1992) Thinking, problem solving, cognition (2nd ed). Freeman WH/Times Books/Henry Holt & Co. https://psycnet.apa.org/record/1992-97696-000

McCallum S, Milner MM (2021) The effectiveness of formative assessment: student views and staff reflections. Assess Eval High Educ 46(1):1–16

Melo M, Miranda GL (2015) Learning electrical circuits: the effects of the 4C-ID instructional approach in the acquisition and transfer of knowledge. J Inf Technol Educ 14:313–337

Melo M (2018) The 4C/ID-model in physics education: instructional design of a digital learning environment to teach electrical circuits. Int J Instr 11(1):103–122

Merrill MD(2002a) A pebble‐in‐the‐pond model for instructional design Perform Improv 41(7):41–46

Merrill MD (2002b) First principles of instruction. Educ Technol Res Dev 50(3):43–59

Min KK (2012) Theoretically grounded guidelines for assessing learning progress: cognitive changes in ill-structured complex problem-solving contexts. Educ Technol Res Dev 60(4):601–622

Nadolski RJ, Hummel HG, Rusman E, Ackermans K (2021) Rubric formats for the formative assessment of oral presentation skills acquisition in secondary education. Educ Technol Res Dev 69(5):2663–2682

Ndiaye Y, Hérold JF, Chatoney M (2021) Applying the 4C/ID-model to help students structure their knowledge system when learning the concept of force in technology. Techne Serien 28(2):260–268. https://journals.oslomet.no/index.php/techneA/article/view/4319

OECD. (2018) PISA 2015 Results in Focus. https://www.oecd.org/pisa/pisa-2015-results-infocus.pdf

Pardo A, Jovanovic J, Dawson S, Gašević D, Mirriahi N (2019) Using learning analytics to scale the provision of personalised feedback. Br J Educ Technol 50(1):128–138

Peranginangin SA, Saragih S, Siagian P (2019) Development of learning materials through PBL with Karo culture context to improve students’ problem-solving ability and self-efficacy. Int Electron J Math Educ 14(2):265–274

Salam A (2015) Input, process and output: System approach in education to assure the quality and excellence in performance. Bangladesh J Med Sci 14(1):1–2

Article   MathSciNet   Google Scholar  

Semilarski H, Soobard R, Rannikmäe M (2021) Promoting students’ perceived self-efficacy towards 21st century skills through everyday life-related scenarios. Educ Sci 11(10):570. https://doi.org/10.3390/educsci11100570

Solmaz S, Kester L, Van-Gerven T (2023) An immersive virtual reality learning environment with CFD simulations: Unveiling the Virtual Garage concept. Educ Inf Technol. https://doi.org/10.1007/s10639-023-11747-z

Sulaiman T, Kotamjani SS, Rahim SSA, Hakim MN (2020) Malaysian public university lecturers’ perceptions and practices of formative and alternative assessments. Int J Learn Teach Educ Res 19(5):379–394

Thima S & Chaijaroen S (2021) The framework for development of the constructivist learning environment model to enhance Ill-structured problem solving in industrial automation system supporting by metacognition. In: International Conference on Innovative Technologies and Learning. Springer, Cham, pp 511–520

Van-Gog T, Sluijsmans DMA, Joosten-Ten BD, Prins JF (2010) Formative assessment in an online learning environment to support flexible on-the-job learning in complex professional domains. Educ Technol Res Dev 58:311–324

Van-Merriënboer JJG, Kester L (2014) The four-component instructional design model: Multimedia principles in environments for complex learning. In: Mayer RE (ed) The Cambridge handbook of multimedia learning. Cambridge University Press, Cambridge, UK, pp 104–148. https://doi.org/10.1017/CBO9781139547369.007

Van-Rosmalen P, Boyle EA, Nadolski R, Van-Der-Baaren J, Fernández-Manjón B, MacArthur E & Star K (2013) Acquiring 21st Century Skills: Gaining insight into the design and applicability of a serious game with 4C-ID. In: International Conference on Games and Learning Alliance. Springer, Cham, pp 327–334. https://doi.org/10.1007/978-3-319-12157-4_26

Webb ME, Prasse D, Phillips M, Kadijevich DM, Angeli C, Strijker A, Carvalho AA, Andresen BB, Dobozy E, Laugesen A (2018) Challenges for IT-enabled formative assessment of complex 21st century skills. Tech Know Learn 23:441–456

Wolterinck C, Poortman C, Schildkamp K & Visscher A (2022) Assessment for Learning: developing the required teacher competencies. Eur J Teach Educ. https://doi.org/10.1080/02619768.2022.2124912

Wolterinck-Broekhuis CHD, Poortman CL, Schildkamp K & Visscher AJ (2022) Teacher professional development in Assessment for Learning. University of Twente. https://doi.org/10.3990/1.9789036554664

Xu X, Zhou Z, Yue Y, Wang M, Gu X (2019) Design and effectiveness of comprehensive learning for complex skills based on 4C/ID model. Chin Educ Technol 10:124–131

Yanto H, Mula JM, Kavanagh MH (2011) Developing students’ accounting competencies using Astin’s IEO model: An identification of key educational inputs based on Indonesian student perspectives. In: Proceedings of the RMIT Accounting Educators’ Conference, 2011: Accounting Education or Educating Accountants?. University of Southern Queensland, Brisbane

Zhou S, Zhang Y, Liu X, Wang Y, Shen X (2020) Empirical research of oral English teaching in primary school based on 4C/ID model. J High Educ Res 1(4):123–130

Download references

Acknowledgements

This work was supported by Key Project of Science and Technology Commission of Shanghai Municipality (Project No.: 17DZ2281800) and Fundamental Research Funds for the Central Universities (Project No.: 2020ECNU-HLYT035). XX and WS have equally contributed to this article, and they should be considered as first authors.

Author information

Authors and affiliations.

Department of Education Information Technology, Faculty of Education, East China Normal University, Shanghai, China

Xianlong Xu, A.Y.M. Atiquil Islam & Yang Zhou

Institute of Education, Tsinghua University, Beijing, China

Wangqi Shen

School of Teacher Education, Jiangsu University, Zhenjiang, Jiangsu, China

A.Y.M. Atiquil Islam

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to A.Y.M. Atiquil Islam .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Informed consent.

Informed consent was obtained from all the participants and their teachers with the permission of the institute involved in the study.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Xu, X., Shen, W., Islam, A.A. et al. A whole learning process-oriented formative assessment framework to cultivate complex skills. Humanit Soc Sci Commun 10 , 653 (2023). https://doi.org/10.1057/s41599-023-02200-0

Download citation

Received : 27 March 2023

Accepted : 25 September 2023

Published : 06 October 2023

DOI : https://doi.org/10.1057/s41599-023-02200-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research on formative assessment

Formative Assessment in Educational Research Published at the Beginning of the New Millennium: Bibliometric Analysis

  • Published: 06 November 2023
  • Volume 7 , pages 106–125, ( 2023 )

Cite this article

  • Ataman Karaçöp   ORCID: orcid.org/0000-0001-8939-3725 1 &
  • Tufan İnaltekin   ORCID: orcid.org/0000-0002-3843-7393 1  

198 Accesses

Explore all metrics

Today, many educational reforms emphasize the importance of formative assessment (FA) for effectiveness in teaching. Considering the importance of FA in education and a perceived lack of same, it is clear that much more research is needed on this subject. However, it is also very important for researchers to formulate new studies and not to repeat exiting ones, and to increase the visibility of exiting effective studies that can make teachers in service understand the importance of FA. In this context, being able to review qualified academic research is very valuable. Therefore, we aimed to conduct a bibliometric analysis using the VOSviewer to determine the focus of research on formative assessment in education (FAE). This bibliometric analysis included 447 studies on FA published in the Web of Science (WoS) from 2000 to 2021. We performed the citation analysis, created a co-authored network map, and performed an analysis of author keywords in FA publications. Results showed that Erin M. Furtak has been the most prolific author in FAE in terms of the number of publications. Moreover, David J. Nicol and Debra Macfarlane-Dick, who had a co-authored publication, were the most influential authors in terms of citations. Results indicated that Univ Colorado/USA with 13 publications was the most productive institution. However, the USA and England were the most productive countries in terms of both numbers of publications and citations. Of the 19 documents with over a hundred citations, Nicol and Macfarlane-Dick ( 2006 ) and Black and Wiliam ( 2009 ) were the most influential documents. According to the number of publications and citations, “Assessment & Evaluation in Higher Education” and “Computers & Education” came to the fore among the top five most productive sources. The results of the co-occurrence analysis showed that the terms “assessment,” “mathematics,” and “professional development” were most often co-occurrence. Moreover, FA, which was the focus of this bibliometric analysis, had a high degree of co-occurrence with feedback, followed by summative evaluation, self-regulation, and professional development. These results will contribute significantly to the efforts of the scientific community towards FA research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research on formative assessment

Similar content being viewed by others

research on formative assessment

Ethical Considerations of Conducting Systematic Reviews in Educational Research

research on formative assessment

Tools for assessing teacher digital literacy: a review

Lan Anh Thuy Nguyen & Anita Habók

research on formative assessment

The power of assessment feedback in teaching and learning: a narrative review and synthesis of the literature

Michael Agyemang Adarkwah

Data Availability

The data that support the findings of this study are available from https://www.webofscience.com/wos/woscc/summary/506d2eeb-9361-4460-8bf3-2a2bd56ed309-720760d4/date-descending/1 .

Antoniou, P., & James, M. (2014). Exploring formative assessment in primary school classrooms: Developing a framework of actions and strategies. Educational Assessment, Evaluation and Accountability, 26 (2), 153–176. https://doi.org/10.1007/s11092-013-9188-4

Article   Google Scholar  

Badaluddin, N. A., Lion, M., Razali, S. M., & Khalit, S. I. (2021). Bibliometric analysis of global trends on soil moisture assessment using the remote sensing research study from 2000 to 2020. Water, Air, & Soil Pollution, 232 (7), 1–10. https://doi.org/10.1007/s11270-021-05218-9

Beatty, I. D., & Gerace, W. J. (2009). Technology-enhanced formative assessment: A research-based pedagogy for teaching science with classroom response technology. Journal of Science Education and Technology, 18 (2), 146–162. https://doi.org/10.1007/s10956-008-9140-4

Bell, B., & Cowie, B. (2001). The characteristics of formative assessment in science education. Science Education, 85 (5), 536–553. https://doi.org/10.15663/wje.v7i1.430

Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice, 18 (1), 5–25. https://doi.org/10.1080/0969594X.2010.513678

Birenbaum, M., DeLuca, C., Earl, L., Heritage, M., Klenowski, V., Looney, A., & Wyatt-Smith, C. (2015). International trends in the implementation of assessment for learning: Implications for policy and practice. Policy Futures in Education, 13 (1), 117–140. https://doi.org/10.1177/147821031456673

Bjork, S., Offer, A., & Söderberg, G. (2014). Time series citation data: The Nobel Prize in economics. Scientometrics, 98 (1), 185–196. https://doi.org/10.1007/s11192-013-0989-5

Black, P. (2015). Formative assessment–an optimistic but incomplete vision. Assessment in Education: Principles, Policy & Practice, 22 (1), 161–177. https://doi.org/10.1080/0969594X.2014.999643

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles Policy and Practice, 5 (1), 7–73. https://doi.org/10.1080/0969595980050102

Black, P., & Wiliam, D. (2003). ‘In praise of educational research’: Formative assessment. British Educational Research Journal, 29 (5), 623–637. https://doi.org/10.1080/0141192032000133721

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21 (1), 5–31. https://doi.org/10.1007/s11092-008-9068-5

Bozkurt, N. O. (2021). Academics’ opinions regarding the quality of scientific publications and their quality problems. Journal of Higher Education and Science, 11 (1), 128–137. https://doi.org/10.5961/jhes.2021.435

Buchanan, T. (2000). The efficacy of a World-Wide Web mediated formative assessment. Journal of Computer Assisted Learning, 16 (3), 193–200. https://doi.org/10.1046/j.1365-2729.2000.00132.x

Cagasan, L., Care, E., Robertson, P., & Luo, R. (2020). Developing a formative assessment protocol to examine formative assessment practices in the Philippines. Educational Assessment, 25 (4), 259–275. https://doi.org/10.1080/10627197.2020.1766960

Cancino, C., Merigó, J. M., Coronado, F., Dessouky, Y., & Dessouky, M. (2017). Forty years of Computers & Industrial Engineering: A bibliometric analysis. Computers & Industrial Engineering, 113 , 614–629. https://doi.org/10.1016/j.cie.2017.08.033

Cao, Y., Qi, F., Cui, H., & Yuan, M. (2022). Knowledge domain and emerging trends of carbon footprint in the field of climate change and energy use: A bibliometric analysis. Environmental Science and Pollution Research . https://doi.org/10.1007/s11356-022-24756-1

Cisterna, D., & Gotwals, A. W. (2018). Enactment of ongoing formative assessment: Challenges and opportunities for professional development and practice. Journal of Science Teacher Education, 29 (3), 200–222. https://doi.org/10.1080/1046560X.2018.1432227

Cizek, G. J., Andrade, H. L., & Bennett, R. E. (2019). Formative assessment: History, definition, and progress. In  Handbook of formative assessment in the disciplines  (pp. 3–19). Routledge.

Coffey, J. E., Hammer, D., Levin, D. M., & Grant, T. (2011). The missing disciplinary substance of formative assessment. Journal of Research in Science Teaching, 48 (10), 1109–1136. https://doi.org/10.1002/tea.20440

Correia, C. F., & Harrison, C. (2020). Teachers’ beliefs about inquiry-based learning and its impact on formative assessment practice. Research in Science & Technological Education, 38 (3), 355–376. https://doi.org/10.1080/02635143.2019.1634040

Cowie, B., & Bell, B. (1999). A model of formative assessment in science education. Assessment in Education: Principles, Policy & Practice, 6 (1), 101–116. https://doi.org/10.1080/09695949993026

De Backer, F., Van Avermaet, P., & Slembrouck, S. (2017). Schools as laboratories for exploring multilingual assessment policies and practices. Language and Education, 31 (3), 217–230. https://doi.org/10.1080/09500782.2016.1261896

DeLuca, C. (2012). Preparing teachers for the age of accountability: Toward a framework for assessment education. Action in Teacher Education, 34 (5–6), 576–591. https://doi.org/10.1080/01626620.2012.730347

DeLuca, C., Valiquette, A., Coombs, A., LaPointe-McEwan, D., & Luhanga, U. (2018). Teachers’ approaches to classroom assessment: A large-scale survey. Assessment in Education: Principles, Policy & Practice, 25 (4), 355–375. https://doi.org/10.1080/0969594X.2016.1244514

Dini, V., Sevian, H., Caushi, K., & Orduña Picón, R. (2020). Characterizing the formative assessment enactment of experienced science teachers. Science Education, 104 (2), 290–325. https://doi.org/10.1002/sce.21559

Đorić, B., Lambić, D., & Jovanović, Ž. (2021). The use of different simulations and different types of feedback and students’ academic performance in physics. Research in Science Education, 51 (5), 1437–1457. https://doi.org/10.1007/s11165-019-9858-4

Double, K. S., McGrane, J. A., & Hopfenbeck, T. N. (2020). The impact of peer assessment on academic performance: A meta-analysis of control group studies. Educational Psychology Review, 32 (2), 481–509. https://doi.org/10.1007/s10648-019-09510-3

Earl, L. M. (2003). Assessment as learning: Using classroom assessment to maximize student learning . Corwin Press.

Google Scholar  

Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2015). Influence of omitted citations on the bibliometric statistics of the major Manufacturing journals. Scientometrics, 103 (3), 1083–1122. https://doi.org/10.1007/s11192-015-1583-9

Furtak, E. M., Circi, R., & Heredia, S. C. (2018). Exploring alignment among learning progressions, teacher-designed formative assessment tasks, and student growth: Results of a four-year study. Applied Measurement in Education, 31 (2), 143–156. https://doi.org/10.1080/08957347.2017.1408624

Furtak, E. M., & Heredia, S. C. (2014). Exploring the influence of learning progressions in two teacher communities. Journal of Research in Science Teaching, 51 (8), 982–1020. https://doi.org/10.1002/tea.21156

Furtak, E. M., Kiemer, K., Circi, R. K., Swanson, R., de León, V., Morrison, D., & Heredia, S. C. (2016). Teachers’ formative assessment abilities and their relationship to student learning: Findings from a four-year intervention study. Instructional Science, 44 (3), 267–291. https://doi.org/10.1007/s11251-016-9371-3

Furtak, E. M., Morrison, D. L., & Kroog, H. (2014). Investigating the link between learning progressions and classroom assessment. Science Education , 98 (4), 640–673.

Gijbels, D., & Dochy, F. (2006). Students’ assessment preferences and approaches to learning: Can formative assessment make a difference? Educational Studies, 32 (4), 399–409. https://doi.org/10.1080/03055690600850354

Giménez-Espert, M. D. C., & Prado-Gascó, V. J. (2019). Bibliometric analysis of six nursing journals from the Web of Science, 2012–2017. Journal of Advanced Nursing, 75 (3), 543–554. https://doi.org/10.1111/jan.13868

Gomez, C. J., Herman, A. C., & Parigi, P. (2022). Leading countries in global science increasingly receive more citations than other countries doing similar research. Nature Human Behaviour, 6 , 919–929. https://doi.org/10.1038/s41562-022-01351-5

Gotwals, A. W. (2018). Where are we now? Learning progressions and formative assessment. Applied Measurement in Education, 31 (2), 157–164. https://doi.org/10.1080/08957347.2017.1408626

Grob, R., Holmeier, M., & Labudde, P. (2021). Analysing formal formative assessment activities in the context of inquiry at primary and upper secondary school in Switzerland. International Journal of Science Education, 43 (3), 407–427. https://doi.org/10.1080/09500693.2019.1663453

Guo, W. Y., & Yan, Z. (2019). Formative and summative assessment in Hong Kong primary schools: Students’ attitudes matter. Assessment in Education: Principles, Policy & Practice, 26 (6), 675–699. https://doi.org/10.1080/0969594X.2019.1571993

Hartmeyer, R., Stevenson, M. P., & Bentsen, P. (2018). A systematic review of concept mapping-based formative assessment processes in primary and secondary science education. Assessment in Education: Principles, Policy & Practice, 25 (6), 598–619. https://doi.org/10.1080/0969594X.2017.1377685

Havnes, A., Smith, K., Dysthe, O., & Ludvigsen, K. (2012). Formative assessment and feedback: Making learning visible. Studies in Educational Evaluation, 38 (1), 21–27. https://doi.org/10.1016/j.stueduc.2012.04.001

Heredia, S. C., Furtak, E. M., Morrison, D., & Renga, I. P. (2016). Science teachers’ representations of classroom practice in the process of formative assessment design. Journal of Science Teacher Education, 27 (7), 697–716. https://doi.org/10.1007/s10972-016-9482-3

Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89 (2), 140–145.

Hernández-Torrano, D., & Ibrayeva, L. (2020). Creativity and education: A bibliometric mapping of the research literature (1975–2019). Thinking Skills and Creativity, 35 , 100625. https://doi.org/10.1016/j.tsc.2019.100625

Hondrich, A. L., Hertel, S., Adl-Amini, K., & Klieme, E. (2016). Implementing curriculum-embedded formative assessment in primary school science classrooms. Assessment in Education: Principles, Policy & Practice, 23 (3), 353–376. https://doi.org/10.1080/0969594X.2015.1049113

Hopster-den Otter, D., Wools, S., Eggen, T. J., & Veldkamp, B. P. (2017). Formative use of test results: A user’s perspective. Studies in Educational Evaluation, 52 , 12–23. https://doi.org/10.1016/j.stueduc.2016.11.002

Hou, J., Yang, X., & Chen, C. (2018). Emerging trends and new developments in information science: A document co-citation analysis (2009–2016). Scientometrics, 115 (2), 869–892. https://doi.org/10.1007/s11192-018-2695-9

Hwang, G. J., & Chang, H. F. (2011). A formative assessment-based mobile learning approach to improving the learning attitudes and achievements of students. Computers & Education, 56 (4), 1023–1031. https://doi.org/10.1016/j.compedu.2010.12.002

Irons, A., & Elkington, S. (2021). Enhancing learning through formative assessment and feedback. Routledge . https://doi.org/10.4324/9781138610514

James, M. (2006). Assessment, teaching and theories of learning. In J. Gardner (Ed.), Assessment and learning (pp. 47–60). Sage.

Kasemodel, M. G. C., Makishi, F., Souza, R. C., & Silva, V. L. (2016). Following the trail of crumbs: A bibliometric study on consumer behavior in the Food Science and Technology field. International Journal of Food Studies, 5 (1), 73–83. https://doi.org/10.7455/ijfs/5.1.2016.a7

Kelley, K., Clark, B., Brown, V., & Sitzia, J. (2003). Good practice in the conduct and reporting of survey research. International Journal for Quality in Health Care , 15 (3), 261–266. https://doi.org/10.1093/intqhc/mzg031

Klenowski, V. (2009). Assessment for learning revisited: An Asia-Pacific perspective. Assessment in Education: Principles, Policy & Practice, 16 (3), 263–268. https://doi.org/10.1080/09695940903319646

Kwon, S. K., Lee, M., & Shin, D. (2017). Educational assessment in the Republic of Korea: Lights and shadows of high-stake exam-based education system. Assessment in Education: Principles, Policy & Practice, 24 (1), 60–77. https://doi.org/10.1080/0969594X.2015.1074540

Lee, H., Chung, H. Q., Zhang, Y., Abedi, J., & Warschauer, M. (2020). The effectiveness and features of formative assessment in US K-12 education: A systematic review. Applied Measurement in Education, 33 (2), 124–140. https://doi.org/10.1080/08957347.2020.1732383

Leydesdorff, L., & Wagner, C. S. (2008). International collaboration in science and the formation of a core group. Journal of Informetrics, 2 (4), 317–325.

Mapplebeck, A., & Dunlop, L. (2021). Oral interactions in secondary science classrooms: A grounded approach to identifying oral feedback types and practices. Research in Science Education, 51 (2), 957–982. https://doi.org/10.1007/s11165-019-9843-y

Marshall, B., & Jane Drummond, M. (2006). How teachers engage with assessment for learning: Lessons from the classroom. Research Papers in Education, 21 (02), 133–149. https://doi.org/10.1080/02671520600615638

McMillan, J. H., Venable, J. C., & Varier, D. (2013). Studies of the Effect of Formative Assessment on Student Achievement: So Much More Is Needed. Practical Assessment, Research & Evaluation, 18 (2), 1–15. https://doi.org/10.7275/tmwm-7792

Milfont, T. L., & Page, E. (2013). A bibliometric review of the first thirty years of the Journal of Environmental Psychology. Psyecology, 4 (2), 195–216. https://doi.org/10.1080/21711976.2013.10773866

National Research Council. (2014). Developing assessments for the next generation science standards. Washington D.C.: National Academies Press.

Nicol, D. J., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31 (2), 199–218. https://doi.org/10.1080/03075070600572090

Pan, W., Jian, L., & Liu, T. (2019). Grey system theory trends from 1991 to 2018: A bibliometric analysis and visualization. Scientometrics, 121 (3), 1407–1434. https://doi.org/10.1007/s11192-019-03256-z

Petrović, J., Pale, P., & Jeren, B. (2017). Online formative assessments in a digital signal processing course: Effects of feedback type and content difficulty on students learning achievements. Education and Information Technologies, 22 (6), 3047–3061. https://doi.org/10.1007/s10639-016-9571-0

Pryor, J., & Crossouard, B. (2008). A socio-cultural theorisation of formative assessment. Oxford Review of Education, 34 (1), 1–20. https://doi.org/10.1080/03054980701476386

Radhakrishnan, S., Erbis, S., Isaacs, J. A., & Kamarthi, S. (2017). Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature.  PloS one ,  12 (3). https://doi.org/10.1371/journal.pone.0172778

Ruiz-Primo, M. A., & Furtak, E. M. (2007). Exploring teachers’ informal formative assessment practices and students’ understanding in the context of scientific inquiry. Journal of Research in Science Teaching, 44 (1), 57–84. https://doi.org/10.1002/tea.20163

Rushton, A. (2005). Formative assessment: A key to deep learning? Medical Teacher, 27 (6), 509–513. https://doi.org/10.1080/01421590500129159

Sadler, D. R. (1998). Formative assessment: Revisiting the territory. Assessment in Education: Principles, Policy & Practice, 5 (1), 77–84. https://doi.org/10.1080/0969595980050104

Stiggins, R. (2005). From formative assessment to assessment for learning: A path to success in standards-based schools. Phi Delta Kappan, 87 (4), 324–328.

Sudakova, N. E., Savina, T. N., Masalimova, A. R., Mikhaylovsky, M. N., Karandeeva, L. G., & Zhdanov, S. P. (2022). Online formative assessment in higher education: Bibliometric analysis. Education Sciences, 12 (3), 209. https://doi.org/10.3390/educsci12030209

Tanner, W., Akbas, E., & Hasan, M. (2019, December). Paper recommendation based on citation relation. In  2019 IEEE international conference on big data (big data)  (pp. 3053–3059). IEEE.

Taras, M. (2005). Assessment–summative and formative–some theoretical reflections. British Journal of Educational Studies, 53 (4), 466–478. https://doi.org/10.1111/j.1467-8527.2005.00307.x

Tierney, R. D. (2014). Fairness as a multifaceted quality in classroom assessment. Studies in Educational Evaluation, 43 , 55–69. https://doi.org/10.1016/j.stueduc.2013.12.003

Torrance, H., & Pryor, J. (2001). Developing formative assessment in the classroom: Using action research to explore and modify theory. British Educational Research Journal, 27 (5), 615–631. https://doi.org/10.1080/01411920120095780

Triantafillou, E., Pomportsis, A., & Demetriadis, S. (2003). The design and the formative evaluation of an adaptive educational system based on cognitive styles. Computers & Education, 41 (1), 87–103. https://doi.org/10.1016/S0360-1315(03)00031-9

Van der Kleij, F. M. (2019). Comparison of teacher and student perceptions of formative assessment feedback practices and association with individual student characteristics. Teaching and Teacher Education, 85 , 175–189. https://doi.org/10.1016/j.tate.2019.06.010

Van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84 (2), 523–538. https://doi.org/10.1007/s11192-009-0146-3

Van Eck, N. J. & Waltman, L. (2012). VOSviewermanual. Manual for VOSviewer Version 1.5.2. Available at: https://www.vosviewer.com/documentation/Manual_VOSviewer_1.5.2.pdf (accessed 25 January 2023).

Van Eck, N. J., & Waltman, L. (2014). Visualizing bibliometric networks. In  Measuring scholarly impact  (pp. 285–320). Springer, Cham. https://doi.org/10.1007/978-3-319-10377-8_13

Van Eck, N. J., & Waltman, L. (2017). Citation-based clustering of publications using CitNetExplorer and VOSviewer. Scientometrics, 111 (2), 1053–1070. https://doi.org/10.1007/s11192-017-2300-7

Van Leeuwen, T. N., Visser, M. S., Moed, H. F., Nederhof, T. J., & Van Raan, A. F. J. (2003). The Holy Grail of science policy: Exploring and combining bibliometric tools in search of scientific excellence. Scientometrics, 57 (2), 257–280.

Vickerman, P. (2009). Student perspectives on formative peer assessment: An attempt to deepen learning? Assessment & Evaluation in Higher Education, 34 (2), 221–230. https://doi.org/10.1080/02602930801955986

Webb, M. (2014). Beginning teacher education and collaborative formative e-assessment. In A pproaches to assessment that enhance learning in higher education (pp. 117–138). Routledge.

Weurlander, M., Söderberg, M., Scheja, M., Hult, H., & Wernerson, A. (2012). Exploring formative assessment as a tool for learning: Students’ experiences of different methods of formative assessment. Assessment & Evaluation in Higher Education, 37 (6), 747–760. https://doi.org/10.1080/02602938.2011.572153

Wiliam, D. (2013). Assessment: The bridge between teaching and learning. Voices from the Middle, 21 (2), 15–20.

Wilkie, B., & Liefeith, A. (2022). Student experiences of live synchronised video feedback in formative assessment. Teaching in Higher Education, 27 (3), 403–416. https://doi.org/10.1080/13562517.2020.1725879

Wylie, E. C., & Lyon, C. J. (2020). Developing a formative assessment protocol to support professional growth. Educational Assessment, 25 (4), 314–330. https://doi.org/10.1080/10627197.2020.1766956

Xiao, Y., & Yang, M. (2019). Formative assessment and self-regulated learning: How formative assessment supports students’ self-regulation in English language learning. System, 81 , 39–49. https://doi.org/10.1016/j.system.2019.01.004

Yan, Z., & Cheng, E. C. K. (2015). Primary teachers’ attitudes, intentions and practices regarding formative assessment. Teaching and Teacher Education, 45 , 128–136. https://doi.org/10.1016/j.tate.2014.10.002

Yorke, M. (2003). Formative assessment in higher education: Moves towards theory and the enhancement of pedagogic practice. Higher Education, 45 (4), 477–501. https://doi.org/10.1023/A:1023967026413

Ysenbaert, J., Van Houtte, M., & Van Avermaet, P. (2020). Assessment policies and practices in contexts of diversity: Unravelling the tensions. Educational Assessment, Evaluation and Accountability, 32 (2), 107–126. https://doi.org/10.1007/s11092-020-09319-7

Zeng, R., & Chini, A. (2017). A review of research on embodied energy of buildings using bibliometric analysis. Energy and Buildings, 155 , 172–184. https://doi.org/10.1016/j.enbuild.2017.09.025

Zhang, Y., & Wang, P. (2022). Twenty years’ development of teacher identity research: A bibliometric analysis.  Frontiers in Psychology ,  12 . https://doi.org/10.3389/fpsyg.2021.783913

Download references

Author information

Authors and affiliations.

Department of Mathematics and Science Education, Kafkas University, Kars, 36100, Turkey

Ataman Karaçöp & Tufan İnaltekin

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ataman Karaçöp .

Ethics declarations

Conflict of interest.

The author declares no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Karaçöp, A., İnaltekin, T. Formative Assessment in Educational Research Published at the Beginning of the New Millennium: Bibliometric Analysis. J Form Des Learn 7 , 106–125 (2023). https://doi.org/10.1007/s41686-023-00081-9

Download citation

Accepted : 18 October 2023

Published : 06 November 2023

Issue Date : December 2023

DOI : https://doi.org/10.1007/s41686-023-00081-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Formative assessment
  • Bibliometric analysis
  • Web of Science
  • Find a journal
  • Publish with us
  • Track your research
  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, the effect of a formative assessment practice on student achievement in mathematics.

research on formative assessment

  • 1 Department of Science and Mathematics Education, Umeå University, Umeå, Sweden
  • 2 Umeå Mathematics Education Research Centre (UMERC), Umeå University, Umeå, Sweden

Research has shown that formative assessment can enhance student learning. However, it is conceptualised and implemented in different ways, and its effects on student achievement vary. A need has been identified for experimental studies to carefully describe both the characteristics of implemented formative assessment practices and their impact on student achievement. We examined the effects on student achievement of changes in formative assessment of a random sample of 14 secondary school mathematics teachers after a professional development programme. This study describes practices implemented and students’ achievement as measured by pre-tests and post-tests. We found no significant differences in achievement on the post-test, after controlling for pre-test scores, between the intervention group and control group, and no significant correlation between the number of formative assessment activities implemented and the post-test scores (controlled for the pre-test scores). We discuss characteristics of formative assessment implementations that may be critical for enhancing student achievement.

1. Introduction

In their seminal review of the effects of formative assessment Black and Wiliam (1998) concluded that it can significantly improve student achievement. Since then there has been a large increase in the number of research publications on formative assessment 1 ( Baird et al., 2014 ; Hirsh and Lindberg, 2015 ). However, this literature shows a large variation in the size of the effects of formative assessment, and some interventions with formative assessment produced no effects at all on student achievement ( Bennett, 2011 ; Kingston and Nash, 2011 ; Briggs et al., 2012 ).

1.1. Formative assessment

Differences in effects of formative assessment on student achievement may depend on the conceptualisation of formative assessment. There has been a significant amount of variation in how formative assessment is conceptualised and implemented, and some of the common conceptualisations to formative assessment were described by Baird et al. (2014) .

The following well-cited definition by Black and Wiliam (2009) includes several of these conceptualisations to formative assessment:

Practice in a classroom is formative to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited. (p. 9)

The heart of this conceptualisation is the use of assessment information to adapt teaching and learning to the learning needs identified through the assessment. Thus, the ‘big idea’ should permeate all work with formative assessment. This idea clarifies that evidence about student learning needs to be collected, interpreted and used by teachers and learners to decide on the next steps in instruction. The term ‘instruction’ is used in the sense described by Black and Wiliam (2009) to include both teaching and learning, including “any activity that is intended to create learning” ( Black and Wiliam, 2009 , p. 10). Thus, formative assessment “is concerned with the creation of, and capitalization upon, ‘moments of contingency’ in instruction for the purpose of the regulation of learning processes” ( Black and Wiliam, 2009 , p. 10). Such moments may be created by teachers or students through planned assessment activities, and can result from using any kind of assessment procedure and artefact (e.g., tests, informal observations or dialogue) to reveal student knowledge and skills. These moments may also be noticed and acknowledged during learning activities in a lesson.

This definition affords several different approaches to formative assessment. In one approach, formative assessment is viewed as a process in which teachers assess students’ learning to provide feedback to students or modify instructional activities to better meet students’ learning needs. Two other approaches emphasise the importance of the students’ proactive participation in the formative assessment processes. In one of these approaches the students act as support for each other’s learning through peer assessment and peer-feedback, and suggest to their peers ways to reach their learning goals. In the other approach focus is on the students as self-regulated learners who use self-assessment and actions based on that self-assessment to reach their own learning goals. All three of these approaches may also include a focus on helping students to understand the learning goals. It is also possible to combine these three approaches and integrate them into a unity, which may be seen as a fourth approach to formative assessment. The emphasis on planned assessment processes, or on information gathered about student learning from informal day-to-day activities such as observations or dialogue may also vary between and within approaches.

The definition by Black and Wiliam (2009) was operationalised by Wiliam and Thompson (2008) in a form that facilitates the learning, and practical use, of formative assessment in the classroom, and this framework was used in the analysis of teacher practices in the present study. Wiliam and Thompson described formative assessment as a practice based on adherence to the ‘big idea’ of using evidence about student learning to adjust instruction to better meet the students’ learning needs, and the use of the following five key strategies (KS) ( Wiliam and Thompson, 2008 ):

KS 1 . Clarifying, sharing and understanding learning intentions and criteria for success.
KS 2 . Engineering effective classroom discussions, questions and tasks that elicit evidence of learning.
KS 3 . Providing feedback that moves learners forward.
KS 4 . Activating students as instructional resources for one another.
KS 5 . Activating students as the owners of their own learning.

The first key strategy emphasises the importance of teachers and students sharing an understanding of the learning goals. The second stresses the teacher’s role in collecting evidence about student learning that can both form the basis of feedback to help meet students’ learning needs and also to make better-informed decisions about how to continue with and adapt instruction. The third key strategy is providing such feedback and instructional activities that improve students’ learning. The fourth and fifth strategies recognise and emphasise the roles of both teachers and students as active agents in carrying out the processes involved in the ‘big idea’. For example, students may assess their peers’ work and provide feedback (KS 4) and also, as self-regulated learners (KS 5) assess their own performances and decide how to take the next steps in their learning. The role of the teacher is to support the students’ development of these skills and to motivate the use of these skills in practice.

The framework does not posit that any particular activities are required to carry out the key strategies, but some classroom activities may contribute more than others to the purposes of the big idea and to each key strategy. When all the strategies are used together as an inherent part of a unified classroom practice, they can support each other in facilitating student engagement and learning. For example, the students’ involvement as proactive agents in the formative assessment processes as peer-assessors and self-regulated learners is facilitated by both teacher’s and students’ engagement in attaining a common interpretation of learning goals and success criteria. Frequent assessment of students’ knowledge and skills, and valid interpretations of their responses, would facilitate the possibilities of providing appropriate feedback and instruction that often meet the students’ learning needs.

Differences in formative assessment practices may also be due to not only differences in conceptualisation, but also to differences in implementation of each approach, and these differences may be quantitative or qualitative. Quantitatively, teachers may, for example, gather information about student learning and adapt their instruction or provide individual feedback to students several times each lesson, each month or each term. Qualitatively, some questions are better than others for capturing relevant student knowledge; some interpretations and inferences based on student responses may be more valid than others; some feedback and instructional modifications may be better adapted than others to the learning needs identified in the assessment. The formative assessment practices may also be carried out representing different foundational principles. For example, some scholars, like Marshall and Drummond (2006) have described the foundational principle of formative assessment (or assessment for learning) as promotion of student autonomy. These scholars made a distinction between practices that capture the essence of this principle (i.e., practices that follow the ‘spirit’ of formative assessment), and procedures that do not embody this principle (procedures that adhere to the ‘letter’ of formative assessment). Several studies have found that many teachers focus on teacher-centred practices in which the teacher is the proactive agent in the formative assessment processes, at the expense of promoting student autonomy, even though such a focus was not in accordance with the conceptualisation of the formative assessment meant to be implemented ( Jönsson et al., 2015 ; Wylie and Lyon, 2015 ).

1.2. Effects of formative assessment on student achievement

Several research reviews that include many different studies of each of the first three approaches have shown that all of these approaches to formative assessment can improve student achievement, and that the size of the effects vary substantially between individual studies that take a given approach. Starting with the first approach, several research reviews have looked at studies investigating the effects of teachers’ feedback to students. Hattie and Timperley (2007) reported a mean average effect size of d  = 0.8 with effects sizes in individual studies varying between 0 and 1.2. Wisniewski et al. (2020) found an average effect size of 0.5, and they too reported a notable variability of the effects between studies included in the review. Koenka et al. (2021) found an average effect size of 0.25 when comparing feedback as grades with no feedback, and an average effect size of 0.32 when comparing the effects of feedback provided as comments with feedback given as grades. Another review by Shute (2008) analysed earlier reviews on the effects of feedback on student achievement. She concluded that effects of feedback on student achievement varied between negative effects to very large positive effects. Yeh (2009) reviewed four studies on the effectiveness of frequent use of small computer-based tests to provide feedback and give differentiated instruction to students. The effects on student achievement in the analysed studies varied between 0.3 and 0.4.

The second approach to formative assessment comprises processes including peer-assessment and peer-feedback. In the meta-analyses by both Double et al. (2020) and Sanchez et al. (2017) , the average effect size of the included studies was 0.3. The Double et al. (2020) study reported very large variation between −1 and 1.5, whilst the effects in Sanchez et al. (2017) varied between 0.2 and 0.6. As a part of their meta-analysis, Sanchez et al. (2017) also investigated the effects of the third approach to formative assessment, which emphasises self-assessment. They found similar effects from self-assessment as from peer-assessment (the average effects size was 0.3 and the effect sizes varied between −0.8 and 1.8). Earlier, Ross (2006) had reported studies showing positive effects and other studies showing negative effects from self-assessment on student achievement, but the most common effect size was at a level of approximately 0.5. Graham et al. (2015) found an average effect size of 0.6, and although Andrade (2019) did not report an average effect size in her review she concluded that all studies in her review showed a positive association between self-assessment and learning.

Studies on the effects on student achievement from the fourth approach to formative assessment, which comprises classroom practices including all of the first three approaches, are much more rare than studies on one of the three approaches individually. At this time, we have not found any reviews of the effects of the fourth approach on student achievement, only a few individual studies. A study by Wiliam et al. (2004) found a positive effect on student achievement with an effect size of 0.3, and a parallel study to the one presented here found a positive effect of 0.7 (measured at the teacher level) ( Andersson and Palm, 2017 ). A quasi-experimental study by Wafubwa and Csíkos (2022) reported a positive effect of 0.38. Two other experimental studies analysing the effects of implementing formative assessment in line with this conceptualisation, but with a focus on peer-assessment and self-assessment, have also found statistically significant effects on student achievement ( Chen et al., 2017 ; Chen and Andrade, 2018 ). The effect size in Chen et al. was 0.25 and the three effects sizes in Chen and Andrade varied between 0.15 and 0.25, with the latter being statistically significant.

Very few scholars question the potential of formative assessment, but several argue that more research is needed to establish both the size of its effect on achievement in different student populations and in different contexts, and the mechanisms by which it improves learning ( Dunn and Mulvenon, 2009 ; Bennett, 2011 ; Kingston and Nash, 2011 ; Briggs et al., 2012 ; McMillan et al., 2013 ). Baird et al. (2014) observed that most available studies are case studies involving only a few students, and Flórez and Sammons (2013) and Wafubwa (2020) concluded that most studies of formative assessment have not measured change directly, but only through participants’ perceptions. Consequently, calls have been made to complement such investigations with more experimental studies with representative samples, control groups, and pre-and post-tests to measure students’ learning gains ( Bennett, 2011 ; Flórez and Sammons, 2013 ; Baird et al., 2014 ; Wafubwa, 2020 ). In the time since those calls were made a number of such experimental studies have been published on the effects of peer-assessment and self-assessment, which made meta-analyses about these approaches possible ( Sanchez et al., 2017 ; Double et al., 2020 ). However, as described above, experimental studies on the effects of the fourth approach, in which all of the three first approaches are included, are still rare.

1.3. Formative assessment in mathematics

Studies on the effects of formative assessment on student achievement in mathematics indicate that the conclusions drawn more generally also hold for this specific subject. The effects sizes vary substantially between different implementations within each approach, and there is a lack of experimental studies investigating the effects on student achievement from the fourth approach to formative assessment. The National Mathematics Advisory Panel (2008) conducted a meta-analysis on experimental studies within the first approach to formative assessment. They included studies investigating the effects of a regular use of brief tests for formative purposes on student achievement in mathematics, and they found an average effect size of 0.3 (varying between 0 and 0.6). These brief tests may be paper-and-pencil tests, but there have also been a growing number of studies investigating the effects of interventions using computer programmes to aid the formative assessment processes. These interventions have often showed positive effects of different sizes on student achievement in mathematics (e.g., Yeh, 2009 ; Burns et al., 2010 ; Koedinger et al., 2010 ; Faber et al., 2017 ; Murphy et al., 2020 ). The computer programmes generate tests and information about students’ learning needs, and depending on the intervention, either the computer programme provides students with feedback, or the intervention includes training sessions for the teachers about how to use the information from the test results to adapt their teaching to the students’ identified learning needs. In a review by Rohrbeck et al. (2003) , positive effects on student achievement in mathematics were found from peer-assisted learning including peer-assessment and subsequent feedback (the second approach to formative assessment). The mean effect size was 0.3, which was not statistically different from the sizes of the effects found for other subjects such as reading. The review by Palm et al. (2017) reported on studies from the first approach (teacher assessment of student learning and adapted feedback and instructional activities), and third approach to formative assessment (self-assessment with subsequent actions based on the assessment). Although the studies included revealed a large variation in the effects on student achievement in mathematics, positive effects were found from both approaches. Since many studies included in the review did not report effects sizes, no average effect size was calculated. However, the review indicated a need for studies investigating the effects of the fourth approach to formative assessment, encompassing teacher assessment of student learning followed by adapted feedback and instructional activities, as well as peer-assessment and self-assessment.

1.4. The need for experimental studies that include careful descriptions of the implemented practice

In addition to the research needs outlined in the previous sections, it is also important to conduct studies that carefully describe both the particulars of the formative assessment practices that the teachers implement and the impact of that kind of implementation on student achievement. Such studies could deepen our understanding of how formative assessment works to improve learning, which would improve our ability to predict the conditions and population groups in which formative assessment is likely to work in certain ways ( Bennett, 2011 ; Kingston and Nash, 2011 ; McMillan et al., 2013 ). Unfortunately, such studies are scarce ( Schneider and Randel, 2010 ). Several researchers argue that the vague description of the formative assessment practices actually implemented by the teachers in many studies (not only what they were supposed to implement) makes it difficult to connect the outcomes in terms of student achievement to the particular characteristics of those practices ( Kingston and Nash, 2011 ; McMillan et al., 2013 ). The shortage of empirical studies including careful descriptions of the particular formative assessment practice implemented, representative samples, and pre- and post- measurements of actual student achievement, are particularly rare with the fourth approach to formative assessment. In their 2013 review, Flórez and Sammons (2013) identified only one quantitative study examining the effects of such a conceptualisation on student achievement, which was the aforementioned study by Wiliam et al. (2004) . In 2017, a parallel study to the one presented here found that after controlling for pre-test scores, students of a random sample of teachers who implemented a more formative classroom practice in school year 4 significantly outperformed students in the control group in a post-intervention test that year ( Andersson and Palm, 2017 ).

Studies of the effects of formative assessment with large and randomised samples and control groups require using professional development programmes (PDP) that support teachers in their development of formative assessment practices. However, high-quality formative assessments, and in particular those that belong to the fourth approach, constitute complex and advanced practices. Whilst some PDPs have been successful in accomplishing such substantially developed formative assessment practices to the extent that increased student achievement was obtained (e.g., Andersson and Palm, 2017 ), providing sufficient support for teachers to use these programmes to develop classroom practices in accordance with the fourth approach has proven to be difficult with large samples of teachers (e.g., Bell et al., 2008 ; Randel et al., 2016 ). There is a general consensus that certain characteristics of PDPs are important for attaining desired teacher and student outcomes ( Timperley et al., 2007 ; Desimone, 2009 ); these include (1) a focus on teaching and learning subject matter, (2) active learning including hands-on practice, (3) teacher collaboration and discussions about the impact of activities tested in the teachers’ classes, (4) coherence between what is being taught in the programme and wider policy trends and research, (5) time spent on the programme, and (6) engagement of school leaders and external expertise. Heitink et al. (2016) also identified similar characteristics important for developing formative assessment practices, whilst DeLuca et al. (2019) additionally suggested that identifying teachers’ learning continua would be an important support for their continued development, both in terms of conceptual understanding and enacted formative assessment practices.

2. Research questions

In the present study, a random sample of year-7 mathematics teachers implemented formative assessment practices after having participated in a professional development programme (PDP) focused on the fourth approach to formative assessment as described by Wiliam and Thompson (2008) . The main goal of the study was to see whether these formative assessment practices would have an effect on students’ achievement. A complementary question is whether implementing more (rather than fewer) formative assessment activities would have a positive effect on the students’ achievement gains (for a definition of formative assessment activities, see the next section). This complementary research question is intended to provide some empirical evidence about the issue of the effects of quantity and quality of formative assessment. For example, it might be beneficial for students’ learning if teachers gathered information about their students’ learning needs in many different ways, and used this information to adapt their teaching, or if teachers provided students with several different opportunities to self-assess. Indeed, these activities are central parts of many conceptualizations of formative assessment. However, another possibility is that the use of a diversity of different assessments activities and adaptation of teaching and learning do not affect student achievement, or that the quality of these practices needs to be high to achieve an effect. We have not found any study investigating this specific issue empirically using randomised samples with control groups and actual measurements of student achievement. Such a study could be useful for understanding what constitutes both sufficient quality and sufficient quantity for formative assessment practices to have a positive effect on student achievement.

The present study investigates the following two research questions:

1. To what extent do the formative classroom practices implemented by the year-7 teachers who participated in the professional development programme affect student achievement in mathematics?

2. To what extent is there a correlation between the number of formative assessment activities implemented and students’ achievement gains in mathematics?

3.1. Procedure

A randomised selection of schoolyear-7 teachers was invited to participate in a professional development programme in formative assessment (described below) held in the spring of 2011, and in the research study. The next school year (autumn 2011 – spring 2012), the teachers returned to full-time teaching at their schools. The rest of the schoolyear-7 teachers in the municipality were the control group. To study the effect of the implemented formative assessment practices on students’ achievement in mathematics, a pre-test was administered to all schoolyear-7 students in the municipality in August 2011, and a post-test was administered to the same students near the end of May 2012 (see Figure 1 for a timeline of the data collection and teacher activities). Results for students of teachers who participated in the PDP were compared with those for students of teachers in the control group. The tests were administered by the municipality as one of their evaluations of their schools. The researchers were then allowed access to anonymized spreadsheets with the results from the tests. Since the tests were carried out under municipality regulations, and researchers only used anonymized municipality data, consent from teachers and students was not required. However, all teachers were informed that researchers would be allowed to use anonymized data from the tests.

www.frontiersin.org

Figure 1 . Timeline for data collection and teachers’ activities.

There were only two types of contacts between the researchers and the teachers during the school year 2011–2012. One was the organisation of three 2 h sessions so the teachers who participated in the PDP could get together and discuss their formative classroom practice if they wished. Only a few did. Teachers mentioned a lack of time as a reason for not participating in these sessions. They also mentioned that they thought they knew how to implement the formative assessment practices they wanted to carry out this schoolyear, so they did not feel a need for those sessions at this time. The other type of contact was the data collection for analysing changes to the teachers’ formative classroom practice after the PDP. These data collection occasions included unannounced classroom observations during the school year and both a questionnaire and an interview at the end of the school year. For these parts, consent from the teachers was obtained.

3.2. Participants

The 14 teachers (6 females, 8 males) who participated in the PDP were selected by a stratified random sampling procedure from the population of all 35 teachers who were scheduled to teach mathematics in a schoolyear-7 class (students approximately 13 years old) in the upcoming school year in a mid-sized Swedish municipality. In the selection procedure, secondary education schools were first stratified based on the number of classes in schoolyear 7, and then one to three teachers were randomly selected from each school depending on the number of classes in these schools. There were 14 schools with year-7 classes in the municipality, and each school included between one and 6 year-7 classes. For schools with an even number of year-7 classes, half of these teachers were randomly selected to participate in the PDP. For the schools with 3 year-7 classes half of them were randomly selected to participate with two teachers, and the other schools with one teacher. A similar procedure was used for schools with one or 5 year-7 classes, but they could contribute with one or two, and two or three teachers, respectively. Two of the 20 selected teachers declined to participate in the PDP. Another four had to withdraw from the study for reasons such as moving to another city or not being assigned a year-7 class after the PDP. These six teachers were not included in either the intervention group or the control group. The 14 teachers who participated in the PDP taught a total of 291 students (the intervention group). The students belonging to this group were 148 females and 143 males. The remaining 15 teachers (6 females, 9 males) that taught year-7 mathematics that school year taught a total of 275 students (the control group). Of these students, 141 were females and 134 were males. The teachers in the control group did not receive in-service training, and had no prior training, in formative assessment. To avoid influencing the teachers who had not participated in the professional development programme, the teachers who had participated in the programme were asked not to discuss what they had learnt in the PDP with their colleagues. Both teacher groups included teachers with varying lengths of teaching experience, ranging from teachers with only a year of teaching experience to those close to retirement. The students in both groups were diverse in their socio-economic and cultural backgrounds.

In Sweden, children are obliged to attend school for 9 years, and during these compulsory school-years the students take 16 subjects such as mathematics, history and music. There is a national curriculum that includes the learning goals in all subjects. National tests are administered in schoolyears 3, 6, and 9, and the results should be given special consideration when setting grades. The students are given grades in each subject in school years 6–9. The final grades at the end of school year 9 are used for admission to upper-secondary school programmes (school years 10–12). Almost all students go on to upper-secondary school, but not all are admitted to their first-choice programme. The teachers teaching mathematics usually follow their class through school years 7 to 9, and commonly teach 3–4 subjects (for example mathematics, physics, chemistry and biology).

3.3. Teachers’ formative assessment practices

3.3.1. design of the professional development programme.

The research questions in the present study concerns the effects on student achievement from the formative assessment practices the teachers implemented after their participation in the professional development programme (PDP). Thus, the object of study is not the PDP itself or the PDPs’ effects on teacher practices or student achievement. However, in order to provide context for the study we here briefly describe the PDP.

The PDP the teachers participated in was designed to have many of the characteristics identified by researchers as important for accomplishing teacher development ( Timperley et al., 2007 ; Desimone, 2009 ). The programme included 4 h meetings, once a week over one term, for a total of 96 h. The teachers had another five hours per week, or 120 h in total, available for reading literature and planning and reflecting upon formative assessment activities that were new to them (“new” in the sense that they had not used them, or used them to a lesser extent, prior to the PDP). A regular meeting comprised lectures presenting the theory of formative assessment, research supporting the value of formative assessment, suggestions for concrete formative assessment activities to try out in the classroom before the next meeting, group discussions about the content and how to implement it, and discussions about the previous week’s classroom try-outs. The programme was process-oriented, had a formative character, and was organised and led by the second author. The framework by Wiliam and Thompson (2008) formed the content of the PDP, and it included the general principles of formative assessment. Meetings during the PDP focussed on how these principles could be applied to the school subject mathematics, and in particular to the mathematics curriculum the teachers and their students were engaged in during the PDP. For example, when Key Strategy 1 was explored during a meeting, then the specific learning goals of the mathematics curriculum being taught during the PDP were used in the discussions of how to obtain a shared understanding of the learning goals between teacher and the students.

3.3.2. Data collection and data analysis

The formative assessment practices implemented by the year-7 teachers after the PDP were identified and described in detail in Boström and Palm (2019) , and are summarised in the next section. The analysis and identification of these formative assessment practices described in Boström and Palm (2019) was possible because the teachers’ practices before the PDP were described in an earlier study ( Andersson et al., 2017 ). Thus, the teacher practices before and after the PDP could be compared. The analyses in these previous studies, which are described in detail in Andersson et al. (2017) and Boström and Palm (2019) , of the teachers’ practices were based on data collected through classroom observations and teacher interviews. At least four classroom observations were made for each of the 14 teachers (two before and two after the PDP). The classroom visits were unannounced in order to increase the probability of observing regular lessons. The interviews were conducted before the PDP and after the school-year following the PDP. The classroom practices of the teachers in the control group were not analysed. Since the teachers in the intervention group were randomly selected, and there were no other professional development initiatives going on at the participating schools at the time, the assumption is that both groups of teachers had similar practices before the PDP, and that only the teachers in the intervention group changed their practices after the PDP.

Changes in teachers’ formative assessment practices were described in terms of the formative assessment activities they used before and after the PDP, and the framework comprising the ‘big idea’ and five key strategies ( Wiliam and Thompson, 2008 ) was used in the analysis of the data. Formative assessment activities were defined as activities that contribute to the attainment of the goals of each key strategy and the big idea. The description of the formative assessment practices in terms of formative assessment activities does not imply that these activities were carried out in isolation. For an activity to be categorised as a formative assessment activity, it needed to be carried out in conjunction with other activities to form a formative assessment practice. For example, for an activity to be categorised as belonging to Key Strategy 2 (gathering information about students’ learning needs), the information about student learning that was identified in that activity had to be used for adjusting feedback or instruction to meet those needs.

For a formative assessment activity to be considered as part of a change in a teacher’s formative assessment practice it needed to be an activity that was new in the sense that the teacher had not used it (or had used it to a lesser extent) before the PDP. That activity also needed to be used regularly as a part of the new practice. For the researchers to conclude that an activity constituted a regular part of a new practice, it was required that the teacher in the interview either provided details or examples of the uses and outcomes of the activity, or that the interview data was supplemented with classroom observation data from which it could be concluded that the activity was used regularly (in other words, it was not sufficient that the teachers only said that they used the activity). Such indications from the classroom observations include students who seemed used to, or who asked for, an activity. Examples of such classroom observations are students providing answers to teacher questions on mini-whiteboards (a technique new to many of the teachers) without asking about the procedure, and students not having to ask how to use suggestions written by the teacher on the big whiteboard about how to help them monitoring and developing their own learning. The teachers were also asked to relate their descriptions of their new practices to the lessons the researchers had observed. Thus, the interviews were the leading source of information about the teachers’ practice, and the classroom observations served to validate or reject conclusions drawn from the interviews.

3.3.3. Description of formative assessment practices implemented after the PDP

All teachers implemented new formative assessment activities in their classrooms and began to use them regularly. These activities strengthened classroom practice in line with the big idea of collecting evidence of student learning in order to adjust instruction to better meet students’ learning needs. Each teacher implemented from 3 to 19 new activities, with a median of 11.5. About half of the teachers complemented previous instruction with 3 to 7 new activities, whilst others made substantial changes in their classroom practice to include many new activities connected to each key strategy and to the big idea. The largest number of newly implemented activities were connected to Key Strategy 2, gathering information about student learning, and only a few were related to Key Strategy 4, activating students as instructional resources for one another.

The most common change was to more frequently elicit evidence of student learning with the purpose of adjusting instruction (activities pertaining to KS 2). The teachers implemented new small and quick assessment activities that were used regularly. Almost all teachers started to use ‘exit-passes’ ( Wiliam, 2011 ), question(s) that all students were required to answer in writing at the end of the lesson. About half also started to let their students answer teacher questions during the lessons on mini-whiteboards. These ‘all-response’ systems provided teachers with more frequent information about all of their students’ learning. Consequently, the teachers could make more frequent and well-founded adjustments to their instruction to better fit their students’ learning needs. The teachers realised they now had information about all of their students’ learning needs earlier than when they had gathered such information mainly from student questions during seat-work and from students who raised their hands to answer teacher questions during whole-class sessions. Using those previous assessment techniques, instruction was often adjusted based on information from only a few students. Using the new assessment techniques, with information from all students, teachers perceived they could now help more students more effectively. The teachers also perceived that the use of mini-whiteboards contributed to increased engagement and thinking among all students during whole-class sessions. Half of the teachers started to use a system of randomly distributing questions to different students, which they also felt improved the students’ engagement in learning activities.

With improved assessment of their students’ learning, most teachers also began to more frequently provide adjusted instruction for individual students, and about half of the teachers began to more often adjust their instructional activities for the whole class. Thus, not only did the frequency of the adjustment cycle increase, but the adjustments themselves were also better founded since teachers had more information about every student. However, some teachers mostly posed questions targeting ‘basic knowledge’ to examine whether students had sufficient understanding to be able to solve standard textbook tasks during seat-work, or to follow the fundamentals of an upcoming lecture. In those cases, assessments and subsequent adjustments targeted the learning needs of only a few students, and not those of students who understood the basics but were struggling for higher understanding.

The most common change in relation to Key Strategy 1, which was made by about half of the teachers, was to break the learning goals down into subgoals and present these lesson goals in the beginning of each lesson. Teachers generally also started to talk more about these learning goals with their students, and thus to focus on goals rather than on the number of tasks to be solved. However, they did not involve students in other activities that might help them to understand these goals. For example, teachers did not invite the students to actively discuss and negotiate the meaning of the goals, nor did they provide detailed examples of the goals at different levels and the criteria for attaining them, nor did they give students feedback on their interpretation of the goals.

In relation to Key Strategy 3, most teachers became more conscious of the characteristics of their feedback to students, which this resulted in their feedback being more consciously thought out. About half tried to include in their feedback two things each student had done well and one suggestion for improvement (“two stars and a wish,” Wiliam, 2011 ). Such ‘stars’ are likely to be perceived as motivators, and detailed suggestions for improvement are useful for learning. However, other than during seat-work, the teachers did not set aside specific time to work with the feedback.

Only minor changes were made concerning Key strategy 4, and half of the teachers did not change their practice with respect to this key strategy at all. A few teachers encouraged their students to collaborate more by giving them group tasks and describing how to seek and provide help. In relation to Key strategy 5 the most common change was that half of the teachers discussed and decided with their students about how they, with the aid of self-assessment, could think and act when they got stuck in solving a task, in order to learn and solve that (and other) task(s). A few teachers also began to support students’ self-regulated learning by giving them responsibility for correcting their diagnostic tests (something half of the teachers had already done before the PDP). However, teachers rarely used activities that specifically helped their students to take an active role in the formative assessment processes of peer assessment, peer feedback, and self-regulated learning. For example, they rarely described how students could assess themselves and their peers, what are the important characteristics of peer feedback, and how to adjust their learning. Neither did they set up activities in which students could practise and get feedback on these skills. In addition to having implemented a practice including the regular use of activities pertaining to the big idea and Key Strategies 2 and 3, most of the teachers also began to regularly use activities pertaining to either (or both) Key Strategies 4 or 5. However, much of the responsibility of the formative classroom practice remained with the teachers.

The big idea and all key strategies were focussed on in the PDP. However, the activities the teachers chose to implement after the PDP were most often those they expected they would be able to carry out successfully. Expectations of successful implementations were mainly based on whether they felt they had mastered an activity that they tried out in their classrooms during the PDP. Another factor determining activity choice was the anticipated value and cost of the implementation. The teachers most often choose activities that they anticipated would not be too time-consuming (including both teacher preparation and teacher and student implementation time) and not too difficult to carry out (for both teacher and students), but still provide increased student engagement and learning. A teacher’s beliefs about what a good teacher is and what constitutes good teaching were also considered when choosing activities to implement in their classroom practices ( Boström and Palm, 2020 ).

3.4. Changes in student achievement

3.4.1. data collection.

A pre-test and a post-test were used to investigate possible effects on student achievement from the formative assessment practices the teachers implemented after the PDP. At the end of spring 2011, the teachers were informed that all year-7 classes in the municipality would take the pre-test at the beginning of the next school year (middle of august). At the beginning of August, before the students arrived, the teachers were provided with written information about the test, which consisted of two parts to be administered on different days, one part for which calculators were allowed and one for which they were not. Students were given 40 min to work on each part. The total maximum score on the pre-test was 60. The first part consisted of 33 tasks, including subtasks, with a total maximum score of 33, and the second consisted of 14 tasks, including subtasks, with a total maximum score of 27.

At the end of the school year, on specified days in May, the post-test was administered to all year-7 students in the municipality. At the beginning of the term, teachers were informed about the dates, although the tests and accompanying instructions were distributed to the teachers only a few days before the test was conducted. The post-test also consisted of two parts, one with and one without calculators. Students were given 40 min to work on each part. The total maximum score on the post-test was 58. The first part consisted of 31 tasks, including subtasks, with a total maximum score of 31, and the second part consisted of 17 tasks, including subtasks, with a total maximum score of 27.

The tests were developed for the municipality’s evaluation purposes, and to also fit this study, by a group consisting of the authors, experienced secondary school teachers, and national test developers. The tests were designed to assess the mathematics specified in the national curriculum documents, which would provide the teachers in the municipality assistance in interpreting the new national curriculum. The new national curriculum was mostly an update of the former curriculum, although some vaguenesses in the previous formulations were clarified. The pre-test covered content that students were expected to have learnt by school year 6, and the post-test covered content that students were expected to learn in year 7. The general content areas were the same in the two tests, but on a more advanced level in the post-test. In both tests, this content included understanding and use of numbers, algebra, geometry, probability and statistics, and relationships and change. The tasks also required the same process standards in both tests. These process standards were handling of procedures, use of mathematical concepts, reasoning, problem solving and mathematical communication. The tests included tasks with multiple-choice, fill-in the blanks, and short-answer constructed-response formats, and for many tasks, students were required to show how they arrived at their answers. For the research study, the scores on the tests were used to evaluate differences in student achievement on the post-test between the intervention group and control group, controlling for initial differences in mathematical proficiency between the groups on the pre-test.

The pre-test was piloted in 2 year-6 classes (at the end of year 6), and the post-test was piloted in 2 year-7 classes in other municipalities before they were used for this study. These pilot studies were done to gather information about the students’ understanding of the tasks, time for solving the tasks, and indications of distributions of scores and possible ceiling effects. A few tasks were removed from the tests as a result of the pilot studies. The piloting teachers and the test development group agreed that the tests were consistent with the national curriculum documents for the relevant school years.

The teachers in the main study received detailed instructions including, for example, how to introduce the tests to students, how much support could be given to students, how to handle missing students, and how to deliver the tests back to the municipality office. The authors and a group of experienced retired mathematics teachers were hired by the municipality to mark the tests. The markers did not know whether students belonged to the intervention group or to the control group. The municipality then compiled the results in a spreadsheet. An anonymized version of this file was made available to the researchers who transferred this data to a statistical software programme. The teachers received a file with the results of their own students so that they could, if they wished, use this data to inform their subsequent instruction. The teachers were also informed that the results of their students would not be reported to anyone else.

3.4.2. Validity and reliability of test score interpretation

An interpretation of test scores may be seen as valid to the extent that a (validity) argument is supported by appropriate (theoretical and/or empirical) evidence ( Kane, 2016 ). In the current case the interpretation being made is that the tests provide a measure of the students’ knowledge and skills in the mathematics curriculum for school-year 6 (pretest) and school-year 7 (posttest). In general, validation of such an interpretation of scores from school tests could be done by evaluating how well the content of the tests reflects the curriculum and actual classroom teaching. A common way of making such evaluations is to gather a panel of teachers teaching the content in the relevant school-years (and if available, other experts on the relevant curriculum) to review the tests. This procedure was followed in this study. A panel consisting of mathematics national test developers and mathematics teachers in the relevant school-years compared the tests with the national curriculum documents as well as with the textbooks used in the municipality (textbook writers also interpret the national curriculum documents and Swedish teachers rely heavily on the textbooks in their teaching ( Skolverket, 2012 ). This panel also judged the difficulty of the tasks in relation to the particular student group of year-7 students. Empirical evidence of both the difficulty level and the students’ understandings of the tasks were available from the pilot tests. Based on that evidence and their judgements of the contents of the tests in relation to the national curriculum documents and used textbooks, the panel unanimously concluded that the tests would provide an appropriate measure of the students’ knowledge and skills in the mathematics curriculum of school-year 6 and 7, respectively.

Reliability considerations may also be seen as a part of the validation of test score interpretation. The Cronbach’s alpha coefficient for the pre-test with this sample was 0.88, which suggests high internal consistency reliability ( Cohen et al., 2011 ). For the post-test, the Cronbach’s alpha coefficient was calculated as 0.92, which suggests the post-test also had high reliability. A detailed scoring procedure was put in place to secure a high level of agreement between the raters in the scoring of the tests (no measure for the inter-rater reliability was calculated). The maximum score on a subtask varied between one and two points. The raters were provided a marking scheme that had been tested in the pilot studies, and included detailed directions about the student answers required for awarding each scoring point. In addition, examples of student work were also provided to aid the scoring of some of the tasks. All five raters were sitting in the same room. They first started by all scoring the same task solutions from the same students in all tasks, and then discussed their scorings when they differed. They repeated this procedure until scoring task solutions from new students did not result in much difference in the scoring. They then divided the scoring of the students’ solutions among themselves, but as soon as they were uncertain about a particular scoring, they discussed these uncertainties with the group and reached consensus.

3.4.3. Data analysis

The ultimate goal of the study was to investigate whether the formative assessment practices implemented by the teachers had any effect on the students’ mathematics achievement. Since the students are nested in classes, a multilevel modelling (MLM) technique could have been suitable for investigating these effects, but for such techniques to be appropriate, it is conventional for MLM to use a minimum of 20 cases per Level 2 unit (20 students in each class) and 20 Level 2 units (20 classes) in the data. There were enough students at Level 1 to ensure there were 20 cases per class. However, there were fewer than 20 classes in both the intervention group and the control group. In addition, the intraclass correlation (ICC(2)) were 0.56 for both groups and the whole sample for the pre-test, and varied between 0.67 and 0.70 on the post-test. These values, particularly the ICC(2) values for the pre-test, indicate an insufficient degree of reliability with which class-mean ratings differ between classes for MLM to be used (acceptable value are 0.7 or higher; e.g. Marsh et al., 2012 ). Thus, any values at Level 2 (classroom level) would be potentially misleading if MLM would be used. However, the ICC(1) values for both the whole sample and subgroups for both tests were between 0.06 and 0.11. Such low ICC(1) values mean that the between-class variation is very small and does not contribute much to the total variation of scores ( Lam et al., 2015 ). Thus, the possible effects of not including Level 2 in the analysis would be small (because of the small ICC(1) values). In accordance with common practice when having such low ICC(1) values, we proceeded with a Level 1 analysis only. However, when interpreting the results and answering the research questions, the possible small effects of not including Level 2 in the analysis is accounted for. It may be noted that such effects would cause negatively biassed standard errors estimated for Level 2 variables (which is the concern of interest when the intervention is delivered at the classroom level, as it is in our case), which may increase the risk of type 1 error but not type 2 error ( Maas and Hox, 2004 ; Huang, 2016 ). That is, not including a Level 2 analysis would only increase the risk of concluding that a treatment is effective when it is not, and not increase the risk of concluding that a treatment is ineffective when in fact it is effective ( Gorard, 2007 ). Since the Level 1 analysis in the present study did not detect a statistically significant effect from the intervention (see the Results section), an inclusion of Level 2 in the analysis would not produce different results.

To assess whether the formative assessment practices implemented by the teachers had any effect on the students’ mathematics achievement, a one-way between group analysis of covariance (ANCOVA) was conducted using SPSS Version 27. An ANCOVA tests for significant differences in post-test scores between the students in the intervention group and the students in the control group, whilst controlling for differences in the pre-test scores by calculating an ‘adjusted’ mean for the post-test scores. SPSS uses regression procedures to remove the variation in the dependent variable (the post-test scores) that is due to the covariate (the pre-test scores), and then performs a normal analysis of variance on the post-test scores. To examine the ANCOVA assumptions of normality and homogeneity of variances, we used the Shapiro–Wilk test and Levene’s test, respectively.

We also conducted a partial correlation analysis to study the relationship between the number of new formative assessment activities implemented by the teachers who had participated in the professional development program and their students’ achievement on the post-test when controlled for the scores on the pre-test. Partial correlation is similar to Pearson product–moment correlation, except that it allows control for an additional variable (in this case, the pre-test).

In the ANCOVA, the group variable was professional development of the students’ teachers, that is, the students taught by teachers who had participated in the formative assessment programme were assigned to one group and the other students were assigned to the other group. The dependent variable was the students’ scores on the post-test. The students’ scores on the pre-test were used as the covariate in the analysis. Analyses of normality, linearity, and homogeneity of variances show that all prerequisites for ANCOVA were met. The Shapiro–Wilk’s test ( p  > 0.05) yielded significance values above 0.8, which indicate that the mean post-test scores of both student groups were approximately normally distributed. This was also supported by visual inspection of the histograms, normal QQ-plots, and box plots of both groups’ mean post-test scores. The distribution of scores indicates a linear relationship between the dependent variable and the covariate for both the intervention group and the control group. Thus, the assumption of linearity is not violated. Levene’s test yielded a significance value of 0.84, which verified the equality of variances in the samples (homogeneity of variance, p  > 0.05). In addition to the prerequisites for ANCOVA being met, a significance value of 0.89 showed that there was no statistically significant interaction between the group variable and the covariate. This supports the assumption of homogeneity of regression slopes, which enables a direct comparison of adjusted means between groups.

The results of the ANCOVA show that, after adjusting for the pre-test scores, there was no significant difference in post-test scores between the intervention group and the control group ( F (1,563) = 0.037, p  = 0.85, η p 2  = 0.000). Indeed, the effect size measured by eta squared was 0.000. Thus, when controlling for achievement on the pre-test, the formative classroom practices implemented by the teachers in the intervention group did not have an effect on student achievement on the post-test in comparison with the classroom practices of the teachers in the control group. As described in the Data analysis section above, the decision to not include Level 2 in the analysis would increase the risk of type 1 error but not type 2 error ( Maas and Hox, 2004 ; Huang, 2016 ), and would therefore not increase the risk of the results not showing an effect of the intervention if there actually was one. Thus, the decision to not include Level 2 in the analysis did not affect the results of the study.

To find indications of whether students at different achievement levels were affected by the formative assessment practices, we conducted the same analysis on three student subsamples: the third of the students that had the lowest scores on the pre-test, the middle third of the students, and the third of the students that had the highest scores on the pre-test. The ANCOVAs showed that, after adjusting for the pre-test scores, there were no significant differences in post-test scores between the intervention group and the control group for either of these student subsamples ( F (1,184) = 0.59, p  = 0.44, η p 2  = 0.003; F (1,193) = 0.60, p  = 0.44, η p 2  = 0.003; F (1,180) = 1.38, p  = 0.24, η p 2  = 0.008). The mean scores on the tests are presented in Table 1 .

www.frontiersin.org

Table 1 . Mean scores on the tests.

A partial correlation analysis was made to investigate a possible relationship between the number of new formative assessment activities implemented by the teachers in the intervention group and their students’ achievement gains. Preliminary analyses showed no violation of the required assumptions of normality, linearity, and homoscedasticity. The partial correlation showed a weak, not statistically significant, positive partial correlation between the number of new formative assessment activities implemented by the teachers and the students’ achievement on the post-test when controlled for the pre-test ( r  = 0.15, n  = 14, p  = 0.62). An inspection of the zero-order correlation ( r  = 0.26) suggests that controlling for the pre-test had some effect on the strength of the (non-significant) relationship between the two variables.

5. Discussion

5.1. conclusion.

The results of the study show that formative assessment, as implemented by the year-7 teachers who participated in the professional development programme, did not have a significant effect on students’ achievement. The results showed no effects for either the whole intervention group or for any of the three student subsamples that differed in their achievement on the pre-test. There was also no significant correlation between the number of formative assessment activities implemented by the teachers and student achievement on the post-test when controlled for the pre-test scores.

The results of the study do not mean that formative assessment does not improve student achievement. Earlier studies have shown that practices of all four approaches to formative assessment can enhance achievement. Strongly teacher-led practices in which the teacher gathers information about student learning and provides feedback and instructional activities adapted to the identified learning needs ( Hattie and Timperley, 2007 ; National Mathematics Advisory Panel, 2008 ; Yeh, 2009 ; Burns et al., 2010 ; Koedinger et al., 2010 ; Faber et al., 2017 ; Palm et al., 2017 ; Murphy et al., 2020 ), practices that focus on the student’s proactive engagement in formative assessment practices such as peer-assessment and peer-feedback ( Sanchez et al., 2017 ; Double et al., 2020 ) and self-assessment ( Ross, 2006 ; Graham et al., 2015 ; Sanchez et al., 2017 ; Andrade, 2019 ), and practices that include all of these three approaches ( Wiliam et al., 2004 ; Andersson and Palm, 2017 ) have all been shown to improve student achievement. The results of this study explicate that it is not the general approach to formative assessment in itself that is the decisive factor in whether the practice will affect student achievement (although the approach taken will provide different affordances for, and constraints upon, possible effects). Instead, they indicate that the way the approach is implemented is more essential to learning. This conclusion is consistent with reviews of the effects on student achievement from each approach to formative assessment, which all report positive average effects from the different approaches and large differences in effects sizes of the studies within each approach ( Ross, 2006 ; Hattie and Timperley, 2007 ; National Mathematics Advisory Panel, 2008 ; Graham et al., 2015 ; Palm et al., 2017 ; Sanchez et al., 2017 ; Double et al., 2020 ). The results of the present study show that the particular ways in which the formative classroom practices were performed by the teachers were insufficient for improving student achievement. If the characteristics of the implemented activities had been different (for example if activities had included more specified instructions and practice for students about how to assess and give feedback to their peers, or if activities would have more explicitly supported students in self-regulating their learning), there might have been a correlation between the number of implemented formative assessment activities and student achievement gains.

This study examined the effect of formative assessment practices on student achievement using measurements of actual student performance before and after the formative assessment intervention, and we have described the characteristics of these formative assessment practices. Such studies have been scarce ( Schneider and Randel, 2010 ), and calls have been made to complement other types of research with this sort of study ( Bennett, 2011 ; Kingston and Nash, 2011 ; McMillan et al., 2013 ). Our study addresses this need to empirically connect certain formative assessment practices with student achievement gains in different populations and contexts. In contrast to the present study, an earlier experimental study by Andersson and Palm (2017) did find positive effects when teachers developed formative assessment practices based on the big idea and five key strategies in the framework by Wiliam and Thompson (2008) . In that study, formative assessment practices implemented by the year-4 teachers were based on the same framework as the formative assessment practices implemented by the year-7 teachers in the present study, and the specific practices implemented by both groups of teachers were also very similar. In the following, we will discuss possible reasons that the year-7 teachers’ practices were inadequate for improving student achievement, and the characteristics of the implemented formative assessment practices that might explain the differences in the studies’ outcomes. Further studies may explore these differences in more detail and empirically examine their possible effects.

5.2. Possible explanations for the non-effects on student achievement

The formative assessment practices of the year-7 teachers were in many ways similar to those of year-4 teachers (students approximately 10 years old) in a parallel study, where we saw that those practices did affect student achievement in mathematics ( Andersson and Palm, 2017 ). The year-7 teachers’ formative assessment practices included assessing the learning of all students’ learning more often than they did before the PDP (mostly by using exit passes at the end of lessons), which enabled the teachers to adapt their instruction to the learning needs of all their students more frequently (and they did use this information to adjust feedback and instructional activities). Such practice is central to formative assessment, and should be beneficial for student achievement.

However, there were some differences between the practices of these two teacher groups that may have affected student achievement differently. For example, all year-4 teachers began to often let all students respond to daily whole-class questions on their mini-whiteboards, and those responses were followed by immediate modifications to instructional activities and feedback ( Andersson and Palm, 2017 ); in contrast, only half of the year-7 teachers did so. Consequently, the year-4 teachers were more able to provide a practice that continuously adapted to their students’ learning needs. In addition, the feedback of the year-4 teachers more often included detailed comments about what the students had done well and suggestions for improvement, which would enhance students’ feelings of competence and therefore their motivation ( Ryan and Deci, 2020 ), which would provide extended learning opportunities. Another difference between the practices of the two teacher groups is that it seems the year-7 teachers more often used questions more targeted towards ‘basic knowledge’ than towards conceptual understanding at various levels. The present study did not find any effects on student achievement for the students with the lowest pre-test scores, so the focus on basic skills does not seem to have helped these students learn more. However, more high-achieving students may have been able to benefit from a classroom practice that had been adapted to fit also their learning needs. To achieve significant overall effects on student achievement, it would be important for both questions and adjusted instruction to be targeted to different levels of knowledge and skills, so that the learning needs of all students in the class could be detected in the first place and then met accordingly.

However, assessing different levels of knowledge and skills, and adapting instruction to meet these different learning needs, would be much more difficult than only ensuring that all students obtain basics knowledge and skills in the curriculum. It is possible that these difficulties could be overcome using professionally developed tests as a means of gathering information about the students’ progress and understanding. Then the teachers would not have to develop these items themselves, and items on such tests may be more thought-out and aimed towards capturing the true diversity of student understanding. Such items could be used during lessons to complement other ways of collecting information about students’ learning needs. Indeed, several studies have shown that the use of such tests can improve student achievement ( Yeh, 2009 ; Burns et al., 2010 ; Koedinger et al., 2010 ; Faber et al., 2017 ; Murphy et al., 2020 ). A disadvantage of relying on these tests may be that they limit the teachers’ flexibility regarding when and how to assess their students, and when and how to take actions based on this assessment information ( Palm et al., 2017 ). Using these kinds of tests as a main source for gathering information about students’ learning could therefore hinder development of practices in which teachers could continuously create and capitalise upon what Black and Wiliam (2009 , p. 10) call ‘moments of contingency’, in other words, to make timely adjustments to their teaching in order to meet their students’ learning needs.

In their communication with students, the teachers also changed their emphasis away from the number of tasks to be solved in favour of focusing on the intended learning goals, and they started to present learning goals for each lesson. This sort of change may indeed be a first step towards teachers and students reaching a common understanding of the learning goals. However, the teachers did not go into detail with examples or more thorough descriptions of the learning goals, nor the criteria for attaining those goals at various levels, nor did they involve the students in active discussions and negotiations about the meaning of the goals. This change in emphasis towards the learning goals may be too superficial for the students to understand them well enough to use them as motivators, guidance in their own learning and guidance in their support of their peers’ learning ( Wiliam, 2007 ). Similarly, although some of the teachers started to hand over more responsibility for correcting diagnostic tests to the students, most of them did not teach the students how to regulate their own learning or how to assess and give feedback to their peers, which are approaches to formative assessment that can improve learning ( Ross, 2006 ; Graham et al., 2015 ; Palm et al., 2017 ; Sanchez et al., 2017 ; Andrade, 2019 ; Double et al., 2020 ). Thus, the group of teachers in the present study did implement practices that included all five key strategies, but most implemented activities were classified as pertaining to Key Strategies 1–3, so the focus was on the teacher as the responsible agent for the formative assessment process. This focus is similar to that identified in other studies aiming for the fourth approach to formative assessment ( Jönsson et al., 2015 ; Wylie and Lyon, 2015 ). Not providing sufficient support for students to act as proactive agents in the formative assessment processes, either as self-regulated learners or peer assessors, misses one possible factor that could improve student achievement. Indeed, an appropriate use of Key strategies 4 and 5 is sometimes seen as signifying the “spirit” of formative assessment ( Marshall and Drummond, 2006 ; DeLuca et al., 2019 ).

5.3. Limitations of the study and future research

It cannot be ruled out that the study would have produced different results if, for example, the time between the pre-test and post-test had been even longer, or if the student sample had included students with a longer history of participating in formative assessment practices. Such students might have been able to make more effective uses of the learning opportunities available to them via improved feedback based on more frequent assessments. In general, teachers’ formative assessment practices might also improve during an implementation. Many teachers may need to start by implementing some parts of a formative assessment practice, and when feeling confident with these improvements, extend their practices further. With time, teachers may also improve the quality of the practice that they implemented. Both of these patterns are consistent with the learning continuum for teachers’ implementation of formative assessment suggested by DeLuca et al. (2019) , and effects on student achievement may sometimes manifest after a longer period of time than the time span in the present study. This suggests that longitudinal studies on the effects of formative assessment on student achievement would be valuable contributions to the research community.

A limitation of the study is that we did not specifically study the practices of the control group. Since the teachers in the intervention group were randomly selected, the assumption is that both groups of teachers had similar practices before the PDP, and that only the teachers in the intervention group changed their practices after the PDP. If that assumption was not correct (if for example the teachers in the intervention group would have shared their knowledge about formative assessment to their colleagues in the control group, or if the control group teachers had developed formative assessment practices for other reasons), this could have affected the results of the study. However, this scenario seems unlikely, because developing formative assessment practices most often requires strong long-term professional development support with ample time for learning and implementation ( Heitink et al., 2016 ), and there were no other professional development initiatives going on at the participating schools at the time. In addition, the teachers said they had not shared any information with colleagues not teaching students in the intervention group (in fact they said it was difficult to even find time to collaborate with their colleagues who did teach these students). Another possibility is that the quality of mathematics teaching in Swedish schools is particularly high, and differences in achievement from teaching interventions would be easier to detect in countries where teachers do not engage in students’ thinking as much. It is certainly possible that changes in teacher practices similar to those that were made in the current study might produce other results in other school contexts, but based on results of international comparative studies such as PISA Sweden does not stand out as an exceptional country when it comes to mathematics achievement ( OECD, 2019 ).

Another limitation of this study is that it did not provide detailed specifications of some of the important characteristics of the teachers’ formative classroom practices. Further specifications that would have been useful include more details about the teachers’ feedback (e.g., how often each student received feedback and details about the teachers’ suggestions for how to improve), the quality of the teachers’ questions and tasks and the kinds of knowledge and skills they elicited, and details about the sort of adaptations to instructional activities teachers made in light of the assessment information they collected. Details about the interactions between the teachers and their students (for example, in feedback and support for peer-assessment and self-assessment) would also have been useful. Experimental studies connecting such specified characteristics to student outcomes would be valuable contributions to our understanding of the mechanisms underlying the impact of formative assessment and to the further development of a theory of formative assessment that is based on both theoretical and empirical evidence.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors upon request, without undue reservation.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants’ legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Acknowledgments

This article is based on parts of a doctoral dissertation by Boström (2017) .

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

1. ^ We use the terms formative assessment and assessment for learning synonymously. This use of terminology is in accordance with some scholars (e.g., Black and Wiliam, 2009 ; Bennett, 2011 ; Baird et al., 2014 ), whilst others use the terms with somewhat different connotations (e.g., Swaffield, 2011 ).

Andersson, C., Boström, E., and Palm, T. (2017). Formative assessment in Swedish mathematics classroom practice. Nord. Stud. Math. Educ. 22, 5–20.

Google Scholar

Andersson, C., and Palm, T. (2017). The impact of formative assessment on student achievement: a study of the effects of changes to classroom practice after a comprehensive professional development programme. Learn. Instr. 49, 92–102. doi: 10.1016/j.learninstruc.2016.12.006

CrossRef Full Text | Google Scholar

Andrade, H. (2019). A critical review of research on student self-assessment. Front. Educ. 4:87. doi: 10.3389/feduc.2019.00087

Baird, J., Hopfenbeck, T., Newton, P., Stobart, G., and Steen-Utheim, A. (2014). State of the Field review: Assessment and Learning . Oslo: Report for the Norwegian Knowledge Centre for Education, case number 13/4697. Availalbe at: http://forskningsradet.no

Bell, C., Steinberg, J., Wiliam, D., and Wylie, C. (2008). “Formative assessment and teacher achievement: two years of implementation of the keeping learning on track program” in Paper Presented at the Annual Meeting of the National Council on Measurement in Education (New York, NY)

Bennett, R. E. (2011). Formative assessment: a critical review. Assess. Educ. Princ. Policy Pract. 18, 5–25. doi: 10.1080/0969594x.2010.513678

Black, P., and Wiliam, D. (1998). Assessment and classroom learning. Assess. Educ. Princ. Policy Pract. 5, 7–74. doi: 10.1080/0969595980050102

Black, P., and Wiliam, D. (2009). Developing the theory of formative assessment. Educ. Assess. Eval. Account. 21, 5–31. doi: 10.1007/s11092-008-9068-5

Boström, E., (2017). Formativ bedömning: en enkel match eller en svår utmaning. Effekter av en kompetensutvecklingssatsning på lärarnas praktik och på elevernas prestationer i matematik. (Doktorsavhandling, Umeå universitet, Sverige) Hämtad från. Available at: http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135038

Boström, E., and Palm, T. (2019). Teachers’ formative assesment practices: changes after a professional development programme and important conditions for change. Assess. Matters 13, 44–70. doi: 10.18296/am.0038

Boström, E., and Palm, T. (2020). Expectancy-value theory as an explanatory theory for the effect of professional development programmes in formative assessment on teacher practice. Teach. Dev. 24, 539–558. doi: 10.1080/13664530.2020.1782975

Briggs, D. C., Ruiz-Primo, M. A., Furtak, E., Shepard, L., and Yin, Y. (2012). Meta-analytic methodology and inferences about the efficacy of formative assessment. Educ. Meas. Issues Pract. 31, 13–17. doi: 10.1111/j.1745-3992.2012.00251.x

Burns, M., Klingbeil, D., and Ysseldyke, J. (2010). The effects of technology-enhanced formative evaluation on student performance on state accountability math tests. Psychol. Sch. 47, 582–591. doi: 10.1002/pits.20492

Chen, F., and Andrade, H. (2018). The impact of criteria-referenced formative assessment on fifth-grade students’ theater arts achievement. J. Educ. Res. 111, 310–319. doi: 10.1080/00220671.2016.1255870

Chen, F., Lui, A. M., Andrade, H., Valle, C., and Mir, H. (2017). Criteria-referenced formative assessment in the arts. Educ. Assess. Eval. Account. 29, 297–314. doi: 10.1007/s11092-017-9259-z

Cohen, L., Manion, L., and Morrison, K. (2011). Research Methods in Education (7th). New York, NY: Routledge.

DeLuca, C., Chapman-Chin, A., and Klinger, D. (2019). Toward a teacher professional learning continuum in assessment for learning. Educ. Assess. 24, 267–285. doi: 10.1080/10627197.2019.1670056

Desimone, L. (2009). Improving impact studies of teachers’ professional development: toward better conceptualizations and measures. Educ. Res. 38, 181–199. doi: 10.3102/0013189X08331140

Double, K. S., McGrane, J. A., and Hopfenbeck, T. N. (2020). The impact of peer assessment on academic performance: a meta-analysis of control group studies. Educ. Psychol. Rev. 32, 481–509. doi: 10.1007/s10648-019-09510-3

Dunn, K. E., and Mulvenon, S. W. (2009). A critical review of research on formative assessment: the limited scientific evidence of the impact of formative assessment in education. Pract. Assess. Res. Eval. 14, 1–11. doi: 10.7275/JG4H-RB87

Faber, J. M., Luyten, H., and Visscher, A. J. (2017). The effects of a digital formative assessment tool on mathematics achievement and student motivation: results of a randomized experiment. Comput. Educ. 106, 83–96. doi: 10.1016/j.compedu.2016.12.001

Flórez, M. T., and Sammons, P. (2013). Assessment for Learning: Effects and Impact . Reading: CfBT Education Trust.

Gorard, S. (2007). The dubious benefits of mulit-level modeling. Int. J. Res. Method Educ. 30, 221–236. doi: 10.1080/17437270701383560

Graham, S., Hebert, M., and Harris, K. (2015). Formative assessment and writing. Elem. Sch. J. 115, 523–547. doi: 10.1086/681947

Hattie, J., and Timperley, H. (2007). The power of feedback. Rev. Educ. Res. 77, 81–112. doi: 10.3102/003465430298487

Heitink, M. C., Van der Kleij, F. M., Veldkamp, B. P., Schildkamp, K., and Kippers, W. B. (2016). A systematic review of prerequisites for implementing assessment for learning in classroom practice. Educ. Res. Rev. 17, 50–62. doi: 10.1016/j.edurev.2015.12.002

Hirsh, Å., and Lindberg, V. (2015). Formativ Bedömning på 2000-Talet: En översikt av Svensk Och Internationell Forskning, Delrapport Från Skolforsk-Projektet [Formative Assessment in the 21st Century: An Overview of Swedish and International Research] . Stockholm: Swedish Research Council.

Huang, F. (2016). Alternatives to multilevel modeling for the analysis of clustered data. J. Exp. Educ. 84, 175–196. doi: 10.1080/00220973.2014.952397

Jönsson, A., Lundahl, C., and Holmgren, A. (2015). Evaluating a large-scale implementation of assessment for learning in Sweden. Assess. Educ. Princ. Policy Pract. 22, 104–121. doi: 10.1080/0969594X.2014.970612

Kane, M. (2016). Explicating validity. Assess. Educ. Princ. Policy Pract. 23, 198–211. doi: 10.1080/0969594X.2015.1060192

Kingston, N., and Nash, B. (2011). Formative assessment: a meta-analysis and a call for research. Educ. Meas. Issues Pract. 30, 28–37. doi: 10.1111/j.1745-3992.2011.00220.x

Koedinger, K., McLaughlin, E., and Heffernan, N. (2010). A quasi-experimental evaluation of an on-line formative assessment and tutoring system. J. Educ. Comput. Res. 43, 489–510. doi: 10.2190/EC.43.4.d

Koenka, A. C., Linnenbrink-Garcia, L., Moshontz, H., Atkinson, K. M., Sanchez, C. E., and Cooper, H. (2021). A meta-analysis on the impact of grades and comments on academic motivation and achievement: a case for written feedback. Educ. Psychol. 41, 922–947. doi: 10.1080/01443410.2019.1659939

Lam, A., Ruzek, E., Schenke, K., Conley, A., and Karabenick, S. (2015). Student perceptions of classroom achievement goal structure: is it appropriate to aggregate? J. Educ. Psychol. 107, 1102–1115. doi: 10.1037/edu0000028

Maas, C., and Hox, J. (2004). Robustness issues in multilevel regression analysis. Stat. Neerl. 58, 127–137. doi: 10.1046/j.0039-0402.2003.00252.x

Marsh, H. W., Lüdtke, O., Nagengast, B., Trautwein, U., Morin, A. J. S., Abduljabbar, A. S., et al. (2012). Classroom climate and contextual effects: conceptual and methodological issues in the evaluation of group-level effects. Educ. Psychol. 47, 106–124. doi: 10.1080/00461520.2012.670488

Marshall, B., and Drummond, M. J. (2006). How teachers engage with assessment for learning: lessons from the classroom. Res. Pap. Educ. 21, 133–149. doi: 10.1080/02671520600615638

McMillan, J. H., Venable, J. C., and Varier, D. (2013). Studies of the effect of formative assessment on student achievement: so much more is needed. Pract. Assess. Res. Eval. 18, 1–15. doi: 10.7275/tmwm-7792

Murphy, R., Roschelle, J., Feng, M., and Mason, C. A. (2020). Investigating efficacy, moderators and mediators for an online mathematics homework intervention. J. Res. Educ. Effect. 13, 235–270. doi: 10.1080/19345747.2019.1710885

National Mathematics Advisory Panel. (2008). Chapter 6: Report of the Task Group on Instructional Practices. Available at: http://www.ed.gov/about/bdscomm/list/mathpanel/report/instructional-practices.pdf (Accessed September 12, 2014).

OECD (2019). “PISA 2018 Results” in (Volume 1): What Students Know and Can Do (Paris, France: PISA, OECD Publishing)

Palm, T., Andersson, C., Boström, E., and Vingsle, L. (2017). A review of the impact of formative assessment on student achievement in mathematics. Nord. Stud. Math. Educ. 22, 25–50.

Randel, B., Apthorp, H., Beesley, A., Clark, T., and Wang, X. (2016). Impacts of professional development in classroom assessment on teacher and student outcomes. J. Educ. Res. 109, 491–502. doi: 10.1080/00220671.2014.992581

Rohrbeck, C. A., Ginsburg-Block, M. D., Fantuzzo, J. W., and Miller, T. R. (2003). Peer-assisted learning interventions with elementary school studies: a meta-analytic review. J. Educ. Psychol. 95, 240–257. doi: 10.1037/0022-0663.95.2.240

Ross, J. (2006). The reliability, validity, and utility of self-assessment. Pract. Assess. Res. Eval. 11, 1–13. doi: 10.7275/9wph-vv65

Ryan, R., and Deci, E. (2020). Intrinsic and extrinsic motivation from a self-determination theory perspective: definitions, theory, practices, and future directions. Contemp. Educ. Psychol. 61:101860:101860. doi: 10.1016/j.cedpsych.2020.101860

Sanchez, C. E., Atkinson, K. M., Koenka, A. C., Moshontz, H., and Cooper, H. (2017). Self-grading and peer-grading for formative and summative assessments in 3rd through 12th grade classrooms: a meta-analysis. J. Educ. Psychol. 109, 1049–1066. doi: 10.1037/edu0000190

Schneider, C., and Randel, B. (2010). “Research on characteristics of effective professional development programs for enhancing educators’ skills in formative assessment” in Handbook of Formative Assessment . eds. H. Andrade and G. Cizek (New York: Routledge), 251–276.

Shute, V. J. (2008). Focus on formative feedback. Rev. Educ. Res. 78, 153–189. doi: 10.3102/0034654307313795

Skolverket. (2012). TIMSS 2011: Svenska Grundskoleelevers Kunskaper i Matematik Och Naturvetenskap i ett Internationellt Perspektiv (Rapport 380) . Stockholm: Skolverket.

Swaffield, S. (2011). Getting to the heart of authentic assessment for learning. Assess. Educ. Princ. Policy Pract. 18, 433–449. doi: 10.1080/0969594X.2011.582838

Timperley, H., Wilson, A., Barrar, H., and Fung, I. (2007). Teacher Professional Learning and Development: Best Evidence Synthesis Iteration . Wellington, New Zealand: Ministry of Education.

Wafubwa, R. N. (2020). Role of formative assessment in improving students’ motivation, engagement, and achievement: a systematic review of literature. Int. J. Assess. Eval. 28, 17–31. doi: 10.18848/2327-7920/CGP/v28i01/17-31

Wafubwa, R. N., and Csíkos, C. (2022). Impact of formative assessment instructional approach on students’ mathematics achievement and their metacognitive awareness. Int. J. Instr. 15, 119–138. doi: 10.29333/iji.2022.1527a

Wiliam, D. (2007). “Keeping learning on track: classroom assessment and the regulation of learning” in Second Handbook of Mathematics Teaching and Learning . ed. F. K. Lester Jr. (Greenwich, CT: Information Age Publishing), 1053–1098.

Wiliam, D. (2011). Embedded Formative Assessment . Bloomington, Indiana: Solution Tree Press.

Wiliam, D., Lee, C., Harrison, C., and Black, P. (2004). Teachers developing assessment for learning: impact on student achievement. Assess. Educ. Princ. Policy Pract. 11, 49–65. doi: 10.1080/0969594042000208994

Wiliam, D., and Thompson, M. (2008). “Integrating assessment with learning: what will it take to make it work?” in The Future of Assessment: Shaping Teaching and Learning . ed. C. A. Dwyer (Mahwah, NJ: Lawrence Erlbaum Associates), 53–82.

Wisniewski, B., Zierer, K., and Hattie, J. (2020). The power of feedback revisited: a meta-analysis of educational feedback research. Front. Psychol. 10:3087. doi: 10.3380/fpsyg.2019.03087

PubMed Abstract | CrossRef Full Text | Google Scholar

Wylie, E. C., and Lyon, C. J. (2015). The fidelity of formative assessment implementation: issues of breadth and quality. Assess. Educ. Princ. Policy Pract. 22, 140–160. doi: 10.1080/0969594X.2014.990416

Yeh, S. S. (2009). Class size reduction or rapid formative assessment? A comparison of cost-effectiveness. Educ. Res. Rev. 4, 7–15. doi: 10.1016/j.edurev.2008.09.001

Keywords: formative assessment, assessment for learning, student achievement, effect, mathematics

Citation: Boström E and Palm T (2023) The effect of a formative assessment practice on student achievement in mathematics. Front. Educ . 8:1101192. doi: 10.3389/feduc.2023.1101192

Received: 17 November 2022; Accepted: 23 February 2023; Published: 16 March 2023.

Reviewed by:

Copyright © 2023 Boström and Palm. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Erika Boström, [email protected] ; Torulf Palm, [email protected]

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.11(2); 2021

Logo of bmjo

Original research

Formative peer assessment in higher healthcare education programmes: a scoping review, marie stenberg.

Department of Care Science, Faculty of Health and Society, Malmö University, Malmö, Sweden

Elisabeth Mangrio

Mariette bengtsson, elisabeth carlson, associated data.

bmjopen-2020-045345supp001.pdf

bmjopen-2020-045345supp002.pdf

All data relevant to the study are included in the article or uploaded as online supplemental information. No additional data available.

Formative peer assessment focuses on learning and development of the student learning process. This implies that students are taking responsibility for assessing the work of their peers by giving and receiving feedback to each other. The aim was to compile research about formative peer assessment presented in higher healthcare education, focusing on the rationale, the interventions, the experiences of students and teachers and the outcomes of formative assessment interventions.

A scoping review.

Data sources

Searches were conducted until May 2019 in PubMed, Cumulative Index to Nursing and Allied Health Literature, Education Research Complete and Education Research Centre. Grey literature was searched in Library Search, Google Scholar and Science Direct.

Eligibility criteria

Studies addressing formative peer assessment in higher education, focusing on medicine, nursing, midwifery, dentistry, physical or occupational therapy and radiology published in peer-reviewed articles or in grey literature.

Data extractions and synthesis

Out of 1452 studies, 37 met the inclusion criteria and were critically appraised using relevant Critical Appraisal Skills Programme, Joanna Briggs Institute and Mixed Methods Appraisal Tool tools. The pertinent data were analysed using thematic analysis.

The critical appraisal resulted in 18 included studies with high and moderate quality. The rationale for using formative peer assessment relates to giving and receiving constructive feedback as a means to promote learning. The experience and outcome of formative peer assessment interventions from the perspective of students and teachers are presented within three themes: (1) organisation and structure of the formative peer assessment activities, (2) personal attributes and consequences for oneself and relationships and (3) experience and outcome of feedback and learning.

Healthcare education must consider preparing and introducing students to collaborative learning, and thus develop well-designed learning activities aligned with the learning outcomes. Since peer collaboration seems to affect students’ and teachers’ experiences of formative peer assessment, empirical investigations exploring collaboration between students are of utmost importance.

Strengths and limitations of this study

  • The current scoping review is previously presented in a published study protocol.
  • Four databases were systematically searched to identify research on formative peer assessment.
  • Critical appraisal tools were used to assess the quality of studies with quantitative, qualitative and mixed-methods designs.
  • Articles appraised as high or moderate quality were included.
  • Since only English studies were included, studies may have been missed that would otherwise have met the inclusion criteria.

Peer assessment is an educational approach where feedback, communication, reflection and collaboration between peers are key characteristics. In a peer assessment activity, students take responsibility for assessing the work of their peers by giving (and receiving) feedback on a specific subject. 1 It allows students to consider the learning outcomes for peers of similar status and to reflect on their own learning mirrored in a peer. 2 Peer assessment has shown to support students’ development of judgement skills, critiquing abilities and self-awareness as well as their understanding of the assessment criteria used in a course. 1 In higher education, peer assessment has been a way to move from an individualistic and teacher-led approach to a more collaborative, student-centred approach to assessment 1 aligned with social constructivism principles. 3 In this social context of interaction and collaboration, students can expand their knowledge, identify their strengths and weaknesses, and develop personal and professional skills 4 by evaluating the professional competence of a peer. 5 Peer assessment can be used in academic and professional settings as a strategy to enhance students’ engagement in their own learning. 6–8 The collaborative aspect of peer assessment relates to professional teamwork, as well as to broader goals of lifelong learning. As argued by Boud et al , 1 peer assessment addresses course-specific goals not readily developed otherwise. For healthcare professions, it enhances the ability to work in a team in a supportive and respectful atmosphere, 9 which is highly relevant for patient outcome and the reduction of errors compromising patient safety. 10 However, recent research has shown that peer collaboration is challenging 11 and that healthcare professionals are not prepared to deliver and receive feedback effectively. 12 This emphasises the importance for healthcare educators to support students with activities fostering these competences. Feedback is highly associated with enhancing student learning 13 and modifying learning during the learning process 14 as a means for students to close the gap between their present state of learning and their desired goal(s). Peer feedback can be written or oral and conducted as peer observations in small or large groups. 8 Further, it is driven by set assessment criteria, 1 which can be either summative or formative, formal or informal. Summative assessment evaluates students’ success or failure after the learning process, 15 whereas formative assessment aims for improvement during the learning process. 4 16 According to Black and Wiliam, 15 formative peer assessment activities involve feedback to modify the teaching and learning of the students. The intention of feedback is to help students help each other when planning their learning. 4 17 An informal formative peer assessment activity involves a continuous process throughout a course or education, whereas a formal one is designated to a single point in a course momentum. Earlier research on peer assessment in healthcare education has provided an overview of specific areas within the peer assessment process. For example, Speyer et al presented psychometric characteristics of peer assessment instruments and questionnaires in medical education, 18 concluding that quite a few instruments exist; however, these intruments mainly focus on professional behaviour and they lack sufficient psychometric data. Tornwall 12 focused on how nursing students were prepared by academics to participate in peer assessment activities and highlighted the importance of creating a supporting learning environment. Lerchenfeldt et al 19 concluded that peer assessment supports medical students in developing professional behaviour and that peer feedback is a way to assess professionalism. Khan et al 20 reviewed the role of peer assessment in objective structured clinical examinations (OSCE), showing that peer assessment promotes learning but that students need training in how to provide feedback. In short, the existing literature contributes valuable knowledge about formative peer assessment in healthcare education targeting specific areas. However, there seems to be a lack of compiled research considering formative peer assessment in its entirety, including the context, rationale, experience and outcome of the formative peer assessment process. Therefore, this scoping review attempts to present an overview of formative peer assessment in healthcare education rather than specific areas within that process.

This scoping review was conducted using the York methodology by Arksey and O’Malley 21 and the recommendations presented by Levac et al . 22 We constructed a scoping protocol, using a Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols, to present the planned methodology for the scoping review. 23

Aim and research questions

We aimed to compile research about formative peer assessment presented in higher healthcare education. The research questions were as follows: What are the rationales for using formative peer assessment in healthcare education? How are formative peer assessment interventions delivered in healthcare education and in what context? What experiences of formative peer assessment do students and teachers in healthcare education have? What are the outcomes of formative peer assessment interventions? We used the ‘Population Concept and Context’ elements recommended for scoping reviews to establish effective search criteria ( table 1 ). 24

The Population Concept and Context mnemonic as recommended by the Joanna Briggs Institute

Relevant studies identified

The literature search was conducted in the databases PubMed, Cumulative Index to Nursing and Allied Health Literature, Education Research Complete and Education Research Centre. Search tools such as Medical Subject Headings, Headings, Thesaurus and Boolean operators (AND/OR) helped expand and narrow the search. Initially, the search terms were broad (eg, peer assessment or higher education) in order to capture the range of published literature. However, the extensiveness of the material made it necessary to narrow the search terms and organise them in three major blocks. The following inclusion criteria were applied in the search: (1) articles addressing formative peer assessment in higher education; (2) students and teachers in medicine, nursing, midwifery, dentistry, physical or occupational therapy and radiology and (3) peer-reviewed articles, grey literature (books, discussion papers, posters, etc). Studies of summative peer assessment, instrument development and systematic reviews were excluded. We incorporated several similar terms related to peer assessment in the search to ensure that no studies were missed ( online supplemental appendix 1 ). Furthermore, we consulted a well-versed librarian with experience of systematic search 25 to assist us in systematically identifying relevant databases and search terms for each database, control the relevance of the constructed search blocks and manage the data in a reference management system. No limitation was set for year, all studies indexed in the four databases were included until the last search 28 May 2019.

Supplementary data

Study selection.

The process of the study selection and the reasons for exclusion are presented in a flow diagram 26 ( figure 1 ). First, the first author (MS) screened all 1452 titles. Second, MS read all the abstracts, gave those responding to the research questions a unique code, and organised them in a reference management system. The reason for inclusion and exclusion at title and abstract level was charted by the first author and critically discussed within the team (MS, EM, MB and EC). An additional hand search of reference lists was conducted. To cover a subject in full, a scoping review should include search in grey literature. 21 22 Therefore, the grey literature was scoped to find unpublished results by searching Google Scholar, LibSearch and Science Direct. The grey literature mostly contained research posters, conference abstracts, discussion papers and books, but a handsearch revealed original research articles that were added for further screening and appraisal. Finally, the first author (MS) arrived at 81 studies, read them in full-text, and discussed them with the other three authors (EM, MB and EC).

An external file that holds a picture, illustration, etc.
Object name is bmjopen-2020-045345f01.jpg

PRISMA flow chart. ERC, Education Research Centre; ERIC, Education Research Complete; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Charting the data

We constructed a charting form to facilitate the screening of the full-text studies ( online supplemental appendix 2 ). Out of the 81 studies, 37 met the inclusion criteria and were appraised for quality using Critical Appraisal Skills Programme (CASP). 27 The reason for conducting a crtitical appraisal of the studies was to enhance the use of the findings for policy-making and practice in higher healthcare education. 28 To investigate the interpretation of the quality instruments, three members of the research team (MS, EM and EC) conducted an initial test assessment of two randomly selected studies and graded them with high, moderate or low quality. Additional screening tools were used for studies with a mixed methods design 29 and cross-sectional studies 30 not available in CASP. When a discrepancy arose, a fourth researcher (MB) assessed the articles independently without prior knowledge of what the others have concluded. This was followed by a discussion among all four researchers to secure internal agreement on how to further interpret the checklist items and the quality assessments. Consequently, to ensure high quality, the studies had to have a ‘’yes’ answer for a majority of the questions. If ‘no’ dominated, the study was excluded. Since earlier reports 31 have raised and discussed the importance of ethical issues in systematic reviews, all screening protocols in this review included ethical considerations, as an individual criterion. The first author critically appraised all 37 articles, and 15 articles were divided between the team members (EM, MB and EC) and independently appraised. Nevertheless, during the screening process all 37 articles were critically discussed using the Rayyan system for systematic reviews 32 before final decision for inclusion. By this procedure, all authors agreed on not only which articles to include, but also the reason for exclusion. The critical appraisal resulted in 18 studies with high and moderate quality ( table 2 ).

Overview of included studies

*High equals majority of items in the critical appraisal tools.

†Twenty-four students included in the intervention, and 19 attended the focus group session.

‡Twelve students received faculty feedback, and 12 students received peer feedback.

OSCE, objective structured clinical examination.

Collating, summarising and reporting results

The analysis process followed the five phases of thematic analysis described by Braun and Clarke, 33 with support of a practical guide provided by Maguire and Delahunt. 34 The first phase included familiarising with the data. Therefore, prior to the coding process, we read all the articles to grasp a first impression of the results presented within the included studies. We then conducted a theoretical thematic analysis, meaning that the results were deductively coded, 33 guided by the research questions. We read the results a second time before starting the initial coding. The codes consisted of short descriptions close to the original text. The codes were then combined into themes and subthemes. The themes were identified with a semantic approach, meaning that they were explicit: we did not look for anything beyond what was written. 33 Finally, we constructed a thematic map to present an overview of the results and how the themes related to each other. The results from the studies are presented narratively.

Consultation

Consultation is an optional stage in scoping reviews. 21 However, since it adds methodological rigour, 22 we presented and discussed the preliminary results and the thematic map with nine academic teachers who are experts within the field of healthcare education and pedagogy. The purpose of the consultation was to enhance the validity of the results of the scoping review and to facilitate appropriate dissemination of outputs. 33 The expert group responded to four questions: Do the themes make sense? Is too much data included in one single theme? Are the themes distinct or do they overlap? Are there themes within themes? 34 The consultation resulted in a revision of a few themes and the way they related to each other.

Patient and public involvement

No patients or members of the public were involved.

The 18 included studies were published between 2002 and 2017 in the USA (6), the UK (6), Australia (3), Canada (2) and the United Arab Emirate (1) ( table 3 ). The studies were conducted in medical (12), dental (2), nursing (2), occupational therapy (1) and radiography (1) educations. Six studies were presented in the framework of an existing collaborative educational model. 35–40 Our review revealed that the most frequent setting for formative peer assessment activities is within clinical skill-training courses, 35 39–47 involving intraprofessional peers. The common rationale for using formative peer assessment is to support students, usually explained by the inherent learning of the feedback process, 35 39 40 43–45 47–51 and to prepare students for professional behaviour and provide them with the skills required in the healthcare professions. 36–38 46–49 52 Table 3 presents the results of the analysis related to the research questions of context, rationale and interventions of formative peer assessment.

Overview and summery of the context, rationale and interventions of formative peer assessment presented in the included studies.

*Appears in how many of the included 18 studies.

The results related to the research questions about the experience of students and teachers and the outcome of formative peer assessment interventions fall within three themes: (1) the organisation and structure of peer assessment activities, (2) personal attributes and consequences for oneself and one’s peer relationships and (3) the experience and outcome of feedback and learning.

The organisation and structure of formative peer assessment activities

In the reviewed studies, students express that the responsibility of faculty is a key component in formative peer assessment, meaning that faculty must clearly state the aim of the peer assessment activity. Students highlight the need to be prepared and trained in how to give and receive constructive feedback. 36 47 50–52 The learning activities need to be well designed and supported by guidelines on how to use them. 35 36 50 52 Otherwise, it could discourage students from participating in the peer activities. 52 Novice students find it difficult to be objective and to offer constructive criticism in a group. 36 46 This emphasises the importance of responsibility from faculty, especially when students are to give feedback on professional behaviour. 52 Some students prefer direct communication with peers when feedback is negative, whereas others think it is the responsibility of faculty. 52 There is some ambiguity regarding whether feedback should be given anonymously or not, 47 52 whether it should bear consequences from faculty or not, 52 whether it should be informal or formal, and whether the peer should be at the same academic level or at a more experienced higher-level. 50 52 Moreover, some students express how they favour small groups 41 49 ; as students in small groups are more active than those in large groups. 41 Students and teachers agree that peer assessment should be strictly formative rather than summative. 42 46 52 Teachers see themselves as key facilitators and express that students value feedback from teachers rather than from peers (in terms of credibility). 51 Students express similar sentiments even if they appreciate the peer feedback. 40 42 44 46 However, teachers confirm the need for training and preparing students early in the education, as well as the need for their own professional development to guide students effectively. 51

Personal attributes and the impact and consequences for oneself and one’s peer relationships

Students generally focus on how peer assessment activities may affect their personal relationships in a negative way. 35 37 42 50 52 They express worry over consequences for themselves and their social relationships 37 40 52 as well as feeling anxious that negative feedback given to a peer may affect the grading from faculty. 52 Moreover, students emphasise the importance of enthusiasm and engagement in listening to peers’ opinions during their collaboration. 36 47 They mention positive personal attributes and behaviours such as being organised, polite and helpful as supportive for peer collaboration. 36 47 Further, they mention the importance of both a positive and close relationship between students and faculty 52 and a positive culture in the learning environment. 40 While students highlight the impact on and consequences for personal relationships, teachers speak of the importance of respect in formative peer assessment, 36 including respect for each other, the learning activity, and the collaboration and interaction. 36 Further, teachers emphasise the importance of students being self-aware, being well prepared and taking own responsibility for the peer assessment activity. 36

The experience and outcome of feedback and learning

According to the students in the reviewed studies, formative peer assessment contributes to developing the skills needed in practice and in their future profession. 35 36 40 41 48 52 They appreciate the opportunity to give and receive feedback from a peer, 35 36 40 42 47 48 50 and they agree that the feedback they received made them change how they worked 42 48 or how they taught their peers. 47 48 They consider activities such as observation of others’ performance as beneficial for learning because they make them reflect on their own performance 35 36 40 41 46 49 50 and help them identify knowledge gaps. 35 40 49 Students with prior experience of peer learning are more likely to provide specific guiding feedback than those without such experiences. 39 Moreover, two studies showed significantly improved test results for students who took part in a peer feedback activity compared with those who did not. 43 49 Further, students thought they could be honest in their feedback and would learn better if the feedback was more in-depth. 35 46 Students at entry level tend to give more positive feedback than senior students; they also focus on practical and clinical knowledge, whereas more senior students focus on communication, management and leadership in their feedback comments. 45 A study exploring what students remember of received feedback points to memories of positive growth, negative self-image and negative attitudes towards classmates. Received feedback sometimes confirmed personal traits the students already knew about. 37 In addition, negative feedback was more likely to result in a change in their work habits and interpersonal attributes. 37 Students expressed some anxiety regarding the usefulness of feedback from low-performing students 40 50 and non-motivated students, which contributes to ineffective interaction and learning. 36 47 Low performing students show lack of initiative, preparation and respect but also improvement in their grades after the peer assessment experience. 47 Furthermore, feedback from peers can be a predictor of a student’s unprofessional behaviour; hence, it could be used as a tool for early remediation. 38 In an evaluation of faculty examiners’ experience of students’ feedback, the faculty express how they consider student feedback to be given in a professional and appropriate way and faculty examiners would have given similar feedback. 42 In an OSCE-examination where a checklist was used, the results showed statistical significance in assessment between faculty examiners and student examiners. 42

We found that formative peer assessment is a process with two consecutive phases. The first phase concerns the understanding of the rationale and fundament of the peer assessment process for students and faculty members. The results indicate that the rationale is to support student learning and prepare them for healthcare professions. The formative peer assessment activities support students’ reflection on their own knowledge and development when mirrored in a peer by alternating the roles of observer and observed. 53 54 It further contributes to skills as communication, transfer of understandable knowledge and collaboration, all significant core competences when caring for patients and their relatives. 54 For faculty, organising formative peer assessment, can be cost beneficial. This was recently emphasised in high volume classes expressing the reduction of costs with students giving feedback to a peer instead of teachers. 55 Nevertheless, students express the importance of clarifying the aim of the peer assessment activity and the responsibility of the faculty. We recommend faculty to clearly define the activity and explain how it supports student learning and professionalism, especially when students are to provide feedback to each other on sensitive matters, such as unprofessional behaviour. A collaborative activity between students requires trust, and the real intention must be made transparent. 4 56–58 Moreover, to enable student development in line with the learning outcomes, the learning activity needs to be well designed and understood by students. 59–61 However, Casey et al 62 recommended further investigations of how to prepare students for the peer assessment activities.

The second phase concerns the organisation and structure of the formative peer assessment activity, for example, how to give and receive feedback and the complexity of peer collaboration as it affects students’ emotions concerning both themselves and their relationship with their peers. This coincides with earlier research emphasising the social factors of peer assessment and the importance for teachers to consider them. 4 Nevertheless, surprisingly, few studies highlight the collaborative part of peer assessment. 4 11 One reason might be that formative peer assessment is often presented as a ‘stand alone’ activity and not involved in a collaborative learning environment. 8 63 We agree with earlier research 64 65 arguing that peer assessment needs to be affiliated with practices of collaborative learning. Similar implications are presented by Tornwall, 12 who concluded the importance of integrating peer collaboration as a natural approach throughout education to support student development.

Limitations

Previous methodological concerns and discussions have been related to the systematic approach of handling grey literature. 66 67 We argue that the grey literature may contribute to a wider understanding of the research area. Nevertheless, when we conducted a critical appraisal of the included studies, the grey literature was excluded due to lack of methodological rigour. Therefore, we recommend considering this time-consuming phase of the methodology in scoping reviews. We further acknowledge that the last search was conducted in May 2019, studies may have been included if an additional search had been provided after this date and in other databases than the ones presented. Further, the current scoping review has not fully elucidated the perspective of teachers and faculty. Few of the included studies highlighted the teachers’ perspective why further research is required.

Conclusions and implications for further research

Some have argued that research on peer assessment is deficient in referring to exactly what peer assessment aims to achieve. 68 We conclude that within healthcare education the aim of formative peer assessment is to prepare students for the collaborative aspects crucial within the healthcare professions. However, healthcare education must consider preparing and introducing students to collaborative learning; therefore, well-designed learning activities aligned with the learning outcomes need to be developed. Based on this scoping review, formative peer assessment needs to be implemented in a collaborative learning environment throughout the education to be effective. However, since peer collaboration seems to affect students’ and teachers’ experience of formative peer assessment, empirical investigations exploring the collaboration between students are of utmost importance.

Supplementary Material

Acknowledgments.

Special thanks to the members in the expert group for their valuable contribution in the consultation.

Twitter: @Have none

Contributors: MS led the design, search strategy, and conceptualisation of this work and drafted the manuscript. EM, MB and EC were involved in the conceptualisation of the review design, inclusion and exclusion criteria, and critical appraisal and provided feedback on the methodology and the manuscript. All authors give their approval to the publishing of this scoping review manuscript.

Funding: The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests: None declared.

Provenance and peer review: Not commissioned; externally peer reviewed.

Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Data availability statement

Ethics statements, patient consent for publication.

Not required.

  • Our Mission

Resources for Assessment in Project-Based Learning

Project-based learning (PBL) demands excellent assessment practices to ensure that all learners are supported in the learning process. With good assessment practices, PBL can create a culture of excellence for all students and ensure deeper learning for all. We’ve compiled some of the best resources from Edutopia and the web to support your use of assessment in PBL, including information about strategies, advice on how to address the demands of standardized tests, and summaries of the research.

PBL Assessment Foundations

Watch this video to discover how assessment can be integrated seamlessly into project-based learning to measure student understanding from the beginning to the end of a project:

Angela Haydel DeBarger describes research-based strategies for implementing PBL projects that are rigorous and engaging: the importance of having students create products that address the driving question, providing ongoing opportunities for feedback and reflection, and presenting work to an authentic audience.

Explore responses to questions directed toward teachers in the field, in this post by Andrew Larson. You'll find strategies for reporting on content and success skills; educators also describe the use of traditional assessments like quizzes and tests.

High school teacher Katie Piper shares honest feedback about the challenges associated with assessing students fairly during the PBL process, where collaboration is key and critical. Some strategies include conducting individual assessment of team products, as well as "weighted scoring" and "role-based" assessment practices. 

In this blog post, classroom teacher Matt Weyers explains how he shifted the conversation in his classroom from getting a grade to student learning. He shares his step-by-step plan and also the great results.

Read about how PBL school model Expeditionary Learning  approaches assessment within project-based learning, in this interview with Ron Berger. Berger emphasizes student ownership of the assessment process and references several videos of sample PBL project assessments. 

In this post from Michael Hernandez, find ideas for conducting multidimensional evaluation to encourage students, provide meaningful feedback, and set students up for success within project-based learning.

PBL and Formative Assessment Practices

In another blog post from Matt Weyers, find great tips on using formative assessment within the PBL process to drive student learning. Weyers explains how to use the driving question to prompt reflection and the "Need to Know" to check for understanding.

John Larmer, editor in chief for the Buck Institute for Education, shares practical strategies to ensure students submit their best work, including reflective questions for teachers to use: questions around rubrics, formative assessment, authenticity, and time for revision and reflection. These assessment practices help students improve and share exemplary work.

Writer Suzie Boss explains how formative assessment within project cyles can empower students to learn more and experience more success. Along the way, she underscores the value of framing mistakes as learning opportunities, the excitement of risk-taking, and the importance of describing clear learning goals throughout a project. 

PBL and Standardized Tests

In this article for District Administration, regular Edutopia blogger Suzie Boss tells the story of how schools are meeting the challenge of standardized tests and moving past the “bubble” exam; she also highlights how educators are overcoming fear and anxiety around assessing critical thinking and content.

This Knowledge in Action research project from the University of Washington explores how well-designed PBL can meet, and in many ways, surpass what the AP exam assesses, including both content learning objectives and goals around 21st-century skills. 

Edutopia blogger Andrew Miller provides specific and practical strategies to address the demands of standardized tests while doing great PBL projects. In addition to embedding standardized tests prompts within the project, Miller suggests implementing PBL projects where they fit, targeting power standards, and examining standardized tests to see what students will need to be successful. Because these projects are powerful learning tools, there's no need to wait for testing season to get started.

PBL Assessment Research

Read about PBL assessment that supports student success in this page from Edutopia's comprehensive research review on project-based learning.

The Buck Institute for Education has compiled, as well as conducted, comprehensive research on PBL. Many of the studies shared on this page specifically address assessment practices and results.

We hope these resources on PBL Assessment help ensure that students learn not only content but also the skills they need to be "future ready." Use these ideas and tools to alleviate concerns you have around assessment and PBL and to support the design of effective PBL projects.

Banner

Educational and Psychological Instruments

  • Finding tests and test info
  • Citing Tests
  • More on Tests in Books
  • Selected Journals and Organizations
  • Reliability, Validity, and Ethical Use
  • More on Educational Assessment
  • Assistance with Instrument Design

Books on Education Assessment

  • Library Catalog Search for resources on Educational tests and measures
  • Library Catalog Search for resources on Educational tests and measurements -- Standards

Cover Art

  • << Previous: Reliability, Validity, and Ethical Use
  • Next: Assistance with Instrument Design >>
  • Last Updated: Apr 22, 2024 8:02 PM
  • URL: https://research.library.gsu.edu/educationalandpsychologicalinstruments

Share

IMAGES

  1. 21 Summative Assessment Examples (2024)

    research on formative assessment

  2. PPT

    research on formative assessment

  3. Formative and Summative Assessment

    research on formative assessment

  4. Four Types of Formative Assessment To Enhance Engagement & Learning (2022)

    research on formative assessment

  5. 1 Formative Assessment Cycle

    research on formative assessment

  6. Formative assessment

    research on formative assessment

VIDEO

  1. Placement and Diagnostic Assessment (Difference)

  2. Pretrieval Practice, by @TeacherToolkit

  3. Questioning Techniques for Formative Assessment

  4. What is formative assessment?

  5. Embedded Formative Assessment- Chapter 4

  6. The Power of Effective Feedback by @TeacherToolkit

COMMENTS

  1. Formative assessment: A systematic review of critical teacher prerequisites for classroom practice

    1. Introduction. Using assessment for a formative purpose is intended to guide students' learning processes and improve students' learning outcomes (Van der Kleij, Vermeulen, Schildkamp, & Eggen, 2015; Bennett, 2011; Black & Wiliam, 1998).Based on its promising potential for enhancing student learning (Black & Wiliam, 1998), formative assessment has become a "policy pillar of educational ...

  2. PDF The Effects of Formative Assessment on Academic Achievement ...

    Formative assessment is viewed as the most critical assessment strategy in many cities of Canada as well. Countries such as Finland, Germany, Sweden and Spain also emphasize ... According to the research results, formative assessment was the third most influential factor among 138 factors for students' achievement. In the same order, feedback ...

  3. (PDF) Formative assessment: A critical review

    Officers (CCSSO) and contained in McManus (2008, 3): 'Formative assessment is a. process used by teachers and students during instruction that provides feedback to. adjust ongoing teaching and ...

  4. Formative assessment and feedback for learning in higher education: A

    This section summarises high-quality studies focusing on the content and delivery of feedback and formative assessment. This includes research that examines a range of issues such as whether students receive feedback (or not), as well as the level of detail, amount and content of formative assessment tasks and feedback.

  5. The effectiveness of formative assessment for enhancing reading

    Introduction. In an era of reconfiguring the relationship between learning and assessment, spurred by quantitative and qualitative evidence, formative assessment is proffered to meet the goals of lifelong learning and promote high-performance and high equity for all students ().It has gained momentum among researchers and practitioners in various culture contexts.

  6. PDF The Effect of Formative Assessment Practices on Student Learning ...

    Examining the effectiveness of formative assessment and its moderators (i.e. types of formative assessment interventions, education level) would contribute to the literature. In this sense, the following research questions were asked in this study: 1) What effect do formative assessment interventions have on student learning according to

  7. A whole learning process-oriented formative assessment ...

    However, current research on the formative assessment of complex skills has faced a series of challenges, including the need to construct an assessment framework concerning specific complex skills ...

  8. PDF The Research Status of Formative Assessment in 16483898 25387138

    bibliometric research on formative assessment is important for researchers because it enables them to gain insights into the field, evaluate research impact, identify research gaps, and contribute to evidence-based educational decision-making. This study focuses on articles published between 2008 and 2022 on formative assessment in

  9. Formative Assessment and Feedback Strategies

    In their synthesis of feedback research in view of self-regulated learning, Butler and Winne adopted a cyclical conceptualization of formative assessment and feedback from a learner's perspective.In the context of peer assessment and peer feedback, this cyclical conceptualization of the formative assessment process has been, for example, picked up by Reinholz ().

  10. Formative Assessment in Educational Research Published at ...

    In this context, being able to review qualified academic research is very valuable. Therefore, we aimed to conduct a bibliometric analysis using the VOSviewer to determine the focus of research on formative assessment in education (FAE). This bibliometric analysis included 447 studies on FA published in the Web of Science (WoS) from 2000 to 2021.

  11. Full article: A systematic review on factors influencing teachers

    The benefits of formative assessment manifested in research studies have made it an important agenda in educational reform all round the world (Birenbaum et al., Citation 2015) although the effects vary widely among different implementations and student populations (Bennett, Citation 2011).

  12. Full article: Formative assessment, growth mindset, and achievement

    While research on formative assessment focuses on external teaching practices, work on growth mindset emphasises internal psychological processes. This study examined the interplay between three formative assessment strategies (i.e. sharing learning progressions, providing feedback, and instructional adjustments) and growth mindset in ...

  13. Frontiers

    Research has shown that formative assessment can enhance student learning. However, it is conceptualised and implemented in different ways, and its effects on student achievement vary. A need has been identified for experimental studies to carefully describe both the characteristics of implemented formative assessment practices and their impact on student achievement. We examined the effects ...

  14. Formative Assessment: Balancing Educational Effectiveness and Resource

    The research findings highlight the crucial importance of assessment generally, and formative assessment in particular, on student learning in higher education. Research pressures, larger classes and more distance learning are all challenges that make the balancing act between resource efficiency and educational effectiveness increasingly ...

  15. PDF The Impact of Formative Assessment and Learning ...

    One of the most frequently cited works on formative assessment is the research review conducted by Black and Wiliam in 1998. The analysis compiled over 250 publications, both quantitative and qualitative, which were found to:

  16. (PDF) The effectiveness of formative assessment

    Formative assessment fosters self-learning and provides productive feedback on students' learning outcomes, thereby significantly influencing students' motivation and achievement (Dix, 2017 ...

  17. PDF Ae Assessment for Learning

    Formative assessment methods have been important to raising overall levels of student achievement. Quantitative and qualitative research on formative assessment has shown that it is perhaps one of the most important interventions for promoting high-performance ever studied. In their influential 1998 review of

  18. Practical Assessment, Research, and Evaluation

    Practical Assessment, Research & Evaluation, Vol 14, No 7 . Page 2 Dunn and Mulvenon, Formative Assessments . Review of the Literature . Over the past several years, a growing emphasis on the use of formative assessment has emerged, yet formative assessment has remained an enigma in the literature (Black & Wiliam, 1998; Leung & Mohan, 2004). When

  19. Formative vs. summative assessment: impacts on academic motivation

    As assessment plays an important role in the process of teaching and learning, this research explored the impacts of formative and summative assessments on academic motivation, attitude toward learning, test anxiety, and self-regulation skill of EFL students in Iran.

  20. (PDF) A Critical Review of Research on Formative Assessment: The

    A Critical Review of Research on Formative Assessment: The Limited Scientific Evidence of the Impact of Formative Assessment in Education January 2009 DOI: 10.4324/9780203462041_chapter_1

  21. Full article: Using Formative Assessment and Feedback from Student

    2.2 Student Response Systems (SRS) As consensus grows on the effectiveness of formative assessment in teaching mathematics and statistics, research has also turned toward examining new ways of delivering feedback to students through technology (Suurtam Citation 2012).Beyond only clickers, educators are increasingly using interactive SRS to provide students with instant feedback, to generate ...

  22. Original research: Formative peer assessment in higher healthcare

    Some have argued that research on peer assessment is deficient in referring to exactly what peer assessment aims to achieve. 68 We conclude that within healthcare education the aim of formative peer assessment is to prepare students for the collaborative aspects crucial within the healthcare professions. However, healthcare education must ...

  23. 7 Smart, Fast Formative Assessment Strategies

    3. Dipsticks: So-called alternative formative assessments are meant to be as easy and quick as checking the oil in your car, so they're sometimes referred to as dipsticks. These can be things like asking students to: write a letter explaining a key idea to a friend, draw a sketch to visually represent new knowledge, or.

  24. Resources for Assessment in Project-Based Learning

    In another blog post from Matt Weyers, find great tips on using formative assessment within the PBL process to drive student learning. Weyers explains how to use the driving question to prompt reflection and the "Need to Know" to check for understanding. ... PBL Assessment Research. Project-Based Learning Research Review: Evidence-Based ...

  25. A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students

    This paper examines the use of LLMs to support the grading and explanation of short-answer formative assessments in K12 science topics. While significant work has been done on programmatically scoring well-structured student assessments in math and computer science, many of these approaches produce a numerical score and stop short of providing teachers and students with explanations for the ...

  26. More on Educational Assessment

    Formative Assessment by Margaret Heritage Develop the knowledge and skills needed for successful formative assessment Formative assessment is a process used by teachers and students to keep learning moving forward. In the 10 years since the first edition of Formative Assessment was published, the practice has become a mainstay in classrooms, but that does not mean that it ...

  27. Linguistically responsive formative assessment for emergent bilinguals

    This study contributes to a comprehensive understanding of what and how teachers should know and act for performing responsive formative assessments for emergent bilinguals. It has implications for further research and for teachers who work with emergent bilingual students in core content areas.

  28. Process for Certification of 2024-2025 Indirect Cost Rates for

    This letter is to inform independent school districts (ISDs) that the Texas Education Agency (TEA) has uploaded the completed Indirect Cost Rate Proposals (ICRPs) in GFFC Reports and Data Collections. ISDs that requested an indirect cost rate must review, approve, and certify their ICRP data by May 17, 2024.