Learning and assessment deal with related, yet distinct concepts

Learning can be defined as the acquisition of knowledge, skills, values, beliefs, and habits through experience, study, or instruction. Assessments are instruments designed to observe behavior in a learner and produce data that can be used to draw inference about the knowledge, skills, values, beliefs, and habits that the learner has.

Although learning and assessment are both key to education, the statistical models used to describe learning data and assessment data have significantly diverged and grown to leverage the salient features and distinct assumptions that are embodied in their respective data sets.

The fields of educational data mining and learning analytics harness the dynamic, temporal, and large scale nature of learning data to construct models which can be used to predict learner performance, personalize and adapt instructional content, recommend intervention and curriculum changes, and provide information visualization to track progress.

On the other hand, the same objectives are targeted by the field of psychometrics, using cross-sectional assessment data rather than longitudinal data.

Specifically we look at Bayesian Knowledge Tracing (BKT) and Item Response Theory (IRT). BKT, a statistical model in educational data mining, is the most ubiquitous model used for data obtained from intelligent tutoring systems, which are systems constructed to provide immediate and customized instruction to learners. IRT, a modeling framework developed in the field of psychometrics, was designed for constructing and analyzing assessments.

Historically, the research in BKT and IRT models has had little overlap, as on the surface these models seem to be completely different and incompatible.

However in a recently submitted paper, Assessment meets Learning: On the Relation between Item Response Theory and Bayesian Knowledge Tracing, we show that there is an intimate connection between these two models that places BKT and IRT under an umbrella of general models of learning and assessment data.

Specifically we show that the latent dichotomous variable in BKT is related (in the long run, as time goes to infinity) to a type of IRT model. In fact, by using person specific “learning” parameters and item specific “forgetting” parameters in BKT we are able to recover a (4-1)PL IRT model (four minus one; a 4PL model with discrimination set to 1). See the paper for details.

Implications

The relationship between BKT and IRT that is described in the paper highlights the cautionary vision that our CEO Maarten Roorda has laid out for ACT.

“This development will help better measure student success for students, teachers and administrators, districts, and states,” said Roorda. “It is groundbreaking research that we believe will help change the field of education as a whole for the better.”

Connecting a learning model to an assessment model may lead to new ways to assess student learning through the work students are already doing in the classroom. By tracking a learners’ achievement continuously through a learning system, it is possible to get an estimate of summative learning that would otherwise be obtained only through an end-of-course exam. This paper shows that, at least theoretically, this approach is valid.

“Students already do assignments, homework, quizzes, and tests all year long in their classes, and these findings suggest we can use those data to assess student learning,” said Alina von Davier, ACT senior vice president of ACTNext. “This can provide educators with benchmarks to use throughout the year, as well as at the end, that will help them measure what students have achieved academically.”

Issues with both models for learning and assessment

The core issue with both BKT and IRT is their lack of a placeholder for education in the model. Although the BKT model can estimate the rate at which learning occurs and the IRT model is capable of estimating the learning that has occurred there is no component in either model to denote teaching or education that is occurring to the learners, nor how differences in teaching lead to differences in learning outcomes (IRT), or the learning process (BKT).

The BKT model is basically a ballistic model, where the learning process is closer to firing a gun, with the path being almost entirely determined by the initial conditions (i.e., parameters), than it is to flying a plane, with a pilot (i.e., education) steering and changing the course of the plane as needed. IRT models are inherently cross-sectional, and aim to explain observed differences in what has been learned. Such models however have little to offer in explaining how these observed differences came into existence, or what measures could reduce or alter them.

As a subset of Network Psychometrics

The connection between BKT and IRT that is described in the paper does not alleviate the issue raised in the previous section. We have bridged assessment to learning, but neither has been bridged to education yet. One could argue that no progress has been made. After all, integrating two models, neither one suitable for thinking about education, does not by itself lead to a model suitable for thinking about education.

We argue that the outlook is not so bleak. The integration of BKT and IRT is a point of departure for a new research agenda for the learning sciences and psychometrics together; an agenda aimed at factoring the role of education into the learning equation.

Both the outcome of learning and the process of learning crucially depend on it. Let’s start off with a question: Why is it that no educational system starts off in primary education by teaching long division, and then slowly working towards counting?

Every systems starts off with counting, followed by addition, and slowly works its way to long division. Counting is a clear prerequisite for addition. That is, even though both learning processes may be adequately described by a BKT-IRT model, the learning processes are not independent. Leveraging such inherent dependencies is what education is all about.

Network psychometrics is a new conceptual model that formalizes such (mutual) dependencies. The primary innovation in this new conception is the construction of a graphical network in which the nodes are the observable features that are mutually reinforcing based on their connections by causal relations.

In the paper we show how the BKT-IRT framework fits into the network psychometric approach. The benefit is clear: the network psychometrics approach has a specific placeholder for education, namely, the graphical structure of the network, i.e. the specification of the edges and which nodes are connected to each other. This structure can be specified and allowed to change over time to take into account the impact of education.

“ACT is on the cutting edge of learning and measurement, as we work to transform the field of educational assessment and find new ways to help students learn the skills they need to succeed in college and career,” said Roorda. “This research is a stepping stone to broader and more important connections between learning, assessment, and education. We’re not there yet, but we are making incredible strides in that direction.”