For a full overview of 2019 ETCPS, please visit the event site.

Oct. 9, 2019 Poster Presenters and Tech Demonstrations

Karen Barton, Edmentum
When Ed Tech Focuses on Educators First: Impacts on Design and Development
Validity is the trustworthiness and usefulness of the information. Building a validity argument is about telling an evidentiary story. The attention to evidence with ongoing intention throughout the design, development, and delivery of products, including assessments, is an opportunity and duty for Educational technology developers.  This presentation lays out a comprehensive development process based on Edmentum’s Educators First approach. The process starts with the development of a theory of action and research foundation around the purpose, outcomes, and intended uses of the technology. Educators are involved very early in the development process. By building internal and educator literacy and shared understanding of the key decisions educators are trying to make, as well as the required information for those decisions, the design and development has a validity focus rather than a feature focus. The presentation provides specific examples of building literacy in assessment, connecting research based evidence of implementation and assessment to classroom practice, and directly involving educators in the early stages of development.  

Timo Bechger, ACT
Dexter: open source software for serious psychometrics
Dexter is an R package intended as a robust and fairly comprehensive system for managing and analyzing test data organized in booklets. It includes facilities for importing and managing test data, assessing and improving the quality of data through basic test-and-item analysis, fitting an IRT model, and computing various estimates of ability. Dexter comes with two companion packages: a shiny application to serve users not familiar with R, and a package dedicated to the calibration of multistage testing data.
Dexter differs from other psychometric software both in terms of the tools it includes as in those it omits. The data management component is quite developed to promote data integrity. Many psychometric methods not found elsewhere are provided, such as Haberman’s (2007) interaction model generalized for polytomous items, new methods for exploratory and confirmatory DIF analysis, support for the 3DC method of standard setting, and many more. On the other hand, there is no support for multivariate IRT models, 2PL and 3PL models and other methods sufficiently represented in other packages. The central IRT model is a polytomous generalization of the extended marginal Rasch model: we call it the Extended Nominal Response Model (ENORM).

Danielle Benesh, Michala D Cox, Yeajin Ham, Alfonso J Martinez and Alexis Oakley, University of Iowa
Building blocks for collaborative problem solving: Minecraft as a learning tool

Brad Bolender, ACT
Finding Stimulus Passages in Large Digital Corpora: The POE Passage-Finding Application
Developing stimulus passages for English language tests requires a considerable investment of resources. In order to assist with this process, the Passage Organizer/Extractor (POE) application prototype was developed. Test developers can upload a full book-length text, and POE will segment the text into potential excerpted passages that meet criteria such as length, reading level, and cohesion. Experiments with the POE prototype helped content specialists find five stimulus passages for the The ACT English test that were further developed into full test units. Next steps include expanding efforts to include ACT Aspire.

Alex Casillas, ACT
Implementing Career and College Clubs Curriculum in a Cohort of Underserved Students
During the 2018-19 academic year, Alliance College-Ready Public Schools began implementing NCCEP’s Career & College Clubs (CCC) curriculum across seven schools focused on a cohort of 9th graders (n=977) who are participating in GEAR UP. Students’ SEL and other skills were measured before and after the implementation of curriculum lessons. Results show that students’ skills developed substantially across several important SEL areas, such as academic discipline, collaboration, and managing feelings.

Pravin Chopade, David Edwards, Saad M. Khan, Alejandro Andrade, Scott Pu and Bryan Maddox, ACT
Crisis in Space – A Multiplayer Game to Measure Collaborative Problem Solving Skills
Do you like to have a fun game with your friends? Do you want to travel in space, or support others who are doing so? The journey is difficult, few succeed, many fail, but everyone learns something. Can you work with your partners at Ground control to navigate the challenge? Can you support the Astronauts and help them avoid catastrophe? The only way to know is to try!In this project, we use a non-invasive, in vivo approach to measure collaborative problem solving (CPS) and social-emotional skills using AI and Machine Learning (ML). We use CPS, a space exploration video game called “Crisis in Space” (CIS) from LRNG (Previously Glass Lab Inc) that involves a two-player jigsaw task composed of a series of puzzles.Our research involves the development of a ML framework for the analysis of multi-modal audio, video, eye-tracking data obtained from a pilot study. We are using computational framework for evidence extraction and accumulation and performing analysis on data obtained from a pilot study.

Geoffrey Converse, University of Iowa
Autoencoders for Educational Assessment
In educational assessment research, a common goal is to determine students’ knowledge about some construct. This knowledge is latent and can be represented by continuous or discrete variables which influence the individual’s performance on a test. Item response theory (IRT) models and cognitive diagnostic models (CDM) structure this relation, defining specific functions between the knowledge of the individual, and the probability of answering an item correctly. Previous research implies that neural networks can emulate these models, and, with a modification in its architecture, overcome some of the limitations concerned to “big data” analysis. In this work, we compare two different types of neural networks for this application: autoencoders (AE) and variational autoencoders (VAE). Not only can these neural networks be used as similar predictive models, but they can recover and interpret parameters in the same way as in the IRT and CDM approaches. Our results show that neural networks are a valid approach to this problem, and explores the advantages that a VAE brings over a regular autoencoder, AE, in the educational context. In both cases, interpretability of the neural network parameters is presented. This work helps overcome problems in both fields: lack of flexibility in IRT and lack of transparency in neural networks.

Kyle Diederich and Juan Pablo Hourcade, University of Iowa
Play-Based Design: Face-to-Face Interaction for Young Children Voices in the Design Process
There has been a dramatic growth in interactive technology use by children under the age of 5 during the past decade. Despite this growth, children under the age of 5 typically participate only as users or testers in the design process in the overwhelming majority of projects targeting this population presented in key child-computer interaction venues. In this paper we introduce play-based design, an age-appropriate design method to give 3-4-year-old children a voice in the design process. More specifically, we contribute a thorough analysis of the use of existing methods to design technologies for children under the age of 5, a summary of the process that resulted in the development of play-based design, a detailed description of play-based design, a qualitative analysis of our experience implementing play-based design with two groups of children, and a discussion of play-based design’s place among other methods, its advantages, and limitations.

Aaron Horn, NewBoCo, Delta V Coding
DeltaV Code School: A New Approach To Iowa’s Tech Talent Skills Gap
DeltaV Code School helps adult career changers learn coding and modern web development, moving from novice to entry level fullstack developer in 20 weeks.

Takao Ichiko
Cooperative Aspects of Learning with an Assessment Concept Scheme in Distance Learning
It is a fact that placing people in the same room, seating them together, and telling them they are a group, does not mean they will cooperate effectively even in an up-to-date networked environment. In distance learning, it is no exaggeration to say that one of the most important issues/matters is learning quality assurance.
In this research, learners are provided with cooperative aspects of learning through intentional communications based on human primary cognition, as well as knowledge and intelligence integration, ensuring a beneficial result from the features and functionalities that have been verified for teaching and learning on a real-time basis. Cooperative aspects of learning enable high quality communication to be consistent with conventional education environments. Different from previous styles, cooperative aspects of learning also enable the extension of individual abilities for a more advanced comprehension, sharing thoughts with other learners and maintaining learners’ deepening algorithmic or non-algorithmic aspects of learning is able to lead to advanced comprehensive aspects of learning, similar to reasonably creative aspects of learning, with bidirectional reinforcement between learners and teaching staff, or among learners, without any disturbance to class contexts. It is important for a learner to make the best use of an assessment concept scheme through intentional communications with dynamically conducted reinforcement, for a learner-based driving force in a more advanced comprehension, between each other as well as within a learner’s more advanced comprehension.
At the same time it is not so easy to discuss the assessment of qualitative and quantitative views in distance learning. Therefore, in this research, critical thinking and creative thinking integrated rubrics are proposed for concept mapping-based assessment on advanced comprehensions in distance learning with a mobile focus. The introduction of subjects which may help readers visualize learners’ advanced comprehensions in distance learning, and also for the extensions leading to learning quality, has been attempted. It should be positively suggested how to successfully integrate vivid human knowledge and intelligence with less confusion or disturbance.
Increasingly, forms of communication which are able to capture both an educational core leading scheme and an integrated rubric scheme are being deeply deliberated in distance learning for a more advanced comprehension with a scope of regional to interdisciplinary worth, which is greatly needed, e.g. STEM (science, technology, engineering and mathematics) to STEAM, by integrating the Arts. Thus, it can be feasible to introduce cooperative aspects of learning into concept mapping-based assessment for a more highly objective learning quality assessment through intentional communications on a real-time basis. It should be said that cooperative aspects of learning should not be overlooked as compared with algorithmic aspects of learning.
It is expected that the form and roles of distance education and learning will rapidly emerge from the current conventional methods and lead to more innovative approaches which enable more extensive options in educational and/or learning processes, including the concept of a life-long educational model, which are required to widely empower individual learners.

Feng Ji, Benjamin Deonovic, Gunter Maris, ACT
When is Deep Learning Overkill?
Deep learning models are often viewed as “black boxes” and, although excellent tools for prediction, are criticized for not providing researchers with interpretable parameters. This study elucidates what a deep learning model such as a Boltzmann Machine has learned. An approximation to the marginal distribution of a neural network is provided which is used to determine what kind of data sets are well suited and which are not for investigation by deep learning.

Feng Ji, Benjamin Deonovic, Jimmy de la Torre and Gunter Maris, ACT
Variational Inference for Cognitive Diagnosis Models
This study proposes the use of variational inference, a fast Bayesian inference alternative to Markov Chain Monte Carlo, for cognitive diagnosis models. Results show that the proposed method is comparable to Expectation-Maximization when the number of attributes ($K$) is moderate, and remains computationally feasible when $K$ is large.

Geoff LaFlair, Duolingo
Duolingo English Test
The Duolingo English Test is an innovative English proficiency assessment for today’s international students and institutions. Available online, on demand, students can certify their English from their computer. In addition to their proficiency score, students can showcase themselves through a video interview and writing sample. Results are available within 48 hours and can be shared with an unlimited number of institutions.

Dongmei Li, ACT
Constant CSEM Achieved thorugh Scale Transformation and Adaptive Testing
Conditional standard error of measurement (CSEM) provides important information for score interpretation. Having a constant CSEM across all score levels not only simplifies score reporting and score interpretation but also contributes to fairness. There are two fundamentally different approaches to achieve constant CSEM in operational test programs. One is through scale transformation using methodologies as described by Kolen (1988), Li, Woodruff, Thompson, & Wang (2014), or Moses and Tim (2017), and the other is through computer adaptive testing (CAT) with the fixed precision stopping rule (Wainer, 2000). However, there is little discussions in the literature about the similarities and differences of these two approaches, nor regarding which approach is more appropriate. The purpose of this study is to illustrate the differences and similarities of these two approaches and to explore the best solutions for transitioning from linear tests with constant CSEM to CAT. Specifically, the study is intended to investigate the following research questions: 1. What are the differences and similarities for tests whose CSEM is made constant through scale transformation versus those through fixed precision CAT? 2. Is it possible to scale a CAT to have constant CSEM through scale transformation? 3. When transitioning from linear testing to CAT, if the linear forms have been scaled to have constant CSEM, how can the CAT have scores that are interchangeable with linear forms but maintain the same constant CSEM property? Using empirical examples, this study demonstrates that though these two approaches are fundamentally different, they are equally defensible and not mutually exclusive. It is thus possible to apply both in order to obtain a score scale with constant CSEM. The study also shows that when a linear test is already scaled to have constant CSEM, a fixed precision CAT may not be the best choice when transitioning from linear to CAT in order to maintain the equal CSEM property of the score scale. Results from this study can inform decisions for test scaling in both linear testing and CAT.

Haiying Li, ACT
Group Language Use and Performance in Simulated Synchronous Collaboration
The present examined whether language used by groups is correlated with group performance and what language patterns enable better group performance during collaborative problem solving (CPS). 956 Participants were randomly paired into 478 dyadic groups. Two participants in each group synchronously collaborated to interact through text chats with two virtual agents to complete a set of science inquiry practices. Interactions were used to measure CPS competency, including sharing ideas, negotiating ideas, regulating problem-solving activities, and maintaining communication. Language used during group interaction was measured by 18 language features at the multiple textual levels. Results indicated significant correlations between group performance and language features during interaction. These findings confirmed the use of more formal, academic language in inquiry correlated with higher performance on inquiry practices as well as demonstrated the unique characteristics of group interaction in CPS. Implications for the design and assessment of CPS were discussed.

Ian MacMillan, ACT
AIGL – Automatic Item Generation for WorkKeys Graphic Literacy
AIGL is a proof-of-concept program that automatically generates items and graphics for WorkKeys Graphic Literacy. Graphic Literacy tests an examinee’s ability to identify, compare, and draw inferences from data represented as a line or bar graph, pie chart, table, or flowchart, etc. Graphic Literacy items are unique in that they (1) are created in multiple-item sets, and (2) are based on a graphic stimulus. AIGL automatically generates both the items and the graphics from the same csv file. Coded in Python, AIGL uses a combinatorial approach to populate item stem templates with content, and uses the Matplotlib library to generate the associated graphic. For this proof-of-concept, items and graphics were generated for the “find information” skill, wherein the examinee must locate a specified data point or region of a simple graphic. Future work would include modeling more difficult item types and generating more complex graphics.

David Menendez, Lu Ou, Michael Yudelson and Vanessa Simmering, ACT
Inter-relatedness of pre-algebraic knowledge among middle school children
Recent accounts in the development of mathematical knowledge argued that conceptual and procedural knowledge influence one another in an iterative fashion (Rittle-Johnson, 2017). There is also recent evidence that within conceptual knowledge, mastery of different topics (such as mathematical equivalence or fraction) is related to performance on other topics (like algebra; Booth & Newton, 2012; McNeil, Hornburg, Devlin, Carrazza, & McKeever, 2019). We are currently examining data from 1146 middle school children using Bridge to Algebra (an intelligent tutoring system; Stamper et al., 2010). We explore whether students’ accuracy in one concept is related to their accuracy for a different component. In a preliminary analysis we found that students’ accuracy on fraction operation problems was correlated to their accuracy on solving one step equations (r = .42) and linear inequalities (r = .44). Solution of one step equations was also related to accuracy on linear inequalities (r = .50). We will explore whether this relationship is present when students start learning about these concepts or whether it emerges throughout the learning process. Our hope is that our results will enrich our understanding of the interconnectedness and co-develop of mathematical knowledge.

Mohi Mittal, HCL
HCL Technologies
HCL Technologies (HCL) empowers global enterprises with technology for the next decade today. HCL’s Mode 1-2-3 strategy through its deep-domain industry expertise, customer-centricity and entrepreneurial culture of ideapreneurship™ enables businesses transform into next-gen enterprises.

Julia Panter and Elana Broch, Princeton Univ.
Notes from the trenches: A mother-daughter team explores what might be done to ensure that students with learning differences are not further disadvantaged by e-learning
Defying traditional wisdom about Mother’s Level of Education and (lack of) Adversity Scores, my daughter’s GRE Quant score was (how do I say this tactfully?) greatly over-predicted by my years of education.
Some mothers burst with pride when their daughter rushes their sorority. I am so proud when my daughter writes phony journal submissions (see Figure 1).
All kidding aside, having a mother with a PhD in Psychometrics has always been a challenge for my daughter (and co-author) who was diagnosed with dyscalculia before high school. This summer, after her acceptance into a Master’s degree program, she ventured into e-learning for her first “Intro to Statistics” course, knowing that mom could help her. Having taught many a reluctant learner (psychology majors would discuss their problems than learn about research methods and statistics), I thought I had what it takes to teach her to tackle area under the curve problems, regression, and hypothesis testing. Not only was the course completely online, the summer session compressed 12 weeks into six.
Meanwhile my son (F in Intermediate Algebra first semester in College, D- in College algebra at his next school) was taking College Algebra at a local community college. He was also taking his class online. Fortunately, my grad school path to Psychometrics (in that Gopherland to the north of ACT) involved a lot of teaching of math and statistics to mathematically-underprepared college students. Why am I spending my summer vacation writing this abstract? I think my (adult) children mastered a lot of quantitative material this summer. Because I supervised and tutored many hours of this e-learning experience, my co-author/daughter and I hope to provide some feedback to testing companies and administrators on the pluses and minuses of e-learning. My co-author / daughter also hopes to share her perspective on the experience. We would like to address the following in a poster or brief session: With the goal of Equity in Learning in mind, what are the implications of e-learning for students who are high functioning enough to attend college and graduate school yet struggle with their learning differences? What testing accommodations make sense in the e-learning context? Under what circumstances is it better to use an off-the-shelf e-learning package or an instructor led one?
Figure 1. A student is pursuing a degree lower in status than the highest degree held by one parent but aspires to one day rival the parent in level of power / education. An example found in coursework completed toward that degree makes references to the relationship between parent/child academic achievement. The ISES measures the degree to which the child’s aspirations are damaging. ISES = M – (J + C) * I/ Y Where M=mother’s age at time of highest degree conferral, J=age of student, C = degrees already conferred, I = student’s interest in current course, Y = years of graduate-level work completed by parent in a directly related field of study to course in question.

Seyedahmad Rahimi and Valerie Shute, Florida State Univ.
The Architecture of Physics Playground—A Learning Game with Stealth Assessment & Adaptive Content
Learning and engagement theories—such as the zone of proximal development (Vygotsky, 1978) and flow (Csikszentmihalyi, 1990)—suggest that challenges in a learning environment should match students’ ability. With advances in technology, as well as in the learning and assessment sciences (Shute, Leighton, Jang, & Chu, 2016), we can develop learning environments that can accurately measure and support students’ knowledge, skills, and other attributes. Such learning environments can use real-time estimates of students’ competency levels to adapt their challenges to students’ ability levels. The purpose of this proposal is to describe the various components of an adaptive environment in the context of a game called Physics Playground (PP; Shute & Ventura, 2013).

Emily Starr, StarrMatica Learning Systems
Dump the Slump – Customizable Reading Instruction with StarrMatica Texts
Do you know about the fourth grade slump? In fourth grade, students’ reading comprehension scores dip, and researchers attribute it to difficulty with comprehending informational texts. Come learn how StarrMatica is helping teachers dump the slump with an innovative software that revolutionizes reading instruction. With StarrMatica Texts, teachers can customize informational texts so all students can read the same content at their independent reading levels, and teachers can focus their instruction on the comprehension skills their students need the most.

Yigal Rosen, Kristin Stoeffler, and Laurel Ozersky, ACT
EDU2050: Design and Development of Innovative Learning and Assessment Solutions
Higher-order skills such as, creativity, critical thinking, scientific inquiry, and computational thinking transform lives and drive economies. However, the learning and assessment of these skills using traditional methods is a challenging task. Recent advancements in technology, learning science, cognitive psychology, and educational assessment enable the development of innovative learning and assessment solutions for these higher-order skills. This presentation will highlight current constructs, concepts, and techniques facilitating the effective design and development of technology-enhanced assessments for higher-order skills at scale. We will also share prototypes being used to explore this space as part of our EDU2050 initiative.

Binu Thayamkery and Sabari Raja, Nepris
Nepris – Connecting Industry to Classrooms!
Nepris connects educators and learners with a network of industry professionals, virtually, bringing real-world relevance and career exposure to all students. Nepris also provides a skills-based volunteering platform for organizations to extend education outreach, and build their brand among the future workforce.

Jay Thomas, ACT
Using Machine Learning to Assess 3-Dimensional Science Learning at Scale in Classrooms
To adequately assess standards based on the NRC Framework and NGSS, three-dimensional science learning, constructed response or composite items that combine some forced choice with constructed response portions are necessary to measure some of the key science and engineering practices such as scientific argument and communicating scientific ideas. While forced choice items do present evidence of some of these skills and practices, they are unable to adequately assess all of the knowledge, skills, and abilities (KSAs) within the targeted practices. (Pellegrino et al, 2014).
However, human scoring of constructed response items or composite items can be time consuming and costly. (Williamson, Bejar, and Mislvey, 2006; Mao et al, 2018 ) “In short, from a pragmatic point of view a key goal in any assessment is to maximize construct representation at the least cost. Human scoring might be the way to achieve that goal under some circumstances, but not in others.” (Bejar, Williamson, and Mislevy, 2006. Pg 53) Given the desire of stakeholders, such as states, to keep costs down for assessments of NGSS (Gorin and Mislevy, 2013; Toch, 2006), a more cost effective solution is necessary. Pellegrino et al (2014) suggest that technology may supply answers to some of these problems through the use of simulations, tech-enhanced items, and other emerging technologies. The most likely emerging technology for scoring constructed response questions is using machine learning to classify students’ responses along a learning progression. (Gambrell, Thomas, Meisner, and Bolender, 2016; Thomas, 2017; Thomas, Kim, Draney, 2018).
This poster presents work using machine learning to holistically score 3-dimensional classroom assessment items based on a Learning Progression. Over the last few years, we have been able to score 1,800,000 student responses.

Josine Verhagen and Dylan Arena, Kidadaptive
Psychometric challenges in designing game-based digital assessments for preschoolers to complement an observational assessment system.
We will present the first of a series of game-based digital assessments for early literacy and math to complement an observational curriculum-based assessment system widely used in preschools. The assessment targets phonemic awareness by asking children to choose ingredients with the right sounds to help a bear cook. We will describe several challenges while developing this digital assessment and how we addressed them: 1) Converting guidelines for observational ratings by teachers into items to present in a game-based environment 2) Constructing the right tasks and the right amount of items in such a way that every learner has a decent adaptive experience while keeping the game realistic and engaging 3) Picking the reporting scale and item parameters to align closely with the experience of teachers in preschools in the absence of a calibration process ahead of launch 4) Incorporating information from teacher ratings and the age/grade level of the learner as prior information for the game-based assessment. Once we have multiple digital assessments built, we plan on using Bayesian multidimensional models to combine information from assessments on related skills to inform the prior for an assessment on a new skill. We also plan to investigate how combining process data across game-based assessments in literacy and math can tell use something about cognitive skills such as approaches to learning.

Yibo Wang and Mingjia Ma, University of Iowa
A ShinyApp for estimation of equating sample size
Equating error expressions help to compare the proficient of equating designs and equating methods; they also help to estimate the sample size required to achieve a certain level of precision (Kolen & Brennan, 2014). This ShinyApp is dedicated to visualizing and synthesizing some comparisons under different conditions, it makes the evaluation of equating results and decisions-making more convenient.

Emily Whittiers, HLT
Empowering Students Through Lifelong Learning
Imagine a digital experience that transforms the way students in healthcare think about lifelong learning. Picture a supportive, captivating service that strengthens the quality of their work — from instant answers about common questions to captivating exercises that offer a safe place to learn new care methods. At Higher Learning Technologies, we’ve developed an experience that will help healthcare professionals adopt a lifestyle of learning — a way of practice that leads to improved patient outcomes. Our digital learning experience demands the highest caliber of educational content, reliable reference materials, and the support of associations, schools, hospitals, publishers, and institutions. So, if you’re interested in empowering students through lifelong learning, check out this technology demo of Higher Learning Technologies’ platform that has drawn in over 10 million learners.

Julia Winter, Alchem.ie
Augmented Reality apps for Spatial Reasoning
Through the use of mobile and immersive augmented reality, we are exploring methods to build three-dimensional reasoning skills. This demo will use software built for iPads and for the Magic Leap One headset.

Julia Winter, Sarah Wegwerth and Gianna Manchester, Alchem.ie
Measuring student competency and identifying misconceptions with the Mechanisms app and platform
In organic chemistry mechanisms show how electrons are move during a reaction to transform starting materials into products. On paper this is shown using curved arrows. Due to the vast number of possible answers, grading mechanism type questions on paper is tedious. Through Alchemie’s app Mechanisms app this is changing. In the app the user uses his or her finger to drag electrons to complete mechanisms. The app records each move allowing for answers to be quickly assessed within the app. Data can be analyzed for each individual user or across multiple users. Through case studies, we will present how the data collected by the app can be used to measure competency as well as misconceptions. Also proposed are ways that the data can be used to personalize the learning experience.

Guanlan Xu and Walter Vispoel, University of Iowa
Using Data Augmentation to Improve IRT-3PL Calibrations with Small Samples
We investigated the effectiveness of DupER (Duplicate, Erase, and Replace) data augmentation procedures (Foley, 2010) in calibrating the 3PL Item Response Theory (IRT) model using small datasets in an operational setting. Both Markov chain Monte Carlo (MCMC) and Expectation–Maximization (EM) algorithms were used in the imputation step. We compared results of 1PL and 3PL IRT calibrations on original small datasets and 3PL calibrations on DupER augmented datasets and found that DupER procedures were most effective with medium-sized samples (n = 600) in which imputed datasets adequately reflected the score distribution within the target population.

Erin Yao, ACT
Automated Scoring Engine Model Training & Operation at Scale
The “Constructed Response Automated Scoring Engine” is an application that provides automated scoring for essays and constructed response items. In this demo, we will provide attendees with a better understanding of how the emerging technology of automated scoring systems (a) are trained and evaluated for their scoring accuracy; (b) analyze and distinguish valid from invalid student responses; and (c) are implemented to provide scoring at high-scale and low cost. Attendees will be able to use this information to consider automated scoring uses in their testing services and for discussions with other stakeholders.
The objectives of this Technology Demo include:
1. Attendees will learn how a scoring model is created;
2. Attendees will see a scoring model trained, evaluated and deployed in real time;
3. Attendees will learn how automated scoring systems ensure integrity and validity of scores;
4. Attendees will have an opportunity to submit responses to the CRASE API to witness their responses being scored in real time.

Megan Zalzala and Jens Zalzala, Shaking Earth Digital
Immersive Development Reality
In today’s work environment, the US has a surplus of computer and tech jobs, yet not enough qualified people to fill them. According to a study from the National Center for Education Statistics, in 2015 there were 527,169 open computer jobs in the US. During that same year, there were only 59,581 computer science graduates. While many reasons exist for this shortage, one major reason is the amount of computer science education and engagement in middle and high schools.
Virtual Reality (VR) is a way to reach and engage students. Immersive Development Reality (IDR) allows students to be active creators in VR, instead of only observing the virtual world. While in VR, students are immersed in a friendly environment, free from distractions.
IDR is a fun and engaging way for students to learn the concepts of coding. While in VR, they place objects representing coding operations and functions into the virtual space around them and make connections to ‘write’ their code. The user pushes play, and can instantly see their app or game in action.
IDR is educational software that teaches students the concepts of coding while in VR. They grab, move, and place objects in the virtual environment. Then, they make connections via wires from the objects to different functions. Throughout the process, the student can press play at any time to see the code in action. Not only are students learning how to code, they also learn math and science concepts with IDR.


Thank you to all the presenters!