Data and Learning: Penn GSE’s Dr. Ryan Baker Leads the Penn Center for Learning Analytics

October 13, 2017

interview by Juliana Rosati

When middle school students struggle with math homework, will this affect their college prospects years later? How can massive open online courses (MOOCs) improve their completion rates? Penn GSE Associate Professor Ryan Baker aims to answer questions like these as director of the Penn Center for Learning Analytics (PCLA), working with a team to find factors in online and classroom learning that predict and promote success. Having joined Penn GSE in 2016 as a transplant from Teachers College, Columbia University, Dr. Baker established PCLA with generous support from the Office of the Provost at Penn, as well as federal funding. We sat down with him to learn more about how PCLA is working to improve learning and how he believes data can transform the education landscape.

How would you describe PCLA’s mission?

We use data to study how learning unfolds. We are trying to identify aspects of a student’s experience that predict long-term outcomes, so that these can be addressed early. For example, one of our projects has shown that if students exhibit boredom, confusion, careless behaviors, or other forms of disengagement during middle school mathematics, this affects not only their performance on standardized examinations at the end of the year, but also their college prospects several years later—whether they will go to college, whether it will be a selective college, and what they will major in when they get there. Findings like these can be used to develop more effective curricula and teaching approaches.

What kinds of learning environments do you study, and how?

We look at a variety of formats, including state-of-the-art online learning like MOOCs, traditional classroom instruction, and blended learning, where students use software in a classroom. It happens to be easier to collect data about online and blended contexts, but we are interested in learning across the board. We use a range of methods, from large-scale approaches like data mining to small-scale observational techniques.

Dr. Ryan Baker. Photos by Ginger Fox Photography

Data mining means developing computer algorithms to find hidden patterns in large quantities of data. What does this make possible?

With traditional research methods, typically you can only ask one question at a time. With data mining, if you have enough data, you can ask ten thousand questions at a time. We have a large portfolio of data-mining projects. For one of them, Ph.D. student Juan Miguel Andres is analyzing over a hundred data sets about the MOOCs Penn offers for free to learners around the world through the Coursera and edX platforms. He is using the data to test fifteen previously published findings about MOOCs. Too often, studies about MOOCs take a narrow approach, drawing conclusions from the data on just one course. We want to see if the existing findings about things like the completion rates and benefits of MOOCs can really be generalized across multiple courses in different subjects for different populations, and data mining allows us to do that.

What are some other highlights of your data-mining projects?

Dr. Jaclyn Ocumpaugh, PCLA’s associate director, is working to identify in real time which students in a classroom are engaging in behaviors that impact their learning. This could save field researchers time and money by giving them a better sense of where to focus when they observe a classroom. Ph.D. student Stefan Slater is working on a project that analyzes 160,000 math problems to determine which ones are most effective for students and why. For another key project, the Center has received a grant from the Office of Naval Research to determine which aspects of U.S. Navy training lead to better outcomes, such as a lower risk of accidents.

“If students exhibit boredom, confusion, careless behaviors, or other forms of disengagement during middle school mathematics, this affects not only their performance on standardized examinations at the end of the year, but also their college prospects several years later.”

In addition to the U.S. Navy project, have you done other research on workplace learning?

Yes, when I was at Columbia we did a project studying the emotional reactions of military cadets training to be combat medics. The training content that the cadets go through is, frankly, very intense. Our findings identified ways to make the training more supportive, such as messages highlighting the trainees’ ability to succeed.

How can PCLA’s findings be translated into better learning experiences for students?

Largely, findings like ours can be taken up by various educational vendors—such as developers of online or print curricula. So, for example, a vendor that provides an algebra curriculum to a thousand schools could make use of our findings to improve their content so that students will learn more effectively. Or these companies can provide the information in a digestible fashion for faculty and administrators.

Last summer, you taught your MOOC “Big Data and Education” for the first time at Penn, having brought it from Columbia. What are some of the highlights of the course?

It’s designed for people in graduate school and the workforce who want to learn the key methods of educational data mining and learning analytics. As far as I know, fewer than ten universities in the world teach these subjects, and many of them use my MOOC as their textbook. I’m excited about the ways we have continued to push the envelope for MOOC instruction and content. For this version of the course, we moved toward incorporating adaptive learning, using software that provides the content in a way that is tailored to each student’s individual needs.

Adaptive learning is often pointed to as a way to improve education. How well do you think it is currently applied, and how do you view its potential?

No currently existing system really reaches the full vision of adaptive learning— education that is sensitive to the full range of differences that students bring to bear, in which students receive individualized learning experiences that will most help them to grow. Some of the technology that is currently used to customize students’ learning in K–12 is pretty good, though it hasn’t cracked the problem entirely. K–12 education is ahead of some other areas in implementing adaptive learning, but higher education is moving a lot faster and tends to be better equipped to evaluate evidence and adopt good products. Right now, some sustainable efforts for higher education are coming out of large publishers and learning technology companies.

From left: Doctoral student Stefan Slater, Associate Professor Ryan Baker, doctoral student Juan Miguel Andres, and PCLA Associate Director Dr. Jaclyn Ocumpaugh use data to study how learning unfolds.

Through support from the Office of the Provost, PCLA has a central location in the Van Pelt-Dietrich Library to foster its collaborative role at Penn. How does the Center collaborate?

We have an important collaboration with the University’s Online Learning Initiative, which is located next to us in the library, to study the data from Penn’s MOOCs. Since joining Penn GSE, I’ve been talking with many others on campus about potential collaborations in areas such as computer science and writing, and I feel very positive about the collaborative possibilities at Penn. The Center also has research collaborations with the University of Michigan, the University of Illinois, Georgia State University, and Beijing Normal University in China, among many other places. Beyond my role at GSE and Penn, I see my job as one of building things that can do outreach from the University, so that people around the world can utilize Penn’s great research instruments.

This article originally appeared in the Fall 2017 issue of The Penn GSE Magazine.