What’s your [Data] point?

Audrey Watters describes “data” as one of the top trends of 2011 and predicts that it will be even more important in 2012.  Big data has been a hot topic in the corporate world for about a year and half.  Corporations realized that with the explosion of social media and willingness of the public to volunteer their private information when it comes to cyberspace, they have a massive amount of data collected.  Not just any data, but specific data about potential customers and their preferences.  As the line between education and business draws closer, our politicians (often times former business leaders) are realizing that if there is a formula to predict consumer spending habits, why can’t we predict student progress and/or success?   Or the even scarier question, “why can’t we correlate students success with teacher performance?”

The problem with “big data” as an education data warehouse is that it cannot be easily used as a prediction engine.  There are too many dynamic variables in the life of a student.  For example, a model student in which the big data analytics formula would normally predict as a successful student could be experiencing a divorce situation at home which would affect his study habits.  In essence the problem is not in mining the data that is available.  The problem is what information is needed to determine student success?  This following quote sums up this issue…

”It would be nice if all of the data which sociologists require could be enumerated because then we could run them through IBM machines and draw charts as the economists do. However, not everything that can be counted counts, and not everything that counts can be counted.” – William Bruce Cameron (1963)

Watters points to an article on her blog about a study that was conducted on the virtual classroom.  The study concluded that the virtual classroom students did not perform as well as their traditional school counterparts.  Perhaps this can be attributed to the presence of a traditional school environment and instructor?  In my opinion this is also a topic that is up for debate because the study does not disclose the prior history of the subjects academic performance.  We don’t know if they have failed out of the traditional schools and enrolled in the virtual classrooms while this study was conducted.  This proves the point that big data has its place in education for some analysis but it is certainly not ready to be used as a tool for predicting student success or teacher evaluation.

Big data


Information Overload: Education Data and Learning Analytics

The What?

For a while now, our society has been keeping track of just about everything that it can. Temperatures, wind speeds, number of traffic accidents at the intersection of Broome & Sycamore Streets, social security numbers, and on and on. If we were to enumerate the names of each dataset and what kind of information they kept, that list too, just might be another big data dataset. We have all of this information, but the problem is it’s very disjoint and so big, that drawing connections takes massive amounts of computing power.

Information Overload

Even with an infographic like this, it’s hard to imagine just how much data we are all creating each day. Some people are trying to bring it all together to see what they can find.

But when people are able to analyze big data and draw new meaning from once unrelated datasets it can offer highly valuable insights. Whether its the relationship between human activity and global warming, or if an asteroid is likely to join the earth, or if a student is having trouble keeping up in his or her distance learning course.

That’s why there are a couple of new buzzwords in the education industry: Education Data and Learning Analytics. There are many types of education data, such as the somewhat controversial metrics on teacher performance, and there are many issues that arise from the data, like privacy (for example, students and teachers deserve their privacy, but in order to protect it, researchers might not be able to access the data). But here I’ll only discuss the ideas in general sense.

Education Data

Education Data is the collection of information on a student from the key indicators such as grades, but now also bits of information like how long a student spent time reading the assigned chapters his or her eBook.

Learning Analytics

Learning Analytics uses the various algorithms created by the Educational Data Miners and bringing them all together. So while Education Data might be used to search for an answer to a question like “what are the most successful methods for teaching the alphabet?” or “what are the main indicators of dislexia in new readers”, Learning Analytics will group these models in to larger applications such as one whose goal is to teach literacy but can also, on its own, identify a dyslexic student and adjust how it goes about presenting the material to him or her.

The Why

Why Mine Data?

The hope is that by mining this sort of granular information that researchers can use this data to do things like make predictions about the student’s future performance, or find patterns that show up in good or struggling students. They do this by inventing new computer algorithms and creating new data models that analyze the data to help us understand what the data says about issues we are interested in. The Department of Education wrote a brief about ED and LA in which they explain their belief that through this type of processes they would be able to answer  questions that are highly specific to an individual. I’ll list a couple of interesting ones found in the report:

  • What sequence of topics is most effective for a specific student?
  • What will predict student success?

The DoE and many researchers all believe that the answer to such questions can be uncovered in these seemingly meaningless datasets

Why Practice Learning Analytics?

Again, here are some questions that the DoE in particular hopes to be able to answer learning analytics:

  • When are students ready to move on to the next topic?
  • When are students falling behind in a course?
  • When is a student at risk for not completing a course?
  • What grade is a student likely to get without intervention?
  • What is the best next course for a given student?
  • Should a student be referred to a counselor for help?

With a mature set of data tool and learning analytics, I’m sure that Massive Online Open Couses will be able to offer eLearning to students without requiring the same level of attention (or perhaps none at all) from a professor or instructor.