I just received my certificate from Stanford’s Statistical Learning course, taught by the legendary Trevor Hastie and Rob Tribshirani. This was the first MOOC I’ve completed since making the jump from education to the corporate world, and I did find it challenging to keep up with the material despite the fact that this class required quite a bit less on a per week basis than most of the Johns Hopkins Data Science Specialization on Coursera. As has been the case with many of the MOOCs I’ve taken, I wanted to share my thoughts in a review of the class for those that might be interested in taking it.
Course Lectures and Materials
Without a doubt, the video lectures are the high point of the point of Statistical Learning. Hastie and Tribshirani are engaging speakers, they use real world examples, and while they get into mathematical theory, they don’t get too deep “into the weeds.” I’ve completed at least part of about 16 MOOCs at this point (14 start to finish), and these were the easiest to understand videos I’ve ever watched as a student, with good production quality.
Hastie and Tribshirani use An Introduction to Statistical Learning with Applications in R (ISLR) as the course text. This book is available as a free pdf download or as a hard copy on Amazon. My biggest regret in the course is that I did not take greater advantage of the book (more on that later). I completed all of the assignments for the first couple of chapters, and they were really beneficial. After that, I basically skimmed it. Almost all of Hastie and Tribshirani’s examples are directly aligned to examples in the book, although many times they do not go as in depth as ISLR.
ISLR spends a lot of time on linear regression and related topics like generalized linear models. Eventually, more advanced models such as random forests, support vector machines, and clustering are covered. I should add that for one set of lectures, the University of Washington’s Daniela Witten, one of the ISLR authors, joins the cast and does a fine job.
Assessments and Pacing
As much as I loved the lectures with Hastie and Tribshirani, the assessments were where I really thought the course needed improvement. The course as a whole is done on Stanford’s own site and powered by OPENedX, so if you are familiar with edX courses, this feels a lot like one of those. Grading is strictly multiple choice questions, and you get one chance for each question. Many of the questions are a bit… quirky? Especially, in the early chapters, it’s not always easy to determine what’s being asked and what assumptions one should make. Only getting one chance makes it more frustrating when you realize you’ve misinterpreted a question.
I could tell from the forums that this frustrated a great many students. I also found that frustrating, but was more disappointed with the lack of open ended assignments. While the lectures in this course were superior to many of the JHU Data Science Coursera sequence, the JHU sequence offers open ended programming assignments. I always learned far more from those problems than I did from the multiple choice problems. Somehow, even the multiple choice problems for the JHU sequence often got me to write a 5-10 line script to answer them and I can’t say the same here.
I also found that without weekly deadlines, I ended up cramming several weeks worth of material at the end to meet the final deadline. In Statistical Learning, nothing is due until the end of the course. I know from speaking with other MOOC students about this that I’m probably an exception, but I find that regular deadlines motivate me to stay on track with course materials.
Bottom Line Statistical Learning Review
To some extent, your learning style will dictate how much you get out of Statistical Learning. Due to my learning style, I can’t honestly say that I mastered the material that I wasn’t already comfortable with, but I did get exposure to some new ideas that I can build on in the future. Additionally, Hastie, Tribshirani, and Witten all deliver masterful lectures throughout this course and the ISLR textbook is terrific. Given the price (free), it’s hard to say that this is a bad choice if you are looking for an introduction to statistical learning methods.