I’ve recently completed the second course in the University of Washington Machine Learning Specialization on Coursera, “Machine Learning: Regression.” This comes on the heels of completing course 1, Machine Learning Foundations: A Case Study Approach. This course debuted right at the end of November and wrapped up 6 weeks later (my impression is that these courses are slipping a bit behind the timeline that was originally announced). I’d encourage you to read my review of the first course above, as I was left satisfied with the learning experience I received in the first class, but wondering if some of the concerns that students raised would be addressed.
From course one to course two, the most visible change was the primary face of the class. That is, Emily Fox took the lead in front of the camera throughout Machine Learning: Regression, while Carlos Guestrin took a step back. Fox is very patient in her teaching style, never rushing to make a point, and giving you time to absorb information before moving on. I appreciate how she both uses concrete examples as well as works out the linear algebra as appropriate.
A variety of regression techniques are covered throughout the course, beginning at the beginning with simple linear regression, while progressing towards multiple regression, ridge regression, lasso, k-nearest neighbors regression, and kernel regression. Along the way, measures for assessing performance such as RSS and cross validation are explained as well.
Unlike the survey course (course one), in Machine Learning: Regression, students have the chance to implement these techniques as well as use pre-implemented versions. I found that I came away understanding each of these algorithms much better than I did going into the course as a result of having to do these implementations myself, although there still were one or two (specifically I’m thinking lasso), where I was pretty reliant on the hints and starter code provided for building the actual implementation of the algorithm.
Once again, this entire specialization is conducted in Python, and Jupyter (iPython) notebooks are used throughout. I find that having a lot of structure with starter code is a blessing and a curse. It does keep you from getting too far off track, but there is also the reality that with less ability to wander, you may not learn certain things as well. For the most part, I think Guestrin and Fox walk this fine line pretty well.
As in course one, Graphlab from Guestrin’s company Dato was used throughout. To their credit, this time the instructors made it easier to do the work with other modules such as Pandas and SciKit Learn by making the data available in non-Graphlab formats from day one. I saw posts on the class forums that indicated there were students doing the work with other tools, but from perspective, since the notebooks were already written to leverage Graphlab, I really didn’t want to reinvent the wheel and stuck with that. This was less of an issue in the assignments that required students to implement a machine learning algorithm, since those assignments were light on Graphlab, and heavy on Numpy anyway.
Bottom Line Machine Learning: Regression Review
I said in my review of the first course in this specialization that I was definitely going to stick around for another course to see what I got out of it. At this point, I’m in for the duration. These courses are a nice mix of techniques and tools that I already have familiarity with and those that I’m less experienced with. The teaching style is great, and I’m finding that even when techniques are covered that I’m already familiar with, it’s far from a waste of my time. I come away with a better understanding both of the theoretical underpinnings and applications of the machine learning techniques being addressed.