After a completing the Data Science Specialization from Johns Hopkins in 2014, my MOOC studies in 2015 have been fairly sporadic, partly as a result of starting a new job, and partly as a result of not seeing something that seemed like the right fit. That’s no longer the case, as I’ve recently jumped into a new specialization, the Machine Learning Specialization from the University of Washington.
As great an experience as I had with the JHU specialization, this new specialization checks a couple of continuing education boxes for me that I felt the JHU specialization left lacking.
- The UW Machine Learning Specialization is taught entirely in Python, whereas the entire JHU specialization was done with R.
- The UW Specialization is entirely focused on advancing knowledge and skills in machine learning and is designed for those with intermediate data science skills, whereas the JHU specialization only touches on machine learning since it is designed for beginners.
I’ve just recently completed the first course in this specialization, “Machine Learning Foundation—A Case Study Approach,” and wanted to write a review. There’s an awful lot to like about this class, and potentially the rest of the sequence depending on how it plays out, but there are one or two caveats to be aware of going in.
Probably the biggest thing that the class has going for it are the instructors, Carlos Guestrin and Emily Fox. You can review their academic credentials, which are impeccable, but in my experience, Coursera students don’t really care about that in their instructors, only their ability to communicate complex ideas. Guestrin and Fox excel in this.
While they don’t ignore the theoretical underpinnings of the machine learning methods taught in this course, they don’t focus time there. Instead, the spend instruction time helping the student understand the application of the machine learning algorithms in the context of a different problem (or case study, as the name of the course implies). Having taken a lot of mathematics courses at the undergraduate and graduate level, I appreciate Guestrin and Fox incorporating these case studies as it makes it much clearer to see how the different methods could be used in a variety of contexts.
Machine Learning Techniques Covered
As for the actual machine learning methods covered by this case study approach, they are regression, classification, clustering, recommenders, and deep learning. This is a nice variety of methods to begin with, and I was pleased to see recommenders and deep learning included since they are not typically covered in most introductory machine learning lessons I’ve seen. In theory, very little knowledge of machine learning is assumed. However, I think that a fair amount of the coding for selecting rows and columns would be somewhat mystifying (it was covered very quickly) if someone had truly no knowledge of Pandas or a data friendly language like R or Matlab. Hopefully, most students taking this class are already some experience in Python or a similar language.
New Coursera Platform
This course leveraged the new style of Coursera class as well. This was actually a pretty significant change from what I got accustomed to in 2014. Differences I noted, most (but probably not all) of which are related to the new platform:
- Unlimited quiz retakes were offered.
- I could not see the answers I had previously tried, making it more difficult to learn from wrong answers without writing down all of my answers in advance of submission.
- It was a requirement to pass all quizzes to pass the course (i.e. total points didn’t matter).
- I could see the number of students enrolled in the class, displayed on a world map by region. The class started with a little less than 8,000 students and finished with a little less than 7,000, pretty small by Coursera standards.
- Videos didn’t work on my phone via the Coursera Android app, but they did work via the mobile site. I checked, and the other course I was enrolled in (2014 Coursera style) still allowed me to access videos via the Coursera app.
Scikit Learn vs Graphlab Create
There’s no question that the biggest source of consternation on the class forums was the use of Graphlab Create. Guestrin is the CEO of Dato, which makes Graphlab. Although every student was issued an 1-year academic license for Graphlab, there was certainly disappointment that the open source, and ubiquitous, Scikit Learn was not utilized for model building over Graphlab.
I will admit that I am among those that is disappointed that Scikit Learn was not utilized for the course because I like learning things that are widely used. That said, having used both Graphlab Create and Scikit Learn, I definitely understand Guestrin’s explanation that the course is about learning concepts, not tools. Graphlab is easier to get started with building models, so more time could be devoted to understanding concepts. Some comments were made on the course forums by the TAs that indicated that Graphlab might not be used quite so exclusively in future classes in the Machine Learning Specialization, so it will be interesting to see how that plays out.
Bottom Line Machine Learning Foundations Coursera Review
I had initial concerns with the choice of Graphlab over Scikit Learn, but this turned out to be an excellent course. The strength of the course is the instructors’ ability to relate applicable data problems to the machine learning algorithms taught. I will be sticking with the University of Washington’s Machine Learning Specialization and look forward to deeper dives in future courses. The next course, focusing on regression, has been announced to start November 30, 2015.