Category Archives: Data

Back2School with Vectors, Cosine Similarity, and Word2Vec

Back2School with Vectors, Cosine Similarity, and Word2Vec

Tomorrow, I’ll be making a return visit to the high school where I spent a decade in the mathematics department as a teacher. I’ve got the chance to speak to ten classes over the course of six class periods and tell them a little bit about what I do as a data scientist.

Since many of the students will be familiar with concepts like vectors and trigonometry, I’ve decided to do an activity involving the Python gensim package and Word2Vec. Specifically, each student was asked to submit a “Tweet” about the most interesting thing they’ve done in the last couple of couple of weeks. I was given those Tweets last week and have prepared a little talk and code walk through about how we can use Word2Vec to identify similar Tweets by transforming unstructured text with word embeddings and comparing their cosine similarity.

I’ve decided to go ahead and share the code in a Github repo. If you’re interested in word embeddings, I hope you’ll find it helpful. I’m also posting the presentation I’m giving tomorrow below, but some formatting of indents, margins, etc. did get lost in the process of wrapping it in an iframe, so if you want to see it in the best possible form, check it out here.

 

Machine Learning Specialization Cut Short by Coursera

After an extremely long wait, today was the day that the fifth course in Coursera’s Machine Learning Specialization was set to begin. I’ve been with this specialization since it launched in the fall of 2015. Students were initially promised an ambitious slate of six courses, including a capstone that would wrap up by early summer of… Continue Reading

Minivan Price Comparison With R

Minivan Price Comparison With R

With my family growing once again and my 13-year-old Mazda Protégé on the fritz, I recently decided it was time to go minivan shopping. A frugal shopper, some might say cheap, I quickly set my focus on the used, domestic market and found that there are only two competitors here, the Dodge Grand Caravan and the… Continue Reading

University of Washington Machine Learning Classification Review

I’ve spent the last couple of months working through course three in the University of Washington’s Machine Learning Specialization on Coursera. Course two was regression (review); the topic of the third course is classification. As has been the case with previous courses, this specialization continues to be taught by Carlos Guestrin and Emily Fox. For… Continue Reading

Graphing Calculator Price Dashboard

Graphing Calculator Price Dashboard

These interactive plots show the prices on Amazon for popular Texas Instruments calculators such as the TI-Nspire CX (review) and TI-84 Plus CE (review) as well as non-TI models like the Casio Prizm (review) and HP Prime (review). The graphs show the last 7 days, and they update every hour, day and night, so check… Continue Reading

Coursera Review–Machine Learning: Regression

Coursera Review–Machine Learning: Regression

I’ve recently completed the second course in the University of Washington Machine Learning Specialization on Coursera, “Machine Learning: Regression.” This comes on the heels of completing course 1, Machine Learning Foundations: A Case Study Approach. This course debuted right at the end of November and wrapped up 6 weeks later (my impression is that these… Continue Reading

Constructing a Social Graph With Twitter and Plotly

Constructing a Social Graph With Twitter and Plotly

In a couple of earlier posts, I showed an example of a social graph created from Twitter data and Plotly, a graph of relationships between educational technology enthusiasts on Twitter. Those posts were more for the educator audience that I write for, but increasingly, I’m getting feedback on my posts from other data scientists, so… Continue Reading

Teaching Graph Theory With Twitter

In a recent post, I displayed the social network graph that I created using the Twitter API and Plotly. There are a number of interesting applications here. Given my history with education, one that I think that shouldn’t be overlooked is as an interesting way to teach graph theory for an innovative teacher and school.… Continue Reading