Author Archives: Lucas Allen

Retro Game Retrieval Engine Design

Retro Game Retrieval Engine Design

I’ve got a new Shiny web app that I’ve embedded on another site where I’m doing some experimental things, and I wanted to talk generally about how I created it. The web app can be found at the following link that allows the user to do interactive searches for similar classic games for home consoles from what are generally known as the third generation (NES, Sega Master System) through the sixth generation (Wii, PS2, Xbox). The similarity of games was determined based on text used to describe those games in Wikipedia articles. It was not a small project to put it together, and I got distracted numerous times through its creation, so this actually came together over several months.

Here’s my workflow at a high level:

Gather Game Lists/Links

Wikipedia contributors helpfully offer lists for just about every classic game system (example: Nintendo Entertainment System). Once I decided on a dozen or so systems to include, I easily had thousands of potential games to include. Unfortunately, there’s very little consistency as to how the game tables are constructed. This meant that to gather the links to individual game pages required a custom Beautiful Soup script for each page with a list of games.

Retrieve Game Data

Once the individual game pages had been determined, I needed the individual game page data. Luckily, this turned out to be a bit easier. Rather than use Beautiful Soup, the Wikipedia API was a simpler alternative. While using on page text wasn’t a perfect solution (some games in a series share the same Wikipedia page while other games point to the movie they are derived from), the vast majority of games do have a standalone page that tells the story of the game, as well as something about its development and history.

Create Word Embeddings

I used the Python module Gensim to create the word embeddings that I’d later use to rank the game similarity. I’ve been using Gensim for about a year now, but this still required some experimentation. I considered the following possibilities:

  1. A pretrained Word2Vec model with average word vectors
  2. A newly trained Word2Vec model with average word vectors
  3. A newly trained Doc2Vec model

Ultimately, I went with option 3. I have found Doc2Vec models difficult to train; my searches online suggest I’m not alone in this. However, during the course of this project, I found a paper on Doc2Vec that changed my approach and results. This topic could be an entire blog post, but the TLDR version is that the authors explain that it is critical to crank up the number of training epochs with Doc2Vec, typically into the hundreds. To understand this better, I suggest giving their paper a read or perusing their repo on this topic. In particular, this little snippet from their train.py file is gold as a starting point for looking for those golden Doc2Vec hyperparameters:

With only a few tweaks to these parameters to parallelize on more cores and take the number of epochs a bit higher (300), and about 12 hours of waiting, I had a model that was giving intuitive results.

Move to R/Shiny

Up to this point, everything I’d done had been in Python. However, I wanted a Shiny app to share my results. To make this work, I saved all of the document vectors I’d created to a CSV. In addition to creating the Shiny app itself, I needed an efficient way to do the cosine similarity calculation. Gensim offers that capability, but I was giving that up by moving to R. I needed to vectorize that calculation, and while I’ve gotten pretty comfortable with those sorts of broadcast calculations in Numpy in recent months, I haven’t had a need to do vectorized array operations in R. One solution turns out to be R’s “sweep” function.

Ultimately, the Shiny app itself is fairly simple. I haven’t built one in in a while now, and it gave me a chance to use a few features that are a little more recent to Shiny’s release history. I built with the Shinydashboard library, which allows for a great way way to responsively lay out the widgets within your Shiny app that feels more “dashboard-like.” I also added a few responsive info boxes, and I was pleased to see that Rstudio has integrated a ton of icons to choose from, including gaming icon that was appropriate for this app.

So if you’re curious, check out the results in the Shiny app. Not every result is completely intuitive, but most top results are, and some are very cool, like when Doc2Vec picks up on the fact that Castlevania and Van Helsing are both vampire hunter games.

TI-Innovator Ranger Demo and Code

TI-Innovator Ranger Demo and Code

A few weeks back, I added my review of the TI-Innovator. I had a couple of demos in that review, and I’ve been adding an explanation and code so interested teachers and students can try them out in their own classrooms. In this post, I’m taking the TI-Innovator Ranger for a spin. The Ranger technology… Continue Reading

TI-Innovator Hub Demo

I recently added a review of the TI-Innovator. I had a couple of simple demos in there, and I promised I’d show how I did them later. Here’s the first of two blog posts explaining one of those demos. This one is really simple, just playing a few notes from the original Super Mario Bros… Continue Reading

TI-Innovator Review

TI-Innovator Review

  A few months back, Texas Instruments announced a new STEM education product they had developed that would encourage kids to develop coding skills right on their graphing calculators, the TI-Innovator. The Innovator would work with either the TI-Nspire CX family or the TI-84 Plus CE, the latest generation of Texas Instruments graphing calculators. I… Continue Reading

Machine Learning Specialization Cut Short by Coursera

After an extremely long wait, today was the day that the fifth course in Coursera’s Machine Learning Specialization was set to begin. I’ve been with this specialization since it launched in the fall of 2015. Students were initially promised an ambitious slate of six courses, including a capstone that would wrap up by early summer of… Continue Reading

How to Draw Mario on the TI-Nspire

A few months back I had some time on my hands and did a post on how to graph Mickey Mouse with the TI-Nspire. Today I found myself in the same situation and decided to try my hand with the classic Nintendo character Mario on the Nspire. I imposed the same rules on myself as… Continue Reading