Category Archives: Data

Teaching Graph Theory With Twitter

In a recent post, I displayed the social network graph that I created using the Twitter API and Plotly. There are a number of interesting applications here. Given my history with education, one that I think that shouldn’t be overlooked is as an interesting way to teach graph theory for an innovative teacher and school.… Continue Reading

#EdTechChat Social Network Graph

#EdTechChat Social Network Graph

Using the Twitter API and Plotly with Python, I created a visualization of a recent #EdTechChat on Twitter, held on December 14. If you aren’t familiar with graph theory, the dots in this visualization are referred to as nodes or vertices. They represent the Twitter users that participated in the chat. The line segments connecting… Continue Reading

Databricks Review

Databricks Review

  Not too long ago, I did my first post on Apache Spark, a Spark dataframes tutorial. I’ve continued to experiment with Spark since taking my first tentative steps with it just a few months ago. One of the challenges with Spark is that it has a reputation for being difficult to deploy at scale.… Continue Reading

My First Month With Ubuntu

  My journey into data science is taking me all sorts of interesting places that I didn’t originally expect. That’s what I love about it. While I can feel myself accelerating into the learning curve, there’s no shortage of new things to learn and won’t be for years to come. One of the latest has… Continue Reading

Spark Dataframes and MLlib

Spark Dataframes and MLlib

NOTE: I have created a new, much updated and easier version of this tutorial based on the Spark 2.1 dataframes api and MLlib that I encourage readers to take a look at instead of this older post. A couple of months ago, I got my first experience with Apache Spark. While I am just starting to… Continue Reading

Favorite Podcasts for Data Scientists

One of my favorite learning methods is via podcasts. They allow me to multitask–exercising, driving, or doing chores–while listening to experts on a particular topic. Some of the podcasts I listen to are purely for entertainment (think Serial or StartUp) but many others are for educational purposes. As I’ve been trying to build up my… Continue Reading

Johns Hopkins Data Science Specialization Review

Johns Hopkins Data Science Specialization Review

It’s been a couple of weeks since Johns Hopkins issued final certificates for their Data Science Specialization on Coursera. I’m glad to say that I am now among the first crop of “alums” of the program. According to the last email we students received from our Johns Hopkins professors, about 2.3 million students have attempted… Continue Reading