Stoltzmaniac Fans – It’s time for a #100DaysOfCode update.
I have completed 11 days of the challenge. Let me tell you, it has been a blast and I have already learned a lot. In this post I’ll walk you through what I’ve done thus far. Here is a link to the code on my GitHub repository.
As you may recall from my previous post I set out to create a flask application to host data science projects for the Meetup group that I organize (Fort Collins Data Science Meetup). My goal is to provide people with an outlet to run code online where they will get the benefits of having a server and a dynamic UI. This will improve the group’s collaboration and Git skills along with allowing people to showcase their work without having to build infrastructure. In case you’re wondering, I built this using Docker Compose, Flask, NGINX, PostgreSQL, and MongoDB.
In order to keep from boring myself to sleep while writing this, I’m going to keep it short and to the point. You might be asking, “what does this application look like?” That’s a great question. It’s a normal website where people contribute Python scripts to do some sort of data processing or analysis. For example, here’s a word cloud generator where the user inserts a Twitter handle with a link to a logo of some sort and then a word cloud is created from all of the most recent tweets! Here is @realdonaldtrump as the Republican elephant and @barackobama as the Democrat donkey.
Recently, I started looking into data sets to compete in Go Code Colorado (check it out if you live in CO). The problem with such diversity in data sets is finding a way to quickly visualize the data and do exploratory analysis. While tools like Tableau make data visualization extremely easy, the data isn’t always properly formatted to be easily consumed. Here’s are a few tips to help speed up your exploratory data analysis!
We’ll use data from two sources to aid with this example:
Is George Washington better looking on the dollar bill or represented by a word cloud built with the text of The Constitution of the USA?
A colleague recently asked me that exact question. If you want to be taken seriously in the data science world, you better be able to answer something like this!
I decided that it would be fun to show off a Python package by Andreas Mueller called word_cloud (here) to make a fun image with the text of the Constitution and an image of one of the Founding Fathers.
I must warn you, word clouds are like pie charts people like the way they look but clouds don’t provide much information. That said, this package is really neat because it allows you to easily turn text into images utilizing masks, colors, and numpy!
I’ll keep this post short, what you want to do is simple:
Select an image which you would like to mimic in both color and shape
Read your image into Python using numpy
Read your text into Python using open() and read()
Anyone old enough to remember the Monty Hall problem from the old TV Show Let’s Make a Deal? It’s a classic probability problem – but despite its simplicity, it can be hard to understand what choices to make to maximize your odds of winning.
This is the problem:
You are a contestant on a game show. The host displays three doors. One has the brand new car behind it while behind the other doors have goats behind them. Here’s a beautiful image of all possible options you would have: Continue reading →
OpenCV is an incredibly powerful tool to have in your toolbox. I have had a lot of success using it in Python but very little success in R. I haven’t done too much other than searching Google but it seems as if “imager” and “videoplayR” provide a lot of the functionality but not all of it.
I have never actually called Python functions from R before. Initially, I tried the “rPython” library – that has a lot of advantages, but was completely unnecessary for me so system() worked absolutely fine. While this example is extremely simple, it should help to illustrate how easy it is to utilize the power of Python from within R. I need to give credit to Harrison Kinsley for all of his efforts and work at PythonProgramming.net – I used a lot of his code and ideas for this post (especially the Python portion).