ggplot2: Toward a More Coherent Visual Language

Recently I’ve been making an effort to improve the quality of my visualizations and get them out of the standard THIS-WAS-MADE-IN-GGPLOT theme. Not to say that theme is really all that bad. On the contrary, I’ve always thought it works pretty well for everyday use and certainly looks much better than anything you get standard out of say, Excel, but it’s not a sleek as one would like. More personally, I think it’s good for someone who is trying to get their work seen by more people to display a higher level of design competence.

For clients, this might set them at ease knowing that I can adjust graphics to fit their desired design parameters. For myself, having a unique style, where ideally someone could take one look at a graph with no attribution and say, “Grant Oliveira did that one”, is the goal. The standard fivethirtyeight theme comes to mind as an example of this. Of course, Nate Silver is famous for reasons other than how clean his visualizations are, but Silver and crew have created a definite visual style that unifies everything they create.

To me, the biggest offenders in the stock ggplot theme are the stock font (some kind of thicker Helvetica variant?) and dark, chunky grey background. In the past I’ve typically resolved these issues in post in photoshop. That’s fine, and I think it’ll always be an option, but I think it’d be worth while to work directly in ggplot’s theme function to create a baseline look that I’m happy with and can save locally and have on call whenever. After playing around a little bit I think this is the theme that I’ll be working with most of the time.


Screen Shot 2016-10-25 at 9.46.25 AM.png

What we’ve got here is some data pulled from the College dataset in the ISLR package (Side note: ISLR, or Introduction to Statistical Learning, is an awesome textbook for anyone looking to brush up on everything from basic regression principles to advanced machine learning without getting too bogged down with mathematical notation.) The data are a bunch of different statistics about a broad selection of American universities pulled from the 1995 US News and Report Rankings ranging from what percentage of admitted students were in the top 10% of their class, to what percent of alumni donate. This is a basic linear regression between the Graduation Rate and Expenditure per Student. As we can see, it’s not a great fit (R^2: 0.1524, p: < 2e-16), but it was a dataset I was playing with at the time and a question I thought was worth exploring.

As far as the design goes, I've made the following obvious and not so obvious changes:

  • Lightened the back panel to ‘grey95’
  • Thinned major gridlines to .1, removed minor gridlines, turned them black
  • Changed font to Calibri Light, size = 12 for title and 10 for labels
  • geom_point = size 2 (always thought the standard is too darn small)

I’ve always really liked the stock ‘steelblue’ color, so I think for any plot that can conceivably be monochromatic I’ll use it as my color. The code looks like this (messy for now, but when I have it fully iterated for every option in the theme function I’ll post something more legible):

ggplot(aes(x=Expend, y=Grad.Rate), data = College) + geom_point(size=2, color="steelblue") + geom_smooth(method="lm", color="black", size=1.5) + labs(x = "Expenditure Per Student (in USD)", y = "Graduation Rate") + ylim(0,110) + ggtitle("College Graduation Rate vs. Expenditure Per Student") + scale_x_continuous(label=function(x){return(paste("$", x))}) + 
        theme(panel.background = element_rect(fill = 'grey95'), plot.title = element_text(size=12,lineheight=.8, vjust=1,family="Calibri Light"), 
              axis.title = element_text(size=10,lineheight=.8, vjust=1,family="Calibri Light"), 
              panel.grid.major = element_line(size=.1, color="black"), panel.grid.minor = element_line(size=0, color ="white")) 

I’m sure I’ll be playing with this post by post, but I like this as a start.


One thought on “ggplot2: Toward a More Coherent Visual Language

  1. I must say it was hard to find your page in search results.

    You write awesome articles but you should rank your blog higher
    in search engines. If you don’t know 2017 seo techniues search on youtube:
    how to rank a website Marcel’s way

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s