Data
Info Viz
Design

Occasionally, Parenting

The purpose of this blog is to chronicle my journey through data viz and web development. It's not fancy, but my plan is that as these skills grow I'll grow and improve the look and feel of this blog as well. I'm the type of person that likes to learn from scratch- maybe do it the hard way just to know what that means. For that reason, it's built all from "scratch." No wordpress or css frameworks here. Just the good stuff.

Where's da' Powder?

**The graphics in this post work much better on a laptop or bigger screen. **

I moved to Colorado in November of 1997- following an old snowboarding friend. I had come out to visit this friend the two years before, probably around February as that's when our winter vacations are back east. I was already solidly addicted to snowboarding but up until that first trip out here, having ridden New England ice my entire life, I was, alas, a park rat. I had never experienced the wonders, and sometimes extreme frustrations, of real powder. My memory of those two trips out to visit him in Vail- which would have been the winters of '95/'96 & '96/'97- are epic. Waist deep snow every day. Dropping cliffs. Goggles and hats flug off me as I gleefully cartwheeled through the powder (this was before helmets).

I was hooked. I declared to my parents that I would be only applying to colleges in Colorado and that I would be taking a year off (at least one) to live in Vail where it snowed a foot every day and everyone was happy.

Move out here I did and I have snowboarded (and now skied) a ton. But a conversation with another powder addict friend this winter sparked a question for me. Since those years visiting it feels like we've never really gotten that kind of snow again. Was it just my imagination of how much snow Colorado got those winters- especially in comparison to New England- or were those truly epic years and I just got unluckly and moved here a year too late?

So that's the impetus for this data adventure. Of course, one question tends to lead to another. What has the snow been like at other mountains? Should I really have moved to Tahoe instead (as another friend and I had contemplated way back then)? Is there a best time of year to go visit various mountains?

It turns out tracking snowpack is a little complicated. So here's the nitty gritty disclaimer. Snow comes in many forms. It comes in Colorado Champaign, and Sierra Cement, and everything in between. What this means is some snow has significantly more water content than other snow. There are as many snow preferences as there are powder addicts so I'll leave that debate to another time. The snow data I collected for this project is called Snow Water Equivalent. It means, if you took the snow that fell and melted it, the SWE is the amount of water you would end up with. Clearly denser snow will have a higher SWE. The way to deal with this from a "powder" perspective is to then look at the density of the snow and calculate the snow depth. Unfortunately, historically, SWE is a much more commonly collected metric than density. For that reason all the data in this post is the raw SWE. I will note, here and there where consideration should be taken for varying density of snow.

First, I think I need to answer the very first question that sparked this research: Were the winters of '95/'96 & '96/'97 in Vail really as epic as I remember?

Snow Water Equivalent for the Years 1987 - 2018

It wasn't just a dream. In fact February of '96 & '97 were some of the best February's on record since this data started being collected in 1987! I'm not crazy, just unlucky. Or maybe I should consider myself lucky for having been able to come out those years, even just to visit.

Let's look at the rest of the mountains now. I've collected data from 21 of the largest ski mountains in the west. I'll admit, it's very Colorado heavy- I do live here. The data for all the US mountains was collected from SNOTEL sites near each mountain. A SNOTELsite is a remote snow monitoring station set up by the USDA Natural Resources Conservation Service. The data from the Canadian mountains was collected from a British Colombia government site.

When looking at this chart, remember that some mountains have denser snow than others. The massively high snow totals for Squaw are likely more indicative of their "Sierra Cement". They do report some amazing snow though- like in 2017 when the there was so much snow they had to dig the lifts out.


Now let's look at how the snowfall changes throughout the winter. I was curious to see if one could choose where to ski based on the month. Are some mountains better early season and others better late season?

Mean Snow Water Equivalent per Month for all Years Reporting

For the most part the mountains look about the same. Clearly some shed snow earlier that others (Eldora, Purgatory). Others take a bit longer to get going in the fall (Mt Baker, Jackson, Snowbird).

Up next for this data is getting down to the nitty gritty of powder days? Can we quantify a powder day and, if so, are there patterns we can see in their frequency? Do some mountains get more powder days in certain months than others? I've already started looking at the data and it's messy but I think there's something there.



Technical Notes
The data for this project was cleaned and arranged using a mixture of Excel and R. With the exception of the static first chart, the two other charts are D3. I tried a number of different chart types for this project. Most notably, I tried a small mulitples Radar chart type for the monthly mean data. While it looked cool, it was not as easily understandable as the small multiples area chart I ended up with. As visualizers this is constant battle- cool looking vs. coherent. Sometimes we get lucky and get both. It was hard to give up on the radar chart as I had spent a lot of time getting it to work and cleaning the data to pacify the D3 beast. But when my husband looked at it and said "I don't get it"-- and then said it again after I explained it to him-- I knew it was a stretch.

All the D3 code for this project (and all my other projects can be found at my bl.ocks page. I try hard to comment them as thoroughly as possible. I believe that not commenting your code and then posting it for others to learn from it is a form of newbee Hazing)

Color Theory

After a spirited discussion with a client about color theory and the colors we should use for his project it occured to me that maybe I needed a more research based approach to color selection. This, then, sent me down a rabbit hole of color research. It's deep that hole. Color theories abound. There are theories about mood, and feeling, issues with visual disabilities related to color (color blindness), and then, specific to data viz, issues surrounding discernibility and color connotations. Some color palettes work well for charts with big elements or marks while those same palettes are indistiguishable on charts with smaller marks, like scatter plots.

Luckily (or for the more obsessive of us, unluckily), there are tons of online tools to help choose colors for visualizations. I'm not going to go through every one as there are too many and some other very nice bloggers have already listed many of the best. (This is a great example.) Instead, I'm going to list a smaller selection that I, personally, find useful.

I would break the tools into two categories: Color Helpers & Color Checkers.

Color Helpers can help you choose a palette but really don't give you any information if it's a good palette for data viz from a research perspective.

Color Checkers are based in color and perception research and can check if your palette passes muster for good data viz.


Color Helpers
Kuler
Kuler is my main go-to. It's super powerful in that I can find color using RGB, CMYK or Hex Code and it can generate palettes for me in monochromatic, complimentary, and shades. This is a great place to start if you already have a single color that you want to work off of. Kuler also has saved palettes from other users. You can search by key-word and find some great palettes ready made which you can then tweak to your needs. Finally, I love Kuler because I can save it directly to my Adobe library to be used in the various other Adobe software I use (no, I'm not sponsored by them, but if your interested, Adobe... :)

Color Hunt
This is a simple site with some really gorgeous palettes on it. Users save their palettes to the site so there's a seemingly infinite choices. It doesn't have the sentiment or keyword search capability that Kuler has but its simplicity can be refreshing. My kids used this site to choose the colors to paint their playhouse.

Paletton
The main benefit of this color picking site is that you can see how colors might look on a standard web page layout as the main or accent colors. It's a nice option but I don't use it much as I tend to get what I need from the two above.

Color Checkers
Color Brewer
This site is the go-to if you're choosing colors for choropleth maps. It's also really great for treemaps or other chart types where you have a few color differentiated areas and you want the user to be able to easily distinguish between them. The color palettes are based on research Cynthia Brewer did into how many colors can realistically be distinguishable and the best choices for the color from a user standpoint.
I will say, they're not the sexiest of color palettes. If you have your heart set on a certain dark peach, prepare to be disappointed that there isn't a choropleth palette you can make from it on this site.

Viz Palette
This site was created by two giants in the field as a catch all for the current researched-based color picking theory. Honestly, I'm still trying to figure out what it does. It has the ability to amend your color palette for four different color deficiencies. It also links to some other interesting color palette generators. (Repeat, there are A LOT of these!). One thing I do like about this is the ability to export the color palette as an object.

Choosing colors can be a complicated task and opinions abound. Happy Hunting!

Bl.ock Party!

At our last D3 Meetup here in Boulder we experimented with a Bl.ock Party and it went great! Brian and I chose a few D3 Bl.ocks (free standing chunks of D3 code that are used as examples of how to create various graphics in D3).
I then served as the moderator and scribe. I had the bl.ock up on my computer (and the screen) and, as a group, we went through it and commented the hell out of it. In the process, so many interesting discussions were sparked and I learned a ton. I'm looking forward to doing it again!

In the meantime, the bl.ock we commented, plus two more I was inspired to sift through are here.

Experimenting with Medium

So, I'm experimenting with publishing on Medium. I'm still trying to figure out the whole idea of "Publications" and such but I've posted my first story- which is really just my Breast Cancer Costs website formatted for Medium's formatting. Check it and maybe give me a few claps! Find it Here

Amazing NYT Map Special Section

For anyone who got the physical Sunday New York Times this weekend, there was a special insert dedicated to a fantastic mapping project. The digital version is pretty cool but I'm a huge fan of the physical. I was raised with a deep love for maps and with the opening lines of this article I was hooked. "There was a time when every car's glove compartment was crammed with tattered fold-out road maps, trim rectangles that became table-size monsters that challenged you to refold them neatly." Challenge accepted.

Diving Back into D3.js

I'm diving back into D3 after quite a long hiatus. While I haven't been doing a lot of data viz in general this summer, the time I have spent on it has been more focused on learning HTML and CSS. Those are super necessary skills but now it's time to get back to the hard stuff.

When figuring out what I want to work on I find that I get quickly drowned in datasets and mired trying to figure out which is the most interesting/meaningful/uncomplicated/clean etc. So, to avoid this rabbit hole, I've chosen to just remake last week's charts in D3. Gotta start somewhere.

I started trying to make the bars with the dots on top but got stuck pretty quickly. So, as D3 tends to go, I searched and searched for the perfect bl.ock. The one below seemed like a good start. It may even be a better way of representing the information. I've also switch the data around so the party affiliation is coded by color and the advertising metric is on the xaxis. There's still some formatting that could be done to make it look nicer (titles, spacing, etc.) but I'm going to move on for now and try to get those dots in there as it serves as an interesting data wrangling problem.

Conservatives get a bigger share of the reach, for a smaller share of the cost.

Political Ads

To get back into the swing of data viz, I'm trying to create one chart a week. They're shallow dives, admittedly, but they do the necessary work of stretching my data viz muscles and helping me re-learn all the good stuff I learned during my grad program, but maybe have let lapse in the 9 months since I graduated.

For this week's chart I looked into the new database of political advertisements that . I pulled the top 10 ads for the dates May 31st- Sept 03 2018 by total money spent and categorized each by either Conservative, Liberal, or neither. These are highly subjective categorizations. To assign them, I looked up the name of each organisation who paid for the ad and looked for words like:

  • Conservative
    • Republican
    • Center Right
    • America First
    • Limited Government
  • Liberal
    • Democrat
    • Progressive
    • Pro-Choice

I then calculated three percentages for each category:

  • Percentage of the total number of ads placed
  • Percentage of the total dollars spent
  • Percentage of the total "creatives"-- a measure of the total reach of the ad by views.

The Republicans have the highest percentages in all cases but, interestingly, their lead in views is greater than their lead in dollars spent, meaning they got a pretty good bang for their bucks.
I tried charting this data in two ways. First with a combination Bar and Dot plot.

Political Advertisement Bar Chart

A note about making this chart:
First, I thought I'd make it in Tableau as it's super simple, but the first time Tableau refused to layer the three data series I said NO. I've been down that road before.
Then, I thought I'd dive back into R to play with the data and for making this chart. In googling "layered chart" I stumbled upon Plotly for R. It seemed to simple and cool! (red flag! red flag!) So I went about making the chart and learning how to color and size everything. BUT then I tried to export it. As far as I've been able to deduce from hours now of googling- you can export it in R Markdown but that doesn't help to get it into a blog like this. To export it as a png involved loading all sorts of other libraries. So, then I thought, ok, this is a bummer but I've already got a plan so I'll rebuild it in the free version of the browser based Plotly GUI. Well, the colors aren't as customizable but I get it built and try to export it either as an HTML chunk or IFrame. I could get neither to show up as anything other than a tiny window. Argg!
So, in the end I exported it as a PNG (no fun tooltips) and here we have it.

I'm taking the moral of this story to be...well, I'm not sure what to make of it. That I just need to buckle down and learn D3 for real? Probably.

When looking at the last chart, my husband didn't catch the implications that the Conservatives were getting more views per dollar spent than the Liberals. So, I thought I'd try it a different way. I'm not sure this is much better, but it's an interesting exercise.

Political Pie Chart

Makeover Monday

Last week, I came upon the data graphic “A Subjective Look at this Issue’s Fashion Stories” on p.84 of the The New York Times Style Magazine. As an information visualizer I was intrigued by the piece. It was visually beautiful but contained some fundamental data graphic issues.

Here's the original piece . The piece is intended to chart the frequency of the various categories in four photo shoots throughout the magazine. The categories (listed in the legend) are somewhat arbitrary, as far as I can tell, and maybe meant to be a bit silly.


This piece is visually beautiful but I found it had some problems. First, some categories are repeated in the legend (although they do at least retain the same color mapping). I also found it hard to visually track the frequency of the various categories because the colors were so similar and I had to keep looking up and down the to the legend, which was extensive.

For this week's Makeover Monday I chose to re-imagine this graphic to fix some of these issues and just to play with the data. I started by going through the four photo shoots and cataloging for myself the number of instances of each of the categories they had chosen. I'm, admittedly, not a fashion expert so I may have gotten some of the numbers wrong (ex/ what, exactly is Tweed?). I made my best guesses.

I then calculated the percent of occurrence in each photo-shoot because they didn't have the same number of photos. For example, the first shoot, Born Free, had 4 images whereas the Shape of Things shoot had 12.

I opted for a stacked area chart for this one. It's pretty and I think it illustrates a bit more clearly which categories are represented the most and in which shoots. The color palette was chosen from the cover image of the magazine using colorpicker.com. The cover photo can be found here

I was so excited about this version that I wanted to try another. In this case I abandoned the percentages and went for raw number of instances. I love radar charts and this seemed like a good use of it as, I find, they are beautiful but also readable.

This has been my first foray into Makeover Monday and it's been fun! Hopefully more to come.

The Playhouse

While this blog is intended to be mostly about data viz, it is summer and that means the kids are out and we're building. Our goal this summer is to build a playhouse. Some might call it a shed, we call it a playhouse. (And in two years when they get bored with it, it will be a shed). click here for the playhouse saga

Inaugural Post

It's up! I've successfully broken free from Wordpress and started a blog- built from scratch the old fashioned way! I'm not exactly sure where this blog will go. My thoughts are that it will be a chronicle of my journey in Data Viz mixed with some funny pictures of my kids and ramblings about parenting. I'll try to keep those short.