Data Is Beautiful – November Submission – Rubik’s Cube Data

//

Jon Elordi

I entered my submission to r/DataIsBeautiful’s monthly data visualization competition. This post explains how I went through the Rubik’s Cube data. How I came up with the idea. How I cleaned the data. And ultimately how I visualized it in Tableau.

The final result can be found here. Also if you want to see my other Data viz submissions take a look at this post.

How I found My Topic – Putting Your Best Foot Forward.

The nice thing about the data visualization competition is that you have free reign to attack the data any way you want. I like to find an interesting or odd angle. It’s about telling a story, and I like weird stories. this month’s data set was about the World Cube Association. That’s right it’s data about Rubik’s Cube competitions. Surely I’d find something odd that would suit my fancy.

To start, I downloaded the data files, and just looked through them to get a feel for what they contained. The files were TSV(tab-separated values) as opposed to the more common CSV(comma-separated values). So I had to view them using notepad++ as it can handle just about every file type ever created. I started just poking around the files to see if anything stuck out. Nothing did. The size of the file surprised me. Who knew there was so much Rubik’s cube data out there?

After glancing at the data, I decided to go to the WCA website to learn more. Specifically, I went to the results page. I was digging around when I came across an interesting symbol.

WCA-symbols-foot
Is that a foot?

Is that a foot? Do people solve Rubik’s Cubes with their feet? Yes. Yes, they do. I found my angle. Here are the rules for foot-solving. And here’s a video of somebody solving a Rubik’s Cube with their foot.

Cleaning the Data

For the dashboard, I made a giant master table by merging several tables together, and dropping columns I did not need along the way.

The libraries I used:

dplyr
stringr

The code to get only the foot Rubik’s Cube Results:

WCA_Footers <- WCA_export_Results %>% 
   mutate(year = str_sub(competitionId, -4, -1 )) %>%
   filter(eventId == "333ft" & year > 2000) %>%
   select(competitionId, best, average, personName, personId, year)
 
Footer_personal<- merge(x = WCA_export_RanksAverage, y = WCA_export_RanksSingle,                          by.x = c("personId","eventId"), by.y = c("personId","eventId")) %>% 
                   filter(eventId == "333ft")
 
Footer_personal_withnames <- merge(x= Footer_personal, y = WCA_export_Persons, by.x = "personId", by.y = "id", all.x = TRUE, all.y = FALSE)
 
Master_Table <- merge(x = WCA_Footers, y = Footer_personal_withnames, by.x = "personId", by.y = "personId", all.x = TRUE, all.y = FALSE)

I took the WCA_export_Results file and took what I need from it. I used mutate from the dplyr library to extract the year from the competitionId variable. The str_sub function from the stringr library takes a specific substring from a string. The way I have it configured, the function takes the last four characters. This is where the year is in the competitionId naming convention.

Then I chained the mutate using %>% as is common with dplyr. Using filter I selected only eventId‘s with the “333ft,” these are the footers as I’ve been calling them. And to simplify the data I only took data after the year 2000.

After that, I chained select to only have the columns I needed. In this case were: competitionId, best, average, personName, personId, year.

After this, there cam a lot of merging. These were left-hand joins. I combined WCA_export_RanksAverage and WCA_export_RanksSingle to make Footer_Personal which had the player’s average and best footer times. Then I took Footer_Personal and merged it with WCA_export_Persons to get the player’s names, gender, and country. Before that, it was by personid. Therefore, I combined WCA_Footers with Footer_personal_withnames to get my master table.

Making the Viz

After cleaning the data, and getting my Master Table. I loaded the data into Tableau. Wanting to practice my skills using tableau actions, I decided to make a simple dashboard that highlighted players using a table and a time series. The goal was to have the different graphs linked together so that when you clicked on one, the others would highlight.

The visualization lives here. Feel free to play around with the highlighting function.