Part Two - Associations Between Users

Introduction

In part two, I examine common users between musical subreddits. Much like part one, I used Python! I wrote a script to fetch the comments from the top 50 posts of all time from each subreddit (list can be found here). I found common users between the subreddits and determined graph edge weights by counting common users. It may seem like 50 is a low sample size, but some of the posts have over a thousand comments and, in my experience, the process for grabbing comments cost more time than the process for grabbing post titles using this API.

Methodology

My methodology here is very similar to the first part. I fetched the top 50 submissions from each subreddit, grabbed every comment from them, and saved the 'author' attribute. I then filtered out errors (posts that had been deleted, for example) and condensed my data in to a dictionary where the users are the keys, and their respective subreddits are their values. I imported this data in to Gephi to visualize it, and I've posted my representation below.


The Data

ForceAtlas 2 Layout

The ForceAtlas 2 layout is a way of representing data available on Gephi. I chose this representation technique because it kept the relative distance of the nodes at a manageable length, while still effectively visualizing relation strengths between subreddits spatially.

On the right side of the graph, we see our hardcore genres and their derivatives. /r/metalcore, /r/posthardcore, /r/melodichardcore, and /r/deathcore are all very strongly related. /r/poppunkers also tends to share a lot of users with the 'core' genres, even though it is pretty different it terms of sound and style.

As we move up the graph counter clockwise, we transition more in to metal and it's derivatives. Right between the two we see /r/progmetal and /r/djent, which take styles from both sides of the heavy music coin and add their own twist. If you haven't heard of djent, not a problem - you can read about it here. Closer to the middle, we can see /r/metal serves as a central hub for a lot of genres - hardcore, punk, metal, dubstep, and hiphop.

Closer to the bottom, we see the cluster of electronic music subreddits. I couldn't tell you what the differences are between them myself, because I spend most of my time over on the right side of the graph in the hardcore music variants. /r/electronicmusic seems to be the center of the 'electronic music' hub, but there are 7 very closely related subreddits in that area.

/r/hiphopheads, /r/rap, and /r/90shiphop are all spread out through the middle. This implies that not only do they tend not share a large amount of users with each other, they tend to share users with a large set of other genres unlike themselves (due to their central position). Even though they tend to be lumped together, they are different enough to have more unique subsets of users.

A major issue with my method of analysis is that I'm not accounting for subreddit size and normalizing the weights of the graph connections to account for this. I plan to do this in the near future, and report the differences.