In her 2001 book Anime from Akira to Princess Mononoke, Professor Napier showed that many fans of anime work in computer science and its related fields. The survey also happened to show that “over 70 percent had a grade point average of 3.0 or higher, which is especially impressive when one considers the academic rigor of scientific fields.”
Anime has a pretty well-known reputation for creating men of culture. That’s a clear indication that anime fans can be profoundly affected by the medium. In addition, many prolific open source contributors have anime characters as their profile picture. So that got me to thinking, does being a fan of anime also make you a more intelligent person?
Of course, a question like that is nearly impossible to answer directly. After all, there’s countless ways to measure intelligence, and anime fandom is so broad that no one definition can fit all cases. For example, should we consider someone who has only watched Spirited Away, and liked it very much, but has no exposure to other forms of anime, to be an anime fan? What about people who only read manga? Or those who exclusively watch whatever this is supposed to be?
A smaller question that’s easily answerable would be to see if having an anime profile picture
correlates with you being a better programmer. After all, if someone takes the effort to set their
profile picture to a waifu, they clearly have some fondness for anime. As for being a “better
programmer,” we’ll just equate being better with having more activity on GitHub. And being good at
programming does require an amount of critical reasoning at logic skill, which should equate to a
higher intelligence. Of course, this metric could be easily abused by having a
cron job making a
ton of commits, but it’s a measure of programming activity that should be Good Enough™.
Luckily, Google provides their image labelling API for very cheap (or free, if you have GCP credit). As an example, putting in an image of best girl Mai Sakurajima from Rascal Does Not Dream of Bunny Girl Senpai into the demo provided, I’ll get this list of labels back from it:
Notice how one of the labels is “Anime”? That’s a surprise tool that will help us later :) Google also provides a Python API, which makes it even easier to check images, since all you have to do now is check if “Anime” is one of the tags:
anime_or_not(image): response = client.label_detection(image=image) labels = response.label_annotations for item in labels: if item.description == "Anime": return True
As for GitHub commits, we can use the events API that’s roughly analogous to the contribution history graph of a user. We’ll be measuring user activity just by the number of events for each user, so each event (opening a PR, creating a repo, etc.) is given equal weight. That’s roughly analogous to how green a user’s contribution heatmap is.
PyGitHub wraps the GitHub API into an easy to use library, so getting the number of events for a user, as well as their profile picture’s URL, is pretty simple:
users = g.get_users() for user in users: event_count = 0 for event in user.get_events(): event_count += 1 is_anime_image = check_if_weeb(user.avatar_url)
GitHub does rate limit the API to 5000 requests per hour for authenticated users. That’s enough to run about 2000 requests per hour. To get around that, we can take advantage of how GitHub profile IDs are numbered sequentially and process profiles in batches of 1000:
for github_id in range(1200000, 1201000): try: user = g.get_user(github_id) except Exception: continue # do user stuff here
I’ve modified the
get_user function here to use the undocumented
/user/:id endpoint. This hasn’t
been implemented in PyGitHub yet, but this issue seems to be tracking it.
All that’s left is to link these APIs up and save the data. It’s trivial to just loop through all
users using the
/users GitHub API endpoint, send their image over to the Google Vision API, note
down whether they had an anime profile picture and the number of events for that user, and finally
log it into a CSV for analysis later. That’s exactly what I did, and you can see my code
here. It’s very research quality, so don’t expect much.
So now I’ve got a table of 3497 GitHub profiles, of which only 23 have anime profile pictures. Here’s a box plot that displays the distribution of user activity by profile picture type:
Hmm, the users with an anime profile picture do seem to have a higher average number of activities. But we can’t stop here. Keep in mind that there’s way more samples of users without anime profile pictures compared to those with, as well as the comparatively high amount of outliers in both groups. To be sure that the difference here is statistically significant, we’ll need to do a T-test:
from scipy.stats import ttest_ind cat1 = df[df['is_anime_face'] == True] cat2 = df[df['is_anime_face'] == False] ttest_ind(cat1['contribs'], cat2['contribs'])
That provides a p-value of
0.2371. We now have to conclude that the higher average we got isn’t
statistically significant, since our p-value of 23.7% doesn’t meet the traditional 5% cutoff.
Therefore, we must once again acquiesce to Betteridge’s law, and adopt our null
hypothesis, that having an anime profile picture does not necessarily correlate with your abilities
as a programmer.
Further work into this topic can be done, however. Since this project only looked at a small number of users, who were among the first to register, it is not a representative slice of the GitHub user population. In addition, it may also be enlightening to include the inactive users skipped in this experiment.