Scraping IMDB Data: Are DC Films Getting Better? No, They Aren’t.

It seems like every year we are exposed to more and more superhero films to the point that it made me wonder if they are getting better over time? I decided to narrow my focus onto DC Comic films and saw that surely enough in the late 80’s, the frequency of films began to increase.

The data was scraped from a list made on IMDB which showed the first DC movie as Superman and the Mole Men made in 1951. I utilized the rvest package to accomplish this. I also downloaded datasets from IMDB’s data libraries and joined them by their unique IDs. This contains a list of every movie and film on the database along with their ratings and the year they were released.

Looking at a scatterplot that maps the number of movies onto size, can give us the impression that the quality of movies is making great progress but using a smooth linear regression layer, we can see that the average ratings are only slightly increasing.

When we get a summary of linear regression, we can see that the independent variable of time does not reject the Null Hypothesis, rendering it statistically insignificant. So overall, while the frequency of DC films is increasing, the quality is not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s