Genres Combination of Best Picture Winners of 1927–2019
I love good films. Good films often won many awards, from Golden Globe to Academy awards. You could argue that there are a lot of great movies that didn’t win any prestigious award, but these awards still a good place to create a movie watchlist from. I’m curious if there any genre combination that won more Best Picture than other genre combinations.
The data needed for this project was scraped from IMDB pages. Its pretty common for a movie to have more than 2 genres. This creates a problem. Imagine i have a movie A with 2 genres, Action and Fantasy. Now imagine a movie B with 3 genres : Action, Fantasy, Thriller. Do these 2 movie (A &B) have the same genre combination?
The solution is to combines the set of genres with itself, which create a cartesian product of genres. Movie B now has 3 genres combination : (Action, Fantasy), (Action, Thriller), (Fantasy, Thriller). This way the movie still count for Action Fantasy combination and still include the Thriller genre.
It’s pretty easy to scrape imdb page using Selenium. After the data collection is done I used the magic of Pandas and Numpy to create a matrix of the amount of genres combination that won Best Picture. I used chord package on python to create this chord graph. I also added some parts to the graph to beautify it. Check it out yourself here.