How I used pi Business Intelligence software to predict films I’d love 

Categories: ,
Sam Baldwin
Alex Paramore of Panintelligence demonstrates how he used pi's predictive analytics to allow him to identify films he would love. 

One of the key benefits of using business analytics software like Panintelligence, is to use your data set to create a model and apply that to your incoming or existing data to predict the future behaviour of a customer, a product or a device. 

In this illustration, pi Developer – Alex Paramore – feeds in publicly available data about films, to demonstrate how pi's BI reporting tools can be used to produce interesting visualizations and unexpected correlations and ultimately – predict if a movie will be any good or not.

Using BI software to predict quality of films


The challenge

Take a dataset and see if you can use it to predict whether you would like a film or not.


The Method

Alex started by collecting data that was openly available from the IMDb. (Alex got the data from Kaggle which provides an open forum for datasets and has all sorts of interesting data for download).

IMDb - source of data on films

Information such as the movie's run length, date of release, age certification, genre, director, country of origin, language, what platform they are streaming on, as well as rating scores from both IMDb and Rotten Tomatoes, were available.

Alex imported these data into his PI database simply from an excel sheet using the data connection on pi. (This could also easily be done via any database such as Oracle or SQL Server).

Table displaying the characteristics of films

The Rules

To define whether he would like a film or not - Alex chose to assign any film that had a viewer rating in IMDb of 7 or more to be a ‘good’ film and one he would like.

Using the analytics chart he set an objective to find films with scores over 7. Running it through pi – straight away Alex was able to see that of the approximately 16,000 films in his data set, 80% had a rating of less than 7.

In other words, based on the rules he had determined, 80% of these films were not movies he would like. Next, he wanted to find what characteristics the 20% of films he would like (those with scores over 7) shared.

Example of a film rating score on IMDb (Star Wars)

Alex chose to use a rating of 7 or above as a film he would like.


The Results: what makes a good film?

Interestingly, pi revealed that the film’s run time was the most significant factor in whether a film would be rated highly. If the run time was greater than 111 minutes – then 33% of those would be a good film. Digging deeper into the data, Alex was able to see that if the film was over 129 minutes, then the chance of a high rating rose to 40%.

PI's BI software visualisation allows you to make predictions

Next, Alex wanted to see what other characteristics of a film were statistically important in its rating. He discovered that the specific streaming platform upon which the film was available, was another highly significant factor. If the film was available to stream on Disney Plus, the chance of the film being rated more than 7 rose to 74%.

So, we now start to see how powerful insights can be gained from even fairly simple data; if a film is over two hours and on Disney Plus – Alex will most probably like the film.

If we increase the granularity by adding the criteria of a director, Alex was able to up the chance of him liking the film to 83%.


Going Even Deeper into The Data

As a self-proclaimed data-geek, Alex wanted to split the data out further and create a second model. So next, he split out the data into genres, languages and countries.

Immediately he was able to see from the data that documentaries tended to be higher rated films. If a film was a documentary, it immediately had a 50% chance of scoring over 7. And if it was a documentary about music, it had a 70% chance of scoring over 7.

Going back to his first model, if a film was a documentary on the topic of music and was over 1.5hrs long, there was immediately a very good probability – 81% – that it would be a good film.

But he also showed how you are able to exclude data too; if you’re not interested in documentaries, you can easily filter that out and create a model that restricts the data and excludes all documentaries.


How can you use data to predict whether a film will be good or not?

Using this model, you could now take the data for newly released films which are yet to be rated on IMDb. And using Alex’s model, you could now predict whether the film is likely to be good or not, based on its characteristics, with a surprisingly high level of accuracy.

Alex’s movie model is of course a fun example of how we can use data to make predictions. But it illustrates how even a small number of fairly simple data points, when run through powerful business analytics software like pi – can produce very valuable insights for your business, allowing you to make data-driven decisions and increase the likelihood of being able to forecast the future.

This post is based on a webinar by Alex Paramore of Panintelligence – you can watch the whole Webinar here. And if you want to see how pi could help you make valuable business predictions - you can try pi for free here.

Alex Paramore from Panintelligence BI Software

"Even just a small number of simple data points can produce very valuable insights for your business when run through powerful business analytics software like pi
panintelligence is a leader in Business Intelligence on G2
panintelligence is a leader in Europe Embedded Business Intelligence on G2
panintelligence is a leader in Mid-Market Analytics Platforms on G2
panintelligence is a leader in Analytics Platforms on G2
panintelligence is a leader in Europe Analytics Platforms on G2
panintelligence is a leader in Mid-Market Embedded Business Intelligence on G2
Users love panintelligence on G2
panintelligence is a leader in Data Visualization on G2

Houston... we've got mail.

Sign up with your email to receive news, updates and the latest blog articles to inspire you and your business.
  • This field is for validation purposes and should be left unchanged.
Privacy PolicyT&Cs
© Panintelligence 2022