Predicting Artist Success at Next Big Sound

Published: April 17, 2018

Next Big Sound has been in the business of spotlighting artists on the rise for a while now. The Predictions Chart — now the Pandora Predictions Chart — launched in 2010 to predict the 20 artists we think are most likely to break, based on a prediction model that ranks artists on their likelihood of debuting on the Billboard 200 Chart in the next year. But the relative importance of metrics changes over time as some networks become more popular, others become less popular, and some approach market saturation. As the music industry continues to evolve, we occasionally update our prediction model to keep in line with those changes.

That’s why today we are introducing a few significant updates to the prediction model that generates Next Big Sound’s Billboard 200 Score, which powers the Pandora Predictions Chart. Head over to today’s chart, and you’ll notice a new and improved list.

So, what’s changed?

New Features And A More Focused Data Set

In its original design, the Billboard 200 prediction model was trained using a data set which included features from all Next Big Sound artists. Features were derived from each artist’s time series social metrics from a 90-day window, and the model attempted to learn whether each of those artists would appear on the Billboard 200 chart within a year of that window. The new model still predicts the likelihood that an artist will appear on the Billboard 200 chart, but incorporates new features and has been trained with a more focused data set in order to improve prediction accuracy for artists earlier in their career.

Focus On Artists In Cycle

There are two noteworthy differences between the training data used in the new model set versus previous versions. The first is that the training set only includes artists that have recently released an album, and the features are derived from the 90 days prior to the album’s release date. To illustrate how limiting the training data set to artists who have recently released an album improves prediction accuracy, imagine that I asked you to predict whether Drake would appear on the Billboard 200 chart in the next year. Let’s also suppose that you don’t know whether Drake will release an album in that time, but you think there’s a 50/50 chance. You know that if Drake did release an album, it would almost certainly appear on the Billboard 200 chart. So if there’s 50% chance that Drake will release an album, and a 100% chance that any album released by Drake will appear on the Billboard 200 chart, you could predict that there’s a 50% chance that Drake will appear on the Billboard 200 chart in the next year. In the absence of any information about whether Drake will release an album, your prediction probability must include that uncertainty. The same is true for the Billboard 200 prediction model, and, as with the Drake example, that additional uncertainty results in lower overall scores. By removing artists who haven’t recently released an album from the training set we’re also removing that element of uncertainty leading to higher scores on average.

Prioritize Growth Over Audience Size

The second update we’ve made to the training data set is to exclude any artists who have previously appeared on the Billboard 200 chart. This means excluding artists with the largest audiences. The advantage of excluding these larger artists is that our model spends more effort learning what to do with artists with smaller audiences. The resulting model prioritizes features that measure audience growth over features that measure audience size producing better predictions for artists that haven’t quite hit the mainstream.

While the training set is restricted to artists who have recently released an album, we generate scores for all artists, even those who don’t have a new album. Without an album an artist has 0% chance of appearing on the Billboard 200 chart, but we can still interpret their Billboard 200 score as the likelihood of them appearing on the chart if they had just released an album. The Billboard 200 score continues to be a valuable tool for aggregating signals across dozens of metrics for all artists. This change to the model just prioritizes improving prediction accuracy for artists experiencing high growth over artists who have already amassed large audiences. With this new model we’re not only defining what success means (appearing on the Billboard 200 chart in this case), we’re also defining whom we most care about predicting success for.

Incorporating New Metrics, Removing Others

In addition to updating the conditions for artists included in our training set, we’ve also removed a few deprecated metrics and added a few new model features. Due to recent changes at YouTube and Instagram, Next Big Sound no longer stores or collects historical data from those networks. Unrelated to removing these two sources, we’re now including new features derived from Pandora streaming metrics and a new feature related to the geographic breakdown of artists’ fans. Data sources come and go. Occasionally we’ll lose a source used to generate features, and our models might perform less well as a result. Other times we’ll add new features which improve model performance. In this case, the addition of the new Pandora features and the new audience geographic feature gave a huge boost to the model’s performance, more than making up for the decrease in performance from removing YouTube and Instagram metrics.

We’ll be releasing a new blog post soon describing some infrastructure changes we’ve made to make the process of training, evaluating, and deploying updated prediction models much easier. These changes will allow us to deploy updated prediction models more frequently, ensuring that we continue to provide the best possible prediction about who will be the next big sound.


Predicting Artist Success at Next Big Sound was originally published in Next Big Sound on Medium, where people are continuing the conversation by highlighting and responding to this story.

Entertainment
follow us on Twitter      Contact      Privacy Policy      Terms of Service
Copyright © BANDMINE // All Right Reserved
Return to top