Module 37 Music Dataset

Who among us does not like music? To be sure, tastes vary across cultures and era, but making (and enjoying) music is a hallmark of our species; it seems to be a fundamental part of what makes us homo sapiens (or pan narrans – the story-telling ape – as Pratchett, Cohen, and Stewart put it [170]).

Music and language enjoy a correlation/causation relationship, in the sense presented when we discussed Association Rules Mining: some researchers think that music began as a proto-language; others, that it arose as a consequence of an evolutionary adaption that first lead to language, and some others yet that the evolution of music and language was driven by common factors.

In the modules that follow, we will illustrate a variety of data science tasks and pipelines through the analysis of music datasets.

37.1 Spotify Music Datasets

Spotify is a Swedish music streaming service founded in 2006. As of 2021, it had 172 million premium subscribers and 380 monthly active users, making it the most commonly-used music streaming app [171].

We will discuss how to obtain the datasets in Web Scraping and Automated Data Collection; for now, we will only note that they contain information about a dozen acoustic features for each song in the Spotify database (such as loudness and tempo).

One of those features is danceability.

danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity [172].

In the Spotify database, this is a decimal number associated with each song, ranging from 0 to 1, with 0 being the least danceable, and 1 the most.

For instance, consider the following songs and their danceability scores.

Table 37.1: Songs and their danceability scores
track artist danceability
O Sanctissima - Remastered 1997 Traditional 0.08
Pärt: Für Alina Arvo Pärt 0.37
In The Pool Pino Donaggio 0.38
Sweetest One The Metros 0.42
I’ll Be Home for Christmas Oscar Peterson 0.48
Love No Limit Mary J. Blige 0.56
Skyscraper Demi Lovato 0.58
Für immer und dich Jan Delay 0.59
1982 Randy Travis 0.61
Mad Them General Levy 0.83
Wanna Make Love (Come Flick My BIC) Sun 0.84
SexyBack Justin Timberlake 0.97

To make the concept a bit clearer, let’s take a listen to two songs, one from near each end of the danceability spectrum:

References

[170] I. Stewart, J. Cohen, and T. Pratchett, The science of discworld ii: The globe. Ebury Publishing, 2011.Available: https://books.google.ca/books?id=MyozhndBMZkC

[171] M. Iqbal, “Spotify Revenue and Usage Statistics.” Business of Apps, 2021.Available: https://www.businessofapps.com/data/spotify-statistics/

[172] B. Plantinga, “What do Spotify’s audio features tell us about this year’s Eurovision Song Contest?” medium.com, 2018.Available: https://medium.com/@boplantinga/what-do-spotifys-audio-features-tell-us-about-this-year-s-eurovision-song-contest-66ad188e112a