As part of the coursework for “Analysing large digital text collections” at the University of Tartu, I compiled an online learning guide to teach the basics of R and simple text mining on the example of Estonian pop songs 1994-2018. The dataset is posted on github.
The materials themselves are published as a gitbook.
Rank-frequency distribution of pop songs and fiction corpus. Estonian newer fiction ngrams was used for comparison.
The locations of ‘la’ and ‘na’ in Estonian popular songs 1994-2018
The song ‘Mina ka’ by Reket, the x axis shows location in time, y axis the number of repetitions.