Finding a great dataset is all very well, but the next step is working out how to get the data onto your computer so that you can start playing with it. Datasets come in many forms, and there are different ways of collecting the data. In this article I will use some examples from the list of datasets in this previous article on women composers.
There are three main approaches to collecting data: read it and type it in, download it, or ‘scrape’ it. Continue reading →
There is a lot of interest at the moment in women composers. Until recently, women were a small minority of the composing population, but in working with large datasets, I encounter a surprisingly large number of female names (although it is often frustratingly difficult to find out any details about them). In the nineteenth century, for example, perhaps 1-2% of published music was written by women.1 Whilst that is an embarrassingly small proportion, it still equates to a substantial body of music by many hundreds of women composers – most of whom have since sunk into obscurity. There are of course many more from the twentieth and twenty-first centuries.2Continue reading →
There is a shocking absence of statistics in books on music history. Generations of music historians have shown little interest in using statistical analysis to quantify their subject.
But why should it be considered outrageous that music historians have not embraced the tools and techniques that would enable them to quantify music history? After all, there are plenty of excellent accounts of the history of music, all based on thorough and rigorous scholarship and a deep knowledge of the subject. Is this not enough? Continue reading →