Often in statistical analysis we need to select things at random. For example, if it is impractical to work with a complete dataset, the only option might be to use a random sample. The science of statistics tells us how to analyse a sample in order to reach conclusions about the entire dataset, and gives us ways to calculate margins of error based on the size of the sample. But I digress.
The graph below illustrates the size of orchestra required to perform symphonies composed between 1750 and 1920. Each symphony is represented by two dots: the red dots and line represent woodwind instruments; blue relates to brass instruments. Continue reading →
The gentleman pictured to the right is Welsh composer Henry Brinley Richards.1 Although he is little-known today, his piano nocturne ‘Marie’ Opus.60 was the most published British musical work in Germany in the nineteenth century. German music lovers could purchase ‘Marie’ in its original form or in various arrangements in an impressive 34 separate publications from 27 different publishers between 1861 and 1877.2
That conclusion comes from an analysis of Hofmeister’s Monatsberichte – a monthly listing of music publications appearing in the German market, compiled by Leipzig music publisher Friedrich Hofmeister from 1829 onwards. The Monatsberichte up to the end of the nineteenth century are available as an online database, listing about a third of a million publications from over 36,000 composers. This article is about the British composers and their works that appear in Hofmeister’s listings. Continue reading →
If you go to the British Library online catalogue, search for music scores published in each year from 1650 to 1920, and plot the number of ‘hits’ by year, the result looks like this. Continue reading →
I have recently been trying to collect data from the Listening Experience Database (LED) in order to put together a proposal for a conference paper. The LED is a nicely constructed database using linked open data and a structure based on something called the ‘Semantic Web’. Rather than traditional databases that have a hierarchical ‘tree’ structure, the Semantic Web concept is a true ‘network’, where anything can be linked to anything else. The LED, for example, includes links to data on a number of other databases. Have a look at the LED and follow a few links and you will see what this means – a very rich and flexible means of linking data together. Continue reading →
Finding a great dataset is all very well, but the next step is working out how to get the data onto your computer so that you can start playing with it. Datasets come in many forms, and there are different ways of collecting the data. In this article I will use some examples from the list of datasets in this previous article on women composers.
There are three main approaches to collecting data: read it and type it in, download it, or ‘scrape’ it. Continue reading →
In what ways can statistical techniques be used to investigate topics in historical musicology? I think there are four main approaches – hypothesis testing, quantification, modelling and exploration. Their use depends on the topic, the data, and the type of question you are trying to answer.
These four types often overlap. It is hard to do modelling without some exploration and quantification, for example. Also, after you have spent so long collecting the data, cleaning it, and getting it into a form for statistical analysis, why not squeeze the most out of it and do some general exploration after testing your hypotheses? Continue reading →
There is a lot of interest at the moment in women composers. Until recently, women were a small minority of the composing population, but in working with large datasets, I encounter a surprisingly large number of female names (although it is often frustratingly difficult to find out any details about them). In the nineteenth century, for example, perhaps 1-2% of published music was written by women.1 Whilst that is an embarrassingly small proportion, it still equates to a substantial body of music by many hundreds of women composers – most of whom have since sunk into obscurity. There are of course many more from the twentieth and twenty-first centuries.2Continue reading →
There is a shocking absence of statistics in books on music history. Generations of music historians have shown little interest in using statistical analysis to quantify their subject.
But why should it be considered outrageous that music historians have not embraced the tools and techniques that would enable them to quantify music history? After all, there are plenty of excellent accounts of the history of music, all based on thorough and rigorous scholarship and a deep knowledge of the subject. Is this not enough? Continue reading →
This website is about how to use statistical techniques to study music history. It is based on my PhD thesis, and on more recent work developing the techniques, investigating various topics in music history, and discovering new datasets and ways of understanding them.
It is perhaps more common to develop a PhD thesis into a book, and I have considered this option. But a website seems a more sensible way to go, for three main reasons… Continue reading →