Historical music datasets vary enormously in the topics they cover (composers, recordings, sheet music, concerts, etc); in their format (online databases, scanned books, real books, etc); and in whether they are contemporary historical datasets, or more recent collections of historical data. They also have many different organisational structures (searchable database, alphabetical list, chronological, etc) and vary in the nature and format of the information they contain. They are all subject to different types of bias and error, and present different challenges of analysis and interpretation.

It is therefore hard to generalise about how to use these datasets – each should be handled individually. Extracting the data and getting it into a usable form is often a difficult and time-consuming process, but it is usually worth the effort!

The pages in this section will point you to a number of historical musical datasets under different headings – composers, printed music, recorded music, concerts, repertoire and miscellaneous. Bear in mind that datasets often provide information across several of these headings – composer information can often be found in printed music collections, for example. Some of the smaller, more obscure or hard-to-use datasets listed in my thesis have been omitted here, and a number of new discoveries have been included. Let me know if you have found others worth mentioning.

Cite this article as: Gustar, A.J. 'Datasets' in Statistics in Historical Musicology, 12th July 2017,