British Academic Spoken English Corpus
The British Academic Spoken English (BASE) corpus is a record of the speech of university lecturers and students at the turn of the 21st century. The corpus consists of 160 lectures and 39 seminars recorded in a variety of university departments. It contains 1,644,942 tokens in total (lectures and seminars). Holdings are distributed across four broad disciplinary groups, each represented by 40 lectures and 10 seminars. These groups are:
- Arts and Humanities
- Life Sciences
- Physical Sciences
- Social Sciences
The BASE corpus was developed by Hilary Nesi, with Paul Thompson. Natalie Snodgrass and Sarah Creer were employed as research assistants and Tim Kelly was video director for the project. Lou Burnard (Oxford University) and Adam Kilgarriff (Lexicography MasterClass Ltd) acted as consultants. The corpus facilitates, amongst other things, investigation of:
- The frequency and range of academic lexis
- The meaning and use of individual words and multi-word units
- The structure of academic lectures
- The pace, density and delivery styles of academic lectures
- The discourse function of intonation
- Patterns of interaction, including turn-taking and topic selection
- The interplay of visual and aural stimuli
- The representation of ideas and the expression of attitudes
The lectures and seminars have been transcribed and tagged using a system devised in accordance with the TEI Guidelines. The corpus has been deposited in the Oxford Text Archive and is catalogued by the Arts and Humanities Data Service
Funding
The early stages of corpus development were assisted by funding from the Universities of Warwick and Reading , BALEAP, EURALEX, and The British Academy (2000-2001, Grant reference: SG 30284).
Major funding was provided by the Arts and Humanities Research Council as part of their Resource Enhancement Scheme (2001–2005, Award Number: RE/AN6806/APN13545).

Selected by Intute, providing access to the very best Web resources for education and research
