The Story of the Learner Corpus LINDSEI_CZ
- Číslo 2 
PublisherUniverzita Karlova, Filozofická fakulta
SourceStudie z aplikované lingvistiky - Studies in Applied Linguistics, 2017, 8, 2, p. 22-35
corpus methodology, learner corpora, learner corpus linguistics, LINDSEI, spoken corpora
The article presents the recently completed Czech subcorpus of the multinational learner corpus ofadvanced spoken English LINDSEI and aims to draw attention to some of the methodological concernsthe field of learner corpus linguistics faces. First, it describes the Louvain family of learnercorpora, where this project originated, and provides a detailed description of LINDSEI, its history,design, structure, transcription system and metadata. It then outlines the nature of the Czech subcorpusLINDSEI_CZ, telling the story of its compilation and providing a quantitative description ofthe corpus size, task sizes and learner variables, as well as a description of the transcription process.The core part of this text discusses methodological concerns affecting learner corpus designand construction and deals with such issues as task design, recording instructions, the matter oflearner-participant proficiency, and transcription system employed. It concludes with a considerationof various methodological suggestions and offers the possible view that, despite certain weaknesses,LINDSEI is an invaluable source of highly authentic learner data. The last section providesa thematic categorisation of existing studies on LINDSEI and concludes with descriptions of somefuture projects. The article calls for a thorough reconsideration of learner corpus design and practiceand for the formulation of compilation and research standards which would lead to an increasein the reliability and exploitation potential of learner corpora.