In present-day English corpora are used for dictionary-making. When looking at that, concordances and collocations play an important role here. This essay gives information about the terms collocation and concordance, provides examples and shows problems that may occur during the linguistic research process.
Table of Contents
1. Introduction
2. Collocation
3. Concordance
4. Problems Concerning the Research of Collocations
5. Advanced Methods in Corpus Linguistics
6. Example for Dictionary-Making: The Lexeme Sweet
6.1. Using Statistical Data Exemplified on BNCWeb and LDOCE
6.2. Including Statistics in Dictionary-Making
7. Conclusion
Objectives and Core Topics
This essay explores the foundational concepts of corpus linguistics, specifically focusing on how concordances and collocations contribute to modern dictionary-making and semantic research.
- The theoretical definitions and historical development of collocation and concordance.
- Methodological challenges inherent in researching collocations.
- The practical application of statistical measures in linguistic analysis.
- A comparative analysis of dictionary entries using the lexeme "sweet" to demonstrate corpus-based improvements.
- The evolution of learner’s dictionaries through the integration of corpus data.
Excerpt from the Book
6. Example for Dictionary-Making: The Lexeme Sweet
For the lexeme sweet the online version of the BNC and the LDOCE1 and LDOCE4 were used. The BNCWeb provided statistical information and the two versions of the LDOCE show how helpful this can be for dictionary-making. Before exploring the data of the BNCWeb, certain facts have to be taken into consideration.
First there is to say that corpora contain all kinds of spoken and written texts, even grammatically incorrect sentences that can show typical and central evidence of a language. Then, corpora do not contain explanations for the statements they use. Another point is that there are limits in the construction and for portraying information in corpora. In addition to that it is to mention that a corpus can be representative but can still include generalizations that are misleading (McRnery et al. 2006:121).
The corpus I will introduce in my example is the BNCWeb. It is the online version of the BNC Corpus. In the year 2007 the BNC featured 100,106,008 words presented in 4,124 written texts or transcripts of speech in modern British English. 90 per cent of the corpus are written texts, only 10 per cent are spoken texts. The 90 per cent include regional and national newspapers, a variety of periodicals and journals, academic books, letters, memoranda and essays and the other 10 per cent used transcripts of informal and formal texts with different registers from representatives of different demographic groups (McEbery et al. 2007:59-60).
Summary of Chapters
1. Introduction: The introduction outlines the role of corpora in contemporary dictionary-making and identifies the scope of the essay regarding collocation and concordance.
2. Collocation: This chapter provides a historical overview of the term collocation, citing Firth and Halliday, and discusses the importance of registers and statistical analysis.
3. Concordance: This chapter explains how concordances identify formal language patterns and predictable lexical units using computer-based corpora.
4. Problems Concerning the Research of Collocations: This chapter addresses limitations such as the definition of colligation and the lack of consensus among native speakers regarding acceptable collocations.
5. Advanced Methods in Corpus Linguistics: This chapter highlights the evolution from informant-based research to computational methods like lemmatization and part-of-speech tagging.
6. Example for Dictionary-Making: The Lexeme Sweet: This chapter presents a case study comparing older and newer dictionary editions to show how statistical data improves entry quality.
7. Conclusion: The conclusion summarizes the advancement of corpus linguistics and predicts further integration of technical progress into user-specified dictionaries.
Keywords
Corpus Linguistics, Collocation, Concordance, Lexicography, BNCWeb, LDOCE, Semantics, Statistical Measures, Frequency, Mutual Information, Z-score, Log-likelihood, Learner’s Dictionaries, Syntax, Word-forms.
Frequently Asked Questions
What is the primary focus of this work?
The work examines the application of corpus linguistics, specifically concordances and collocations, in the field of lexicography and semantics.
What are the core thematic areas discussed?
The essay covers the definition of linguistic terms, the history of corpus research, statistical methods, and the practical application of these tools in creating better dictionaries.
What is the central research goal?
The goal is to demonstrate how modern corpus data can be utilized to improve dictionary entries, particularly for language learners.
Which scientific methods are primarily utilized?
The work relies on corpus-based analysis, using statistical indicators like z-scores, mutual information (MI), and log-likelihood to evaluate lexical relationships.
What is covered in the main body of the text?
The main body details the evolution of corpus research, identifies problems in collocations, explains statistical measures, and provides a comparative case study of the word "sweet".
How is this paper characterized by keywords?
The paper is defined by terms such as Corpus Linguistics, Collocation, Concordance, and Lexicography, highlighting its technical linguistic focus.
Why is the lexeme "sweet" used as a case study?
It serves as a practical example to compare how earlier and modern dictionaries handle frequency data and contextual examples.
What does the author conclude about dictionary-making?
The author concludes that dictionary-making has become more objective and user-focused due to the availability of large corpora and advanced statistical tools.
- Quote paper
- Marie Wieslert (Author), 2009, Corpus Linguistics: Lexicography and Semantics: Introduction to Concordance and Collocations, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/171915