Construction and Application of English: Chinese Interpretation Corpus Based on Big Data
Author Names:
Xiaomeng Hu, Fengru Zhang
Author Affiliation:
Public Course Teaching Department, Hainan Vocational College of Politics and Law, Haikou, China
Author Email:
zhang_fengru@outlook.com
Publication Date:
April 24, 2026
Page numbers:
DOI Number:
https://doi.org/10.1177/14727978251367161
Abstract:
Interpreting teaching and research need a large number of real, high-quality interpreting corpus, but the existing interpreting corpus has many shortcomings, such as small scale, single type, and uneven quality. In this paper, we utilize big data technology to build a powerful, easy-to-use and open-sharing English-Chinese interpreting corpus database to provide rich and diverse high-quality interpreting examples for the teaching and research of interpreting. We collect English-Chinese interpreting data of various types, scenarios, topics, and levels from the Internet, TV broadcasts, and other channels, clean, standardize, slice, align, and annotate the data, store the metadata information in XML format, and design and implement the structure, functions, and interfaces of the corpus. This paper mainly introduces the data method, model construction, and application effect of the corpus, including the collection, organization, annotation, storage, management, retrieval, analysis, display, and application of the corpus.
Keywords:
big data, English-Chinese interpreting, corpus, construction and application
You need to register before accessing this content.