Research and development based on semantic understanding with Chinese characteristics

Published: July 01, 2023

Background:

Large models are widely used in social media, and incorrect remarks and opinions may have a negative social impact.、
Large language models developed by the West have ideological application risks.
Domestic large language model research and development mostly focuses on the expansion of general capabilities and pays less attention to ideological aspects.

Contributions:

Based on the large language model independently developed in China, focusing on the vertical field of semantics with Chinese characteristics (represented by the core socialist values), we build the first domestic and foreign understanding and alignment dataset and large language model.
Optimize the large language model In the four core links of pre-training, supervised fine-tuning, reward model evaluation, and reinforcement learning based on human feedback.
Not only achieves question-answering, summarization, and expansion capabilities, but also has Chinese-style semantic understanding and alignment service and evaluation capabilities.