Model Overview
Description
ChatGLM3 is a generation of pre-trained dialogue models jointly released by Zhipu AI and Tsinghua KEG. ChatGLM3-6B is the open-source model in the ChatGLM3 series, maintaining many excellent features of the first two generations such as smooth dialogue and low deployment threshold. It adopts a more diverse training dataset, and a newly designed Prompt format, and has a more comprehensive open-source series. All these weights are fully open for academic research, and free commercial use is also allowed after registration via a questionnaire.
Terms and Conditions
We hereby declare that our team has not developed any applications based on ChatGLM3 models, not on iOS, Android, the web, or any other platform. We strongly call on all users not to use ChatGLM3 models for any activities that harm national / social security or violate the law. Also, we ask users not to use ChatGLM3 models for Internet services that have not undergone appropriate security reviews and filings. We hope that all users can abide by this principle and ensure that the development of technology proceeds in a regulated and legal environment.
If any problems arise due to the use of ChatGLM3 open-source models, including but not limited to data security issues, public opinion risks, or any risks and problems brought about by the model being misled, abused, spread or improperly exploited, we will not assume any responsibility.
Third-Party Community Consideration
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to THUDM's Model Card.
License, Acceptable Use, and Research Privacy Policy
By using this model, you are agreeing to the terms and conditions of the Apache 2.0 and ChatGLM3-6B License.
Only for internal study and test purposes. Please follow the Model License and Terms of Use, and please contact ZhipuAI(https://open.bigmodel.cn/mla/form) for any commercial use.
The ChatGLM3-6B open-source model aims to promote the development of large-model technology. Developers and everyone are earnestly requested to comply with these rule:
- Comply with applicable laws, regulations and policies.
- Don't perform or facilitate any activities that may harm the safety, wellbeing, or rights of others, or for any services that have not been evaluated and registered.
- Don't distribute output from ChatGLM3-6B open-source model and the service to misinform, misrepresent, mislead others or use in any improper purpose .
Although every effort has been made to ensure the compliance and accuracy of the data at various stages of model training, due to the smaller scale of the ChatGLM3-6B model and the influence of probabilistic randomness factors, the accuracy of output content cannot be guaranteed. The model output is also easily misled by user input. Developers and users should build safeguards and assume risks and liabilities in the use of open-source models and codes.
References
GitHub: https://github.com/THUDM/ChatGLM3
HuggingFace: https://huggingface.co/THUDM/chatglm3-6b
Technical Report: https://arxiv.org/abs/2103.10360
Model Developer: Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University
Model Release Date: October 22, 2023
Model Architecture:
Architecture Type: Transformer
Fine-tuned from model: ChatGLM3-6B base model
Input:
Input Type: Text
Input Format: String
Input Parameters: Temperature, Top P, Max Output Tokens
Output:
Output Type: Text
Output Format: String
Model Version: ChatGLM3-6B v1.1.0
Evaluation
ChatGLM3-6B conducted performance tests on 8 typical Chinese-English datasets including: general language test, mathematics, code, and multilingual translation. For more detailed evaluation results of original models, please refer to GitHub.
Inference:
Engine: Triton TensorRT-LLM
Test Hardware: L40