Model Overview
Description:
Baichuan 2 is the new generation of large-scale open-source language models launched by Baichuan Intelligence inc.. It is trained on a high-quality corpus with 2.6 trillion tokens and has achieved the best performance in authoritative Chinese and English benchmarks of the same size. This 13B version for chat models is fully open to academic research. Developers can also use them for free in commercial applications after obtaining an official commercial license.
Evaluation
Baichuan2-13B-Chat is tested on authoritative Chinese-English datasets across six domains: General, Legal, Medical, Mathematics, Code, and Multilingual Translation. For more detailed evaluation results of original models, please refer to GitHub.
Terms and Conditions
We hereby declare that our team has not developed any applications based on Baichuan 2 models, not on iOS, Android, the web, or any other platform. We strongly call on all users not to use Baichuan 2 models for any activities that harm national / social security or violate the law. Also, we ask users not to use Baichuan 2 models for Internet services that have not undergone appropriate security reviews and filings. We hope that all users can abide by this principle and ensure that the development of technology proceeds in a regulated and legal environment.
If any problems arise due to the use of Baichuan 2 open-source models, including but not limited to data security issues, public opinion risks, or any risks and problems brought about by the model being misled, abused, spread or improperly exploited, we will not assume any responsibility.
Third-Party Community Consideration
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Baichuan-Inc's Model Card
References(s):
GitHub
HuggingFace
Technical Report
License, Acceptable Use, and Research Privacy Policy
By using this model, you are agreeing to the terms and conditions of the Apache 2.0 and Community License for Baichuan2 Model.
Model Architecture:
Architecture Type: Transformer
Fine-tuned from model: Baichuan2
Input:
Input Type: Text
Input Format: String
Input Parameters: Temperature, Top P, Max Output Tokens
Output:
Output Type: Text
Output Format: String
Inference:
Engine: Triton TensorRT-LLM
Test Hardware: L40