baichuan-inc / baichuan2-13b-chat

Model Overview

Description:

Baichuan 2 is the new generation of large-scale open-source language models launched by Baichuan Intelligence inc.. It is trained on a high-quality corpus with 2.6 trillion tokens and has achieved the best performance in authoritative Chinese and English benchmarks of the same size. This 13B version for chat models is fully open to academic research. Developers can also use them for free in commercial applications after obtaining an official commercial license.

Evaluation

Baichuan2-13B-Chat is tested on authoritative Chinese-English datasets across six domains: General, Legal, Medical, Mathematics, Code, and Multilingual Translation. For more detailed evaluation results of original models, please refer to GitHub.

Terms and Conditions

We hereby declare that our team has not developed any applications based on Baichuan 2 models, not on iOS, Android, the web, or any other platform. We strongly call on all users not to use Baichuan 2 models for any activities that harm national / social security or violate the law. Also, we ask users not to use Baichuan 2 models for Internet services that have not undergone appropriate security reviews and filings. We hope that all users can abide by this principle and ensure that the development of technology proceeds in a regulated and legal environment.

If any problems arise due to the use of Baichuan 2 open-source models, including but not limited to data security issues, public opinion risks, or any risks and problems brought about by the model being misled, abused, spread or improperly exploited, we will not assume any responsibility.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Baichuan-Inc's Model Card

References(s):

GitHub

HuggingFace

Technical Report

License, Acceptable Use, and Research Privacy Policy

By using this model, you are agreeing to the terms and conditions of the Apache 2.0 and Community License for Baichuan2 Model.

Model Architecture:

Architecture Type: Transformer

Fine-tuned from model: Baichuan2

Input:

Input Type: Text

Input Format: String

Input Parameters: Temperature, Top P, Max Output Tokens

Output:

Output Type: Text

Output Format: String

Inference:

Engine: Triton TensorRT-LLM

Test Hardware: L40