mediatek / breeze-7b-instruct

Model Overview

Description

Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.
The current release version of Breeze-7B is v1.0, which has undergone a more refined training process compared to Breeze-7B-v0_1, resulting in significantly improved performance in both English and Traditional Chinese.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to the Breeze Model card.

License and Terms of use

GOVERNING TERMS: Your use of this API is governed by the NVIDIA API Trial Service Terms of Use; and the use of this model is governed by the NVIDIA AI Foundation Models Community License.

Model Developer: MediaTek Research

Model Release Date: March 5, 2024.

Features

  • Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese
  • 8k-token context length
  • Multi-turn dialogue (without special handling for harmfulness)

Benchmark Performance

The comparison of Breeze-7B-Instruct-v1_0 with other open-source instruction-tuned language models of similar parameter size, known for their good performance in Chinese, is presented here.

Models# Parameters↑ MT-Bench-tw (Score)TMMLU+ (ACC)Table (ACC)MT-Bench (Score)MMLU (ACC)
TC, ChatTC, KnowledgeTC, ReasoningEN, ChatEN, Knowledge
0 shot0 shot0 shot0 shot0 shot
GPT-3.5-Turbo7.143.5645.147.967.09
Qwen1.5-7B-Chat7B6.445.6534.727.661.85
Breeze-7B-Instruct-v1_07B6.042.6739.587.461.73
Mistral-7B-v0.2-Instruct7B5.634.9533.337.659.97
Yi-6B-Chat6B5.044.7925.696.059.45
Taiwan-LLM-13B-v2.0-chat13B5.029.4723.61N/A*50.50
Taiwan-LLM-7B-v2.1-chat7B4.228.0831.25N/A*42.72

* Taiwan-LLM models respond to multi-turn questions (English) in Traditional Chinese.

Details on MT-Bench-tw (0 shot):
Models
STEMExtractionReasoningMathCodingRoleplayWritingHumanitiesAVG
GPT-3.5-Turbo7.86.15.16.46.28.77.49.37.1
Qwen1.5-7B-Chat95.64.72.83.78.08.09.46.4
Breeze-7B-Instruct-v1_07.85.24.24.24.17.65.99.16.0
Mistral-7B-v0.2-Instruct6.94.64.33.34.47.26.27.85.6
Yi-6B-Chat7.32.73.13.32.37.25.28.85.0
Taiwan-LLM-13B-v2.0-chat6.13.44.12.33.17.46.66.85.0
Taiwan-LLM-7B-v2.1-chat5.22.62.31.23.46.65.76.84.2
Details on TMMLU+ (0 shot):
Model
STEMSocial ScienceHumanitiesOtherAVG
GPT-3.5-Turbo41.5848.5240.9643.1843.56
Qwen1.5-7B-Chat41.4851.6644.0545.4045.65
Breeze-7B-Instruct-v1_036.4648.3845.1140.7542.67
Mistral-7B-v0.2-Instruct32.7938.0534.8934.0434.94
Yi-6B-Chat37.8051.7445.3644.2544.79
Taiwan-LLM-13B-v2.0-chat27.7433.6927.0329.4329.47
Taiwan-LLM-7B-v2.1-chat25.5831.7627.3627.6128.08

Model Architecture

  • Architecture Type: Causal decoder-only transformer language model
  • Network Architecture: Mistral7b

Input

  • Input Type: Text
  • Input Format: String
  • Input Parameters: max_tokens, temperature, top_p, stop, frequency_penalty, presence_penalty, seed

Output

  • Output Type: Text
  • Output Format: String

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.

Software Integration:

  • Supported Hardware Platform(s): Lovelace

[Preferred/Supported] Operating System(s):

  • Linux

Model Version

Breeze-7B-Instruct-v1_0

Inference

Engine: Triton + TensorRT-LLM

Test Hardware: L40