当前位置: 首页 > news >正文

Mistral AI 又又又开源了闭源企业级模型——Mistral-Small-Instruct-2409

就在不久前,Mistral 公司在开源了 Pixtral 12B 视觉多模态大模型之后,又开源了自家的企业级小型模型 Mistral-Small-Instruct-2409 (22B),这是 Mistral AI 最新的企业级小型模型,是 Mistral Small v24.02 的升级版。该机型可根据 Mistral Research License 使用,为客户提供了灵活的选择,使其能够在翻译、摘要、情感分析和其他不需要完整通用模型的任务中,选择经济高效、快速可靠的解决方案。
在这里插入图片描述

Mistral Small 雏形采用 Mixtral-8X7B-v0.1(46.7B),这是一个具有 12B 活动参数的稀疏专家混合模型。它的推理能力更强,功能更多,可以生成和推理代码,并且是多语言的,支持英语、法语、德语、意大利语和西班牙语。

太激动人心了, Mistral 型号的性能总是出类拔萃。现在,我们在很多缝隙上都有了出色的覆盖范围

  • 8b- Llama 3.1 8b

  • 12b- Nemo 12b

  • 22b- Mistral Small

  • 27b- Gemma-2 27b

  • 35b- Command-R 35b 08-2024

  • 40-60b- GAP (我相信这里有两个新的 MOE,但我最后发现 Llamacpp 不支持它们)

  • 70b- Llama 3.1 70b

  • 103b- Command-R+ 103b

  • 123b- Mistral Large 2

  • 141b- WizardLM-2 8x22b

  • 230b- Deepseek V2/2.5

  • 405b- Llama 3.1 405b

Mistral Small v24.09 拥有 220 亿个参数,为客户提供了介于 Mistral NeMo 12B 和 Mistral Large 2 之间的便捷中间点,提供了可在各种平台和环境中部署的经济高效的解决方案。。

在这里插入图片描述
在这里插入图片描述

Mistral Small v24.09 拥有 220 亿个参数,为客户提供了介于 Mistral NeMo 12B 和 Mistral Large 2 之间的便捷中间点,提供了可在各种平台和环境中部署的经济高效的解决方案。如下图所示,与以前的模型相比,新的小型模型在人类对齐、推理能力和代码方面都有显著改进。
在这里插入图片描述
在这里插入图片描述

Mistral-Small-Instruct-2409 是一个指示微调版本,具有以下特点:

  • 22B 参数
  • 词汇量达 32768
  • 支持函数调用
  • 128k 序列长度

使用

vLLM(推荐)

安装 vLLM >= v0.6.1.post1

pip install --upgrade vllm

安装 mistral_common >= 1.4.1

pip install --upgrade mistral_common

本地

from vllm import LLM
from vllm.sampling_params import SamplingParamsmodel_name = "mistralai/Mistral-Small-Instruct-2409"sampling_params = SamplingParams(max_tokens=8192)# note that running Mistral-Small on a single GPU requires at least 44 GB of GPU RAM
# If you want to divide the GPU requirement over multiple devices, please add *e.g.* `tensor_parallel=2`
llm = LLM(model=model_name, tokenizer_mode="mistral", config_format="mistral", load_format="mistral")prompt = "How often does the letter r occur in Mistral?"messages = [{"role": "user","content": prompt},
]outputs = llm.chat(messages, sampling_params=sampling_params)print(outputs[0].outputs[0].text)

服务器

vllm serve mistralai/Mistral-Small-Instruct-2409 --tokenizer_mode mistral --config_format mistral --load_format mistral

注意: 在单 GPU 上运行 Mistral-Small 至少需要 44 GB GPU 内存。

如果要将 GPU 需求分配给多个设备,请添加 --tensor_parallel=2 等信息

客户端

curl --location 'http://<your-node-url>:8000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer token' \
--data '{"model": "mistralai/Mistral-Small-Instruct-2409","messages": [{"role": "user","content": "How often does the letter r occur in Mistral?"}]
}'

Mistral-inference

安装mistral_inference >= 1.4.1

pip install mistral_inference --upgrade

下载

from huggingface_hub import snapshot_download
from pathlib import Pathmistral_models_path = Path.home().joinpath('mistral_models', '22B-Instruct-Small')
mistral_models_path.mkdir(parents=True, exist_ok=True)snapshot_download(repo_id="mistralai/Mistral-Small-Instruct-2409", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

聊天

mistral-chat $HOME/mistral_models/22B-Instruct-Small --instruct --max_tokens 256

Instruct following

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generatefrom mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequesttokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
model = Transformer.from_folder(mistral_models_path)completion_request = ChatCompletionRequest(messages=[UserMessage(content="How often does the letter r occur in Mistral?")])tokens = tokenizer.encode_chat_completion(completion_request).tokensout_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])print(result)

Function calling

from mistral_common.protocol.instruct.tool_calls import Function, Tool
from mistral_inference.transformer import Transformer
from mistral_inference.generate import generatefrom mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequesttokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
model = Transformer.from_folder(mistral_models_path)completion_request = ChatCompletionRequest(tools=[Tool(function=Function(name="get_current_weather",description="Get the current weather",parameters={"type": "object","properties": {"location": {"type": "string","description": "The city and state, e.g. San Francisco, CA",},"format": {"type": "string","enum": ["celsius", "fahrenheit"],"description": "The temperature unit to use. Infer this from the users location.",},},"required": ["location", "format"],},))],messages=[UserMessage(content="What's the weather like today in Paris?"),],
)tokens = tokenizer.encode_chat_completion(completion_request).tokensout_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])print(result)

Hugging Face Transformers

from transformers import LlamaTokenizerFast, MistralForCausalLM
import torchdevice = "cuda"
tokenizer = LlamaTokenizerFast.from_pretrained('mistralai/Mistral-Small-Instruct-2409')
tokenizer.pad_token = tokenizer.eos_tokenmodel = MistralForCausalLM.from_pretrained('mistralai/Mistral-Small-Instruct-2409', torch_dtype=torch.bfloat16)
model = model.to(device)prompt = "How often does the letter r occur in Mistral?"messages = [{"role": "user", "content": prompt},]model_input = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device)
gen = model.generate(model_input, max_new_tokens=150)
dec = tokenizer.batch_decode(gen)
print(dec)

输出

<s>[INST]How often does the letter r occur in Mistral?[/INST]To determine how often the letter "r" occurs in the word "Mistral,"we can simply count the instances of "r" in the word.The word "Mistral" is broken down as follows:- M- i- s- t- r- a- lCounting the "r"s, we find that there is only one "r" in "Mistral."Therefore, the letter "r" occurs once in the word "Mistral."
</s>

看来 Mistral 尝试用 CoT 来修复草莓问题🙂

资料

https://mistral.ai/news/september-24-release/

https://artificialanalysis.ai/models/mistral-small

https://huggingface.co/mistralai/Mistral-Small-Instruct-2409

相关文章:

  • 北京网站建设多少钱?
  • 辽宁网页制作哪家好_网站建设
  • 高端品牌网站建设_汉中网站制作
  • 【图像压缩与重构】基于标准+改进BP神经网络
  • Vue 3:实现页面返回上一页的功能
  • AI驱动TDSQL-C Serverless 数据库技术实战营-ai学生选课系统数据分析
  • 基于微信小程序的购物系统+php(lw+演示+源码+运行)
  • 游戏服务器知识
  • 【全网最全】2024华为杯研赛D题完整代码建模过程+py代码(后续会更新)
  • 第五章 JSP
  • 芯片开发(1)---BQ76905---底层参数配置
  • SpringBoot 消息队列RabbitMQ 消息可靠性 数据持久化 与 LazyQueue
  • python qt5 常用
  • Why is OpenAI image generation Api returning 400 bad request in Unity?
  • 【CPU】CPU的物理核、逻辑核、超线程判断及L1、L2、L3缓存、CacheLine和CPU的TBL说明
  • Windows上创建批处理.bat文件并且注册为开机自启(Python-web微服务)
  • Vue3与Flask后端Demo
  • VisualPromptGFSS
  • 【翻译】Mashape是如何管理15000个API和微服务的(三)
  • es6(二):字符串的扩展
  • ESLint简单操作
  • Go 语言编译器的 //go: 详解
  • HTML-表单
  • iOS | NSProxy
  • java概述
  • java架构面试锦集:开源框架+并发+数据结构+大企必备面试题
  • python学习笔记-类对象的信息
  • socket.io+express实现聊天室的思考(三)
  • vagrant 添加本地 box 安装 laravel homestead
  • 持续集成与持续部署宝典Part 2:创建持续集成流水线
  • 互联网大裁员:Java程序员失工作,焉知不能进ali?
  • 讲清楚之javascript作用域
  • 开发基于以太坊智能合约的DApp
  • 理解在java “”i=i++;”所发生的事情
  • 前端路由实现-history
  • 前嗅ForeSpider采集配置界面介绍
  • 机器人开始自主学习,是人类福祉,还是定时炸弹? ...
  • 积累各种好的链接
  • ​Kaggle X光肺炎检测比赛第二名方案解析 | CVPR 2020 Workshop
  • # Panda3d 碰撞检测系统介绍
  • #单片机(TB6600驱动42步进电机)
  • (4)logging(日志模块)
  • (delphi11最新学习资料) Object Pascal 学习笔记---第5章第5节(delphi中的指针)
  • (顶刊)一个基于分类代理模型的超多目标优化算法
  • (附源码)python旅游推荐系统 毕业设计 250623
  • (南京观海微电子)——COF介绍
  • (使用vite搭建vue3项目(vite + vue3 + vue router + pinia + element plus))
  • (数据结构)顺序表的定义
  • (太强大了) - Linux 性能监控、测试、优化工具
  • (一)Dubbo快速入门、介绍、使用
  • .a文件和.so文件
  • .equal()和==的区别 怎样判断字符串为空问题: Illegal invoke-super to void nio.file.AccessDeniedException
  • .md即markdown文件的基本常用编写语法
  • .NET Framework Client Profile - a Subset of the .NET Framework Redistribution
  • .NET 应用启用与禁用自动生成绑定重定向 (bindingRedirect),解决不同版本 dll 的依赖问题
  • .NET平台开源项目速览(15)文档数据库RavenDB-介绍与初体验
  • .NET中使用Protobuffer 实现序列化和反序列化
  • @SuppressWarnings注解