本地运行 Qwen2-VL
本地运行 Qwen2-VL
- 1. 克隆代码
- 2. 创建虚拟环境
- 3. 安装依赖模块
- 4. 启动
- 5. 访问
1. 克隆代码
git clone https://github.com/QwenLM/Qwen2-VL.git
cd Qwen2-VL
2. 创建虚拟环境
conda create -n qwen2-vl python=3.11 -y
conda activate qwen2-vl
3. 安装依赖模块
pip install git+https://github.com/huggingface/transformers accelerate
pip install qwen-vl-utils
pip install deepspeed
pip install flash-attn --no-build-isolation
pip install git+https://github.com/huggingface/transformers.git
pip install einops==0.8.0
pip install git+https://github.com/fyabc/vllm.git@add_qwen2_vl_new
4. 启动
python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-7B-Instruct --model Qwen/Qwen2-VL-7B-Instruct
5. 访问
curl http://localhost:8000/v1/chat/completions \-H "Content-Type: application/json" \-d '{"model": "Qwen2-VL-7B-Instruct","messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": [{"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},{"type": "text", "text": "What is the text in the illustrate?"}]}]}'
from openai import OpenAI# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)chat_response = client.chat.completions.create(model="Qwen2-7B-Instruct",messages=[{"role": "system", "content": "You are a helpful assistant."},{"role": "user","content": [{"type": "image_url","image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"},},{"type": "text", "text": "What is the text in the illustrate?"},],},],
)
print("Chat response:", chat_response)
完结!
refer:
- https://github.com/QwenLM/Qwen2-VL
- https://help.aliyun.com/zh/model-studio/developer-reference/qwen-vl-api#2166c1d8b3i5r