vllm 推理qwen gguf模型使用案例;openai接口调用、requests调用
参考:
https://docs.vllm.ai/en/latest/getting_started/examples/gguf_inference.html
https://docs.vllm.ai/en/latest/models/engine_args.html
安装:升级到0.5.5才行
pip install -U vllm -i https://pypi.tuna.tsinghua.edu.cn/simple -trusted-host pypi<