当前位置：首页 > news >正文

基于YOLOv8的船舶目标检测与分割（ONNX模型）

news 来源：原创 2024/9/20 16:28:45

添加图片注释，不超过 140 字（可选）

项目背景

需求分析：在海洋监控、港口管理、海事安全等领域，自动化的船只检测与分割技术对于提高效率和安全性至关重要。
技术选型：YOLOv8是YOLO系列的一个较新版本，以其速度快、准确率高而著称。使用ONNX（Open Neural Network Exchange）格式可以跨平台部署模型，并且通常能够获得更好的性能。

技术栈

Python：主要编程语言。
PyTorch：用于训练和加载YOLOv8模型。
ONNX：用于模型转换和部署。
OpenCV：用于图像处理和显示结果。
Pillow：用于读取和保存图像文件。

项目结构

数据准备：

收集带有标注的船只图像数据集。
将数据集划分为训练集和测试集。

添加图片注释，不超过 140 字（可选）

模型训练：

使用YOLOv8框架训练模型。
调整超参数以优化检测和分割性能。
模型转换：
将训练好的PyTorch模型导出为ONNX格式。
验证ONNX模型的正确性。
推理部署：
编写推理代码，支持从图像或视频流中检测并分割船只。
使用ONNX Runtime进行高效推理。
结果展示：
可视化检测结果，包括边界框和分割掩码。
计算并报告性能指标如准确率、召回率等。

示例代码

一个简化的示例代码片段，用于演示如何加载一个ONNX模型并在单张图片上进行船只检测和分割：

python深色版本 1import cv2 2import numpy as np 3import onnxruntime as ort 4 5# 加载ONNX模型 6ort_session = ort.InferenceSession("yolov8.onnx") 7 8# 加载图像 9img = cv2.imread('input.jpg') 10img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 11img = img.astype(np.float32) / 255.0 12img = np.expand_dims(img, axis=0) 13 14# 进行推理 15outputs = ort_session.run(None, {'images': img}) 16 17# 解析输出 18boxes, scores, labels, masks = outputs 19 20# 可视化结果 21for box, score, label, mask in zip(boxes[0], scores[0], labels[0], masks[0]): 22 if score > 0.5: 23 x1, y1, x2, y2 = box 24 # 绘制边界框 25 cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2) 26 # 绘制分割掩码 27 mask = (mask > 0.5).astype(np.uint8) * 255 28 img_masked = cv2.bitwise_and(img, img, mask=mask) 29 img = cv2.addWeighted(img, 1, img_masked, 0.5, 0) 30 31cv2.imshow('Detection and Segmentation', cv2.cvtColor(img, cv2.COLOR_RGB2BGR)) 32cv2.waitKey(0) 33cv2.destroyAllWindows()

首先，确保安装了必要的库：

bash深色版本 1pip install onnxruntime opencv-python Pillow numpy

接下来是Python代码示例：

python深色版本

1import cv2 2import numpy as np 3from PIL import Image 4import onnxruntime as ort 5 6def letterbox_image(image, new_size): 7 """ 8 Resize the image with unchanged aspect ratio using padding. 9 """ 10 old_size = image.shape[:2] # old_size is in (height, width) format 11 ratio = min(new_size[0] / old_size[0], new_size[1] / old_size[1]) 12 new_size = tuple([int(x * ratio) for x in old_size]) 13 image = cv2.resize(image, (new_size[1], new_size[0])) 14 15 delta_w = new_size[1] - new_size[1] 16 delta_h = new_size[0] - new_size[0] 17 top, bottom = delta_h // 2, delta_h - (delta_h // 2) 18 left, right = delta_w // 2, delta_w - (delta_w // 2) 19 20 color = [0, 0, 0] 21 new_img = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) 22 return new_img, ratio, (top, left) 23 24def preprocess_image(image, input_size=(640, 640)): 25 """ 26 Preprocess the image to match the input size of the model. 27 """ 28 img, _, _ = letterbox_image(image, input_size) 29 img = img[:, :, ::-1].transpose((2, 0, 1)) # BGR to RGB, HWC to CHW 30 img = np.ascontiguousarray(img, dtype=np.float32) / 255.0 31 return img 32 33def postprocess_output(output, confidence_threshold=0.5, iou_threshold=0.5): 34 """ 35 Postprocess the output from the model. 36 """ 37 boxes = output[0][0] 38 scores = output[0][1] 39 labels = output[0][2] 40 masks = output[0][3] 41 42 # Apply non-max suppression 43 indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), confidence_threshold, iou_threshold) 44 45 filtered_boxes = [] 46 filtered_scores = [] 47 filtered_labels = [] 48 filtered_masks = [] 49 50 for i in indices: 51 idx = i[0] 52 filtered_boxes.append(boxes[idx]) 53 filtered_scores.append(scores[idx]) 54 filtered_labels.append(labels[idx]) 55 filtered_masks.append(masks[idx]) 56 57 return filtered_boxes, filtered_scores, filtered_labels, filtered_masks 58 59def visualize(image, boxes, scores, labels, masks, orig_image_shape, ratio, padding): 60 """ 61 Visualize the detection results. 62 """ 63 top, left = padding 64 for box, score, label, mask in zip(boxes, scores, labels, masks): 65 box = np.array(box).astype(int) 66 box /= ratio 67 box[[0, 2]] -= left 68 box[[1, 3]] -= top 69 box = box.clip(min=0) 70 71 # Draw bounding box 72 cv2.rectangle(image, (box[0], box[1]), (box[2], box[3]), (0, 255, 0), 2) 73 74 # Draw segmentation mask 75 mask = (mask > 0.5).astype(np.uint8) * 255 76 mask = cv2.resize(mask, (orig_image_shape[1], orig_image_shape[0])) 77 image_masked = cv2.bitwise_and(image, image, mask=mask) 78 image = cv2.addWeighted(image, 1, image_masked, 0.5, 0) 79 80 return image 81 82# Load the ONNX model 83ort_session = ort.InferenceSession("yolov8.onnx") 84 85# Load an example image 86image_path = 'input.jpg' 87image = cv2.imread(image_path) 88orig_image_shape = image.shape[:2] 89 90# Preprocess the image 91input_image = preprocess_image(image) 92input_image = np.expand_dims(input_image, axis=0) 93 94# Perform inference 95outputs = ort_session.run(None, {'images': input_image}) 96 97# Postprocess the output 98filtered_boxes, filtered_scores, filtered_labels, filtered_masks = postprocess_output(outputs) 99 100# Visualize the results 101visualized_image = visualize(image, filtered_boxes, filtered_scores, filtered_labels, filtered_masks, orig_image_shape, 1.0, (0, 0)) 102 103# Display the result 104cv2.imshow('Detection and Segmentation', visualized_image) 105cv2.waitKey(0) 106cv2.destroyAllWindows()

代码说明

letterbox_image: 保持原始图像的长宽比不变，通过填充的方式调整图像大小。
preprocess_image: 图像预处理函数，将图像调整到模型所需的尺寸，并将其转换为合适的格式。
postprocess_output: 后处理函数，对模型输出进行非极大值抑制 (NMS)，过滤掉低置信度和重叠的预测。
visualize: 结果可视化函数，用于绘制边界框和分割掩码。

注意事项

在运行这段代码之前，请确保你已经训练了一个YOLOv8模型，并将其导出为ONNX格式。你可以从YOLOv8的官方仓库获取相应的代码或者使用预训练的模型。
本示例假设模型输出包含四个维度：边界框坐标、置信度分数、类别标签以及分割掩码。
对于实际应用，你可能还需要考虑更多的因素，例如模型的输入输出布局、后处理的具体细节等。代码仅为示例，实际应用中可能需要根据具体需求调整细节。此外，确保安装了所有必要的库，并正确配置了环境。

项目背景

需求分析：在海洋监控、港口管理、海事安全等领域，自动化的船只检测与分割技术对于提高效率和安全性至关重要。
技术选型：YOLOv8是YOLO系列的一个较新版本，以其速度快、准确率高而著称。使用ONNX（Open Neural Network Exchange）格式可以跨平台部署模型，并且通常能够获得更好的性能。

技术栈

Python：主要编程语言。
PyTorch：用于训练和加载YOLOv8模型。
ONNX：用于模型转换和部署。
OpenCV：用于图像处理和显示结果。
Pillow：用于读取和保存图像文件。

项目结构

数据准备：
- 收集带有标注的船只图像数据集。
- 将数据集划分为训练集和测试集。
模型训练：
- 使用YOLOv8框架训练模型。
- 调整超参数以优化检测和分割性能。
模型转换：
- 将训练好的PyTorch模型导出为ONNX格式。
- 验证ONNX模型的正确性。
推理部署：
- 编写推理代码，支持从图像或视频流中检测并分割船只。
- 使用ONNX Runtime进行高效推理。
结果展示：
- 可视化检测结果，包括边界框和分割掩码。
- 计算并报告性能指标如准确率、召回率等。

示例代码

这里给出一个简化的示例代码片段，用于演示如何加载一个ONNX模型并在单张图片上进行船只检测和分割：

1import cv2
2import numpy as np
3import onnxruntime as ort
4
5# 加载ONNX模型
6ort_session = ort.InferenceSession("yolov8.onnx")
7
8# 加载图像
9img = cv2.imread('input.jpg')
10img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
11img = img.astype(np.float32) / 255.0
12img = np.expand_dims(img, axis=0)
13
14# 进行推理
15outputs = ort_session.run(None, {'images': img})
16
17# 解析输出
18boxes, scores, labels, masks = outputs
19
20# 可视化结果
21for box, score, label, mask in zip(boxes[0], scores[0], labels[0], masks[0]):
22    if score > 0.5:
23        x1, y1, x2, y2 = box
24        # 绘制边界框
25        cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
26        # 绘制分割掩码
27        mask = (mask > 0.5).astype(np.uint8) * 255
28        img_masked = cv2.bitwise_and(img, img, mask=mask)
29        img = cv2.addWeighted(img, 1, img_masked, 0.5, 0)
30
31cv2.imshow('Detection and Segmentation', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
32cv2.waitKey(0)
33cv2.destroyAllWindows()

首先，确保安装了必要的库：

1pip install onnxruntime opencv-python Pillow numpy

接下来是Python代码示例：

1import cv2
2import numpy as np
3from PIL import Image
4import onnxruntime as ort
5
6def letterbox_image(image, new_size):
7    """
8    Resize the image with unchanged aspect ratio using padding.
9    """
10    old_size = image.shape[:2]  # old_size is in (height, width) format
11    ratio = min(new_size[0] / old_size[0], new_size[1] / old_size[1])
12    new_size = tuple([int(x * ratio) for x in old_size])
13    image = cv2.resize(image, (new_size[1], new_size[0]))
14    
15    delta_w = new_size[1] - new_size[1]
16    delta_h = new_size[0] - new_size[0]
17    top, bottom = delta_h // 2, delta_h - (delta_h // 2)
18    left, right = delta_w // 2, delta_w - (delta_w // 2)
19    
20    color = [0, 0, 0]
21    new_img = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)
22    return new_img, ratio, (top, left)
23
24def preprocess_image(image, input_size=(640, 640)):
25    """
26    Preprocess the image to match the input size of the model.
27    """
28    img, _, _ = letterbox_image(image, input_size)
29    img = img[:, :, ::-1].transpose((2, 0, 1))  # BGR to RGB, HWC to CHW
30    img = np.ascontiguousarray(img, dtype=np.float32) / 255.0
31    return img
32
33def postprocess_output(output, confidence_threshold=0.5, iou_threshold=0.5):
34    """
35    Postprocess the output from the model.
36    """
37    boxes = output[0][0]
38    scores = output[0][1]
39    labels = output[0][2]
40    masks = output[0][3]
41
42    # Apply non-max suppression
43    indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), confidence_threshold, iou_threshold)
44
45    filtered_boxes = []
46    filtered_scores = []
47    filtered_labels = []
48    filtered_masks = []
49
50    for i in indices:
51        idx = i[0]
52        filtered_boxes.append(boxes[idx])
53        filtered_scores.append(scores[idx])
54        filtered_labels.append(labels[idx])
55        filtered_masks.append(masks[idx])
56
57    return filtered_boxes, filtered_scores, filtered_labels, filtered_masks
58
59def visualize(image, boxes, scores, labels, masks, orig_image_shape, ratio, padding):
60    """
61    Visualize the detection results.
62    """
63    top, left = padding
64    for box, score, label, mask in zip(boxes, scores, labels, masks):
65        box = np.array(box).astype(int)
66        box /= ratio
67        box[[0, 2]] -= left
68        box[[1, 3]] -= top
69        box = box.clip(min=0)
70        
71        # Draw bounding box
72        cv2.rectangle(image, (box[0], box[1]), (box[2], box[3]), (0, 255, 0), 2)
73        
74        # Draw segmentation mask
75        mask = (mask > 0.5).astype(np.uint8) * 255
76        mask = cv2.resize(mask, (orig_image_shape[1], orig_image_shape[0]))
77        image_masked = cv2.bitwise_and(image, image, mask=mask)
78        image = cv2.addWeighted(image, 1, image_masked, 0.5, 0)
79
80    return image
81
82# Load the ONNX model
83ort_session = ort.InferenceSession("yolov8.onnx")
84
85# Load an example image
86image_path = 'input.jpg'
87image = cv2.imread(image_path)
88orig_image_shape = image.shape[:2]
89
90# Preprocess the image
91input_image = preprocess_image(image)
92input_image = np.expand_dims(input_image, axis=0)
93
94# Perform inference
95outputs = ort_session.run(None, {'images': input_image})
96
97# Postprocess the output
98filtered_boxes, filtered_scores, filtered_labels, filtered_masks = postprocess_output(outputs)
99
100# Visualize the results
101visualized_image = visualize(image, filtered_boxes, filtered_scores, filtered_labels, filtered_masks, orig_image_shape, 1.0, (0, 0))
102
103# Display the result
104cv2.imshow('Detection and Segmentation', visualized_image)
105cv2.waitKey(0)
106cv2.destroyAllWindows()