VisDrone 数据集转 YOLO 格式详解：从无人机小目标检测数据到 YOLOv8 训练数据

张

张建站

2026/5/14 23:24:47

10分钟阅读

VisDrone 数据集转 YOLO 格式详解：从无人机小目标检测数据到 YOLOv8 训练数据

1. 项目背景在进行无人机场景小目标检测研究时VisDrone 是一个非常常用的数据集。它包含大量航拍图像目标类型包括行人、车辆、自行车、三轮车、公交车等非常适合研究密集小目标检测、遮挡目标检测以及无人机边缘部署场景下的轻量化检测问题。但是VisDrone 原始标注格式并不能直接用于 YOLOv8 训练。YOLOv8 要求数据集满足固定的目录结构和标签格式因此在正式训练模型之前需要先将 VisDrone 原始标注转换为 YOLO 格式。本文记录 VisDrone2019-DET 数据集转换为 YOLO 格式的完整思路与代码实现。2. VisDrone 原始标注格式VisDrone 的每个图片对应一个.txt标注文件文件中每一行表示一个目标框格式如下x, y, w, h, score, category, truncation, occlusion各字段含义如下字段含义x目标框左上角 x 坐标y目标框左上角 y 坐标w目标框宽度h目标框高度score标注置信度category类别编号truncation截断程度occlusion遮挡程度其中最关键的是x, y, w, h, category。这些信息决定了目标的位置和类别。3. YOLO 标签格式YOLO 使用的标签格式为class_id x_center y_center width height并且要求类别编号从0开始坐标全部归一化到0~1每张图片对应一个同名.txt标签文件标签文件和图片文件按照固定目录组织。例如3 0.521233 0.618293 0.032111 0.041225 0 0.217833 0.348901 0.012655 0.028411其中3 表示类别编号 0.521233 表示目标框中心点 x 坐标 0.618293 表示目标框中心点 y 坐标 0.032111 表示目标框宽度 0.041225 表示目标框高度4. 类别编号转换VisDrone 原始类别如下0: ignored regions 1: pedestrian 2: people 3: bicycle 4: car 5: van 6: truck 7: tricycle 8: awning-tricycle 9: bus 10: motor 11: others在目标检测任务中我们通常只保留1~10这 10 个有效类别忽略0 ignored regions和11 others。由于 YOLO 类别编号必须从0开始因此需要进行如下映射VisDrone 类别YOLO 类别类别名称10pedestrian21people32bicycle43car54van65truck76tricycle87awning-tricycle98bus109motor代码中通过下面语句完成类别过滤和编号转换VALID_CLASSES set(range(1, 11)) if category not in VALID_CLASSES: continue cls category - 15. 坐标格式转换VisDrone 使用的是左上角坐标格式x, y, w, h其中x, y 表示目标框左上角坐标 w, h 表示目标框宽度和高度YOLO 使用的是中心点坐标格式x_center, y_center, width, height并且需要归一化。转换公式如下x_center (x1 new_w / 2) / img_w y_center (y1 new_h / 2) / img_h box_w new_w / img_w box_h new_h / img_h其中img_w 表示图片宽度 img_h 表示图片高度 new_w 表示修正后的目标框宽度 new_h 表示修正后的目标框高度6. 为什么要进行边界裁剪在真实数据集中部分目标框可能存在越界情况。例如目标框超出图片左边界、上边界或者右下角超过图片尺寸。如果不处理这些异常框可能会导致 YOLO 训练时报错或者影响模型训练稳定性。因此代码中使用如下方式对目标框进行裁剪x1 max(0, x) y1 max(0, y) x2 min(img_w, x w) y2 min(img_h, y h)这样可以保证目标框坐标始终位于图片范围内。裁剪后重新计算宽高new_w x2 - x1 new_h y2 - y1如果目标框过小则直接跳过if new_w 1 or new_h 1: continue7. 项目目录结构转换前的原始数据目录如下datasets/VisDrone/raw/ ├── VisDrone2019-DET-train/ │ ├── images/ │ └── annotations/ ├── VisDrone2019-DET-val/ │ ├── images/ │ └── annotations/ └── VisDrone2019-DET-test-dev/ ├── images/ └── annotations/转换后生成 YOLO 数据目录datasets/VisDrone/ ├── images/ │ ├── train/ │ ├── val/ │ └── test/ └── labels/ ├── train/ ├── val/ └── test/YOLO 训练时会自动根据图片路径寻找对应标签文件。例如datasets/VisDrone/images/train/000001.jpg datasets/VisDrone/labels/train/000001.txt8. 完整转换代码from pathlib import Path import shutil from PIL import Image from tqdm import tqdm # VisDrone 原始类别 # 0: ignored regions # 1: pedestrian # 2: people # 3: bicycle # 4: car # 5: van # 6: truck # 7: tricycle # 8: awning-tricycle # 9: bus # 10: motor # 11: others # # YOLO 类别编号需要从 0 开始 # VisDrone 1-10 - YOLO 0-9 VALID_CLASSES set(range(1, 11)) def convert_annotation(txt_path: Path, img_path: Path, save_path: Path): img Image.open(img_path) img_w, img_h img.size yolo_lines [] with open(txt_path, r, encodingutf-8) as f: lines f.readlines() for line in lines: parts line.strip().split(,) if len(parts) 8: continue x, y, w, h map(float, parts[:4]) score int(parts[4]) category int(parts[5]) truncation int(parts[6]) occlusion int(parts[7]) # 忽略 ignored regions 和 others只保留 1-10 类 if category not in VALID_CLASSES: continue # VisDrone 类别 1-10 转 YOLO 类别 0-9 cls category - 1 # 过滤无效框 if w 0 or h 0: continue # 防止框越界 x1 max(0, x) y1 max(0, y) x2 min(img_w, x w) y2 min(img_h, y h) new_w x2 - x1 new_h y2 - y1 if new_w 1 or new_h 1: continue # 转 YOLO 格式class x_center y_center width height x_center (x1 new_w / 2) / img_w y_center (y1 new_h / 2) / img_h box_w new_w / img_w box_h new_h / img_h yolo_lines.append( f{cls} {x_center:.6f} {y_center:.6f} {box_w:.6f} {box_h:.6f}\n ) with open(save_path, w, encodingutf-8) as f: f.writelines(yolo_lines) def process_split(root: Path, split_name: str, raw_folder: str): raw_dir root / raw / raw_folder raw_img_dir raw_dir / images raw_ann_dir raw_dir / annotations out_img_dir root / images / split_name out_label_dir root / labels / split_name out_img_dir.mkdir(parentsTrue, exist_okTrue) out_label_dir.mkdir(parentsTrue, exist_okTrue) img_paths sorted(raw_img_dir.glob(*.jpg)) print(f\nProcessing {split_name}: {len(img_paths)} images) print(fImage dir: {raw_img_dir}) print(fAnnotation dir: {raw_ann_dir}) for img_path in tqdm(img_paths): ann_path raw_ann_dir / f{img_path.stem}.txt out_img_path out_img_dir / img_path.name out_label_path out_label_dir / f{img_path.stem}.txt shutil.copy2(img_path, out_img_path) if ann_path.exists(): convert_annotation(ann_path, img_path, out_label_path) else: out_label_path.write_text(, encodingutf-8) def main(): root Path(datasets/VisDrone) process_split(root, train, VisDrone2019-DET-train) process_split(root, val, VisDrone2019-DET-val) process_split(root, test, VisDrone2019-DET-test-dev) print(\nConversion finished.) print(YOLO images saved to: datasets/VisDrone/images) print(YOLO labels saved to: datasets/VisDrone/labels) if __name__ __main__: main()9. 运行转换脚本在项目根目录执行python scripts/convert_visdrone_to_yolo.py正常情况下会看到类似输出Processing train: 6471 images Processing val: 548 images Processing test: 1610 images Conversion finished.转换完成后可以检查图片和标签数量(Get-ChildItem datasets\VisDrone\images\train -Filter *.jpg).Count (Get-ChildItem datasets\VisDrone\labels\train -Filter *.txt).Count (Get-ChildItem datasets\VisDrone\images\val -Filter *.jpg).Count (Get-ChildItem datasets\VisDrone\labels\val -Filter *.txt).Count如果图片数量和标签数量一致说明转换基本成功。10. YOLO 数据配置文件转换完成后还需要准备 YOLO 数据配置文件path: ./datasets/VisDrone train: images/train val: images/val test: images/test names: 0: pedestrian 1: people 2: bicycle 3: car 4: van 5: truck 6: tricycle 7: awning-tricycle 8: bus 9: motor保存为configs/visdrone.yaml后续训练 YOLOv8 时就可以使用yolo detect train modelyolov8n.pt dataconfigs/visdrone.yaml imgsz640 epochs50 batch8 device011. 总结本文完成了 VisDrone2019-DET 数据集到 YOLO 格式的转换。核心步骤包括读取 VisDrone 原始标注文件过滤无效类别将类别编号从1~10转换为0~9将左上角坐标格式转换为 YOLO 中心点坐标格式对坐标进行归一化裁剪越界目标框按照 YOLO 标准目录保存图片和标签。完成该步骤后VisDrone 数据集就可以被 YOLOv8 正常读取和训练。这也是后续开展无人机场景密集小目标检测、轻量化检测网络设计和边缘部署实验的基础。

掌握Geckodriver：现代Web自动化测试的核心桥梁

掌握Geckodriver：现代Web自动化测试的核心桥梁【免费下载链接】geckodriver WebDriver Classic proxy for automating Firefox through Marionette 项目地址: https://gitcode.com/gh_mirrors/ge/geckodriver 在当今快速发展的Web开发领域，自动化…...

2026/5/14 23:22:25 阅读更多 →

Chapter 02：Rules 基础 - 规则系统核心概念

Chapter 02：Rules 基础 - 规则系统核心概念学习目标理解 Rules 的工作原理和核心机制掌握四种规则类型的适用场景了解规则的作用域和限制理解 AGENTS.md 兼容性概念讲解（Why） 1.1 Rules 的本质 Rules（规则）是 Qoder IDE 中用于定义项目级标准的机制。它的核心价值…...

2026/5/14 23:21:21 阅读更多 →

通过Taotoken CLI工具一键配置开发环境与团队共享模型调用设置

🚀 告别海外账号与网络限制！稳定直连全球优质大模型，限时半价接入中。 👉 点击领取海量免费额度通过Taotoken CLI工具一键配置开发环境与团队共享模型调用设置对于需要统一接入多个大模型API的团队而言，管理不同成员…...

2026/5/14 23:20:51 阅读更多 →

2026年AI大模型API中转平台排名揭晓，诗云API(ShiyunApi)脱颖而出成省心之选

在AI开发领域，如何接入模型厂商的官方API是一个绕不开的现实问题。对于海外开发者来说，注册、绑卡、调用，三步即可轻松搞定。然而，国内开发者却面临着跨境网络波动、外币支付门槛、发票合规需求以及多厂商Key碎片化管理等诸多“非…...

2026/5/14 15:34:04 阅读更多 →

CANN/catlass TLA张量详解

TLA Tensors 【免费下载链接】catlass 本项目是CANN的算子模板库，提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass 本文介绍 TLA 中的 Tensor。如果说 Layout 负责描述“逻辑坐标如何映射到内存”&#xf…...

2026/5/13 16:10:23 阅读更多 →

LinkSwift：解锁九大网盘高速下载的终极浏览器脚本解决方案

LinkSwift：解锁九大网盘高速下载的终极浏览器脚本解决方案【免费下载链接】Online-disk-direct-link-download-assistant 一个基于 JavaScript 的网盘文件下载地址获取工具。基于【网盘直链下载助手】修改 ，支持百度网盘 / 阿里云盘 / 中国移动云盘 / …...

2026/5/13 22:17:10 阅读更多 →