自学记录鸿蒙API 13：实现多目标识别Object Detection

news2026/2/10 1:58:25

起步：什么叫多目标识别？

无论是生活中的动物识别、智能相册中的场景分类，还是工业领域的检测任务，都能看到多目标识别的身影。这次，我决定通过学习HarmonyOS最新的Object Detection API（API 13），一步步探索如何实现多目标识别应用，并通过亲手完成一个完整的项目来验证自己的学习成果。

先思考

在深入学习之前，我认真思考了这一技术的潜在应用场景：

智能图像分类：对用户拍摄的图片进行智能分类，比如区分风景、建筑、人物等。
工业检测：识别生产线上产品的质量问题，如瑕疵或异常。
无人零售：分析购物场景中的商品分布，提高商品推荐精度。
交通监控：检测车辆和行人，实现交通状况分析。
AR互动：结合多目标识别技术，实现与周围物体的实时交互。

你还别说，我认识到多目标识别的广阔潜力，同时也促使我更加系统地理解其背后的实现逻辑。

第一阶段：了解Object Detection API的功能

HarmonyOS的Object Detection API提供了以下能力：

目标类别识别：识别图像中目标的类别，如风景、动物、植物等。
边界框生成：为识别的目标生成精确的边界框，便于后续处理。
高精度置信度：为每个目标提供置信度分数，衡量识别结果的可靠性。
多目标支持：能够在单张图片中同时检测多个目标对象。

这种强大的功能正是我此次学习和实践的重点。

第二阶段：项目初始化与权限配置

为了确保多目标识别服务能够正常运行，我首先配置了项目的权限文件。以下是必要的权限配置：

{
  "module": {
    "abilities": [
      {
        "name": "ObjectDetectionAbility",
        "permissions": [
          "ohos.permission.INTERNET",
          "ohos.permission.READ_MEDIA",
          "ohos.permission.WRITE_MEDIA"
        ]
      }
    ]
  }
}

通过这些配置，我的项目能够读取用户的图片文件，并与HarmonyOS的AI服务接口交互。

第三阶段：多目标识别核心功能实现

初始化与销毁检测器

多目标识别服务需要初始化一个检测器实例，同时在不再使用时销毁该实例以释放资源。以下是相关代码：

import { objectDetection } from '@kit.CoreVisionKit';

let detector: objectDetection.ObjectDetector | undefined = undefined;

async function initializeDetector() {
    detector = await objectDetection.ObjectDetector.create();
    console.info('多目标识别检测器初始化成功');
}

async function destroyDetector() {
    if (detector) {
        await detector.destroy();
        console.info('多目标识别检测器已销毁');
    }
}

加载图片并处理检测

实现多目标识别的核心在于加载图片并调用process方法进行检测：

async function detectObjects(imageUri: string) {
    if (!detector) {
        console.error('检测器未初始化');
        return;
    }

    const pixelMap = await loadPixelMap(imageUri);
    const request = {
        inputData: { pixelMap },
        scene: visionBase.SceneMode.FOREGROUND,
    };

    const response = await detector.process(request);

    if (response.objects.length === 0) {
        console.info('未检测到任何目标');
    } else {
        response.objects.forEach((object, index) => {
            console.info(`目标 ${index + 1}：类别 - ${object.labels[0]}, 置信度 - ${object.score}`);
        });
    }

    pixelMap.release();
}

辅助方法：加载图片

import { fileIo } from '@kit.CoreFileKit';
import { image } from '@kit.ImageKit';

async function loadPixelMap(imageUri: string): Promise<image.PixelMap> {
    try {
        console.info(`加载图片: ${imageUri}`);

        // 打开图片文件
        const fileDescriptor = await fileIo.open(imageUri, fileIo.OpenMode.READ_ONLY);
        const imageSource = image.createImageSource(fileDescriptor.fd);

        // 创建PixelMap对象
        const pixelMap = await imageSource.createPixelMap();

        // 关闭文件资源
        await fileIo.close(fileDescriptor);

        console.info('PixelMap加载成功');
        return pixelMap;
    } catch (error) {
        console.error('加载图片失败:', error);
        throw new Error('加载PixelMap失败');
    }
}

第四阶段：用户界面设计

为了使用户可以方便地选择图片并查看检测结果，我利用ArkUI设计了一个简单的用户界面：

import { View, Text, Button } from '@ohos.arkui';

export default View.create({
    build() {
        return {
            type: "flex",
            flexDirection: "column",
            children: [
                {
                    type: Text,
                    content: "多目标识别应用",
                    style: { fontSize: "20vp", textAlign: "center", marginTop: "20vp" },
                },
                {
                    type: Button,
                    content: "选择图片",
                    style: { height: "50vp", marginTop: "10vp" },
                    onClick: this.onSelectImage,
                },
                {
                    type: Button,
                    content: "检测目标",
                    style: { height: "50vp", marginTop: "10vp" },
                    onClick: this.onDetectObjects,
                },
            ],
        };
    },

    onSelectImage() {
        this.imageUri = '/data/media/sample_image.jpg';
        console.info('图片已选择:', this.imageUri);
    },

    async onDetectObjects() {
        await detectObjects(this.imageUri);
    },
});