图片切割

功能描述

图片切割是一项将图片中的不同物体或区域精确分离出来的技术。与传统图片处理不同，图片切割能识别图片中每个像素属于哪个物体，从而实现精细的对象分离，返回每个物体的轮廓或掩码信息。

支持模型

SAM3

如何调用接口

前提准备

获取 API Token（访问模力方舟控制台创建）
一张需要处理的图片（本地文件或网络图片URL）

python调用（示例）

import requests

API_URL = "https://ai.gitee.com/v1/images/segmentation"
API_TOKEN = "<your_api_token>"  # 替换为您的真实Token

headers = {
    "Authorization": f"Bearer {API_TOKEN}"
}

def cut_image(image_path, prompt=""):
    data = {
        "model": "sam3",
        "prompt": prompt
    }

    if image_path.startswith(("http://", "https://")):
        # 网络图片：直接传递URL
        data["image"] = image_path
        response = requests.post(API_URL, headers=headers, data=data)
    else:
        # 本地图片：通过文件上传
        with open(image_path, "rb") as image_file:
            files = {
                "image": (image_path, image_file, "image/jpeg")
            }
            response = requests.post(API_URL, headers=headers, files=files, data=data)
    
    return response.json()

# 示例1：切割本地图片中的人物
result = cut_image("path/to/your/photo.jpg", prompt="person")
print(f"找到 {result['num_segments']} 个人物")

# 示例2：切割网络图片中的所有物体（不使用prompt）
result2 = cut_image("https://example.com/image.jpg")
print(f"识别到 {result2['num_segments']} 个物体")

接口参数说明

参数名	类型	是否必填	说明	示例值
model	string	是	模型名称，固定为 `sam3`	`"sam3"`
image	file/string	是	图片文件，或网络图片URL	本地文件或 `"https://..."`
prompt	string	否	提示词，指定要切割的物体	`"person"`, `"car"`, `"dog,cat"`

返回结果的处理

结果数据结构

接口返回JSON格式的数据，包含以下字段：

{
  "num_segments": 3,
  "segments": [
    {
      "id": 1,
      "label": "person",
      "confidence": 0.95,
      "bbox": [120, 80, 320, 450],
      "mask": {
        "encoding": "rle",
        "size": [512, 512],
        "counts": "H4sIAAAAAAAA..."
      }
    },
    {
      "id": 2,
      "label": "dog",
      "confidence": 0.87,
      "bbox": [350, 300, 480, 420],
      "mask": {...}
    }
  ]
}

Mask 解码方式说明（前端 / 客户端）

接口返回的 mask 字段为 COCO RLE counts，需结合图像宽高进行解码，解码后可得到 H × W 的二维 0/1 mask 矩阵。

RLE 解码示例（JavaScript）

typescript
/**
 * 解码 COCO RLE 格式的 mask
 * @param rleString - RLE 编码字符串
 * @param height - 图像高度
 * @param width - 图像宽度
 * @returns 二维数组 [height][width]，1 表示前景，0 表示背景
 */
export function rleDecode(
  rleString: string,
  height: number,
  width: number,
): number[][] {
  rleString = atob(rleString);
  // 解码 RLE counts
  const counts: number[] = [];
  let p = 0;
  while (p < rleString.length) {
    let x = 0;
    let k = 0;
    let more = true;
    while (more) {
      const c = rleString.charCodeAt(p) - 48;
      x |= (c & 0x1f) << (5 * k);
      more = (c & 0x20) !== 0;
      p++;
      k++;
      if (!more && (c & 0x10) !== 0) {
        x |= -1 << (5 * k);
      }
    }
    if (counts.length > 2) {
      x += counts[counts.length - 2];
    }
    counts.push(x);
  }
  // 从 RLE counts 创建扁平数组（列优先顺序）
  const total = height * width;
  const mask = new Uint8Array(total);
  let pos = 0;
  let val = 0;
  for (const count of counts) {
    for (let i = 0; i < count && pos < total; i++) {
      mask[pos] = val;
      pos++;
    }
    val = 1 - val; // 切换 0 和 1
  }
  // 将扁平数组转换为二维数组（列优先到行优先）
  const result: number[][] = Array.from({ length: height }, () =>
    new Array(width).fill(0),
  );
  // COCO RLE 使用列优先（Fortran）顺序存储
  let idx = 0;
  for (let x = 0; x < width; x++) {
    for (let y = 0; y < height; y++) {
      if (idx < total) {
        result[y][x] = mask[idx];
        idx++;
      }
    }
  }
  return result;
}

您可前往模力方舟示例代码仓库参考更多示例代码。

在线体验

您也可以直接在线体验图片切割功能。

使用方式

上传参考图->输入提示词->点击运行

segmentation1

示例效果

参考图	prompt	结果
	kids
	apple

使用场景

通过图片切割接口，您可以快速为应用添加先进的视觉分析能力，无需训练模型或搭建复杂基础设施，即可享受像素级精度的图片理解服务：

基本使用场景

智能抠图：自动分离人物、商品等主体，去除背景
物体计数：统计图片中的特定物体数量
图像标注：为训练AI模型准备标注数据

实际应用

# 示例1：电商商品抠图
product_result = cut_image("product_photo.jpg", prompt="shoes, handbag")
# 得到每个商品的精确轮廓，可用于制作透明背景的商品图

# 示例2：监控画面分析
security_result = cut_image("camera_feed.jpg", prompt="person, vehicle")
# 识别画面中的人和车辆，用于安防监控

功能描述​

支持模型​

如何调用接口​

前提准备​

python调用（示例）​

接口参数说明​

返回结果的处理​

结果数据结构​

Mask 解码方式说明（前端 / 客户端）​

在线体验​

使用方式​

示例效果​

使用场景​

基本使用场景​

实际应用​