4 个版本 (破坏性更新)

0.4.0	2024 年 2 月 1 日
0.3.0	2024 年 1 月 31 日
0.2.0	2024 年 1 月 29 日
0.1.0	2024 年 1 月 25 日

#333 在图像

每月 113 次下载

Apache-2.0

53KB
1.5K SLoC

surya-rs

Rust 实现的 surya，一个多语言文档 OCR 工具包。实现基于修改版的 Segformer 和 OpenCV。

有关权重的许可，请参阅原始项目。

路线图

此项目仍在开发中，请随意 star 并检查。

模型结构，segformer（仅用于推理）
权重加载
图像输入预处理
热图和亲和图
边界框
图像分割和拼接
文本识别
基准测试
量化

如何构建和安装

如果您尚未设置，请设置 rust 工具链

# visit https://rustup.rs/ for more detailed information
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

安装 llvm 和 opencv（示例在 Mac 上）

brew install llvm opencv

构建和安装二进制文件

# run this first on Mac if you have a M1 chip
export DYLD_FALLBACK_LIBRARY_PATH="$(xcode-select --print-path)/usr/lib/"
# run this first on other Mac
export DYLD_FALLBACK_LIBRARY_PATH="$(xcode-select --print-path)/Toolchains/XcodeDefault.xctoolchain/"
# optionally you can include features like accelerate, metal, mkl, etc.
cargo install --path . --features=cli

构建的二进制文件不包含权重文件本身，而是通过 HuggingFace Hub API 下载。下载后，权重文件将缓存在 HuggingFace 缓存目录中。

使用 -h 查看帮助

Surya is a multilingual document OCR toolkit, original implementation in Python and PyTorch

Usage: surya [OPTIONS] <IMAGE>

Arguments:
  <IMAGE>  path to image

Options:
      --batch-size <BATCH_SIZE>
          detection batch size, if not supplied defaults to 2 on CPU and 16 on GPU
      --model-repo <MODEL_REPO>
          detection model's hugging face repo [default: vikp/line_detector]
      --weights-file-name <WEIGHTS_FILE_NAME>
          detection model's weights file name [default: model.safetensors]
      --config-file-name <CONFIG_FILE_NAME>
          detection model's config file name [default: config.json]
      --non-max-suppression-threshold <NON_MAX_SUPPRESSION_THRESHOLD>
          a value between 0.0 and 1.0 to filter low density part of heatmap [default: 0.35]
      --extract-text-threshold <EXTRACT_TEXT_THRESHOLD>
          a value between 0.0 and 1.0 to filter out bbox with low heatmap density [default: 0.6]
      --bbox-area-threshold <BBOX_AREA_THRESHOLD>
          a pixel threshold to filter out small area bbox [default: 10]
      --polygons
          whether to output polygons json file
      --image
          whether to generate bbox image
      --heatmap
          whether to generate heatmap
      --affinity-map
          whether to generate affinity map
      --output-dir <OUTPUT_DIR>
          output directory, under which the input image will be generating a subdirectory [default: ./surya_output]
      --device <DEVICE_TYPE>
          [default: cpu] [possible values: cpu, gpu, metal]
      --verbose
          whether to enable verbose mode
  -h, --help
          Print help
  -V, --version
          Print version

您也可以使用此来控制日志级别

export SURYA_LOG=warn # or debug, warn, etc.

库

此库还作为一个 trait 发布，供其他 Rust 项目使用。

依赖项

~10–23MB
~284K SLoC