#unicode #text

bin+lib unic-cli

UNIC 命令行工具

3 个版本 (重大更新)

0.9.0 2019 年 3 月 3 日
0.8.0 2019 年 1 月 2 日
0.7.0 2018 年 2 月 7 日

#400国际化(i18n)

MIT/Apache

38KB
493

UNIC 命令行工具

Crates.io

此软件包为开发者提供命令行工具,帮助处理常见的 Unicode 任务,如生成、转换和检查字符串。

如何安装

使用 Cargo 安装 unic-cli 软件包。

$ cargo install unic-cli

如何使用

回声命令

unic-echo 命令生成 Unicode 字符串,类似于常见的 echo 命令。

# Normally, the output ends with a newline
$ unic-echo Hello سلام
Hello سلام

# Get the codepoints or UTF-8/UTF-16, instead of plain text
$ unic-echo Hello سلام -o codepoints
U+0048 U+0065 U+006C U+006C U+006F U+0020 U+0633 U+0644 U+0627 U+0645

# Or see how to escape as a Rust string literal
$ unic-echo Hello سلام -o rust-escape
"Hello \u{633}\u{644}\u{627}\u{645}"

# Also the input can be in codepoints, or UTF-8/UTF-16 hex
$ unic-echo -i codepoints U+0 U+20 U+41 U+A0 -o rust-escape
"\u{0} A\u{a0}"

$ unic-echo --help
Write arguments to the standard output

USAGE:
    unic-echo [FLAGS] [OPTIONS] [STRINGS]...

FLAGS:
    -h, --help          Prints help information
    -n, --no-newline    No trailing newline
    -V, --version       Prints version information

OPTIONS:
    -i, --input <FORMAT>     Specify input format (see list below)
    -o, --output <FORMAT>    Specify output format (see list below)

ARGS:
    <STRINGS>...    Input strings (expected valid Unicode)

INPUT FORMATS:
    plain                   Plain Unicode characters (default)
    codepoints              Unicode codepoints (hex)
    utf8-hex                UTF-8 bytes (hex)
    utf16-hex               UTF-16 words (hex)

OUTPUT FORMATS:
    plain                   Plain Unicode characters (default)
    codepoints              Unicode codepoints (hex)
    utf8-hex                UTF-8 bytes (hex)
    utf16-hex               UTF-16 words (hex)

    braces-escape           String literal with \u{...} escapes for
    | js6-escape            control and non-ASCII characters
    | rust-escape

    braces-escape-all       String literal with \u{...} escapes for
    | js6-escape-all        all characters
    | rust-escape-all

    braces-escape-control   String literal with \u{...} escapes for
    | js6-escape-control    control characters
    | rust-escape-control

检查器命令

unic-inspector 命令列出输入字符串中的 Unicode 字符及其属性。

$ unic-inspector Hello سلام
 H | U+0048 | LATIN CAPITAL LETTER H | Uppercase_Letter
 e | U+0065 | LATIN SMALL LETTER E   | Lowercase_Letter
 l | U+006C | LATIN SMALL LETTER L   | Lowercase_Letter
 l | U+006C | LATIN SMALL LETTER L   | Lowercase_Letter
 o | U+006F | LATIN SMALL LETTER O   | Lowercase_Letter
   | U+0020 | SPACE                  | Space_Separator
 س | U+0633 | ARABIC LETTER SEEN     | Other_Letter
 ل | U+0644 | ARABIC LETTER LAM      | Other_Letter
 ا | U+0627 | ARABIC LETTER ALEF     | Other_Letter
 م | U+0645 | ARABIC LETTER MEEM     | Other_Letter

不久的将来,此工具将允许选择要显示的属性,并支持按图形簇、单词、句子等分组字符。

依赖项

~7.5MB
~112K SLoC