37 个版本

0.16.0 2024 年 5 月 27 日
0.15.6 2023 年 12 月 31 日
0.15.5 2023 年 11 月 28 日
0.14.2 2023 年 7 月 27 日
0.0.3 2022 年 2 月 23 日

#76 in 解析器实现

Download history 16890/week @ 2024-05-03 16749/week @ 2024-05-10 17744/week @ 2024-05-17 20535/week @ 2024-05-24 19469/week @ 2024-05-31 19122/week @ 2024-06-07 18943/week @ 2024-06-14 20481/week @ 2024-06-21 20841/week @ 2024-06-28 18621/week @ 2024-07-05 17303/week @ 2024-07-12 19470/week @ 2024-07-19 18979/week @ 2024-07-26 20605/week @ 2024-08-02 18107/week @ 2024-08-09 13982/week @ 2024-08-16

75,257 每月下载量
用于 184 个包 (6 直接)

MIT/Apache

125KB
2.5K SLoC

Rust 快速 &[u8] 到整数解析器

Crate API

SIMD (快速) 解析支持 x86_64 (SSE4.1, AVX2) 和 Arm64 (aarch64, Neon),即使在没有 SIMD 支持的 CPU 上运行,也仍然比 str::parse 快。

支持负值并验证输入。

支持的输出类型:u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize, isize。

具有良好的测试覆盖率,可以认为是安全的。

要启用 SIMD,需要设置 target-featuretarget-cpu 标志,或者它会回退到非 SIMD 函数。您可以复制 ./.cargo/config.toml 到您的项目中,或者使用以下环境变量之一

  • RUSTFLAGS="-C target-feature=+sse2,+sse3,+sse4.1,+ssse3,+avx,+avx2" 用于 x86_64

  • RUSTFLAGS="-C target-feature=+neon" 用于 Arm64

  • RUSTFLAGS="-C target-cpu=native" 将针对您的当前 CPU 进行优化

对于 Windows PowerShell,您可以使用以下命令设置它:$Env:RUSTFLAGS='-C target-feature=+sse2,+sse3,+sse4.1,+ssse3,+avx,+avx2'

默认情况下,target-feature 设置在 ./.cargo/config.toml 中,但似乎它仅在项目内部工作。

如果您有 &str,则可以使用 .as_bytes()

支持 no_std,使用 --no-default-features

这里 (源码)获得了灵感。

示例

let val: u64 = atoi_simd::parse(b"1234").unwrap();
assert_eq!(val, 1234_u64);

assert_eq!(atoi_simd::parse::<i64>(b"-2345"), Ok(-2345_i64));

assert_eq!(atoi_simd::parse_any::<u64>(b"123something_else"), Ok((123_u64, 3)));

// a drop-in replacement for `str::parse`
assert_eq!(atoi_simd::parse_skipped::<u64>(b"+000000000000000000001234"), Ok(1234_u64));

基准测试

您可以从您的机器上的bench文件夹运行cargo bench命令(或者使用cargo bench -- "parse u64")来单独执行

结果

更多信息请查阅这里

v0.16.0

Rust 1.78,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse::<u64>()

parse::<u64>()

str::parse::<u64>()

str::parse::<u64>()

parse::<i64>()

parse::<i64>()

str::parse::<i64>()

str::parse::<i64>()

parse::<u128>()

parse::<u128>()

str::parse::<u128>()

str::parse::<u128>()

parse::<i128>()

parse::<i128>()

str::parse::<i128>()

str::parse::<i128>()

v0.15.2

Rust 1.73,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse::<u64>()

parse::<u64>()

str::parse::<u64>()

str::parse::<u64>()

parse::<i64>()

parse::<i64>()

str::parse::<i64>()

str::parse::<i64>()

parse::<u128>()

parse::<u128>()

str::parse::<u128>()

str::parse::<u128>()

parse::<i128>()

parse::<i128>()

str::parse::<i128>()

str::parse::<i128>()

v0.14.5

Rust 1.72,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse::<u64>()

parse::<u64>()

str::parse::<u64>()

str::parse::<u64>()

parse::<i64>()

parse::<i64>()

str::parse::<i64>()

str::parse::<i64>()

parse::<u128>()

parse::<u128>()

str::parse::<u128>()

str::parse::<u128>()

parse::<i128>()

parse::<i128>()

str::parse::<i128>()

str::parse::<i128>()

v0.14.4

Rust 1.72,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse::<u64>()

parse::<u64>()

str::parse::<u64>()

str::parse::<u64>()

parse::<i64>()

parse::<i64>()

str::parse::<i64>()

str::parse::<i64>()

parse::<u128>()

parse::<u128>()

str::parse::<u128>()

str::parse::<u128>()

parse::<i128>()

parse::<i128>()

str::parse::<i128>()

str::parse::<i128>()

v0.10.1

Rust 1.67.1,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse::<u64>()

parse::<u64>()

str::parse::<u64>()

str::parse::<u64>()

parse::<i64>()

parse::<i64>()

str::parse::<i64>()

str::parse::<i64>()

parse::<u128>()

parse::<u128>()

str::parse::<u128>()

str::parse::<u128>()

parse::<i128>()

parse::<i128>()

str::parse::<i128>()

str::parse::<i128>()

v0.4-v0.5

Rust 1.63,Windows 10,Intel i7 9700K,"target-feature"已设置

benchmark 64

benchmark 128

parse_u64()

parse() u64

str::parse::<u64>()

str::parse::<u64>()

parse_i64()

parse_i64()

str::parse::<i64>()

str::parse::<i64>()

parse_u128()

parse_u128()

str::parse::<u128>()

str::parse::<u128>()

parse_i128()

parse_i128()

str::parse::<i128>()

str::parse::<i128>()

v0.3.0

Rust 1.63,Windows 10,Intel i7 9700K,"target-feature"已设置

all

parse() u64

parse() u64

str::parse::<u64>()

str::parse::<u64>()

parse_i64()

parse_i64()

str::parse::<i64>()

str::parse::<i64>()

parse_u128()

parse_u128()

str::parse::<u128>()

str::parse::<u128>()

parse_i128()

parse_i128()

str::parse::<i128>()

str::parse::<i128>()

v0.2.1

Rust 1.63,Windows 10,Intel i7 9700K,"target-feature"已设置

all

parse() u64

parse() u64

str::parse::<u64>()

str::parse::<u64>()

parse_i64()

parse_i64()

str::parse::<i64>()

str::parse::<i64>()

parse_u128()

parse_u128()

str::parse::<u128>()

str::parse::<u128>()

parse_i128()

parse_i128()

str::parse::<i128>()

str::parse::<i128>()

v0.2.0

Rust 1.63,Windows 10,Intel i7 9700K,"target-feature"已设置

all

parse() u64

parse() u64

str::parse::<u64>()

str::parse::<u64>()

parse_i64()

parse_i64()

str::parse::<i64>()

str::parse::<i64>()

parse_u128()

parse_u128()

str::parse::<u128>()

str::parse::<u128>()

parse_i128()

parse_i128()

str::parse::<i128>()

str::parse::<i128>()

v0.1.x

观察到的

  • 它的速度大约是标准解析的7倍(对于长字符串,Rust 1.60)
  • 性能对于不同长度的字符串是恒定的(相同的)
Rust 1.63,Windows 10,Intel i7 9700K,"target-feature"已设置

在Rust 1.63上这个变得更快

long string std u64                  1234567890123456
                        time:   [9.0293 ns 9.0843 ns 9.1661 ns]
                        change: [-0.6548% +0.8424% +2.3425%] (p = 0.29 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

这个变得甚至更慢。我多次重新运行它(重建)——结果相同

long string negative std i64         -1234567890123456
                        time:   [17.554 ns 17.607 ns 17.667 ns]
                        change: [-1.6112% -0.2132% +1.5620%] (p = 0.80 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
long string u64                      1234567890123456
                        time:   [1.9273 ns 1.9346 ns 1.9424 ns]
                        change: [-2.3999% -0.4986% +1.2253%] (p = 0.62 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
long string i64                      1234567890123456
                        time:   [2.3258 ns 2.3357 ns 2.3468 ns]
                        change: [-2.1695% -0.4296% +1.3102%] (p = 0.65 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe
long string negative i64             -1234567890123456
                        time:   [2.5319 ns 2.5439 ns 2.5607 ns]
                        change: [-2.0344% -0.3167% +1.5650%] (p = 0.75 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) high mild
  7 (7.00%) high severe
short string std u64                       1
                        time:   [2.3305 ns 2.3462 ns 2.3656 ns]
                        change: [-4.1262% -1.9850% +0.2412%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe
short string negative std i64              -1
                        time:   [3.7983 ns 3.8177 ns 3.8402 ns]
                        change: [-1.4979% -0.0694% +1.5137%] (p = 0.94 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
short string u64                           1
                        time:   [2.0024 ns 2.0097 ns 2.0184 ns]
                        change: [-3.4351% -1.3017% +0.5198%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
short string i64                           1
                        time:   [2.4245 ns 2.4356 ns 2.4499 ns]
                        change: [-2.9298% -1.3203% +0.3535%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
short string negative i64                  -1
                        time:   [2.5191 ns 2.5233 ns 2.5285 ns]
                        change: [-2.8014% -0.9235% +0.7916%] (p = 0.35 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

额外的15个字符基准测试

15 chars string std u64              123456789012345
                        time:   [8.4146 ns 8.4352 ns 8.4604 ns]
                        change: [-2.5855% -1.0348% +0.5767%] (p = 0.21 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
15 chars string negative std i64     -123456789012345
                        time:   [10.268 ns 10.331 ns 10.415 ns]
                        change: [-0.7653% +0.9929% +2.7733%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe
15 chars string u64                  123456789012345
                        time:   [1.8990 ns 1.9042 ns 1.9103 ns]
                        change: [-1.8510% -0.3256% +0.9332%] (p = 0.70 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
15 chars string i64                  123456789012345
                        time:   [2.3780 ns 2.3831 ns 2.3892 ns]
                        change: [-2.1490% -0.6463% +0.8095%] (p = 0.41 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
15 chars string negative i64         -123456789012345
                        time:   [2.5323 ns 2.5445 ns 2.5589 ns]
                        change: [-2.8686% -0.9755% +0.9693%] (p = 0.34 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
Rust 1.60,Windows 10,Intel i7 9700K,"target-feature"已设置
long string std u64                  1234567890123456
                        time:   [15.136 ns 15.172 ns 15.220 ns]
                        change: [-1.0266% +1.4318% +4.7776%] (p = 0.42 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  9 (9.00%) high severe

在解析到i64(标准.parse::<i64>())时,它比u64.parse::<u64>())要快一些

long string negative std i64         -1234567890123456
                        time:   [12.451 ns 12.468 ns 12.489 ns]
                        change: [-2.8201% -1.8197% -0.9578%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) low mild
  5 (5.00%) high mild
  8 (8.00%) high severe
long string u64                      1234567890123456
                        time:   [2.1173 ns 2.1212 ns 2.1254 ns]
                        change: [-1.7643% -0.7705% +0.0464%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe
long string i64                      1234567890123456
                        time:   [2.0971 ns 2.1018 ns 2.1083 ns]
                        change: [-1.4917% -0.3822% +0.4114%] (p = 0.53 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  8 (8.00%) high severe
long string negative i64             -1234567890123456
                        time:   [2.1659 ns 2.1689 ns 2.1729 ns]
                        change: [-1.8464% -0.6673% +0.2406%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) low mild
  1 (1.00%) high mild
  7 (7.00%) high severe
short string std u64                       1
                        time:   [2.7282 ns 2.7315 ns 2.7355 ns]
                        change: [-0.3423% +0.5560% +1.4297%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  6 (6.00%) high mild
  10 (10.00%) high severe
short string negative std i64              -1
                        time:   [3.4122 ns 3.4210 ns 3.4304 ns]
                        change: [-0.4427% +0.2415% +1.0592%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
short string u64                           1
                        time:   [2.0971 ns 2.0989 ns 2.1014 ns]
                        change: [-0.4568% +0.1569% +0.7932%] (p = 0.63 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  12 (12.00%) high severe

这个大约要低一点,大约是2.3纳秒

short string i64                           1
                        time:   [2.6629 ns 2.6704 ns 2.6789 ns]
                        change: [-0.2341% +0.4340% +0.9879%] (p = 0.19 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
short string negative i64                  -1
                        time:   [2.3049 ns 2.3077 ns 2.3115 ns]
                        change: [-0.8049% -0.1058% +0.5989%] (p = 0.79 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  5 (5.00%) low mild
  3 (3.00%) high mild
  8 (8.00%) high severe

额外的15个字符基准测试

15 chars string std u64              123456789012345
                        time:   [14.314 ns 14.347 ns 14.386 ns]
                        change: [+0.5781% +1.5775% +3.0108%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe
15 chars string negative std i64     -123456789012345
                        time:   [11.797 ns 11.869 ns 11.952 ns]
                        change: [-2.0623% -0.8216% +0.4470%] (p = 0.21 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) high mild
  8 (8.00%) high severe
15 chars string u64                  123456789012345
                        time:   [1.8545 ns 1.8559 ns 1.8576 ns]
                        change: [-1.0279% -0.3076% +0.3114%] (p = 0.40 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) low mild
  4 (4.00%) high mild
  9 (9.00%) high severe
15 chars string i64                  123456789012345
                        time:   [2.3638 ns 2.3734 ns 2.3825 ns]
                        change: [-1.8528% -0.7356% +0.2488%] (p = 0.17 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
15 chars string negative i64         -123456789012345
                        time:   [2.3077 ns 2.3109 ns 2.3152 ns]
                        change: [-1.9844% -1.2570% -0.5860%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  10 (10.00%) high severe

没有运行时依赖