#bom #read-stream #read #io-stream #endianness #byte #order

skip_wbom

如果存在,则跳过文件开头的可选编码字节顺序标记(BOM)

8 个不稳定版本 (3 个重大更改)

0.5.1 2022年11月8日
0.5.0 2022年11月7日
0.4.0 2022年11月6日
0.3.0 2022年11月6日
0.2.3 2022年11月6日

#1804解析器实现

Download history 6/week @ 2024-04-15 16/week @ 2024-06-10 104/week @ 2024-07-22

每月104次 下载

MIT/Apache

24KB
295

skip_bom

Build status Rust crates.io docs.rs

如果存在,则跳过输入/输出流开头的可选编码 BOM (字节顺序标记)。

SkipEncodingBom 数据结构不进行动态分配,并支持渐进式流读取。

支持的 BOM 列表可以在 crate 文档中找到

示例

use skip_bom::{BomType, SkipEncodingBom};
use std::io::{Cursor, Read};

// Read a stream after checking that it starts with the BOM
const BOM_BYTES: &'static [u8] = b"\xEF\xBB\xBFThis stream starts with a UTF-8 BOM.";
let mut reader = SkipEncodingBom::new(BomType::all(), Cursor::new(BOM_BYTES));
assert_eq!(Some(BomType::UTF8), reader.read_bom().unwrap());
let mut string = Default::default();
let _ = reader.read_to_string(&mut string).unwrap();
assert_eq!("This stream starts with a UTF-8 BOM.", &string);

// Read a stream without a starting BOM
const NO_BOM_BYTES: &'static [u8] = b"This stream does not start with the UTF-8 BOM: \xEF\xBB\xBF.";
let mut reader = SkipEncodingBom::new(BomType::all(), Cursor::new(NO_BOM_BYTES));
assert_eq!(None, reader.read_bom().unwrap());
let mut buf = Default::default();
let _ = reader.read_to_end(&mut buf).unwrap();
assert_eq!(b"This stream does not start with the UTF-8 BOM: \xEF\xBB\xBF.", buf.as_slice());

// Read a stream and disregard the starting BOM completely
let mut reader = SkipEncodingBom::new(&[BomType::UTF8], Cursor::new(BOM_BYTES));
let mut buf = Default::default();
let _ = reader.read_to_end(&mut buf).unwrap();
assert_eq!(b"This stream starts with a UTF-8 BOM.", buf.as_slice());
// Check the BOM after the read is over.
assert_eq!(Some(Some(BomType::UTF8)), reader.bom_found());

渐进式读取

此 crate 支持一开始不完整的 I/O 流,并在稍后接收数据,即使是初始 BOM。示例

use skip_bom::{BomType, SkipEncodingBom};
use std::io::{Cursor, Read};

let mut reader = SkipEncodingBom::new(&[BomType::UTF8], Cursor::new(b"\xEF\xBB".to_vec()));
let mut buf = Default::default();
let _ = reader.read_to_end(&mut buf).unwrap();
// The stream is incomplete: there are only the first two bytes of the BOM yet
assert_eq!(0, buf.len(), "{:?}", buf.as_slice());
assert_eq!(None, reader.bom_found());
// Add the next bytes and check that the UTF-8 BOM is accounted for
reader.get_mut().get_mut().extend_from_slice(b"\xBFThis stream has a BOM.");
let _ = reader.read_to_end(&mut buf).unwrap();
assert_eq!(b"This stream has a BOM.", buf.as_slice());
assert_eq!(Some(BomType::UTF8), reader.bom_found().unwrap());

参考

文档

包含示例的模块文档

许可

此项目可以在以下任一许可下使用:

供您选择。

无运行时依赖