1 个不稳定版本
使用旧的 Rust 2015
0.1.0 | 2018年6月4日 |
---|
#15 在 #html5ever
15KB
212 行
html5ever-stream
适配器,可轻松将数据流式传输到 html5ever 解析器。
概述
本包旨在提供垫片,使解析 HTML 变得相对容易,这种数据流可以通过标准 IO 读取器/写入器特性或通过 Stream 从 futures 包中消耗
- 支持任何发出实现 AsRef<[u8]> 项的 Stream
- 自动支持 hyper 和不稳定 reqwest 类型
- 支持 reqwest 的 copy_to 方法
- RcDom 的辅助包装器,使其更容易使用。
示例
使用 Hyper 0.11
extern crate futures;
extern crate html5ever;
extern crate html5ever_stream;
extern crate hyper;
extern crate hyper_tls;
extern crate tokio_core;
extern crate num_cpus;
use html5ever::rcdom;
use futures::{Future, Stream};
use hyper::Client;
use hyper_tls::HttpsConnector;
use tokio_core::reactor::Core;
use html5ever_stream::{ParserFuture, NodeStream};
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = Client::configure()
.connector(HttpsConnector::new(num_cpus::get(), &handle).unwrap())
.build(&handle);
// NOTE: We throw away errors here in two places, you are better off casting them into your
// own custom error type in order to propagate them.
let req_fut = client.get("https://github.com".parse().unwrap()).map_err(|_| ());
let parser_fut = req_fut.and_then(|res| {
ParserFuture::new(res.body().map_err(|_| ()), rcdom::RcDom::default())
});
let nodes = parser_fut.and_then(|dom| {
NodeStream::new(&dom).collect()
});
let print_fut = nodes.and_then(|vn| {
println!("found {} elements", vn.len());
Ok(())
});
core.run(print_fut).unwrap();
}
使用不稳定 Async Reqwest 0.8.6
extern crate futures;
extern crate html5ever;
extern crate html5ever_stream;
extern crate reqwest;
extern crate tokio_core;
use html5ever::rcdom;
use futures::{Future, Stream};
use reqwest::unstable::async as async_reqwest;
use tokio_core::reactor::Core;
use html5ever_stream::{ParserFuture, NodeStream};
fn main() {
let mut core = Core::new().unwrap();
let client = async_reqwest::Client::new(&core.handle());
// NOTE: We throw away errors here in two places, you are better off casting them into your
// own custom error type in order to propagate them.
let req_fut = client.get("https://github.com").send().map_err(|_| ());
let parser_fut = req_fut.and_then(|res| {
ParserFuture::new(res.into_body().map_err(|_| ()), rcdom::RcDom::default())
});
let nodes = parser_fut.and_then(|dom| {
NodeStream::new(&dom).collect()
});
let print_fut = nodes.and_then(|vn| {
println!("found {} elements", vn.len());
Ok(())
});
core.run(print_fut).unwrap();
}
使用稳定 Reqwest 0.8.6
extern crate html5ever;
extern crate html5ever_stream;
extern crate reqwest;
use html5ever::rcdom;
use html5ever_stream::{ParserSink, NodeIter};
fn main() {
let mut resp = reqwest::get("https://github.com").unwrap();
let mut parser = ParserSink::new(rcdom::RcDom::default());
resp.copy_to(&mut parser).unwrap();
let document = parser.finish();
let nodes: Vec<rcdom::Handle> = NodeIter::new(&document).collect();
println!("found {} elements", nodes.len());
}
许可证
在 MIT 许可证 下授权
依赖项
~1.2–3MB
~56K SLoC