#domain #extract #tld

tldextract

从 URL 中提取域名信息

6 个版本 (破坏性)

0.6.0 2022 年 8 月 10 日
0.5.1 2018 年 9 月 11 日
0.5.0 2017 年 10 月 30 日
0.4.0 2017 年 2 月 9 日
0.1.0 2016 年 11 月 12 日

#97国际化 (i18n)

Download history 1868/week @ 2024-03-13 1724/week @ 2024-03-20 1039/week @ 2024-03-27 2354/week @ 2024-04-03 2500/week @ 2024-04-10 1458/week @ 2024-04-17 3207/week @ 2024-04-24 1174/week @ 2024-05-01 3983/week @ 2024-05-08 3024/week @ 2024-05-15 2552/week @ 2024-05-22 2705/week @ 2024-05-29 2665/week @ 2024-06-05 2067/week @ 2024-06-12 1414/week @ 2024-06-19 608/week @ 2024-06-26

7,127 每月下载量
10 个 Crates 中使用 (直接使用 4 个)

MIT 许可证

155KB
7.5K SLoC

tldExtract 构建状态 Crates.io

A rust implementation of tldExtract. tldExtract accurately extracts TLD, including gTLD(generic top-level domain) and ccTLD ( country code top-level domain) from the domain and subdomains of a URL. For example, it extracts 'google' from 'http://www.google.com'.

Splitting the url with '.' and taking the last 2 elements does not work except for simple examples like .com domains. This does not work for complicated domains like http://forums.bbc.co.uk . The naive splitting method above will give you 'co' as the domain and 'uk' as the TLD, instead of 'bbc' and 'co.uk' respectively.

While tldExtract knows what all gTLDs and ccTLDs look like by looking up the currently living ones according to the Public Suffix List. So, tleExtract knows the subdomain and its domain from its country code.

Thanks to john-kurkowski, this project is mainly inspired by his work in python

文档

依赖关系

~4–20MB
~268K SLoC