2 个不稳定版本
新版本 0.1.0 | 2024 年 8 月 15 日 |
---|---|
0.0.0 | 2021 年 12 月 1 日 |
#1288 在 命令行工具
每月 76 次下载
21KB
385 行
分析师
分析师是一个支持快速浏览 CSV 数据的命令行工具,它可以以流模式动态读取 CSV 并进行分析。它可以方便地查看 CSV 文件的缺失值、找到 CSV 数据的频繁模式、统计每列数据的频率、找到列的最大和最小值等。
命令
show
: 显示行,默认 10 行,最大 100 行analyst show file.csv--start{start} --end{end}
missing-values
: 显示缺失值analyst missing-values file.csv
frequent-patterns
: 显示频繁模式analyst frequent-patterns file.csv--min-support{ratio}
column-stats
: 显示列统计analyst column-stats file.csv--column{column}
extrema
: 显示列极值analyst extrema file.csv--column{column}
示例
以下是一个示例 CSV 文件。
ID,Name,Age,Grade,Subject,Score,Attendance
1,Alice Smith,18,12,Math,95,98%
2,Bob Johnson,17,11,Physics,88,95%
3,Charlie Brown,16,10,Chemistry,78,92%
4,Diana Lee,,12,Biology,92,97%
5,Eva Martinez,18,12,Math,91,99%
6,Frank Wilson,17,11,,85,93%
7,Grace Taylor,16,10,Physics,89,96%
8,Henry Davis,18,12,Chemistry,,90%
9,Ivy Chen,17,11,Biology,94,98%
10,Jack Thompson,16,10,Math,82,
analyst show test_data.csv
+----+---------------+-----+-------+-----------+-------+------------+
| ID | Name | Age | Grade | Subject | Score | Attendance |
+----+---------------+-----+-------+-----------+-------+------------+
| 1 | Alice Smith | 18 | 12 | Math | 95 | 98% |
+----+---------------+-----+-------+-----------+-------+------------+
| 2 | Bob Johnson | 17 | 11 | Physics | 88 | 95% |
+----+---------------+-----+-------+-----------+-------+------------+
| 3 | Charlie Brown | 16 | 10 | Chemistry | 78 | 92% |
+----+---------------+-----+-------+-----------+-------+------------+
| 4 | Diana Lee | | 12 | Biology | 92 | 97% |
+----+---------------+-----+-------+-----------+-------+------------+
| 5 | Eva Martinez | 18 | 12 | Math | 91 | 99% |
+----+---------------+-----+-------+-----------+-------+------------+
| 6 | Frank Wilson | 17 | 11 | | 85 | 93% |
+----+---------------+-----+-------+-----------+-------+------------+
| 7 | Grace Taylor | 16 | 10 | Physics | 89 | 96% |
+----+---------------+-----+-------+-----------+-------+------------+
| 8 | Henry Davis | 18 | 12 | Chemistry | | 90% |
+----+---------------+-----+-------+-----------+-------+------------+
| 9 | Ivy Chen | 17 | 11 | Biology | 94 | 98% |
+----+---------------+-----+-------+-----------+-------+------------+
| 10 | Jack Thompson | 16 | 10 | Math | 82 | |
+----+---------------+-----+-------+-----------+-------+------------+
analyst missing-values test_data.csv
Total rows analyzed: 10
Missing value analysis:
Age: 1 missing values (10.00%)
Name: 0 missing values (0.00%)
Subject: 1 missing values (10.00%)
Score: 1 missing values (10.00%)
Attendance: 1 missing values (10.00%)
ID: 0 missing values (0.00%)
Grade: 0 missing values (0.00%)
analyst column-stats test_data.csv--column Age
Total rows analyzed: 10
Column statistics:
Column: Age
18: 3 occurrences (30.00%)
17: 3 occurrences (30.00%)
16: 3 occurrences (30.00%)
: 1 occurrences (10.00%)
analyst extrema test_data.csv--column Score
Extrema for column 'Score':
Minimum value: 78
Maximum value: 95
analyst frequent-patterns test_data.csv--min-support0.3
Frequent patterns (min support: 30.00%):
1-item frequent patterns:
Age:16,Grade:10 (support: 30.00%)
Age:17,Grade:11 (support: 30.00%)
Age:18,Grade:12 (support: 30.00%)
依赖关系
~4–13MB
~124K SLoC