1 个不稳定版本
0.1.0 | 2022年11月5日 |
#334 在 机器学习 中
4.5K SLoC
Stitch 的预印本可在 此处 获取。
运行 cargo run --release --bin=compress -- data/cogsci/nuts-bolts.json --max-arity=3 --iterations=10
=======Compression Summary=======
Found 10 inventions
Cost Improvement: (11.93x better) 1919558 -> 160946
fn_0 (1.78x wrt orig): utility: 837792 | final_cost: 1079238 | 1.78x | uses: 320 | body: [fn_0 arity=2: (T (repeat (T l (M 1 0 -0.5 (/ 0.5 (tan (/ pi #1))))) #1 (M 1 (/ (* 2 pi) #1) 0 0)) (M #0 0 0 0))]
fn_1 (3.81x wrt orig): utility: 572767 | final_cost: 503538 | 2.14x | uses: 190 | body: [fn_1 arity=3: (repeat (T (T #2 (M 0.5 0 0 0)) (M 1 0 (* #1 (cos (/ pi 4))) (* #1 (sin (/ pi 4))))) #0 (M 1 (/ (* 2 pi) #0) 0 0))]
fn_2 (6.06x wrt orig): utility: 185436 | final_cost: 316890 | 1.59x | uses: 168 | body: [fn_2 arity=1: (T (T c (M 2 0 0 0)) (M #0 0 0 0))]
fn_3 (7.18x wrt orig): utility: 48984 | final_cost: 267198 | 1.19x | uses: 82 | body: [fn_3 arity=2: (C #1 (T r (M #0 0 0 0)))]
fn_4 (8.29x wrt orig): utility: 35046 | final_cost: 231646 | 1.15x | uses: 88 | body: [fn_4 arity=2: (C (fn_0 4 #1) (fn_0 #0 6))]
fn_5 (9.04x wrt orig): utility: 18885 | final_cost: 212456 | 1.09x | uses: 95 | body: [fn_5 arity=3: (C #2 (fn_1 #1 1.5 #0))]
fn_6 (9.93x wrt orig): utility: 18885 | final_cost: 193266 | 1.10x | uses: 95 | body: [fn_6 arity=3: (C #2 (fn_1 #1 3 #0))]
fn_7 (10.53x wrt orig): utility: 10604 | final_cost: 182358 | 1.06x | uses: 54 | body: [fn_7 arity=2: (C #1 (fn_0 #0 6))]
fn_8 (11.20x wrt orig): utility: 10503 | final_cost: 171450 | 1.06x | uses: 36 | body: [fn_8 arity=2: (C (fn_0 4 #1) (fn_2 #0))]
fn_9 (11.93x wrt orig): utility: 10202 | final_cost: 160946 | 1.07x | uses: 52 | body: [fn_9 arity=0: (fn_4 4.25 6)]
Time: 227ms
是自动生成的抽象名称(1.78x wrt orig)
生成的压缩程序比原始程序小 1.78 倍,而在行中的稍后位置,另一个1.78x
是与前一步相比的压缩率(对于第一步,它们是相同的)。utility: 836528
这是对程序在重写时新原语的数量进行测量的一个指标(除以 100 以获得删除原语的大致数量)uses: 320
在程序集的 320 个地方使用了这个抽象- 请注意,在这些抽象中
从 cargo run --release --bin=compress -- --help
<FILE> json file to read compression input programs from
-a, --max-arity <MAX_ARITY>
max arity of abstractions to find (will find all from 0 to this number inclusive)
[default: 2]
extracts argument values from the json; specifically assumes a key value pair like
"stitch_args": "data/dc/logo_iteration_1_stitchargs.json -a3 -t8 --fmt=dreamcoder
--dreamcoder-drop-last --no-mismatch-check", in the toplevel dictionary of the json. All
other commandline args get discarded when you specify this option
-b, --batch <BATCH>
how many worklist items a thread will take at once [default: 1]
anything related to running a dreamcoder comparison
threads will autoadjust how large their batches are based on the worklist size
--fmt <FMT>
the format of the input file, e.g. 'programs-list' for a simple JSON array of programs
or 'dreamcoder' for a JSON in the style expected by the original dreamcoder codebase.
See [formats.rs] for options or to add new ones [default: programs-list] [possible
values: dreamcoder, programs-list, split-programs-list]
for debugging: prunes all branches except the one that leads to the `--track`
-h, --help
Print help information
--hole-choice <HOLE_CHOICE>
Method for choosing hole to expand at each step, doesn't have a huge effect [default:
depth-first] [possible values: random, breadth-first, depth-first, max-largest-subset,
high-entropy, low-entropy, max-cost, min-cost, many-groups, few-groups, few-apps]
-i, --iterations <ITERATIONS>
Number of iterations to run compression for (number of inventions to find) [default: 3]
-n, --inv-candidates <INV_CANDIDATES>
Number of invention candidates compression_step should return in a *single* step. Note
that these will be the top n optimal candidates modulo subsumption pruning (and the top-
1 is guaranteed to be globally optimal) [default: 1]
disables the safety check for the utility being correct; you only want to do this if you
truly dont mind unsoundness for a minute
disable all optimizations
disable the arity zero priming optimization
disable the force multiuse pruning optimization
disable the free variable pruning optimization
disable the single task pruning optimization
disable the single structurally hashed subtree match pruning
disable the upper bound pruning optimization
disable the useless abstraction pruning optimization
makes it so utility is based purely on corpus size without adding in the abstraction
Disable stat logging - note that stat logging in multithreading requires taking a mutex
so it can be a source of slowdown in the massively multithreaded case, hence this flag
to disable it
makes it so inventions cant start with a lambda at the top
-o, --out <OUT>
json output file [default: out/out.json]
--print-stats <PRINT_STATS>
print stats this often (0 means never) [default: 0]
-r, --show-rewritten
print out programs rewritten under abstraction
whenever you finish an invention do a full rewrite to check that rewriting doesnt raise
a cost mismatch exception
--save-rewritten <SAVE_REWRITTEN>
saves the rewritten frontiers in an input-readable format
shuffle order of set of inventions
-t, --threads <THREADS>
number of threads (no parallelism if set to 1) [default: 1]
--track <TRACK>
for debugging: pattern or abstraction to track
--truncate <TRUNCATE>
truncate set of inventions to include only this many (happens after shuffle if shuffle
is also specified)
calculate utility exhaustively by performing a full rewrite; mainly used when cost
mismatches are happening and we need something slow but accurate
prints whenever a new best abstraction is found
prints every worklist item as it is processed (will slow things down a ton due to
rendering out expressins)
cargorun --release --bin=compress --data/cogsci/nuts-bolts.json --no-opt
或者查看以 --no-opt-
Python 绑定
目前提供初始的 Python 绑定。
- 根据您的操作系统运行
)- 如果此命令不起作用,请告诉我或打开一个问题!它可能因操作系统而异,并且当前的命令可能过拟合到我的电脑上。
- 将
中,例如,通过将export PYTHONPATH="$PYTHONPATH:path/to/stitch/bindings/"
或您特定的 shell / venv 中。这意味着stitch.so
文件在您的 python 路径中,这将允许您导入它。 - 启动
并尝试import stitch
(如果成功,则不应打印任何内容) - 作为一个简单的例子,运行 Python 代码
import stitch,json; result = json.loads(stitch.compression(["(a a a)", "(b b b)"], iterations=1, max_arity=2, max_arity=2)); print("Result:", result)
应找到(#0 #0 #0)
抽象。 - 请注意,目前它输出一个类似于 stitch 常规 out/out.json 输出的大的 Python 字典。
- 有更多可用的关键字参数(完整列表在
中是生成项目为 Python 绑定生成 cdylib 的一个解决方案)。基本上,你可以在cargo run --release --bin=compress -- --help
表示 不运行任何基准测试,只加载文件,就像它是你刚刚生成的结果一样--baseline=master
避免了详细的“未识别的选项”错误 这里
如果你还没有安装: cargo install flamegraph
cargo flamegraph --root --open --deterministic --output=out/flamegraph.svg --bin=compress -- data/cogsci/nuts-bolts.json
这项工作得到了美国国家科学基金会(NSF)的资助,资助编号为 1918839《通过代码理解世界》http://www.neurosymbolic.org/
这项工作部分得到了国防高级研究计划局(DARPA)的资助,资助项目为 Symbiotic Design for Cyber Physical Systems(SDCPS),合同编号 FA8750-20-C-0542(Systemic Generative Engineering)。所表达的观点、意见和/或发现是作者的观点,不一定反映DARPA的观点。
~127K SLoC