23个稳定版本
1.21.0 | 2024年1月20日 |
---|---|
1.20.0 | 2023年7月19日 |
1.19.0 | 2023年6月4日 |
1.18.0 | 2022年12月17日 |
0.0.0 |
|
#13 在 调试 中
14,974 每月下载量
用于 36 个crates (32 个直接)
14MB
294K SLoC
iced-x86是一个用Rust编写的快速且准确的x86 (16/32/64位) 指令解码器、反汇编器和汇编器。
- 👍 支持所有Intel和AMD指令
- 👍 正确:所有指令都已测试,iced已与其他反汇编器/汇编器(xed、gas、objdump、masm、dumpbin、nasm、ndisasm)进行了测试,并进行了模糊测试
- 👍 100% Rust代码
- 👍 格式化器支持masm、nasm、gas(AT&T)、Intel(XED),并提供许多选项来自定义输出
- 👍 非常快:解码速度超过250 MB/s,解码+格式化速度超过130 MB/s(查看此处)
- 👍 解码指令小,仅40字节,解码器不分配任何内存
- 👍 使用代码汇编器创建指令,例如:
asm.mov(eax, edx)
- 👍 编码器可以用于在任意地址重新编码解码的指令
- 👍 API 获取指令信息,例如读取/写入寄存器、内存和rflags位;CPUID功能标志、控制流信息等
- 👍 支持
#![no_std]
和WebAssembly
- 👍 支持
rustc
1.57.0
或更高版本 - 👍 依赖项少 (
lazy_static
) - 👍 许可证:MIT
使用方法
将其添加到您的 Cargo.toml
[dependencies]
iced-x86 = "1.21.0"
或自定义要使用的功能
[dependencies.iced-x86]
version = "1.21.0"
default-features = false
# See below for all features
features = ["std", "decoder", "masm"]
crate功能标志
您可以在您的 Cargo.toml
文件中启用/禁用这些功能。
decoder
: (👍 默认启用) 启用解码器encoder
: (👍 默认启用) 启用编码器block_encoder
: (👍 默认启用) 启用BlockEncoder
。此功能启用encoder
op_code_info
: (👍 默认启用) 启用获取指令元数据 (OpCodeInfo
)。此功能启用encoder
instr_info
: (👍 默认启用) 启用指令信息代码gas
: (👍 默认启用) 启用 GNU汇编器(AT&T)格式化器intel
: (👍 默认启用) 启用 Intel(XED)格式化器masm
: (👍 默认启用) 启用 masm格式化器nasm
: (👍 默认启用) 启用 nasm格式化器fast_fmt
: (👍 默认启用) 启用SpecializedFormatter<TraitOptions>
(和FastFormatter
) (masm语法),比其他格式化器快 ~3.3 倍(时间包括解码+格式化)。如果格式化速度比能够重新组装格式化指令更重要,或者如果目标是wasm(此格式化器使用更少的代码),请使用它。code_asm
: 启用CodeAssembler
,允许轻松创建指令,例如a.xor(ecx, dword_ptr(edx))
而不是使用更冗长的Instruction::with*()
方法。serde
: 启用序列化支持 (Instruction
)。如果使用不同版本的iced进行序列化和反序列化,则不一定保证工作。std
: (👍 默认启用) 启用std
包。必须定义std
或no_std
,但不能同时定义两者。no_std
:启用#![no_std]
。必须定义std
或no_std
,但不能同时定义。此功能使用alloc
crate。mvex
:启用MVEX
指令(Knights Corner)。您还必须在Decoder
构造函数中传递DecoderOptions::KNC
。exhaustive_enums
:启用穷举枚举,即没有枚举具有#[non_exhaustive]
属性
如何做
- 反汇编(解码和格式化指令)
- 汇编指令
- 使用符号解析器反汇编
- 使用彩色文本反汇编
- 在内存中移动代码(例如,挂钩一个函数)
- 获取指令信息,例如读取/写入寄存器/内存、控制流信息等
- 获取内存操作数的虚拟地址
- 反汇编旧/已弃用的CPU指令
- 尽可能快地反汇编
- 创建和编码指令
反汇编(解码和格式化指令)
此示例使用一个 Decoder
和 Formatter
之一来解码和格式化代码,例如 GasFormatter
、IntelFormatter
、MasmFormatter
、NasmFormatter
、SpecializedFormatter<TraitOptions>
(或 FastFormatter
)。
use iced_x86::{Decoder, DecoderOptions, Formatter, Instruction, NasmFormatter};
/*
This method produces the following output:
00007FFAC46ACDA4 48895C2410 mov [rsp+10h],rbx
00007FFAC46ACDA9 4889742418 mov [rsp+18h],rsi
00007FFAC46ACDAE 55 push rbp
00007FFAC46ACDAF 57 push rdi
00007FFAC46ACDB0 4156 push r14
00007FFAC46ACDB2 488DAC2400FFFFFF lea rbp,[rsp-100h]
00007FFAC46ACDBA 4881EC00020000 sub rsp,200h
00007FFAC46ACDC1 488B0518570A00 mov rax,[rel 7FFA`C475`24E0h]
00007FFAC46ACDC8 4833C4 xor rax,rsp
00007FFAC46ACDCB 488985F0000000 mov [rbp+0F0h],rax
00007FFAC46ACDD2 4C8B052F240A00 mov r8,[rel 7FFA`C474`F208h]
00007FFAC46ACDD9 488D05787C0400 lea rax,[rel 7FFA`C46F`4A58h]
00007FFAC46ACDE0 33FF xor edi,edi
*/
#[allow(dead_code)]
pub(crate) fn how_to_disassemble() {
let bytes = EXAMPLE_CODE;
let mut decoder =
Decoder::with_ip(EXAMPLE_CODE_BITNESS, bytes, EXAMPLE_CODE_RIP, DecoderOptions::NONE);
// Formatters: Masm*, Nasm*, Gas* (AT&T) and Intel* (XED).
// For fastest code, see `SpecializedFormatter` which is ~3.3x faster. Use it if formatting
// speed is more important than being able to re-assemble formatted instructions.
let mut formatter = NasmFormatter::new();
// Change some options, there are many more
formatter.options_mut().set_digit_separator("`");
formatter.options_mut().set_first_operand_char_index(10);
// String implements FormatterOutput
let mut output = String::new();
// Initialize this outside the loop because decode_out() writes to every field
let mut instruction = Instruction::default();
// The decoder also implements Iterator/IntoIterator so you could use a for loop:
// for instruction in &mut decoder { /* ... */ }
// or collect():
// let instructions: Vec<_> = decoder.into_iter().collect();
// but can_decode()/decode_out() is a little faster:
while decoder.can_decode() {
// There's also a decode() method that returns an instruction but that also
// means it copies an instruction (40 bytes):
// instruction = decoder.decode();
decoder.decode_out(&mut instruction);
// Format the instruction ("disassemble" it)
output.clear();
formatter.format(&instruction, &mut output);
// Eg. "00007FFAC46ACDB2 488DAC2400FFFFFF lea rbp,[rsp-100h]"
print!("{:016X} ", instruction.ip());
let start_index = (instruction.ip() - EXAMPLE_CODE_RIP) as usize;
let instr_bytes = &bytes[start_index..start_index + instruction.len()];
for b in instr_bytes.iter() {
print!("{:02X}", b);
}
if instr_bytes.len() < HEXBYTES_COLUMN_BYTE_LENGTH {
for _ in 0..HEXBYTES_COLUMN_BYTE_LENGTH - instr_bytes.len() {
print!(" ");
}
}
println!(" {}", output);
}
}
const HEXBYTES_COLUMN_BYTE_LENGTH: usize = 10;
const EXAMPLE_CODE_BITNESS: u32 = 64;
const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4;
static EXAMPLE_CODE: &[u8] = &[
0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D,
0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05,
0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B,
0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF,
];
汇编指令
这允许您轻松创建指令(例如,a.xor(eax, ecx)?
),而无需使用更冗长的 Instruction::with*()
函数。
这需要使用(默认情况下未启用)code_asm
功能。将其添加到您的 Cargo.toml
[dependencies.iced-x86]
version = "1.21.0"
features = ["code_asm"]
use iced_x86::code_asm::*;
#[allow(dead_code)]
pub(crate) fn how_to_use_code_assembler() -> Result<(), IcedError> {
let mut a = CodeAssembler::new(64)?;
// Anytime you add something to a register (or subtract from it), you create a
// memory operand. You can also call word_ptr(), dword_bcst() etc to create memory
// operands.
let _ = rax; // register
let _ = rax + 0; // memory with no size hint
let _ = ptr(rax); // memory with no size hint
let _ = rax + rcx * 4 - 123; // memory with no size hint
// To create a memory operand with only a displacement or only a base register,
// you can call one of the memory fns:
let _ = qword_ptr(123); // memory with a qword size hint
let _ = dword_bcst(rcx); // memory (broadcast) with a dword size hint
// To add a segment override, call the segment methods:
let _ = ptr(rax).fs(); // fs:[rax]
// Each mnemonic is a method
a.push(rcx)?;
// There are a few exceptions where you must append `_<opcount>` to the mnemonic to
// get the instruction you need:
a.ret()?;
a.ret_1(123)?;
// Use byte_ptr(), word_bcst(), etc to force the arg to a memory operand and to add a
// size hint
a.xor(byte_ptr(rdx+r14*4+123), 0x10)?;
// Prefixes are also methods
a.rep().stosd()?;
// Sometimes, you must add an integer suffix to help the compiler:
a.mov(rax, 0x1234_5678_9ABC_DEF0u64)?;
// Create labels that can be referenced by code
let mut loop_lbl1 = a.create_label();
let mut after_loop1 = a.create_label();
a.mov(ecx, 10)?;
a.set_label(&mut loop_lbl1)?;
// If needed, a zero-bytes instruction can be used as a label but this is optional
a.zero_bytes()?;
a.dec(ecx)?;
a.jp(after_loop1)?;
a.jne(loop_lbl1)?;
a.set_label(&mut after_loop1)?;
// It's possible to reference labels with RIP-relative addressing
let mut skip_data = a.create_label();
let mut data = a.create_label();
a.jmp(skip_data)?;
a.set_label(&mut data)?;
a.db(b"\x90\xCC\xF1\x90")?;
a.set_label(&mut skip_data)?;
a.lea(rax, ptr(data))?;
// AVX512 opmasks, {z}, {sae}, {er} and broadcasting are also supported:
a.vsqrtps(zmm16.k2().z(), dword_bcst(rcx))?;
a.vsqrtps(zmm1.k2().z(), zmm23.rd_sae())?;
// Sometimes, the encoder doesn't know if you want VEX or EVEX encoding.
// You can force EVEX globally like so:
a.set_prefer_vex(false);
a.vucomiss(xmm31, xmm15.sae())?;
a.vucomiss(xmm31, ptr(rcx))?;
// or call vex()/evex() to override the encoding option:
a.evex().vucomiss(xmm31, xmm15.sae())?;
a.vex().vucomiss(xmm15, xmm14)?;
// Encode all added instructions.
// Use `assemble_options()` if you must get the address of a label
let bytes = a.assemble(0x1234_5678)?;
assert_eq!(bytes.len(), 82);
// If you don't want to encode them, you can get all instructions by calling
// one of these methods:
let instrs = a.instructions(); // Get a reference to the internal vec
assert_eq!(instrs.len(), 19);
let instrs = a.take_instructions(); // Take ownership of the vec with all instructions
assert_eq!(instrs.len(), 19);
assert_eq!(a.instructions().len(), 0);
Ok(())
}
使用符号解析器反汇编
创建一个由 Formatter
调用的自定义 SymbolResolver
。
use iced_x86::{
Decoder, DecoderOptions, Formatter, Instruction, MasmFormatter, SymbolResolver, SymbolResult,
};
use std::collections::HashMap;
struct MySymbolResolver {
map: HashMap<u64, String>,
}
impl SymbolResolver for MySymbolResolver {
fn symbol(
&mut self, _instruction: &Instruction, _operand: u32, _instruction_operand: Option<u32>,
address: u64, _address_size: u32,
) -> Option<SymbolResult> {
if let Some(symbol_string) = self.map.get(&address) {
// The 'address' arg is the address of the symbol and doesn't have to be identical
// to the 'address' arg passed to symbol(). If it's different from the input
// address, the formatter will add +N or -N, eg. '[rax+symbol+123]'
Some(SymbolResult::with_str(address, symbol_string.as_str()))
} else {
None
}
}
}
#[allow(dead_code)]
pub(crate) fn how_to_resolve_symbols() {
let bytes = b"\x48\x8B\x8A\xA5\x5A\xA5\x5A";
let mut decoder = Decoder::new(64, bytes, DecoderOptions::NONE);
let instr = decoder.decode();
let mut sym_map: HashMap<u64, String> = HashMap::new();
sym_map.insert(0x5AA5_5AA5, String::from("my_data"));
let mut output = String::new();
let resolver = Box::new(MySymbolResolver { map: sym_map });
// Create a formatter that uses our symbol resolver
let mut formatter = MasmFormatter::with_options(Some(resolver), None);
// This will call the symbol resolver for each immediate / displacement
// it finds in the instruction.
formatter.format(&instr, &mut output);
// Prints: mov rcx,[rdx+my_data]
println!("{}", output);
}
使用彩色文本反汇编
创建一个由 Formatter
调用的自定义 FormatterOutput
。
除非您安装了 colored
crate,否则此示例将无法编译。见下文。
// This example uses crate colored = "2.0.0"
use colored::{ColoredString, Colorize};
use iced_x86::{
Decoder, DecoderOptions, Formatter, FormatterOutput, FormatterTextKind, IntelFormatter,
};
// Custom formatter output that stores the output in a vector.
struct MyFormatterOutput {
vec: Vec<(String, FormatterTextKind)>,
}
impl MyFormatterOutput {
pub fn new() -> Self {
Self { vec: Vec::new() }
}
}
impl FormatterOutput for MyFormatterOutput {
fn write(&mut self, text: &str, kind: FormatterTextKind) {
// This allocates a string. If that's a problem, just call print!() here
// instead of storing the result in a vector.
self.vec.push((String::from(text), kind));
}
}
#[allow(dead_code)]
pub(crate) fn how_to_colorize_text() {
let bytes = EXAMPLE_CODE;
let mut decoder =
Decoder::with_ip(EXAMPLE_CODE_BITNESS, bytes, EXAMPLE_CODE_RIP, DecoderOptions::NONE);
let mut formatter = IntelFormatter::new();
formatter.options_mut().set_first_operand_char_index(8);
let mut output = MyFormatterOutput::new();
for instruction in &mut decoder {
output.vec.clear();
// The formatter calls output.write() which will update vec with text/colors
formatter.format(&instruction, &mut output);
for (text, kind) in output.vec.iter() {
print!("{}", get_color(text.as_str(), *kind));
}
println!();
}
}
fn get_color(s: &str, kind: FormatterTextKind) -> ColoredString {
match kind {
FormatterTextKind::Directive | FormatterTextKind::Keyword => s.bright_yellow(),
FormatterTextKind::Prefix | FormatterTextKind::Mnemonic => s.bright_red(),
FormatterTextKind::Register => s.bright_blue(),
FormatterTextKind::Number => s.bright_cyan(),
_ => s.white(),
}
}
const EXAMPLE_CODE_BITNESS: u32 = 64;
const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4;
static EXAMPLE_CODE: &[u8] = &[
0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D,
0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05,
0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B,
0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF,
];
在内存中移动代码(例如,挂钩一个函数)
使用指令信息 API 和编码器来修补函数以跳转到程序员的函数。
use iced_x86::{
BlockEncoder, BlockEncoderOptions, Code, Decoder, DecoderOptions, FlowControl, Formatter,
IcedError, Instruction, InstructionBlock, NasmFormatter, OpKind,
};
// Decodes instructions from some address, then encodes them starting at some
// other address. This can be used to hook a function. You decode enough instructions
// until you have enough bytes to add a JMP instruction that jumps to your code.
// Your code will then conditionally jump to the original code that you re-encoded.
//
// This code uses the BlockEncoder which will help with some things, eg. converting
// short branches to longer branches if the target is too far away.
//
// 64-bit mode also supports RIP relative addressing, but the encoder can't rewrite
// those to use a longer displacement. If any of the moved instructions have RIP
// relative addressing and it tries to access data too far away, the encoder will fail.
// The easiest solution is to use OS alloc functions that allocate memory close to the
// original code (+/-2GB).
/*
This method produces the following output:
Original code:
00007FFAC46ACDA4 mov [rsp+10h],rbx
00007FFAC46ACDA9 mov [rsp+18h],rsi
00007FFAC46ACDAE push rbp
00007FFAC46ACDAF push rdi
00007FFAC46ACDB0 push r14
00007FFAC46ACDB2 lea rbp,[rsp-100h]
00007FFAC46ACDBA sub rsp,200h
00007FFAC46ACDC1 mov rax,[rel 7FFAC47524E0h]
00007FFAC46ACDC8 xor rax,rsp
00007FFAC46ACDCB mov [rbp+0F0h],rax
00007FFAC46ACDD2 mov r8,[rel 7FFAC474F208h]
00007FFAC46ACDD9 lea rax,[rel 7FFAC46F4A58h]
00007FFAC46ACDE0 xor edi,edi
Original + patched code:
00007FFAC46ACDA4 mov rax,123456789ABCDEF0h
00007FFAC46ACDAE jmp rax
00007FFAC46ACDB0 push r14
00007FFAC46ACDB2 lea rbp,[rsp-100h]
00007FFAC46ACDBA sub rsp,200h
00007FFAC46ACDC1 mov rax,[rel 7FFAC47524E0h]
00007FFAC46ACDC8 xor rax,rsp
00007FFAC46ACDCB mov [rbp+0F0h],rax
00007FFAC46ACDD2 mov r8,[rel 7FFAC474F208h]
00007FFAC46ACDD9 lea rax,[rel 7FFAC46F4A58h]
00007FFAC46ACDE0 xor edi,edi
Moved code:
00007FFAC48ACDA4 mov [rsp+10h],rbx
00007FFAC48ACDA9 mov [rsp+18h],rsi
00007FFAC48ACDAE push rbp
00007FFAC48ACDAF push rdi
00007FFAC48ACDB0 jmp 00007FFAC46ACDB0h
*/
#[allow(dead_code)]
pub(crate) fn how_to_move_code() -> Result<(), IcedError> {
let example_code = EXAMPLE_CODE.to_vec();
println!("Original code:");
disassemble(&example_code, EXAMPLE_CODE_RIP);
let mut decoder = Decoder::with_ip(
EXAMPLE_CODE_BITNESS,
&example_code,
EXAMPLE_CODE_RIP,
DecoderOptions::NONE,
);
// In 64-bit mode, we need 12 bytes to jump to any address:
// mov rax,imm64 // 10
// jmp rax // 2
// We overwrite rax because it's probably not used by the called function.
// In 32-bit mode, a normal JMP is just 5 bytes
let required_bytes = 10 + 2;
let mut total_bytes = 0;
let mut orig_instructions: Vec<Instruction> = Vec::new();
for instr in &mut decoder {
orig_instructions.push(instr);
total_bytes += instr.len() as u32;
if instr.is_invalid() {
panic!("Found garbage");
}
if total_bytes >= required_bytes {
break;
}
match instr.flow_control() {
FlowControl::Next => {}
FlowControl::UnconditionalBranch => {
if instr.op0_kind() == OpKind::NearBranch64 {
let _target = instr.near_branch_target();
// You could check if it's just jumping forward a few bytes and follow it
// but this is a simple example so we'll fail.
}
panic!("Not supported by this simple example");
}
FlowControl::IndirectBranch
| FlowControl::ConditionalBranch
| FlowControl::Return
| FlowControl::Call
| FlowControl::IndirectCall
| FlowControl::Interrupt
| FlowControl::XbeginXabortXend
| FlowControl::Exception => panic!("Not supported by this simple example"),
}
}
if total_bytes < required_bytes {
panic!("Not enough bytes!");
}
assert!(!orig_instructions.is_empty());
// Create a JMP instruction that branches to the original code, except those instructions
// that we'll re-encode. We don't need to do it if it already ends in 'ret'
let (jmp_back_addr, add) = {
let last_instr = orig_instructions.last().unwrap();
if last_instr.flow_control() != FlowControl::Return {
(last_instr.next_ip(), true)
} else {
(last_instr.next_ip(), false)
}
};
if add {
orig_instructions.push(Instruction::with_branch(Code::Jmp_rel32_64, jmp_back_addr)?);
}
// Relocate the code to some new location. It can fix short/near branches and
// convert them to short/near/long forms if needed. This also works even if it's a
// jrcxz/loop/loopcc instruction which only have short forms.
//
// It can currently only fix RIP relative operands if the new location is within 2GB
// of the target data location.
//
// Note that a block is not the same thing as a basic block. A block can contain any
// number of instructions, including any number of branch instructions. One block
// should be enough unless you must relocate different blocks to different locations.
let relocated_base_address = EXAMPLE_CODE_RIP + 0x20_0000;
let block = InstructionBlock::new(&orig_instructions, relocated_base_address);
// This method can also encode more than one block but that's rarely needed, see above comment.
let result = match BlockEncoder::encode(decoder.bitness(), block, BlockEncoderOptions::NONE) {
Err(err) => panic!("{}", err),
Ok(result) => result,
};
let new_code = result.code_buffer;
// Patch the original code. Pretend that we use some OS API to write to memory...
// We could use the BlockEncoder/Encoder for this but it's easy to do yourself too.
// This is 'mov rax,imm64; jmp rax'
const YOUR_FUNC: u64 = 0x1234_5678_9ABC_DEF0; // Address of your code
let mut example_code = example_code.to_vec();
example_code[0] = 0x48; // \ 'MOV RAX,imm64'
example_code[1] = 0xB8; // /
let mut v = YOUR_FUNC;
for p in &mut example_code[2..10] {
*p = v as u8;
v >>= 8;
}
example_code[10] = 0xFF; // \ JMP RAX
example_code[11] = 0xE0; // /
// Disassemble it
println!("Original + patched code:");
disassemble(&example_code, EXAMPLE_CODE_RIP);
// Disassemble the moved code
println!("Moved code:");
disassemble(&new_code, relocated_base_address);
Ok(())
}
fn disassemble(data: &[u8], ip: u64) {
let mut formatter = NasmFormatter::new();
let mut output = String::new();
let mut decoder = Decoder::with_ip(EXAMPLE_CODE_BITNESS, data, ip, DecoderOptions::NONE);
for instruction in &mut decoder {
output.clear();
formatter.format(&instruction, &mut output);
println!("{:016X} {}", instruction.ip(), output);
}
println!();
}
const EXAMPLE_CODE_BITNESS: u32 = 64;
const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4;
static EXAMPLE_CODE: &[u8] = &[
0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D,
0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05,
0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B,
0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF,
];
获取指令信息,例如读取/写入寄存器/内存、控制流信息等
演示如何获取使用的寄存器/内存和其他信息。它使用 Instruction
方法和一个 InstructionInfoFactory
来获取此信息。
use iced_x86::{
ConditionCode, Decoder, DecoderOptions, Instruction, InstructionInfoFactory, OpKind, RflagsBits,
};
/*
This method produces the following output:
00007FFAC46ACDA4 mov [rsp+10h],rbx
OpCode: o64 89 /r
Instruction: MOV r/m64, r64
Encoding: Legacy
Mnemonic: Mov
Code: Mov_rm64_r64
CpuidFeature: X64
FlowControl: Next
Displacement offset = 4, size = 1
Memory size: 8
Op0Access: Write
Op1Access: Read
Op0: r64_or_mem
Op1: r64_reg
Used reg: RSP:Read
Used reg: RBX:Read
Used mem: [SS:RSP+0x10;UInt64;Write]
00007FFAC46ACDA9 mov [rsp+18h],rsi
OpCode: o64 89 /r
Instruction: MOV r/m64, r64
Encoding: Legacy
Mnemonic: Mov
Code: Mov_rm64_r64
CpuidFeature: X64
FlowControl: Next
Displacement offset = 4, size = 1
Memory size: 8
Op0Access: Write
Op1Access: Read
Op0: r64_or_mem
Op1: r64_reg
Used reg: RSP:Read
Used reg: RSI:Read
Used mem: [SS:RSP+0x18;UInt64;Write]
00007FFAC46ACDAE push rbp
OpCode: o64 50+ro
Instruction: PUSH r64
Encoding: Legacy
Mnemonic: Push
Code: Push_r64
CpuidFeature: X64
FlowControl: Next
SP Increment: -8
Op0Access: Read
Op0: r64_opcode
Used reg: RBP:Read
Used reg: RSP:ReadWrite
Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write]
00007FFAC46ACDAF push rdi
OpCode: o64 50+ro
Instruction: PUSH r64
Encoding: Legacy
Mnemonic: Push
Code: Push_r64
CpuidFeature: X64
FlowControl: Next
SP Increment: -8
Op0Access: Read
Op0: r64_opcode
Used reg: RDI:Read
Used reg: RSP:ReadWrite
Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write]
00007FFAC46ACDB0 push r14
OpCode: o64 50+ro
Instruction: PUSH r64
Encoding: Legacy
Mnemonic: Push
Code: Push_r64
CpuidFeature: X64
FlowControl: Next
SP Increment: -8
Op0Access: Read
Op0: r64_opcode
Used reg: R14:Read
Used reg: RSP:ReadWrite
Used mem: [SS:RSP+0xFFFFFFFFFFFFFFF8;UInt64;Write]
00007FFAC46ACDB2 lea rbp,[rsp-100h]
OpCode: o64 8D /r
Instruction: LEA r64, m
Encoding: Legacy
Mnemonic: Lea
Code: Lea_r64_m
CpuidFeature: X64
FlowControl: Next
Displacement offset = 4, size = 4
Op0Access: Write
Op1Access: NoMemAccess
Op0: r64_reg
Op1: mem
Used reg: RBP:Write
Used reg: RSP:Read
00007FFAC46ACDBA sub rsp,200h
OpCode: o64 81 /5 id
Instruction: SUB r/m64, imm32
Encoding: Legacy
Mnemonic: Sub
Code: Sub_rm64_imm32
CpuidFeature: X64
FlowControl: Next
Immediate offset = 3, size = 4
RFLAGS Written: OF, SF, ZF, AF, CF, PF
RFLAGS Modified: OF, SF, ZF, AF, CF, PF
Op0Access: ReadWrite
Op1Access: Read
Op0: r64_or_mem
Op1: imm32sex64
Used reg: RSP:ReadWrite
00007FFAC46ACDC1 mov rax,[7FFAC47524E0h]
OpCode: o64 8B /r
Instruction: MOV r64, r/m64
Encoding: Legacy
Mnemonic: Mov
Code: Mov_r64_rm64
CpuidFeature: X64
FlowControl: Next
Displacement offset = 3, size = 4
Memory size: 8
Op0Access: Write
Op1Access: Read
Op0: r64_reg
Op1: r64_or_mem
Used reg: RAX:Write
Used mem: [DS:0x7FFAC47524E0;UInt64;Read]
00007FFAC46ACDC8 xor rax,rsp
OpCode: o64 33 /r
Instruction: XOR r64, r/m64
Encoding: Legacy
Mnemonic: Xor
Code: Xor_r64_rm64
CpuidFeature: X64
FlowControl: Next
RFLAGS Written: SF, ZF, PF
RFLAGS Cleared: OF, CF
RFLAGS Undefined: AF
RFLAGS Modified: OF, SF, ZF, AF, CF, PF
Op0Access: ReadWrite
Op1Access: Read
Op0: r64_reg
Op1: r64_or_mem
Used reg: RAX:ReadWrite
Used reg: RSP:Read
00007FFAC46ACDCB mov [rbp+0F0h],rax
OpCode: o64 89 /r
Instruction: MOV r/m64, r64
Encoding: Legacy
Mnemonic: Mov
Code: Mov_rm64_r64
CpuidFeature: X64
FlowControl: Next
Displacement offset = 3, size = 4
Memory size: 8
Op0Access: Write
Op1Access: Read
Op0: r64_or_mem
Op1: r64_reg
Used reg: RBP:Read
Used reg: RAX:Read
Used mem: [SS:RBP+0xF0;UInt64;Write]
00007FFAC46ACDD2 mov r8,[7FFAC474F208h]
OpCode: o64 8B /r
Instruction: MOV r64, r/m64
Encoding: Legacy
Mnemonic: Mov
Code: Mov_r64_rm64
CpuidFeature: X64
FlowControl: Next
Displacement offset = 3, size = 4
Memory size: 8
Op0Access: Write
Op1Access: Read
Op0: r64_reg
Op1: r64_or_mem
Used reg: R8:Write
Used mem: [DS:0x7FFAC474F208;UInt64;Read]
00007FFAC46ACDD9 lea rax,[7FFAC46F4A58h]
OpCode: o64 8D /r
Instruction: LEA r64, m
Encoding: Legacy
Mnemonic: Lea
Code: Lea_r64_m
CpuidFeature: X64
FlowControl: Next
Displacement offset = 3, size = 4
Op0Access: Write
Op1Access: NoMemAccess
Op0: r64_reg
Op1: mem
Used reg: RAX:Write
00007FFAC46ACDE0 xor edi,edi
OpCode: o32 33 /r
Instruction: XOR r32, r/m32
Encoding: Legacy
Mnemonic: Xor
Code: Xor_r32_rm32
CpuidFeature: INTEL386
FlowControl: Next
RFLAGS Cleared: OF, SF, CF
RFLAGS Set: ZF, PF
RFLAGS Undefined: AF
RFLAGS Modified: OF, SF, ZF, AF, CF, PF
Op0Access: Write
Op1Access: None
Op0: r32_reg
Op1: r32_or_mem
Used reg: RDI:Write
*/
#[allow(dead_code)]
pub(crate) fn how_to_get_instruction_info() {
let mut decoder = Decoder::with_ip(
EXAMPLE_CODE_BITNESS,
EXAMPLE_CODE,
EXAMPLE_CODE_RIP,
DecoderOptions::NONE,
);
// Use a factory to create the instruction info if you need register and
// memory usage. If it's something else, eg. encoding, flags, etc, there
// are Instruction methods that can be used instead.
let mut info_factory = InstructionInfoFactory::new();
let mut instr = Instruction::default();
while decoder.can_decode() {
decoder.decode_out(&mut instr);
// Gets offsets in the instruction of the displacement and immediates and their sizes.
// This can be useful if there are relocations in the binary. The encoder has a similar
// method. This method must be called after decode() and you must pass in the last
// instruction decode() returned.
let offsets = decoder.get_constant_offsets(&instr);
// For quick hacks, it's fine to use the Display trait to format an instruction,
// but for real code, use a formatter, eg. MasmFormatter. See other examples.
println!("{:016X} {}", instr.ip(), instr);
let op_code = instr.op_code();
let info = info_factory.info(&instr);
let fpu_info = instr.fpu_stack_increment_info();
println!(" OpCode: {}", op_code.op_code_string());
println!(" Instruction: {}", op_code.instruction_string());
println!(" Encoding: {:?}", instr.encoding());
println!(" Mnemonic: {:?}", instr.mnemonic());
println!(" Code: {:?}", instr.code());
println!(
" CpuidFeature: {}",
instr
.cpuid_features()
.iter()
.map(|&a| format!("{:?}", a))
.collect::<Vec<String>>()
.join(" and ")
);
println!(" FlowControl: {:?}", instr.flow_control());
if fpu_info.writes_top() {
if fpu_info.increment() == 0 {
println!(" FPU TOP: the instruction overwrites TOP");
} else {
println!(" FPU TOP inc: {}", fpu_info.increment());
}
println!(
" FPU TOP cond write: {}",
if fpu_info.conditional() { "true" } else { "false" }
);
}
if offsets.has_displacement() {
println!(
" Displacement offset = {}, size = {}",
offsets.displacement_offset(),
offsets.displacement_size()
);
}
if offsets.has_immediate() {
println!(
" Immediate offset = {}, size = {}",
offsets.immediate_offset(),
offsets.immediate_size()
);
}
if offsets.has_immediate2() {
println!(
" Immediate #2 offset = {}, size = {}",
offsets.immediate_offset2(),
offsets.immediate_size2()
);
}
if instr.is_stack_instruction() {
println!(" SP Increment: {}", instr.stack_pointer_increment());
}
if instr.condition_code() != ConditionCode::None {
println!(" Condition code: {:?}", instr.condition_code());
}
if instr.rflags_read() != RflagsBits::NONE {
println!(" RFLAGS Read: {}", flags(instr.rflags_read()));
}
if instr.rflags_written() != RflagsBits::NONE {
println!(" RFLAGS Written: {}", flags(instr.rflags_written()));
}
if instr.rflags_cleared() != RflagsBits::NONE {
println!(" RFLAGS Cleared: {}", flags(instr.rflags_cleared()));
}
if instr.rflags_set() != RflagsBits::NONE {
println!(" RFLAGS Set: {}", flags(instr.rflags_set()));
}
if instr.rflags_undefined() != RflagsBits::NONE {
println!(" RFLAGS Undefined: {}", flags(instr.rflags_undefined()));
}
if instr.rflags_modified() != RflagsBits::NONE {
println!(" RFLAGS Modified: {}", flags(instr.rflags_modified()));
}
if instr.op_kinds().any(|op_kind| op_kind == OpKind::Memory) {
let size = instr.memory_size().size();
if size != 0 {
println!(" Memory size: {}", size);
}
}
for i in 0..instr.op_count() {
println!(" Op{}Access: {:?}", i, info.op_access(i));
}
for i in 0..op_code.op_count() {
println!(" Op{}: {:?}", i, op_code.op_kind(i));
}
for reg_info in info.used_registers() {
println!(" Used reg: {:?}", reg_info);
}
for mem_info in info.used_memory() {
println!(" Used mem: {:?}", mem_info);
}
}
}
fn flags(rf: u32) -> String {
fn append(sb: &mut String, s: &str) {
if !sb.is_empty() {
sb.push_str(", ");
}
sb.push_str(s);
}
let mut sb = String::new();
if (rf & RflagsBits::OF) != 0 {
append(&mut sb, "OF");
}
if (rf & RflagsBits::SF) != 0 {
append(&mut sb, "SF");
}
if (rf & RflagsBits::ZF) != 0 {
append(&mut sb, "ZF");
}
if (rf & RflagsBits::AF) != 0 {
append(&mut sb, "AF");
}
if (rf & RflagsBits::CF) != 0 {
append(&mut sb, "CF");
}
if (rf & RflagsBits::PF) != 0 {
append(&mut sb, "PF");
}
if (rf & RflagsBits::DF) != 0 {
append(&mut sb, "DF");
}
if (rf & RflagsBits::IF) != 0 {
append(&mut sb, "IF");
}
if (rf & RflagsBits::AC) != 0 {
append(&mut sb, "AC");
}
if (rf & RflagsBits::UIF) != 0 {
append(&mut sb, "UIF");
}
if sb.is_empty() {
sb.push_str("<empty>");
}
sb
}
const EXAMPLE_CODE_BITNESS: u32 = 64;
const EXAMPLE_CODE_RIP: u64 = 0x0000_7FFA_C46A_CDA4;
static EXAMPLE_CODE: &[u8] = &[
0x48, 0x89, 0x5C, 0x24, 0x10, 0x48, 0x89, 0x74, 0x24, 0x18, 0x55, 0x57, 0x41, 0x56, 0x48, 0x8D,
0xAC, 0x24, 0x00, 0xFF, 0xFF, 0xFF, 0x48, 0x81, 0xEC, 0x00, 0x02, 0x00, 0x00, 0x48, 0x8B, 0x05,
0x18, 0x57, 0x0A, 0x00, 0x48, 0x33, 0xC4, 0x48, 0x89, 0x85, 0xF0, 0x00, 0x00, 0x00, 0x4C, 0x8B,
0x05, 0x2F, 0x24, 0x0A, 0x00, 0x48, 0x8D, 0x05, 0x78, 0x7C, 0x04, 0x00, 0x33, 0xFF,
];
获取内存操作数的虚拟地址
use iced_x86::{Decoder, DecoderOptions, Register};
#[allow(dead_code)]
pub(crate) fn how_to_get_virtual_address() {
// add [rdi+r12*8-5AA5EDCCh],esi
let bytes = b"\x42\x01\xB4\xE7\x34\x12\x5A\xA5";
let mut decoder = Decoder::new(64, bytes, DecoderOptions::NONE);
let instr = decoder.decode();
let va = instr.virtual_address(0, 0, |register, _element_index, _element_size| {
match register {
// The base address of ES, CS, SS and DS is always 0 in 64-bit mode
Register::ES | Register::CS | Register::SS | Register::DS => Some(0),
Register::RDI => Some(0x0000_0000_1000_0000),
Register::R12 => Some(0x0000_0004_0000_0000),
_ => None,
}
});
assert_eq!(va, Some(0x0000_001F_B55A_1234));
}
反汇编旧/已弃用的CPU指令
use iced_x86::{Decoder, DecoderOptions, Formatter, Instruction, NasmFormatter};
/*
This method produces the following output:
731E0A03 bndmov bnd1, [eax]
731E0A07 mov tr3, esi
731E0A0A rdshr [eax]
731E0A0D dmint
731E0A0F svdc [eax], cs
731E0A12 cpu_read
731E0A14 pmvzb mm1, [eax]
731E0A17 frinear
731E0A19 altinst
*/
#[allow(dead_code)]
pub(crate) fn how_to_disassemble_old_instrs() {
#[rustfmt::skip]
let bytes = &[
// bndmov bnd1,[eax]
0x66, 0x0F, 0x1A, 0x08,
// mov tr3,esi
0x0F, 0x26, 0xDE,
// rdshr [eax]
0x0F, 0x36, 0x00,
// dmint
0x0F, 0x39,
// svdc [eax],cs
0x0F, 0x78, 0x08,
// cpu_read
0x0F, 0x3D,
// pmvzb mm1,[eax]
0x0F, 0x58, 0x08,
// frinear
0xDF, 0xFC,
// altinst
0x0F, 0x3F,
];
// Enable decoding of Cyrix/Geode instructions, Centaur ALTINST, MOV to/from TR
// and MPX instructions.
// There are other options to enable other instructions such as UMOV, KNC, etc.
// These are deprecated instructions or only used by old CPUs so they're not
// enabled by default. Some newer instructions also use the same opcodes as
// some of these old instructions.
const DECODER_OPTIONS: u32 = DecoderOptions::MPX
| DecoderOptions::MOV_TR
| DecoderOptions::CYRIX
| DecoderOptions::CYRIX_DMI
| DecoderOptions::ALTINST;
let mut decoder = Decoder::with_ip(32, bytes, 0x731E_0A03, DECODER_OPTIONS);
let mut formatter = NasmFormatter::new();
formatter.options_mut().set_space_after_operand_separator(true);
let mut output = String::new();
let mut instruction = Instruction::default();
while decoder.can_decode() {
decoder.decode_out(&mut instruction);
output.clear();
formatter.format(&instruction, &mut output);
println!("{:08X} {}", instruction.ip(), &output);
}
}
尽可能快地反汇编
为了实现最快的反汇编,您应将 ENABLE_DB_DW_DD_DQ
设置为 false
,并且还需要重写不安全的 verify_output_has_enough_bytes_left()
并返回 false
。
use iced_x86::{
Decoder, DecoderOptions, Instruction, SpecializedFormatter, SpecializedFormatterTraitOptions,
};
#[allow(dead_code)]
pub(crate) fn how_to_disassemble_really_fast() {
struct MyTraitOptions;
impl SpecializedFormatterTraitOptions for MyTraitOptions {
// If you never create a db/dw/dd/dq 'instruction', we don't need this feature.
const ENABLE_DB_DW_DD_DQ: bool = false;
// For a few percent faster code, you can also override `verify_output_has_enough_bytes_left()` and return `false`
// unsafe fn verify_output_has_enough_bytes_left() -> bool {
// false
// }
}
type MyFormatter = SpecializedFormatter<MyTraitOptions>;
// Assume this is a big slice and not just one instruction
let bytes = b"\x62\xF2\x4F\xDD\x72\x50\x01";
let mut decoder = Decoder::new(64, bytes, DecoderOptions::NONE);
let mut output = String::new();
let mut instruction = Instruction::default();
let mut formatter = MyFormatter::new();
while decoder.can_decode() {
decoder.decode_out(&mut instruction);
output.clear();
formatter.format(&instruction, &mut output);
// do something with 'output' here, eg.:
// println!("{}", output);
}
}
此外,还需将其添加到您的 Cargo.toml
文件中
[profile.release]
codegen-units = 1
lto = true
opt-level = 3
创建和编码指令
注意:直接使用 CodeAssembler
会更加简单,请参考上面的示例。此示例展示了如何在不使用它的情况下创建指令。
此示例使用 BlockEncoder
来编码创建的 Instruction
指令。
use iced_x86::{
BlockEncoder, BlockEncoderOptions, Code, Decoder, DecoderOptions, Formatter, GasFormatter,
IcedError, Instruction, InstructionBlock, MemoryOperand, Register,
};
#[allow(dead_code)]
pub(crate) fn how_to_encode_instructions() -> Result<(), IcedError> {
let bitness = 64;
// All created instructions get an IP of 0. The label id is just an IP.
// The branch instruction's *target* IP should be equal to the IP of the
// target instruction.
let mut label_id: u64 = 1;
let mut create_label = || {
let id = label_id;
label_id += 1;
id
};
fn add_label(id: u64, mut instruction: Instruction) -> Instruction {
instruction.set_ip(id);
instruction
}
let label1 = create_label();
let mut instructions = vec![
Instruction::with1(Code::Push_r64, Register::RBP)?,
Instruction::with1(Code::Push_r64, Register::RDI)?,
Instruction::with1(Code::Push_r64, Register::RSI)?,
Instruction::with2(Code::Sub_rm64_imm32, Register::RSP, 0x50)?,
Instruction::with(Code::VEX_Vzeroupper),
Instruction::with2(
Code::Lea_r64_m,
Register::RBP,
MemoryOperand::with_base_displ(Register::RSP, 0x60),
)?,
Instruction::with2(Code::Mov_r64_rm64, Register::RSI, Register::RCX)?,
Instruction::with2(
Code::Lea_r64_m,
Register::RDI,
MemoryOperand::with_base_displ(Register::RBP, -0x38),
)?,
Instruction::with2(Code::Mov_r32_imm32, Register::ECX, 0x0A)?,
Instruction::with2(Code::Xor_r32_rm32, Register::EAX, Register::EAX)?,
Instruction::with_rep_stosd(bitness)?,
Instruction::with2(Code::Cmp_rm64_imm32, Register::RSI, 0x1234_5678)?,
// Create a branch instruction that references label1
Instruction::with_branch(Code::Jne_rel32_64, label1)?,
Instruction::with(Code::Nopd),
// Add the instruction that is the target of the branch
add_label(label1, Instruction::with2(Code::Xor_r32_rm32, Register::R15D, Register::R15D)?),
];
// Create an instruction that accesses some data using an RIP relative memory operand
let data1 = create_label();
instructions.push(Instruction::with2(
Code::Lea_r64_m,
Register::R14,
MemoryOperand::with_base_displ(Register::RIP, data1 as i64),
)?);
instructions.push(Instruction::with(Code::Nopd));
let raw_data: &[u8] = &[0x12, 0x34, 0x56, 0x78];
instructions.push(add_label(data1, Instruction::with_declare_byte(raw_data)?));
// Use BlockEncoder to encode a block of instructions. This block can contain any
// number of branches and any number of instructions. It does support encoding more
// than one block but it's rarely needed.
// It uses Encoder to encode all instructions.
// If the target of a branch is too far away, it can fix it to use a longer branch.
// This can be disabled by enabling some BlockEncoderOptions flags.
let target_rip = 0x0000_1248_FC84_0000;
let block = InstructionBlock::new(&instructions, target_rip);
let result = match BlockEncoder::encode(bitness, block, BlockEncoderOptions::NONE) {
Err(error) => panic!("Failed to encode it: {}", error),
Ok(result) => result,
};
// Now disassemble the encoded instructions. Note that the 'jmp near'
// instruction was turned into a 'jmp short' instruction because we
// didn't disable branch optimizations.
let bytes = result.code_buffer;
let mut output = String::new();
let bytes_code = &bytes[0..bytes.len() - raw_data.len()];
let bytes_data = &bytes[bytes.len() - raw_data.len()..];
let mut decoder = Decoder::with_ip(bitness, bytes_code, target_rip, DecoderOptions::NONE);
let mut formatter = GasFormatter::new();
formatter.options_mut().set_first_operand_char_index(8);
for instruction in &mut decoder {
output.clear();
formatter.format(&instruction, &mut output);
println!("{:016X} {}", instruction.ip(), output);
}
let db = Instruction::with_declare_byte(bytes_data)?;
output.clear();
formatter.format(&db, &mut output);
println!("{:016X} {}", decoder.ip(), output);
Ok(())
}
/*
Output:
00001248FC840000 push %rbp
00001248FC840001 push %rdi
00001248FC840002 push %rsi
00001248FC840003 sub $0x50,%rsp
00001248FC84000A vzeroupper
00001248FC84000D lea 0x60(%rsp),%rbp
00001248FC840012 mov %rcx,%rsi
00001248FC840015 lea -0x38(%rbp),%rdi
00001248FC840019 mov $0xA,%ecx
00001248FC84001E xor %eax,%eax
00001248FC840020 rep stos %eax,(%rdi)
00001248FC840022 cmp $0x12345678,%rsi
00001248FC840029 jne 0x00001248FC84002C
00001248FC84002B nop
00001248FC84002C xor %r15d,%r15d
00001248FC84002F lea 0x1248FC840037,%r14
00001248FC840036 nop
00001248FC840037 .byte 0x12,0x34,0x56,0x78
*/
最低支持的 rustc
版本
iced-x86 支持 rustc
1.57.0
或更高版本。此版本在 CI 构建中进行了检查,其中使用了最低支持的版本和最新稳定版本来构建源代码并运行测试。
提高 rustc
的最低支持版本被认为是一个小的破坏性变更。iced-x86 的小版本号将增加。