Using Cf, it's easy to define complex structures with simple oneliners. See pf? for more information. Remember that all these C commands can also be accessed from the visual mode by pressing the d (data conversion) key. Note that unlike t commands Cf doesn't change analysis results. It is only a visual boon.
Sometimes just adding a single line of comments is not enough, in this case radare2 allows you to create a link for a particular text file. You can use it with CC, command or by pressing , key in the visual mode. This will open an $EDITOR to create a new file, or if filename does exist, just will create a link. It will be shown in the disassembly comments:
[0x00003af7 11% 290 /bin/ls]> pd $r @ main+55 # 0x3af7
│0x00003af7 call sym.imp.setlocale ;[1] ; ,(locale-help.txt) ; char *setlocale(int category, const char *locale)
│0x00003afc lea rsi, str.usr_share_locale ; 0x179cc ; "/usr/share/locale"
│0x00003b03 lea rdi, [0x000179b2] ; "coreutils"
│0x00003b0a call sym.imp.bindtextdomain ;[2] ; char *bindtextdomain(char *domainname, char *dirname)
Note ,(locale-help.txt) appeared in the comments, if we press , again in the visual mode, it will open the file. Using this mechanism we can create a long descriptions of some particular places in disassembly, link datasheets or related articles.
ESIL
ESIL stands for 'Evaluable Strings Intermediate Language'. It aims to describe a Forth-like representation for every target CPU opcode semantics. ESIL representations can be evaluated (interpreted) in order to emulate individual instructions. Each command of an ESIL expression is separated by a comma. Its virtual machine can be described as this:
while ((word=haveCommand())) {
if (word.isOperator()) {
esilOperators[word](esil);
} else {
esil.push (word);
}
nextCommand();
}
As we can see ESIL uses a stack-based interpreter similar to what is commonly used for calculators. You have two categories of inputs: values and operators. A value simply gets pushed on the stack, an operator then pops values (its arguments if you will) off the stack, performs its operation and pushes its results (if any) back on. We can think of ESIL as a post-fix notation of the operations we want to do.
So let's see an example:
4,esp,-=,ebp,esp,=[4]
Can you guess what this is? If we take this post-fix notation and transform it back to in-fix we get
esp -= 4
4bytes(dword) [esp] = ebp
We can see that this corresponds to the x86 instruction push ebp! Isn't that cool? The aim is to be able to express most of the common operations performed by CPUs, like binary arithmetic operations, memory loads and stores, processing syscalls. This way if we can transform the instructions to ESIL we can see what a program does while it is running even for the most cryptic architectures you definitely don't have a device to debug on for.
Using ESIL
r2's visual mode is great to inspect the ESIL evaluations.
There are 3 environment variables that are important for watching what a program does:
[0x00000000]> e emu.str = true
asm.emu tells r2 if you want ESIL information to be displayed. If it is set to true, you will see comments appear to the right of your disassembly that tell you how the contents of registers and memory addresses are changed by the current instruction. For example, if you have an instruction that subtracts a value from a register it tells you what the value was before and what it becomes after. This is super useful so you don't have to sit there yourself and track which value goes where.
One problem with this is that it is a lot of information to take in at once and sometimes you simply don't need it. r2 has a nice compromise for this. That is what the emu.str variable is for (asm.emustr on <= 2.2). Instead of this super verbose output with every register value, this only adds really useful information to the output, e.g., strings that are found at addresses a program uses or whether a jump is likely to be taken or not.
The third important variable is asm.esil. This switches your disassembly to no longer show you the actual disassembled instructions, but instead now shows you corresponding ESIL expressions that describe what the instruction does. So if you want to take a look at how instructions are expressed in ESIL simply set "asm.esil" to true.
[0x00000000]> e asm.esil = true
In visual mode you can also toggle this by simply typing O.
ESIL Commands
• "ae" : Evaluate ESIL expression.
[0x00000000]> "ae 1,1,+"
0x2
[0x00000000]>
• "aes" : ESIL Step.
[0x00000000]> aes
[0x00000000]>10aes
• "aeso" : ESIL Step Over.
[0x00000000]> aeso
[0x00000000]>10aeso
• "aesu" : ESIL Step Until.
[0x00001000]> aesu 0x1035
ADDR BREAK
[0x00001019]>
• "ar" : Show/modify ESIL registry.
[0x00001ec7]> ar r_00 = 0x1035
[0x00001ec7]> ar r_00
0x00001035
[0x00001019]>
ESIL Instruction Set
Here is the complete instruction set used by the ESIL VM:
ESIL Opcode | Operands | Name | Operation | example |
---|---|---|---|---|
TRAP | src | Trap | Trap signal | |
$ | src | Interrupt | interrupt | 0x80,$ |
() | src | Syscall | syscall | rax,() |
$$ | src | Instruction address | Get address of current instruction stack=instruction address | |
== | src,dst | Compare | stack = (dst == src) ; update_eflags(dst - src) | |
< | src,dst | Smaller (signed comparison) | stack = (dst < src) ; update_eflags(dst - src) | [0x0000000]> "ae 1,5,<" 0x0 > "ae 5,5" 0x0" |
<= | src,dst | Smaller or Equal (signed comparison) | stack = (dst <= src) ; update_eflags(dst - src) | [0x0000000]> "ae 1,5,<" 0x0 > "ae 5,5" 0x1" |
> | src,dst | Bigger (signed comparison) | stack = (dst > src) ; update_eflags(dst - src) | > "ae 1,5,>" 0x1 > "ae 5,5,>" 0x0 |
>= | src,dst | Bigger or Equal (signed comparison) | stack = (dst >= src) ; update_eflags(dst - src) | > "ae 1,5,>=" 0x1 > "ae 5,5,>=" 0x1 |
<< | src,dst | Shift Left | stack = dst << src | > "ae 1,1,<<" 0x2 > "ae 2,1,<<" 0x4 |
>> | src,dst | Shift Right | stack = dst >> src | > "ae 1,4,>>" 0x2 > "ae 2,4,>>" 0x1 |
<<< | src,dst | Rotate Left | stack=dst ROL src | > "ae 31,1,<<<" 0x80000000 > "ae 32,1,<<<" 0x1 |
>>> | src,dst | Rotate Right | stack=dst ROR src | > "ae 1,1,>>>" 0x80000000 > "ae 32,1,>>>"0x1 |
& | src,dst | AND | stack = dst & src | > "ae 1,1,&" 0x1 > "ae 1,0,&" 0x0 > "ae 0,1,&" 0x0 > "ae 0,0,&" 0x0 |
| | src,dst | OR | stack = dst | src | > "ae 1,1,|" 0x1 > "ae 1,0,|" 0x1 > "ae 0,1,|" 0x1 > "ae 0,0,|" 0x0 |
^ | src,dst | XOR | stack = dst ^src | > "ae 1,1,^" 0x0 > "ae 1,0,^" 0x1 > "ae 0,1,^" 0x1 > "ae 0,0,^" 0x0 |
+ | src,dst | ADD | stack = dst + src | > "ae 3,4,+" 0x7 > "ae 5,5,+" 0xa |
- | src,dst | SUB | stack = dst - src | > "ae 3,4,-" 0x1 > "ae 5,5,-" 0x0 > "ae 4,3,-" 0xffffffffffffffff |
* | src,dst | MUL | stack = dst * src | > "ae 3,4,*" 0xc > "ae 5,5,*" 0x19 |
/ | src,dst | DIV | stack = dst / src | > "ae 2,4,/" 0x2 > "ae 5,5,/" 0x1 > "ae 5,9,/" 0x1 |
% | src,dst | MOD | stack = dst % src | > "ae 2,4,%" 0x0 > "ae 5,5,%" 0x0 > "ae 5,9,%" 0x4 |
~ | bits,src | SIGNEXT | stack = src sign extended | > "ae 8,0x80,~" 0xffffffffffffff80 |
~/ | src,dst | SIGNED DIV | stack = dst / src (signed) | > "ae 2,-4,~/" 0xfffffffffffffffe |
~% | src,dst | SIGNED MOD | stack = dst % src (signed) | > "ae 2,-5,~%" 0xffffffffffffffff |
! | src | NEG | stack = !!!src | > "ae 1,!" 0x0 > "ae 4,!" 0x0 > "ae 0,!" 0x1 |
++ | src | INC | stack = src++ | > ar r_00=0;ar r_00 0x00000000 > "ae r_00,++" 0x1 > ar r_00 0x00000000 > "ae 1,++" 0x2 |
-- | src | DEC | stack = src-- | > ar r_00=5;ar r_00 0x00000005> "ae r_00,--" 0x4 > ar r_00 0x00000005 > "ae 5,--" 0x4 |
= | src,reg | EQU | reg = src | > "ae 3,r_00,=" > aer r_00 0x00000003 > "ae r_00,r_01,=" > aer r_01 0x00000003 |
:= | src,reg | weak EQU | reg = src without side effects | > "ae 3,r_00,:=" > aer r_00 0x00000003 > "ae r_00,r_01,:=" > aer r_01 0x00000003 |
+= | src,reg | ADD eq | reg = reg + src | > ar r_01=5;ar r_00=0;ar r_00 0x00000000 > "ae r_01,r_00,+=" > ar r_00 0x00000005 > "ae 5,r_00,+=" > ar r_00 0x0000000a |
-= | src,reg | SUB eq | reg = reg - src | > "ae r_01,r_00,-=" > ar r_00 0x00000004 > "ae 3,r_00,-=" > ar r_00 0x00000001 |
*= | src,reg | MUL eq | reg = reg * src | > ar r_01=3;ar r_00=5;ar r_00 0x00000005> "ae r_01,r_00,*=" > ar r_00 0x0000000f > "ae 2,r_00,*=" > ar r_00 0x0000001e |
/= | src,reg | DIV eq | reg = reg / src | > ar r_01=3;ar r_00=6;ar r_00 0x00000006 > "ae r_01,r_00,/=" > ar r_00 0x00000002 > "ae 1,r_00,/=" > ar r_00 0x00000002 |
%= | src,reg | MOD eq | reg = reg % src | > ar r_01=3;ar r_00=7;ar r_00 0x00000007 > "ae r_01,r_00,%=" > ar r_00 0x00000001 > ar r_00=9;ar r_00 0x00000009 > "ae 5,r_00,%=" > ar r_00 0x00000004 |
<<= | src,reg | Shift Left eq | reg = reg << src | > ar r_00=1;ar r_01=1;ar r_01 0x00000001 > "ae r_00,r_01,<<=" > ar r_01 0x00000002 > "ae 2,r_01,<<=" > ar r_010x00000008 |
>>= | src,reg | Shift Right eq | reg = reg << src | > ar r_00=1;ar r_01=8;ar r_01 0x00000008 > "ae r_00,r_01,>>=" > ar r_01 0x00000004 > "ae 2,r_01,>>=" > ar r_01 0x00000001 |
&= | src,reg | AND eq | reg = reg & src | > ar r_00=2;ar r_01=6;ar r_01 0x00000006 > "ae r_00,r_01,&=" > ar r_01 0x00000002 > "ae 2,r_01,&=" > ar r_01 0x00000002 > "ae 1,r_01,&=" > ar r_01 0x00000000 |
|= | src,reg | OR eq | reg = reg | src | > ar r_00=2;ar r_01=1;ar r_01 0x00000001 > "ae r_00,r_01,|=" > ar r_01 0x00000003 > "ae 4,r_01,|=" > ar r_01 0x00000007 |
^= | src,reg | XOR eq | reg = reg ^ src | > ar r_00=2;ar r_01=0xab;ar r_01 0x000000ab > "ae r_00,r_01,^=" > ar r_01 0x000000a9 > "ae 2,r_01,^=" > ar r_01 0x000000ab |
++= | reg | INC eq | reg = reg + 1 | > ar r_00=4;ar r_00 0x00000004 > "ae r_00,++=" > ar r_00 0x00000005 |
--= | reg | DEC eq | reg = reg - 1 | > ar r_00=4;ar r_00 0x00000004 > "ae r_00,--=" > ar r_00 0x00000003 |
!= | reg | NOT eq | reg = !reg | > ar r_00=4;ar r_00 0x00000004 > "ae r_00,!=" > ar r_00 0x00000000 > "ae r_00,!=" > ar r_00 0x00000001 |
--- | --- | --- | --- | ---------------------------------------------- |
=[] =[*] =[1] =[2] =[4] =[8] | src,dst | poke | *dst=src | > "ae 0xdeadbeef,0x10000,=[4]," > pxw 4@0x10000 0x00010000 0xdeadbeef .... > "ae 0x0,0x10000,=[4]," > pxw 4@0x10000 0x00010000 0x00000000 |
[] [*] [1] [2] [4] [8] | src | peek | stack=*src | > w test@0x10000 > "ae 0x10000,[4]," 0x74736574 > ar r_00=0x10000 > "ae r_00,[4]," 0x74736574 |
|=[] |=[1] |=[2] |=[4] |=[8] | reg | nombre | code | >> |
SWAP | Swap | Swap two top elements | SWAP | |
DUP | Duplicate | Duplicate top element in stack | DUP | |
NUM | Numeric | If top element is a reference (register name, label, etc), dereference it and push its real value | NUM | |
CLEAR | Clear | Clear stack | CLEAR | |
BREAK | Break | Stops ESIL emulation | BREAK | |
GOTO | n | Goto | Jumps to Nth ESIL word | GOTO 5 |
TODO | To Do | Stops execution (reason: ESIL expression not completed) | TODO |
ESIL Flags
ESIL VM provides by default a set of helper operations for calculating flags. They fulfill their purpose by comparing the old and the new value of the dst operand of the last performed eq-operation. On every eq-operation (e.g. =) ESIL saves the old and new value of the dst operand. Note, that there also exist weak eq operations (e.g. :=), which do not affect flag operations. The == operation affects flag operations, despite not being an eq operation. Flag operations are prefixed with $ character.
z - zero flag, only set if the result of an operation is 0
b - borrow, this requires to specify from which bit (example: 4,$b - checks if borrow from bit 4)
c - carry, same like above (example: 7,$c - checks if carry from bit 7)
o - overflow
p - parity
r - regsize ( asm.bits/8 )
s - sign
ds - delay slot state
jt - jump target
js - jump target set