이비디오교재는정보통신부의 999년도정보통신학술진흥지원사업에의하여지원되어연세대학교전기전자공학과이용석교수연구실에서제작되었습니다 고성능마이크로프로세서 LU ( rithmetic Logic Unit) 와 Register File의구조 2. 연세대학교전기전자공학과이용석교수 Homepage: http://mpu.yonsei.ac.kr E-mail: yonglee@yonsei.ac.kr 전화 : 2-392-794 이용석교수약력 973년 : 연세대학교전기공학과학사 9년 : University of Michigan, Ph. 92 ~ 992 년 : 미국실리콘밸리에서 년간마이크로프로세서설계, 인텔사에서펜티엄 (Pentium) 설계 993년 ~ : 연세대학교전자공학과교수 고성능마이크로프로세서구조와설계강좌시리즈 (Homepage: http://mpu.yonsei.ac.kr). 반도체산업과비메모리분야육성을위한방안 (99.2) 2. 고성능마이크로프로세서구조의개요 (99.2) 3. 고성능마이크로프로세서명령어해석기 (Instruction ecoder) 의구조 (99.3) 4. 고성능마이크로프로세서분기명령어 (ranch Instruction) 의수행방법 (99.3) 5. 고성능마이크로프로세서곱셈기 (Multiplier) 의구조 (99.3) 6. 고성능마이크로프로세서부동소수점연산기 (Floating-Point Unit) 구조 (999.3) 7. 고성능마이크로프로세서캐쉬 (ache) 메모리구조 (999.3). 고성능마이크로프로세서나눗셈연산기 (ivider) 의구조 (999.3) 9. 고성능마이크로프로세서초월함수 (Transcendental) 연산기구조 (999.3) -- -2- -3- -4- -5-
. 고성능마이크로프로세서 LU 와레지스터 파일의구조 (2.). 직접디지털주파수합성기 (FS) 의구조 3. 고성능마이크로프로세서부동소수점연산기 (Floating-Point Unit) 구조 (2) (2.) (2.) 2. 암호화를위한 VLSI 구조와설계의개요 (2.) 다음참고문헌은 Homepage 에저장되어있습니다. ( * 표시 ) [] N.Weste & K.Eshraghian, Principles of MOS VLSI esign, 2nd edition, ddison-wesley Publishing o., 993 [2] J.L.Hennessy &..Patterson, omputer rchitecture, Quantitative pproach, 2nd edition, Morgan Kaufmann Publishers, 996 * [3] 이용석, 6MHz lock 주파수의 IEEE * 표준 Floating Point LU, 전자공학회논문지, 99년 월 [4]..hao &..Wooley,.3ns 32- Word X 32-it Three-Port imos Register File, IEEE Journal of Solid- State ircuits, June, 996-6- -7- -- -9- * [5].sato, 4-Port 3.ns 6-Word 64b Renaming Register File, IEEE Journal of Solid-State ircuits, November, 995 * [6] L..Lev, et.al, 64b Microprocessor with Multimedia Support, IEEE Journal of Solid-State ircuits, November, 995 Topics LU ( rithmetic Logic Unit) - arry look-ahead adder ( 참고문헌 []) - arry select adder - arry chain adder - arrel shifter Register file - Single port * [7] 이용석교수 notes - Multi-port -- --
LU 의기능 dd, +, subtract -=+(-) Reg-LU 의구조 Logical operation N, OR, XOR, NOT Shift, rotate Left / right Multiply (Video강좌 [5]) W Reg file (2 read write) R R dder Logic arrel shifter ivide (Video 강좌 []) FF FF -2- -3- Pipelining ( 참고문헌 [2], Video 강좌 [2]) lock R +R 2 Fetch : Inst. read from memory, Inc P ecode : ecode & read R, R 2 from reg. Execute : R + R 2 Memory : No operation back : to in reg. file R +R 2 R 5 +R 4 lock Register ypassing R +R 2 R 5 +R 4 Reg file ypass FF ypass M U X R 4 P + 4 FF FF FF FF MUX P Inst cache P e c o d e d d R eg ypass R addr M U X L U ata cache M U X -4-, R 4 Reg R 4 file -5- -6- -7-
3 3 F S 3 Full adder 2 F S 2 in S out 2 F S H S Half adder S out 64-bit arry Select dder ( 참고문헌 [3]) out out i n bit carry bit carry out bit carry chain chain chain adder adder adder S S S S S Mux Mux Sum[ 63 :56 ] arry hain dder XOR arry hain dder Truth Table out out in = in = in S out in S out -- Sum[ 23 :6] Sum[ 5:] Sum[ 7:] -9- + + (OR) (N) (XOR) S S -2- -2- out -bit arry hain dder 4 개 uffer 4 개 n n n n in arry hain dder 장점 - 회로가간단 - N, OR, XOR 결과가동시에나옴 - in=, in= 두가지경우의결과가필요할때최적 (carry select adder) 설계시고려사항 - arry chain의 n 의용량이최소화가되어야함 -22- -23-
arry chain adders Subtraction -=+(-) -=+ (2 s comp.) out in n n n n n n Reg file XOR dder = arry in : dd : Subtract -24- -25- Exclusive OR (XOR) Gate () O O Exclusive OR (XOR) Gate (2) O -26- -27- Exclusive OR (XOR) Gate (3) Exclusive OR (XOR) Gate (4) Weak O N O -2- -29-
XOR omparison arrel Shifter XOR () (2) (3) (4) Tr 수 6 7 rive capability Medium Low High Low Left 또는 right로여러 bit를동시에 shift ( shift register 경우는 clock에 bit shift ) Multi-stage의 multiplexer (MUX) 로구성 Two Stage arrel Shifter ( 참고문헌 [3][7], Video강좌 [6]) ( 3 bit left shift) First stage :, 4,, 2, 6, 2, 24, 2 bit left shift Second stage :,, 2, 3 bit left shift Example : To left shift 9 bits, () first stage shifts 6 bits left & (2) second stage shifts 3 bits left Two Stage arrel Shifter n n-4 n- n-2n-6n-2n-24 n-2 32개 input mux 32 m[3 ] m m- m-2 m-3 32개 4 input mux 32 Multiplexer (MUX) Weak PMOS Left/right Shift, Rotate arrel Shifter ( 참고문헌 [7]) Input 2 [3:] Input 32 32 MUX MUX 32 Output -3- -3- -32- -33- -34- -35-
ase : 5 bit left shift Input Input 2 = [3:] Input = all zero 3 26 3 27 ase 2 : 5 bit right shift (32-5=27 bit left shift) Input Input 2 = Sign/zero extension Input = [3:] 3 4 3 5 Output 3 Output 3 Sign extension : arithmetic shift Zero extension : logical shift -36- -37- ase 3 : 5 bit rotate left Input Output Input 2 = [3:] 3 26 3 27 3 Input = [3:] Implementation ( 참고문헌 [7]) Register File (2-read -write) R +R 2 R 5 +R 4 Reg file ypass FF ypass R 4 M U X Register File (2-read -write) Enable ( 참고문헌 []) addr addr 2 ecoder Enable ecoder data Sense amp W buffer ell array Sense amp 2 Enable ecoder data data 2 addr -3- n n-4 n- n-2n-6n-2n-24 n-2 (n:3, 3 29) input mux 35개 35 m m m- m-2 m-3 (m:3, 3 29) 4 input mux 32개 32-39- -4- -4-
ell rray addr addr 2 decoder decoder 3 3 3/6 3/6 ell array 3 3 decoder addr ecoder Truth Table -42-5 4 3 2 2 3 4 63-43- ecoder Logic 6 Tr SRM ell 2 2-44- -45-6 Tr SRM ycle 6 Tr SRM ycle lock lock P 6 tr cell Sense amp ata out P, ata out ❶ ❷ ❸ ❹ P WE ata in 6 tr cell WE P ata in, ❶ ❷ ❸ -46- -47-
3-port SRM ell () ifferential () 3-port SRM ell (2) ( 참고문헌 [4]) ifferential (2) M M2 M3 M4 W R2 R R R2 W W R2 R R R2 W 3-port SRM ell (3) Single-ended Strong omparison ifferential Single-ended W Weak R R2 rea Speed Large Fast Small Slow ifferential Sense mp Enable urrent ata out, ata out urrent ❷ ❶ High current spike Fast ❸ ❹ Single-ended Sense mp 또는 ata out Low power (no current spike) Slow -4- -49- -5- -5- -52- -53-
() XH R,R 2 (R <-> R 2 ) R T R R 2 R R 2 MUX R R 2 R T 3 clock cycle (2) Regs Regs R R 2 clock cycle MUX Future Studies ( 참고문헌 [5][6]) () Superscalar, VLIW Ultrasparc 4-issue, port reg file (7 read, 3 write) (2) SMT (Simultaneous Multi- Threading) -54- -55-