BE 8 BE 6 BE 4 BE 2 BE 0 y 17 y 16 y 15 y 14 y 13 y 12 y 11 y 10 y 9 y 8 y 7 y 6 y 5 y 4 y 3 y 2 y 1 y 0 0 BE 7 BE 5 BE 3 BE 1 BE 16 BE 14 BE 12 BE 10 y 32 y 31 y 30 y 29 y 28 y 27 y 26 y 25 y 24 y 23 y 22 y 21 y 20 y 19 y 18 BE 15 BE 13 BE 11 BE 9
(~ &~ &~ ) ( & & ) ~ & ( ^ ) ~ & & & ~ & ~ & ( ^ )
S 0 S 0 S 0 S 0 S 0 S 0 S 0 S 0 S 0 x x x x x x x S 1 S 1 S 1 S 1 S 1 S 1 S 1 x x x x x x x S 2 S 2 S 2 S 2 S 2 x x x x x x x S 3 S 3 S 3 x x x x x x x 1 1 -S 0 S 0 x x x x x x x 1 -S 1 S 1 x x x x x x x 1 -S 2 S 2 x x x x x x x 1 -S 3 S 3 x x x x x x x
pp_sel[4] pp_sel[2] pp_sel[0] pp_sel[3] pp_sel[1] Zero or sign generated input x i x i-1 x i x i-1 OR pp[i]
3 FA FA FA 3 4 FA FA FA FA 4 FA FA 3 FA FA FA 3 2 FA 2 FA FA FA 2 FA 2 C SUM FA C SUM
32 bits 1 8 23 s e f msb lsb msb lsb (a) 단정도형식 (a) single format 64 bits 1 11 52 s e f msb lsb msb lsb (b) 배정도형식 (b) double format
의보수
Exponent Sub tractor A lig n er Stiky Gen ST 1 (S/D) LOP LZC Fraction Adder Rnd Ctl ST 2 (S/D) 1bit shift Exponent Adder Norm Inc ST 3 (S/D) MUX
EX1 sign xor exponent adder fraction m ultiplier EX2 rnd ctrl Sticky bit gen EX3 exp adder shifter Inc
P i+1 /D 1 (8/3, 2/3) q i+1 =-2 q i+1 =-1 q i+1 =0 q i+1 =1 q i+1 =2-2 -1 1 2 4P i /D (-8/3, -2/3) -1
P=rP 1 (k+j+1)d (k+j)d y q=j+1 (-k+j+1)d x q=j (-k+j)d D min D 1 D 2 D max D
System DSP chip cont reg IU chip cont reg MAC ld / st memory AU exe ld / st FPU chip cont reg on-board bus FP AU ld / st FP Mult FP Div System IU / DSP/ FPU core cont decoder reg ld /st exe AU M A C FP AU FP M ult FP Div
MAC/MAS DSP Sub-Decoder AU (pack/unpack, extend/clam p, lzc, min/max) Mux XY Memory Address Gen DSP XY Memory
3 1 1 6 1 5 1 4 1 1 7 6 5 4 3 2 1 0.................... P L C Z S V R
IF ID EX MA WB IU inst address gen BUS inst memory inst decoder/ control inst folding branch target gen stack pointer SPR execution forward data memory system controller (TLB,MMU) register file access DSP prefetch queue GPR DSP address gen DSP XY memory DSP unit
C A B 64 0 64 mux 64 3A generation Path 2 Booth selector R8 4 Booth encoder R8 32 35 34 Path 1 33 33 Wallace tree Booth selector R4 3 Booth encoder R4 33 33 Wallace tree 64 64 64 bit Final adder 64 64- bit result
Yi+1 Yi Yi-1 output 0 0 0 0X 0 0 1 1X 0 1 0 1X 0 1 1 2X 1 0 0-2X 1 0 1-1X 1 1 0-1X 1 1 1 0X
Yi+2 Yi+1 Yi Yi-1 output 0 0 0 0 0X 0 0 0 1 1X 0 0 1 0 1X 0 0 1 1 2X 0 1 0 0 2X 0 1 0 1 3X 0 1 1 0 3X 0 1 1 1 4X 1 0 0 0-4X 1 0 0 1-3X 1 0 1 0-3X 1 0 1 1-2X 1 1 0 0-2X 1 1 0 1-1X 1 1 1 0-1X 1 1 1 1 0X R a d i x - 8 B o o t h 인코딩 0 0 Y 3 2 Y 3 1 Y 1 6 Y 1 5 Y 1 4 Y 1 3 Y 1 2 Y 1 1 Y 1 0 Y 9 Y 8 Y 7 Y 6 Y 5 Y 4 Y 3 Y 2 Y 1 Y 0 0 R a d i x - 4 B o o t h 인코딩
1 1 ~ 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 ~ 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 ~ 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 ~ 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c c 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 34 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 1 s complement 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ correction carry-in 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c 1 1 ~ 1 1 c 42 34 32 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 41 40 39 38 37 36 35 33 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Final adder 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 7 c 5 4 3 2 1 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 10 9 c 6 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 13 12 c 8 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16 15 c 11 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 19 18 c 14 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 22 21 c 17 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 25 24 c 20 1 1 ~ 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 28 27 c 23 1 1 ~ 34333231302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 031 30 c 26 1 1 ~ 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 1 41 40 39 38 37 36 35 1 33 c 29 42 34 32 64-bit C
Signed extend (a) 8 -b it 부호확장 sign Signed extend sign data (b ) 1 6-bit 부호확장 data S i g n C o u n t S i g n C o u n t o u t p u t O v erfloe r f l o w check I n p u t C l a m p o u t p u t C l a m p
(dst + 1)[31:16] (dst + 1)[15:0] dst[31:16] dst[15:0] 16 16 16 16 8-bit clamp 8-bit clamp 8-bit clamp 8-bit clamp 8 8 8 8 dst[31:24] dst[23:16] dst[15:8] dst[7:0]
dst[31:24] dst[23:16] dst[15:8] dst[7:0] 8 8 8 8 16-bit Sign extend 16-bit Sign extend 16-bit Sign extend 16-bit Sign extend 16 16 16 16 dst[31:16] dst[15:0] (dst + 1)[31:16] (dst + 1)[15:0]
L-bit M-bit Memory Index(Address) Mask bits Generator ADDER Bits Reverse Logical shift(15-l) and and or or and Output
3 1 3 0 2 3 2 2 0 S E F ( a ) 6 3 6 2 5 2 5 1 0 S E F ( b ) 3 1 2 4 2 3 2 2 0 8 b0 1 F [22:0] (a) 6 3 5 3 5 2 5 1 0 11 b0 1 F [51:32] F [31:0] H ig h e r 3 2 - b it d a ta Lower 32-bit data (b )
IU Core valid Instruction 16 load/store register Load data Store data FPU Sub-decoder IU 32-bit register file 32 32 32 Source A Source B 32 32 Forwarding FP-ALU FP-MUL FP-DIV 32 Result bus DSP MAC Unit busy Multiplier control
double_state opcode 32-bit reg 32-bit reg 32 32 4 11 11 effective operation exp_diff_cout 4 s_s opcode_out 2 double_state_out 11-bit exp sub 11 exp_diff 32-bit comparator 11 2 significand_comparator_out compare_out effective_out 21 55-bit sticky_gen_input 32 sign_final aligner compare_out exp_max 2 aligned_out 53 sticky_bit reg reg reg reg reg reg 53-bit reg 21-bit reg 53-bit reg 3-bit reg 4 opcode_out double_state sign_final reg reg reg reg 4 opcode_out double_state sign_final 2 11 compare_out 2 11 compare_out exp_max reg exp_max round selector effective_out reg 54-bit normalizer nor_rnd_sel 27 27-bit significand adder 27 27 32 complement mux 27 27 swap 32 mux 2 increaser 27 mux 27 round control g _bit,r _bit, sticky _bit roundin rounding controller ST 1(Single Precision) / ST 1,2 (Double Precision) sticky gen g_bit,r_bit, 30-bit lzc sticky_bit 6 54-bit reg 6-bit reg 3-bit reg 54 27 6 3 27 3 3 ST 2(Single Precision) / ST 3,4 (Double Precision) ST 3(Single Precision) / ST 5,6(Double Precision) reg reg reg reg reg reg 27-bit reg
ES,Comparator, Aligner,Sticky gen 1 1 ES,Comparator, Aligner,Sticky gen 2 2 4 3 Significand Adder 4 Significand Adder 3 6 5 Norm alization /Rounding Normalization /Rounding (a) 6 (b) 5 Inst EX1 EX2 EX3 Inst+1 EX1 EX2 EX3 Inst+2 EX1 EX2 EX3 1 cycle 3 cycles (a) Inst EX1 EX1 EX2 EX2 EX3 EX3 Inst+1 EX1 EX1 EX2 EX2 EX3 EX3 Inst+2 EX1 EX1 EX2 EX2 EX3 EX3 2 cycles 6 cycles (b)
2-to-1 MUX shft_cnta[0] shft_cntb[0] 2 0 = 1 shft_cnta[1] shft_cntb[1] 2 1 = 2 shft_cnta[2] shft_cntb[2] 2 2 = 4 shft_cnta[5:3] shft_cntb[5:3] Aligned out
shift_cnta[2:0] 5 bits M ask C ontrol G enerator 1 5 bits 1.f (53 bits) BS 1(0,1,2,3,4,5,6,7 shifts) 1.f GR OR 8 8 8 8 8 8 6 bits Mask Control Gen. 2 shift_cnta[5:3] OR Sticky bit
A B 2 7 2 7 C L A carry ou t 2 7 su m carry in reg reg
2 9 3 2 1 0 3 b 0 1 S U M [ 2 2 : 0 ] G R S (a ) 2 9 3 2 1 0 C y c l e 1 : S U M [ 2 6 : 0 ] G R S C y c l e 2 : 2 9 3 2 1 0 S U M [ 2 6 : 0 ] 3 b 0 (b )
G, R, S & C, I, K, L R o u n d m o d e R o u n d i n g In c re a se r C o n tro lle r M S B 1 -b it s h ift F i n a l s i g n i f i c a n d T o e x p o n e n t a d j u s t a d d e r R o u n d i n g v a l u e E x c e p t i o n m o d e
11 3 2 0 Single exponent 3 b0 (a) 11 0 Double exponent (b) shift 11 3 2 0 3 b0 LZC when single 3 b0 (c) 11 6 5 0 6 b0 Leading zero count (d)
21-bit X Y XH XL YH YL 32-bit 21-bit 32-bit 10-bit 32-bit XH YH 21-bit XH YL 32-bit 32-bit XL YL 32-bit 21-bit Incrementor 21-bit XL YH 32-bit 53-bit adder
r r r X Y 32 32 opcode 3 multiply_state 2 Sign Selector round_mode 2 sign final Exponent Adder exp max 11 r r r r 11-bit reg 3X Gen Mult Booth Selector R8 35 4 Booth Booth Encoder R8 Selector R4 34 33 33 Multiplier 33 33 Multiplier 64 64 64-bit reg Mux 64-bit reg 64 64 3 Booth Encoder R4 64-bit Multiply Final Adder opcode 3 multiply_state 2 round_mode 2 sign final 53 53-bit Significand Adder exp max 21-bit Incrementor 11 sadder_cout 53 Mux Sticky0 32 r r r r 11-bit reg 53-bit reg 21-bit reg r 11 32 21 exp max State_Machine Sticky1 double_state round_mode Exponent Adj Adder sign final 11 exp final r 11-bit reg r cout 32-bit Increaser 32 Shifter 32 significand final 32-bit reg roundin Round Controller
{8 b0,1,x[22:0]} {8 b0,1,y[22:0]} {11 b0,1,xh[19:0]} {11 b0,1,yh[19:0]} XL[31:0] YL[31:0] {11 b0,1,xh[19:0]} YL[31:0] XL[31:0] {11 b0,1,yh[19:0]}
Inst Inst+1 Inst+2 EX1 EX2 EX3 EX1 EX2 EX1 EX3 EX2 EX3 Higher 32-bits Inst Inst+1 Inst+2 Lower 32-bits 1 cycle 3 cycles (a) EX1 EX1 EX1 EX1 EX2 EX2 EX2 EX2 EX3 EX3 EX1 EX1 EX1 EX1 EX2 EX2 EX2 EX2 4 cycles 7 cycles (b) Lower 32-bits Higher 32-bits EX3 EX3 EX1 EX1 EX1 EX2 EX2 EX1 EX2
fadd F D EX1 EX2 EX3 fdiv fsub F D F EX1 D EX1 EX1 EX1 EX1 EX3 D D D D D EX1 dependency stall (a) fadd fdiv fsub F D EX1 F D F EX2 EX1 D EX3 EX1 EX1 EX1 EX1 EX3 EX1 EX2 EX3 (b)
X mux Y mux Sign Exponent Adder ROM Divisor Formation Significand Subtractor qpos qneg mux Partial+1 Partial quotuent mux Exponent Adj Adder Shifter Inc Shifter Sticky