Shader Programming on GPU (Cg: C for graphics) 2008 년도 1 학기 서강대학교공과대학컴퓨터공학과 임인성교수 Professor Insung Ihm Dept. of Computer Sci. & Eng. Sogang University, Seoul, Korea (c)2008 서강대학교컴퓨터공학과임인성 (Insung Ihm) 2008 학년도 1 학기 1
Cg (C for graphics) Introduction C-like high-level shading language for GPUs API- and platform-independent Courtesy of M. Kilgard 2
Cg 1.4 Cg Toolkit User s Manual: A Developer s Guide to Programmable Graphics (Release 1.4, September 2005) Cg s Programming Model for GPU GPUs consist of programmable processors and other non-programmable units. Programmable processors Vertex processor Pixel processor Geometry processor (Cg 2.0) 3
Cg Language Profile Unlike CPUs, GPU programmability has not quite yet reached the same level of generality. Cg uses the concept of language profile that defines a subset of the full Cg language that is supported on a particular hardware platform of API. - Cg 2.0 4
Program Inputs and Outputs Recall that GPU is a (massively parallel) streaming processor! How to specify the input and output data of each shader program vertex stream VS primitive stream GS (Cg 2.0) primitive stream Rasterization pixel stream PS pixel stream FB Varying inputs versus uniform inputs Varying inputs Used for data that is specified with each element of the stream of input data. Uniform inputs Used for values that are specified separately from the main stream of input data, and don t change with each stream element. varying uniform 5
What kind of attributes are associated with a stream element? Example: vertex (arbvp1) Vertex Shader All vertex programs must declare and set a vector output that uses the POSITION binding semantic. 6
Example: pixel (arbfp1) Pixel Shader 7
Interoperability between vertex programs and fragment programs COLOR0 COLOR0 COLOR0 COLOR0 Varying outputs from a Vertex Program Rasterization The value associated with the POSITION binding semantic may not be read in the fragment program (no more true in G80). Varying inputs to Fragment Programs 8
What Cg shader codes look like? // This is C5E2v_fragmentLighting from "The Cg Tutorial // (Addison-Wesley, ISBN0321194969) by Randima Fernando // and Mark J. Kilgard. See page 124. // vertex.cg void C5E2v_fragmentLighting(float4 position : POSITION, vector type float3 normal : NORMAL, out float4 oposition : POSITION, out float3 objectpos : TEXCOORD0, out float3 onormal : TEXCOORD1, uniform float4x4 modelviewproj) { oposition = mul(modelviewproj, position); objectpos = position.xyz; onormal = normal; // fragment.cg void C5E3f_basicLight(float4 position : TEXCOORD0, float3 normal out float4 color : TEXCOORD1, : COLOR, uniform float3 globalambient, uniform float3 lightcolor, uniform float3 lightposition, uniform float3 eyeposition, uniform float3 Ke, uniform float3 Ka, uniform float3 Kd, swizzle operator matrix type control flow: if/else, while, for, return uniform float3 Ks, uniform float shininess) { float3 P = position.xyz; float3 N = normalize(normal); scalar type // Compute emissive term float3 emissive = Ke; // Compute ambient term float3 ambient = Ka * globalambient; Cg standard lib. Ftn. // Compute the diffuse term float3 L = normalize(lightposition - P); float diffuselight = max(dot(l, N), 0); float3 diffuse = Kd * lightcolor * diffuselight; // Compute the specular term float3 V = normalize(eyeposition - P); float3 H = normalize(l + V); float specularlight = pow(max(dot(h, N), 0), shininess); if (diffuselight <= 0) specularlight = 0; float3 specular = Ks * lightcolor * specularlight; color.xyz = emissive + ambient + diffuse + specular; color.w = 1; write mask operator 9
A Phong Shading Example in Cg 이예제프로그램에서는 Vertex program 꼭지점의좌표를 OC 에서 CC 로기하변환을하고 (POSITION), EC 에서의꼭지점좌표와법선벡터를텍스춰좌표를위핚두레지스터에넣어보냄 (TEXCOORD0, TEXCOORD1). Fragment program 입력으로래스터화과정에서보간을통하여얻어진해당픽셀을통하여보이는물체지점에대핚 EC 좌표와법선벡터를받아들여, EC 를기준으로주어진광원정보및물질정보를사용하여퐁의조명모델계산을수행. Cg Toolkit User s Manual: A Developer s Guide to Programmable Graphics, Release 1.4.1, NVIDIA, March 2006. Cg Toolkit Reference Manual: A Developer s Guide to Programmable Graphics, Release 1.4.1, NVIDIA, March 2006. The Cg Tutorial: The Definitive Guide to Programmable Real- Time Graphics, R. Fernando et al., NVIDIA, 2003. 10
The OpenGL Part Profile: arbvp1, arbfp1 이상, 또는 vp40, fp40 이상 static CGcontext Context; Coded by 김원태, updated by 손성진 static void CheckCgError(void) { CGerror err = cggeterror(); if (err!= CG_NO_ERROR) { printf("\n%s\n", cggeterrorstring(err)); printf("%s\n", cggetlastlisting(context)); exit(1); void init_cg(void) { printf("creating context..."); Context = cgcreatecontext(); cgseterrorcallback(checkcgerror); printf("completed.\n"); printf("compiling vertex program..."); VertexProgram = cgcreateprogramfromfile(context, CG_SOURCE, "vp.cg", VertexProfile, NULL, NULL); printf("completed.\n"); printf("compiling fragment program..."); FragmentProgram = cgcreateprogramfromfile(context, CG_SOURCE, "fp.cg", FragmentProfile, NULL, NULL); printf("completed.\n"); Create a Cg context that all shaders will use. Do this after OpenGL is initialized. static CGprogram VertexProgram, FragmentProgram; Compile a Cg program by adding it to a context. static CGprofile VertexProfile = CG_PROFILE_VP40; static CGprofile FragmentProfile = CG_PROFILE_FP40; A Cg profile indicates a subset of the full Cg language that is supported on a particular hardware platform or API. VertexProfile = cgglgetlatestprofile(cg_gl_vertex); cgglsetoptimaloptions(vertexprofile); CheckCGError(); Get the best available vertex profile for the current OpenGL rendering context. Ask the compiler to optimize for the specific HW underlying the OpenGL rendering context. 11
/* Get parameters */ NormalFlag = cggetnamedparameter(fragmentprogram, "nv_flag"); cgglsetparameter1f(normalflag, nv_flag); printf( Completed. \n ); /* Create the vertex & fragment programs */ printf("loading Cg program..."); cgglloadprogram(vertexprogram); cgglloadprogram(fragmentprogram); printf("completed.\n"); cgglenableprofile(vertexprofile); cgglenableprofile(fragmentprofile); Retrieve a parameter of a shader directly by name. Get a handle to the parameter in this way. static Cgparameter NormalFlag; Set the value of scalar and vector parameters. Pass the compiled object codes by loading the shaders. Enable the shaders before executing a program in OpenGL. cgglbindprogram(vertexprogram); cgglbindprogram(fragmentprogram); Bind the shaders to the current state. 12
void exit_program(void) { cgdestroyprogram(vertexprogram); cgdestroyprogram(fragmentprogram); cgdestroycontext(context); exit(1); Free all resources allocated for the shaders. Free all resources allocated for the context. void draw_object(void) { int i, j, loop; MyPolygon *ptr; glpushmatrix(); glrotatef(angle, 0.0, 1.0, 0.0); glrotatef(xrot, 0.0, 1.0, 0.0); glrotatef(yrot, 1.0, 0.0, 0.0); if(draw_flag!= COW) { gltranslatef(0.0, -1.0, 0.0); glrotatef(-90.0, 1.0, 0.0, 0.0); i = 0; while(i++ < loop) { glbegin(gl_polygon); for (j = 0; j < ptr->nvertex; j++) { glnormal3fv(ptr->normal[j]); glvertex3fv(ptr->vertex[j]); glend(); ptr++; glpopmatrix(); 13
Cg Vertex Shader struct _output { float4 position: POSITION; float4 pec: TEXCOORD0; float3 nec: TEXCOORD1; ; A vector type with x, y, z, and w fields Set by glvertex*(); Set by glnormal*(); MAD R4, R1, R2, R3; R4 R1 R2 R3 4 1 3 1 2 7 0 2 <- * + 1 2 2-3 1-1 4 5 A 4x4 matrix type with 16 elements _output main(float4 position: POSITION, float4 normal: NORMAL, uniform float4x4 ModelViewProj : state.matrix.mvp, uniform float4x4 ModelView : state.matrix.modelview, uniform float4x4 ModelViewIT : state.matrix.modelview.invtrans) { _output OUT; OUT.position = mul(modelviewproj, position); OUT.pEC = mul(modelview, position); OUT.nEC = mul(modelviewit, normal).xyz; // to CC // normal on EC // position on EC return OUT; A Cg standard library function 14
Binding Semantics for arbvp1 vertex program Profile Vertex Shader 15
Cg Standard Library Functions mul(x,y) 16
Cg Pixel Shader struct _output { float3 color: COLOR; ; _output main(float4 position: TEXCOORD0, // position on EC float3 normal: TEXCOORD1, // normal on EC uniform int nv_flag, uniform float4 global_ambient : state.lightmodel.ambient, uniform float4 light_position : state.light[0].position, uniform float4 light_ambient : state.light[0].ambient, uniform float4 light_diffuse : state.light[0].diffuse, uniform float4 light_specular : state.light[0].specular, void set_material (void) { glmaterialfv(gl_front, GL_AMBIENT, mat_ambient); gllightmodelfv(gl_light_model_ambient, global_ambient); glmaterialfv(gl_front, GL_DIFFUSE, mat_diffuse); glmaterialfv(gl_front, GL_SPECULAR, mat_specular); glmaterialf(gl_front, GL_SHININESS, mat_shininess); void set_light (void) { gllightfv(gl_light0, GL_POSITION, light_position); gllightfv(gl_light0, GL_AMBIENT, light_ambient_color); gllightfv(gl_light0, GL_DIFFUSE, light_diff_spec_color); gllightfv(gl_light0, GL_SPECULAR, light_diff_spec_color); gllightmodeli(gl_light_model_local_viewer, GL_TRUE); uniform float4 mat_ambient : state.material.ambient, uniform float4 mat_diffuse : state.material.diffuse, uniform float4 mat_specular : state.material.specular, uniform float mat_shininess : state.material.shininess) { _output OUT; if (nv_flag == 1) { OUT.color = normal; else { OUT.color = global_ambient * mat_ambient; float3 N = normalize(normal); float3 L = normalize(light_position position); float NdotL = dot(n, L); 17
if(ndotl >= 0.0) { OUT.color += mat_ambient * light_ambient; float3 V = normalize(-position); float3 H = normalize(l + V); float NdotH = dot(n, H); float4 lighting = lit(ndotl, NdotH, mat_shininess); float3 diffuse = mat_diffuse * lighting.y; float3 specular = mat_specular * lighting.z; OUT.color += (light_diffuse * diffuse) + (light_specular*specular); return OUT; 18
Binding Semantics for arbfp1 fragment program Profile Pixel Shader 19
Cg Standard Library Functions dot(a,b) normalize(v) lit(ndot1, ndoth, m) 20
A Simple Texture Mapping Example in Cg Vertex Shader struct _output { float4 position: POSITION; float2 texcoord: TEXCOORD0; float4 pec: TEXCOORD1; float3 nec: TEXCOORD2; ; _output main(float4 position: POSITION, float4 normal: NORMAL, float2 texcoord: TEXCOORD0, uniform float4x4 ModelViewProj : state.matrix.mvp, uniform float4x4 ModelView : state.matrix.modelview, uniform float4x4 ModelViewIT : state.matrix.modelview.invtrans){ _output OUT; OUT.position = mul(modelviewproj, position); // to CC OUT.pEC = mul(modelview, position); // position on EC OUT.nEC = mul(modelviewit, normal).xyz; // normal on EC OUT.texcoord = texcoord; return OUT; 21
Pixel Shader struct _output { float3 color: COLOR; ; _output main(float2 texcoord: TEXCOORD0, float4 position: TEXCOORD1, // position on EC float3 normal: TEXCOORD2, // normal on EC uniform sampler2d decal, uniform int tex_flag, uniform float4 global_ambient : state.lightmodel.ambient, uniform float4 light_position: state.light[0].position, uniform float4 light_ambient : state.light[0].ambient, uniform float4 light_diffuse : state.light[0].diffuse, uniform float4 light_specular: state.light[0].specular, uniform float4 mat_ambient : state.material.ambient, uniform float4 mat_diffuse : state.material.diffuse, uniform float4 mat_specular : state.material.specular, uniform float mat_shininess : state.material.shininess) { _output OUT; OUT.color = global_ambient * mat_ambient; float3 N = normalize(normal); float3 L = normalize(light_position position); float NdotL = dot(n, L); 22
if(ndotl >= 0.0) { OUT.color += mat_ambient * light_ambient; float3 V = normalize(-position); float3 H = normalize(l + V); float NdotH = dot(n, H); float4 lighting = lit(ndotl, NdotH, mat_shininess); float3 diffuse = mat_diffuse * lighting.y; float3 specular = mat_specular * lighting.z; OUT.color += light_diff * diffuse; if (tex_flag == 1) { float3 decalcolor = tex2d(decal, texcoord); OUT.color *= decalcolor; // 'GL_MODULATE' part OUT.color += light_specular * specular; return OUT; // During init_opengl void set_textures(void) { read_texture(); glgentextures(1, &tex_name); glbindtexture(gl_texture_2d, tex_name); gltexparameteri(gl_texture_2d, GL_TEXTURE_MAG_FILTER, GL_LINEAR); gltexparameteri(gl_texture_2d, GL_TEXTURE_MIN_FILTER, GL_LINEAR); gltexenvi(gl_texture_env, GL_TEXTURE_ENV_MODE, GL_MODULATE); glteximage2d(gl_texture_2d, 0, GL_RGB, tex_br.ns, tex_br.nt, 0, GL_RGB, GL_UNSIGNED_BYTE, tex_br.tmap); - 나머지 OpenGL 부분은해당프로그램참조 OpenGL 23
Normal Map and Normal-Map Space Generating normal from height fields y z ij = z(i, j) y z y x x x 24
Bump Mapping a Torus Parametric surfaces Torus z y M N Tangent, binormal, and normal vectors z t 1 B y T (s,t) 1 s x S(s,t) N 25
Shading Normal-Map Space Transformation from object space to normal-map space z o z n V n z V o y o B y y o y n T x o x N x o x n S(s,t) z o z o z n V n V o x n x o y o y n 26
A Bump Mapping Example 범프매핑기법적용전과후 일반이미지맵과범프맵 왜범프맵의색상이푸른색일까? 27
Vertex Shader struct _output { float4 oposition: POSITION; float2 otexcoord: TEXCOORD0; float3 lightdirection: TEXCOORD1; float3 eye_direction: TEXCOORD2; ; _output main(float2 parametric: POSITION, //torus param uniform float3 lightposition, // Object-space uniform float3 eyeposition, // Object-space uniform float4x4 modelviewproj, uniform float2 torusinfo, uniform float decalflag) { _output OUT; const float pi2 = 6.28318530; // 2 * Pi // Stetch texture coordinates CCW // over torus to repeat normal map in 6 by 2 pattern float M = torusinfo[1];//1.5 float N = torusinfo[0];//1.0 OUT.oTexCoord = parametric * float2(6, 2);//texture coordinate if(decalflag == 0) { if(out.otexcoord.y > -1.0) { OUT.oTexCoord.y = -OUT.oTexCoord.y; // Compute torus position from its parameteric equation float coss, sins; sincos(pi2 * parametric.x, sins, coss); float cost, sint; parametric.y = -parametric.y; sincos(pi2 * parametric.y, sint, cost); //Make torus position with torus info.(meridian_slices,core_slices) //x=(m+ncos(2*pi*t))cos(2*pi*s) //y=(m+ncos(2*pi*t))sin(2*pi*s) //z=nsin(2*pi*t) float3 torusposition = float3((m+n*cost) *coss, (M+N*cosT)*sinS, N*sinT); OUT.oposition = mul(modelviewproj, float4(torusposition, 1)); //CC position of vertex 28
// Compute per-vertex rotation matrix float3 dpds = float3(-sins*(m+n*cost), coss*(m+n*cost), 0); float3 norm_dpds = normalize(dpds); //T vector float3 normal = float3(coss*cost, sins*cost, sint);//n vector float3 dpdt = cross(normal, norm_dpds); //B vector if(decalflag == 0) if(out.otexcoord.y > -1.0) dpdt = -dpdt; float3x3 rotation = float3x3(norm_dpds, dpdt,normal); // Rotate obj.-space vectors to tex-space float3 eyedirection = eyeposition torusposition; float3 tmp_lightdirection = lightposition - torusposition; OUT.lightDirection = mul(rotation, tmp_lightdirection); eyedirection = mul(rotation, eyedirection); OUT.eye_direction=eyeDirection; return OUT; 29
Pixel Shader struct _output { float3 color: COLOR; ; float3 expand(float3 v) { return (v-0.5)*2; _output main( float2 normalmaptexcoord: TEXCOORD0, float3 lightdirection: TEXCOORD1, float3 eyedirection: TEXCOORD2, uniform float3 ambient, uniform sampler2d decalmap, uniform sampler2d normalmap) { _output OUT; float3 normal = expand(normaltex); //(0,1) ==> (-1,1) normal.xyz = normalize(normal.xyz); decalcolor = tex2d(decalmap, normalmaptexcoord.xy); float3 temp_1 = dot(normal,l); //N dot L float3 temp_2 = dot(normal,v); //N dot V OUT.color = ambient + decalcolor * temp_1 + pow(temp_2,25); return OUT; float3 decalcolor; float3 V = eyedirection; float3 L = lightdirection; L = normalize(l.xyz); V = normalize(v.xyz); float3 normaltex = tex2d(normalmap, normalmaptexcoord).xyz; // Fetch and expand range-compressed normal 30
From Pixar RenderMan Shader to Cg Shader windowhighlight() 이렌더맨쉐이더를 Cg 쉐이더로변환해보자. 31
/* Copyrighted Pixar 1989 */ /* From the RenderMan Companion p.357 */ /* Listing 16.20 Surface shader providing a paned-window highlight*/ /* * windowhighlight(): Give a surface a window-shaped specular highlight. */ surface windowhighlight( point center = point "world" (0, 0, -4), /* center of the window */ in = point "world" (0, 0, 1), /* normal to the wall */ up = point "world" (0, 1, 0); /* 'up' on the wall */ { color specularcolor = 1; float Ka =.3, Kd =.5, xorder = 2, /* number of panes horizontally */ yorder = 3, /* number of panes vertically */ panewidth = 6, /* horizontal size of a pane */ paneheight = 6, /* vertical size of a pane */ framewidth = 1, /* sash width between panes */ fuzz =.2;) /* transition region between pane and sash */ uniform point in2, /* normalized in */ right, /* unit vector perpendicular to in2 and up2 */ up2, /* normalized up perpendicular to in */ corner; /* location of lower left corner of window */ point path, /* incident vector I reflected about normal N */ PtoC, /* vector from surface point to window corner */ PtoF; /* vector from surface point to wall along path */ float offset, modulus, yfract, xfract; point Nf = faceforward( normalize(n), I ); 32
/* Set up uniform variables as described above */ in2 = normalize(in); right = up ^ in2; up2 = normalize(in2^right); right = up2 ^ in2; corner = center - right*xorder*panewidth/2 - up2*yorder*paneheight/2; path = reflect(i, normalize(nf)); /* trace source of highlight */ PtoC = corner - Ps; if (path.ptoc <= 0) {/* outside the room */ xfract = yfract = 0; else { /* * Make PtoF be a vector from the surface point to the wall by adjusting the length of the reflected vector path. */ PtoF = path * (PtoC.in2)/(path.in2); /* * Calculate the vector from the corner to the intersection point, and * project it onto up2. This length is the vertical offset of the * intersection point within the window. */ offset = (PtoF - PtoC).up2; modulus = mod(offset, paneheight); if( offset > 0 && offset/paneheight < yorder ) { /* inside the window */ if( modulus > (paneheight/2))/* symmetry about pane center */ modulus = paneheight - modulus; yfract = smoothstep(/* fuzz at the edge of a pane */ (framewidth/2) - (fuzz/2), (framewidth/2) + (fuzz/2), modulus); else { yfract = 0; 33
/* Repeat the process for horizontal offset */ offset = (PtoF - PtoC).right; modulus = mod(offset, panewidth); if( offset > 0 && offset/panewidth < xorder ) { if( modulus > (panewidth/2)) modulus = panewidth - modulus; xfract = smoothstep( (framewidth/2) - (fuzz/2), (framewidth/2) + (fuzz/2), modulus); else { xfract = 0; /* specular calculation using the highlight */ Ci = Cs * (Kd*diffuse(Nf) + Ka*ambient()) + yfract*xfract*specularcolor ; 34
Cg 2.0 Cg-2.0_Jan2008_ReferenceManual.pdf Example: vertex (gp4vp) Vertex Attribute Input Semantics 35
Output Semantics 36
Example: geometry (gp4gp) Primitive Instance Input Semantic Vertex Instance Input Semantic 37
Vertex Attribute Input Semantics 38
Output Semantics 39
Example: pixel (gp4fp) Interpolated Input Semantic 40
41
Interpolation Semantic Modifiers Per-primitive Input Semantics 42
Output Semantics 43