cdna Microarray Experiment: Design Issues in Early Stage and the Need of Normalization Byung Soo Kim, Ph.D. 1, Sunho Lee, Ph.D. 2, Sun Young Rha, M.D., Ph.D. 3,4 and Hyun Cheol Chung, M.D., Ph.D. 3,4 1 Department of Applied Statistics, Yonsei University, Seoul 120-749, Korea; 2 Department of Applied Mathematics, Sejong University, Seoul 143-747, Korea; 3 Cancer Metastasis Research Center, College of Medicine, Yonsei University, Seoul 120-752, Korea; 4 Brain Korea 21 Project for Medical Science, College of Medicine, Yonsei University, Seoul 120-752, Korea Purpose: The cdna microarray has become a useful tool for observing the expression of thousands of genes simultaneously. However, obtaining good quality microarray data is not easy due to the inherent noise at various stages of the experiment. Therefore, it is essential to understand the source of the variation in the microarray experiment and its size as an initial step of the data analyses. Materials and Methods: The total RNA extracted from HT-1080 fibrosarcoma and normal rat tissues were hybridized to the cdna microarrays with 0.5 K human and 5 K rat genes, respectively. A homotypic reaction and dye swap experiments were used to identify the sources of the variation. Results: The relative fluorescent intensities of the microarray, if unnormalized, have a large variation, particularly in the lower intensity region. The distribution of the log intensity ratios also exhibit some departure from a band around zero, which is the distribution pattern expected when the majority of genes in the microarray are not regulated. Normalization of the log ratios is usually required as a means of preprocessing the data. We claim that a within-print tip group, an intensity-dependent normalization through a loess fit adjustment will be useful for this purpose, particularly in the initial stages of the microarray experiment. Conclusion: For proper data analysis, an understanding the source of the variation and preprocessing of data with a suitable normalization method will be important. It is important to have an interactive cooperation between a researcher and a statistician from the early stages of the study design and to the final stages of data analysis. (Cancer Research and Treatment 2003;35:533-540) Key Words: cdna microarray, Homotypic experiment
Byung Soo Kim, et al Variations of cdna Microarray and Adequate Normalization Method 539 Fig. 6. M-A plots of 5 K rat cdna microarray. (A). Black and orange circles denote M values before and after within-print tip group intensity-dependent normalizations. (B) and (C) represent box plots of M values with respect to 16 print tips before and after within-print tip group intensity-dependent normalizations. We may note in (B) that print tips corresponding to 4, 8, 12 and 16 yield lower M values relative to the other print tips. These four blocks are located at right edges of the array and thus this effect is referred to as the edge effect, a special type of the spacial effect. ์์๊ด ๊ณ์๋ฅผ ์ฌ์ฉํ์๋ค. ์ํ๊ณ ์ ํ๋ ์ฐจ์ด(δ), ์ 1์ข ์ค๋ฅ์ ํ๋ฅ (α), ์ 2์ข ์ค๋ฅ ์ ํ๋ฅ (β), ๊ทธ๋ฆฌ๊ณ ๊ด์ฐฐ์น์ ๋ถ์ฐ์ด ์ฃผ์ด์ก์ ๋ ๊ฒฐ์ ๋ ๊ณ ์ฐฐ ๋ค. ๊ทธ๋ฌ๋, ํ ๋ง์ดํฌ๋ก์ด๋ ์ด์ ์ ์ ๋์ด ์๋ ์์ฒ ์ ๋ง๊ฐ์ ์ ์ ์๋ค์ ๋ฐํ ์ ๋์ ๋ถ์ฐ์ด ์ผ์ ํ์ง ์๊ณ , ๊ทธ cdna ๋ง์ดํฌ๋ก์ด๋ ์ด ์คํ์ ์ ์ ํ ๋ฐ์ ๊ฐ์ด ์ฌ๋ฌ ๋จ ๊ณ๋ฅผ ๊ฑฐ์น๋ฉฐ ๊ฐ ๊ณผ์ ๋ง๋ค ์ค์ฐจ๊ฐ ๊ฐ์ ๋ ์ ์์ผ๋ฏ๋ก ์ฐ์ ํฌ๊ธฐ๊ฐ ๋ํ ์๋ ค์ ธ ์์ง ์์ผ๋ฏ๋ก ํ์ฌ๊น์ง์ ์ด๋ก ์ผ๋ก n์ ๊ณ์ฐํ๋ ๊ฒ์ ์ด๋ ต๋ค(16,20,21). ์ ์ ํ๊ฒ ์ค๊ณ๋ ์ด๊ธฐ ์คํ์์ ๊ฐ ๋จ๊ณ๋ณ ์ค์ฐจ์ ํฌ๊ธฐ๋ฅผ ํ์ ํ๊ณ , ๊ทธ ์ ํ๊ณผ ์๋ฏธ๋ฅผ ํ์ ํ๋ ๊ฒ์ด ์ค์ํ๋ค. ์ฐ์ ๊ฒฐ ๋ก ๋ง์ดํฌ๋ก์ด๋ ์ด ์คํ์ ํตํ์ฌ ํด๊ฒฐํ๋ ค๋ ์๋ฌผํ์ ์ง๋ฌธ ์ ํต๊ณ์ ์ง๋ฌธ์ผ๋ก ์ ํํ๋ ๊ณผ์ ์ด ์์ด์ผ ํ๋ค. ์ด ๊ณผ์ cdna ๋ง์ดํฌ๋ก์ด๋ ์ด ์คํ์ ์ฌ๋ฌ ๋จ๊ณ์ ๊ณผ์ ์ ๊ฑฐ์น ์์ ๊ฐ ๋จ๊ณ๋ณ ์ค์ฐจ๋ฅผ ์ค์ผ ์ ์๋ ์คํ์ค๊ณ ๋ฐ ๋ถ์๋ฐฉ๋ฒ ๊ณ , ๋งค ๋จ๊ณ๋ง๋ค ์ค์ฐจ๊ฐ ๊ฐ์ฌ๋ ์ ์์ผ๋ฏ๋ก, ์ฌํ์ฑ ์๋ ์ ์ ์ด ํ์์ ์ด๋ฏ๋ก ํต๊ณ์ ๋ฌธ๊ฐ์ ๋ฏธ๋ฆฌ ์์ํ๊ณ , ์งํ ์คํ ๊ฒฐ๊ณผ๋ฅผ ์ป๊ธฐ๊น์ง๋ ์๋นํ ์๊ฐ์ด ์์๋๋ ์คํ์ด ๊ณผ์ ์์ ์ง์์ ์ธ ์๊ฒฌ๊ตํ์ด ๋ฐ๋์งํ๋ค. ๋ค. ์ด๊ธฐ ์คํ์์๋ ๋ณ๋์๊ณผ ๊ทธ ํฌ๊ธฐ๋ฅผ ํ์ ํ๋ ๊ฒ์ด ๋งค ๋ณต์ ๊ด์ฐฐ์น์ ํฌ๊ธฐ๋ฅผ ์ด๋ ์์ค์ผ๋ก ํ์ฌ์ผ ํ๋๊ฐ๋ ์ ์ฐ ์ค์ํ๊ณ , ์ด๋ฅผ ์ํ์ฌ 1 1 ์๋ ์คํ๊ณผ ๋ณต์ ๊ด์ฐฐ ์คํ ์ง ๋ฏธํด๊ฒฐ์ ๋ฌธ์ ๋ก ๋จ์์๋ค. ์ผ๋ฐ์ ์ผ๋ก ํ๋ณธํฌ๊ธฐ n์ ๊ฒ ์ ์ํํ์ฌ์ผ ํ๋ค. ์ด๋ฌํ ๊ณผ์ ์์ ์ ์ ํ ํ์คํ๊ฐ ํ