A GIS-Based Method for Delineating Spatial Clusters: A Modified AMOEBA Technique Sang-Il Lee* Daeheon Cho** Hakgi Sohn*** Miok Chae**** GIS LISA GIS LISA AMOEBA AMOEBA GIS AMOEBA Abstract The main objective of the paper is to develop a GIS-based method for delineating spatial clusters. Major tasks are: (i) to devise a sustainable algorithm with reference to various methods developed in the fields of geographic boundary analysis and cluster detection; (ii) to develop a GIS-based program to implement the algorithm. The main results are as follows. First, it is recognized that the AMOEBA technique utilizing LISA is the best candidate. Second, a modified version of the AMOEBA technique is proposed and implemented in a GIS environment. Third, the validity and usefulness of the modified AMOEBA algorithm is assured by its applications to test and real data sets. : geographic boundary analysis, cluster detection, delineation of spatial clusters, wombling, local indicators of spatial association (LISA), AMOEBA technique 2009 (Associate Professor, Department of Geography Education, Seoul National University), si_lee@snu.ac.kr (Professor for Special Appointment, The Graduate School of Education, Ewha Womans University), dhncho@gmail.com (Associate Research Fellow, Korea Research Institute for Human and Settlements), hgsohn@krihs.re.kr (Senior Research Fellow, Korea Research Institute for Human and Settlements), mochae@krihs.re.kr 502
(, Thurstain-Goodwin and Unwin, 2000; Office of the Deputy Prime Minister, 2002), (, Sutton, 2003; Balk et al., 2006), (Sohn, 2008; Sohn and Park, 2008) GIS (area objects) GIS GIS GIS (continuous field) (discrete objects) (Jacquez et al., 2000, 222) (Lee et al., 2009, 86-90) 503
(regular grid data) (spatially intensive variables) (Goodchild and Lam, 1980) (normalized) 1 (line objects) 1 2 (Jacquez et al., 2000, 224-225) Figure 1 Figure 1 (a) (open boundary) (difference boundary) (Jacquez et al., 2000, 225) (b) (c) 2 (closed boundary) (areal boundary) (a) (b) (c) Figure 1. Three situations in geographic boundary analysis. (Source: Jacquez et al., 2000) 504
(Jacquez et al., 2000, 225) (b) (c) (b) (c) (b) (patch) (b) Figure 1 (a) (wombling) (b) (spatially constrained cluster analysis) (c) (cluster detection) Womble(1951) ecotone (Fortin and Dale, 2005, 184) (latticewombling) (Fortin and Dale, 2005, 190) 10 30 (Jacquez et al., 2000, 228) (Lu and Carlin, 2005) (adjacency requirement) k- Legendre and Legendre, 1998 505
(Fortin and Dale, 2005, 177) (Fortin and Dale, 2005, 180) (Figure 1 (c)) (Waller and Gotway, 2004; Lawson and Kleinman, 2005; Rogerson and Yamada, 2009; Tango, 2010) LISA(local indicators of spatial association, (Anselin, 1995) LISA Moran I i Geary c i (Anselin, 1995), Getis-Ord G i G i *(Getis and Ord, 1992; Ord and Getis, 1995; Getis and Ord, 1996) LISA ESDA(exploratory spatial data analysis 1990 (Anselin, 1996; Unwin, 1996; Anselin and Bao, 1997; Anselin, 1998; Brunsdon, 1998; Dykes, 1998; Unwin and Unwin, 1998). LISA Figure 1 LISA Figure 1(a) (difference boundary) LISA Moran I i Anselin(1996) Moran H-L L- H LISA Getis-Ord G i * Boots(2001) Getis-Ord G i * 506
G i * LISA Anselin Moran (Anselin and Bao, 1997) Moran H-H L-L H-L L-H Moran LISA Moran LISA LISA GeoDa (Anselin, 2003) Wulder and Boots(1998) Getis-Ord G i * G i * G i * (maximum G i *) (maximum G i * distance) Fortin and Dale(2005, 159) G i * G i * AMOEBA AMOEBA(A Multidirectional Optimal Ecotope- Based Algorithm) LISA (Aldstadt and Getis, 2006). AMOEBA LISA Getis-Ord G i * 507
1 s w ij n i j w ij 1 w ij 0 w ii 1 0 1 (Aldstadt and Getis, 2006, 330) LISA Moran I i Geary c i Moran I i Geary c i Moran I i Getis-Ord G i * LISA Moran I i 0 0 Moran I i (spatial outlier) Geary c i Geary c i Moran I i Getis-Ord G i * G i * AMOEBA Getis-Ord G i * Moran I i Geary c i LISA Getis-Ord G i * Figure 2 AMOEBA 1 G i * G i *(0) G i * 1 G i *(0) z 508
G i *(0) 0 G i *(0) 0 G i * 1 15 4 6 4 1 15 G i * G i *(1) G i *(0) 0 G i *(1) G i *(0) 1 G i *(0) 0 G i *(1) G i *(0) 1 Figure 2 (a) 1 (a) Stage 1 (b) Stage 2 (c) Stage 3 (d) Final Stage Figure 2. The AMOEBA algorithm. (Source: Aldstadt and Getis, 2006) 509
2 1 Figure 2 (b) (a) 7 G i * G i *(2) G i *(2) G i *(1) G i *(0) 0 2 G i * G i *(0) 0 Figure 2 4 6 (d) Aldstadt and Getis(2006, 336) G i * G i * G i * Aldstatdt and Getis(2006) G i * G i * G i * G i * G i * G i * G i * G i * 510
공간 클러스터의 범역 설정을 위한 GIS-기반 방법론 연구 의 값이 중심 셀의 값보다 더 클 가능성이 높아 Gi* 통 를 추구하다 보면 결과적으로 큰 값에서 시작한 클러 계치가 지속적으로 증가할 수 있는데, 완변하는 양의 스터의 영역을 대부분 혹은 모두 포괄하게 되고, Gi* 공간적 자기상관이 존재하는 경우라면 결과적으로 상 통계치의 값도 더 커지게 된다. 샘플 자료를 이용하여 당히 넓은 범역이 하나의 클러스터로 탐지된다. 값이 가장 큰 지점(표준 점수 5.97)과 평균과 유사한 문제는 값이 매우 큰 셀부터 시작하여 탐색한 클러 지점(표준 점수 0.35)에 대해 각각 클러스터를 탐색한 스터와 평균과 유사한 셀부터 시작하여 탐색한 이 두 Figure 3은 위와 같은 사실을 잘 보여 준다. 두 클러스 클러스터가 중복되는 경우에서 분명히 드러난다. 보다 터는 서로 다른 셀에서 시작하였으나 서로 중복되었는 작은 값에서 시작한 클러스터는 높은 값들이 몰려있는 데(높은 값에서 시작한 클러스터가 낮은 값에서 시작 지점을 향해 확장해가게 되는데, Gi* 통계치의 최대화 한 클러스터에 포함됨), 값이 가장 큰 지점에서 탐색한 Figure 3. A comparison of different spatial cluster boundaries due to different starting cells. 시작 셀의 위치에 따라 상이하게 추출된 공간 클러스터 경계의 비교. -`511`-
G i * 46 56 47 57 G i * AMOEBA Aldstadt and Getis(2006, 335) AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA 1 AMOEBA 2 AMOEBA 2 AMOEBA G i * G i * AMOEBA G i * G i * 0 1 99 2 58 Figure 4. Conceptual framework for a GIS-based program for delineating spatial clusters. 512
GIS GIS ESRI ArcGIS(9.x) Microsoft Visual Basic 6.0 ArcGIS COM DLL (Figure 4) ArcGIS Figure 5 ID G i * GIS Figure 5. A GIS-based program for delineating spatial cluster. 513
AMOEBA AMOEBA AMOEBA AMOEBA Boots and Tiefelsdorf(2000) Boots(2001) (projection matrix) 2 (quadratic form) Moran 20 20 400 0 0 05 4 Table 1 4 AMOEBA AMOEBA G i * 2 58 Figure 6 4 AMOEBA AMOEBA 1 AMOEBA 2 1 AMOEBA AMOEBA AMOEBA 2 Moran I i 9 AMOEBA 7 AMOEBA 9 3 Moran I i AMOEBA AMOEBA 16 AMOEBA 9 AMOEBA 16 4 3 Table 1. Statistical summary for test data. Descriptive Statistics Pattern 1 Pattern 2 Pattern 3 Pattern 4 Mean 0.000 0.000 0.000 0.000 Standard Deviation 0.050 0.050 0.050 0.050 Maximum 0.102 0.097 0.091 0.091 Minimum -0.102-0.095-0.091-0.091 Moran s I 1.035 0.926 0.881 0.875 514
공간 클러스터의 범역 설정을 위한 GIS-기반 방법론 연구 AMOEBA Modified AMOEBA 1 Modified AMOEBA 2 Pattern 1 Pattern 2 Pattern 3 Pattern 4 Figure 6. Results of spatial cluster delineation for test data (Black lines are cluster boundaries). 테스트 데이터에 대한 공간 클러스터 범역 설정 결과. -`515`-
15 1 AMOEBA 9 AMOEBA 1 AMOEBA 2 AMOEBA 1 AMOEBA 2 (a) AMOEBA (b) Modified AMOEBA 1 (c) Modified AMOEBA 2 Figure 7. Results of spatial cluster delineation for population density pattern of Gangnam-gu, Seoul. 516
G i * 3 55 0 035 70 AMOEBA 2 AMOEBA (dasymetric areal interpolation) 100 100m (Lee and Kim, 2007) Figure 7 Table 2 AMOEBA 25 AMOEBA 1 AMOEBA 2 31 30 AMOEBA AMOEBA 1 AMOEBA 2 AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA 5 AMOEBA AMOEBA AMOEBA AMOEBA 2 AMOEBA 1 Table 2. Statistical summary for results of spatial cluster delineation. Summary Statistics AMOEBA Modified AMOEBA 1 Modified AMOEBA 2 Number of spatial clusters 25 31 30 Average area of spatial clusters(km 2 ) Total area of spatial clusters(km 2 ) Average population density of spatial clusters(persons/km 2 ) Standard deviation of population density of spatial clusters Maximum population density of spatial clusters(persons/km 2 ) Minimum population density of spatial clusters(persons/km 2 ) 0.42 0.20 0.30 10.39 6.23 9.11 33,572 41,781 39,906 10,126 13,854 12,887 54,672 77,836 77,836 20,900 24,700 24,700 517
AMOEBA 1 AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA AMOEBA 1 AMOEBA 2 LISA AMOEBA AMOEBA GIS AMOEBA 2 AMOEBA Aldstadt and Getis(2006) Getis-Ord G i * Moran I i (Kulldorff, 1997; 2009) Geary c i Lee S i (Lee, 2001; 2004; 2009) Lee S i AMOEBA (artificial neural network) (Moreira et al., 2007) 518
Aldstadt, J. and Getis, A., 2006, Using AMOEBA to create a spatial weights matrix and identify spatial clusters, Geographical Analysis, 38(4), 327-343. Anselin, L., 1995, Local indicators of spatial association-- LISA, Geographical Analysis, 27(2), 93-115. Anselin, L., 1996, The Moran scatterplot as an ESDA tool to assess local instability in spatial association, in Fisher, M., Scholter, H., and Unwin, D. (eds.), Spatial Analytical Perspectives on GIS, Taylor & Francis, London, 111-125. Anselin, L., 1998, Exploratory spatial data analysis in a geocomputational environment, in Longley, P. A., Brooks, S. M., McDonnell, R., and MacMillan, B. (eds.), Geocomputation: A Primer, John Wiley & Sons, Chichester, West Sussex, 77-94. Anselin, L., 2003, GeoDa 0.9 User s Guide, Spatial Analysis Laboratory, Department of Agricultural and Consumer Economics, University of Illinois. Anselin, L. and Bao, S., 1997, Exploratory spatial data analysis linking SpaceStat and ArcView, in Fischer, M. and Getis, G. (eds.), Recent Development in Spatial Analysis, Springer- Verlag, Berlin, 35-59. Balk, D. L., Deichmann, U., Yetman, G., Pozzi, F., Hay, S. I., and Nelson, A., 2006, Determining global population distribution: Methods, applications and data, Advances in Parasitology, 62, 119-157. Boots, B., 2001, Using local statistics for boundary characterization, GeoJournal, 53(4), 339-345. Boots, B. and Tiefelsdorf, M., 2000, Global and local spatial autocorrelation in bounded regular tessellations, Journal of Geographical Systems, 2(4), 319-348. Brunsdon, C., 1998, Exploratory spatial data analysis and local indicators of spatial association with XLISP-STAT, Journal of the Royal Statistical Society Series D: The Staistician, 47(3), 471-484. Dykes, J., 1998, Cartographic visualization: Exploratory spatial data analysis with local indicators of spatial association using Tcl/Tk and cdv, Journal of the Royal Statistical Society Series D: The Staistician, 47(3), 485-497. Fortin, M.-J. and Dale, M., 2005, Spatial Analysis: A Guide for Ecologists, Cambridge University Press, Cambridge. Getis, A. and Ord, J. K., 1992, The analysis of spatial association by use of distance statistics, Geographical Analysis, 24(3), 189-206. Getis, A. and Ord, J. K., 1996, Local spatial statistics: An overview, in Longley, P. and Batty, M. (eds.), Spatial Analysis: Modelling in a GIS Environment, GeoInformation International, Cambridge, 261-277. Goodchild, M. F. and Lam, N. S.-N., 1980, Areal interpolation: A variant of the traditional spatial problem, Geoprocessing, 1, 297-312. Jacquez, G. M., Maruca, S., and Fortin, M.-J., 2000, From fields to objects: A review of geographic boundary analysis, Journal of Geographical Systems, 2(3), 221-241. Kulldorff, M., 1997, A spatial scan statistic, Communications in Statistics: Theory and Methods, 26(6), 1487-1496. Kulldorff, M., 2009, SaTScan User Guide (version 8.0), Available at http://www.satscan.org/. Lawson, A. B. and Kleinman, K., 2005, Spatial and Syndromic Surveillance for Public Health, John Wiley & Sons, Chichester, West Sussex. Lee, S.-I., 2001, Developing a bivariate spatial association measurer: An integration of Pearson s r and Moran s I, Journal of Geographical Systems, 3(4), 369-385. Lee, S.-I., 2004, A generalized significance testing method for global measures of spatial association: An extension of the Mantel test, Environment and Planning A, 36(9), 1687-1703. Lee, S.-I., 2009, A generalized randomization approach to local measures of spatial association, Geographical Analysis, 41(2), 221-248. Lee, S.-I. and Kim, K., 2007, Representing the population density distribution of Seoul using 519
dasymetric mapping techniques in a GIS environment, Journal of the Korean Cartographic Association, 7(2), 53-67 (in Korean). Lee, S.-I., Shin, J., Kim, H.-M., Hong, I., Kim, K., Chun, Y., Cho, D., Kim, J.-G., and Lee, G. (translation), 2009, Geographic Information Systems and Science, 2nd Edition, Sigmapress, Seoul 2009 2 Longley, P. A., Goodchild, M., Maguire, D. J., and Rhind, D. W., 2005, Geographic Information Systems and Science, 2nd Edition, John Wiley & Sons, Chichester, West Sussex). Legendre, P. and Legendre, L., 1998, Numerical Ecology, 2nd English Edition, Elsevier, New York. Lu, H. and Carlin, B. P., 2005, Bayesian areal wombling for geographical boundary analysis, Geographical Analysis, 37(3), 265-285. Moreira, G. J. P., Takahashi, R. H. C., and Duczmal, L., 2007, Delineating spatial clusters with artificial neural networks, Advances in Disease Surveillance, 4, 104. Office of the Deputy Prime Minister, 2002, Producing Boundaries and Statistics for Town Centres: London Pilot Study Summary Report, The Stationery Office, UK. Ord, J. K. and Getis, A., 1995, Local spatial autocorrelation statistics: Distributional issues and an application, Geographical Analysis, 27(4), 286-306. Rogerson, P. and Yamada, I., 2009, Statistical Detection and Surveillance of Geographic Clusters, Chapman & Hall/CRC, Boca Raton, FL. Sohn, H., 2008, Modeling spatial patterns of an overheated speculation area, Journal of the Korean Geographical Society, 43(1), 104-116 (in Korean). Sohn, H. and Park, K., 2008, A spatial statistical method for exploring hotspots of house price volatility, Journal of the Korean Geographical Society, 43(3), 392-411 (in Korean). Sutton, P. C., 2003, A scale-adjusted measure of urban sprawl using nighttime satellite imagery, Remote Sensing of Environment, 86, 353-369. Tango, T., 2010, Statistical Methods for Disease Clustering, Springer, New York. Thurstain-Goodwin, M. and Unwin, D. J., 2000, Defining and delimiting the central areas of towns for statistical monitoring using continuous surface representations, Transactions in GIS, 4(4), 305-317. Unwin, A., 1996, Exploratory spatial analysis and local statistics, Computational Statistics, 11, 387-400. Unwin, A. and Unwin, D. J., 1998, Exploratory spatial data analysis with local statistics, Journal of the Royal Statistical Society Series D: The Statistician, 47(3), 415-421. Waller, L. A. and Gotway, C. A., 2004, Applied Spatial Statistics for Public Health Data, John Wiley & Sons, Hoboken, NJ. Womble, W. H., 1951, Differential systematics, Science, 114, 315-322. Wulder, M. and Boots, B., 1998, Local spatial autocorrelation characteristics of remotely sensed imagery assessed with the Getis statistics, International Journal of Remote Sensing, 19(11), 2223-2231. 151-748 599 si_lee@snu.ac.kr, 02-880-9028) Correspondence: Sang-Il Lee, Department of Geography Education, College of Education, Seoul National University, 599 Gwanak-ro, Gwanak-gu, Seoul 151-748, Korea (e-mail: si_lee@snu.ac.kr, phone: +82-2- 880-9028) 520