김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

지능정보연구제 17 권제 4 호 2011 년 12 월 (pp.241~254) Support vector machines(svm),, CRM. SVM,,., SVM,,.,,. SVM, SVM. SVM.. * 2009() (NRF-2009-327- B00212). 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철 exp 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 ) t, t, t, t. 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철 Abstract Kyoung-jae Kim * Hyunchul Ahn ** Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don t require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. * Department of Management Information Systems, Dongguk University_Seoul ** School of Management Information Systems, Kookmin University 지능정보연구제 17 권제 4 호 2011 년 12 월

재무예측을위한 Support Vector Machine 의최적화 This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level. Key Words : Instance Selection, Support Vector Machines, Hybrid Model, Financial Forecasting, Data Mining 지능정보연구제 17 권제 4 호 2011 년 12 월

김경재 안현철. KAIST,,, Annals of Operations Research, Applied Intelligence, Applied Soft Computing, Asia Pacific Journal of Information Systems, Computers and Operations Research, Computers in Human Behavior, Expert Systems, Expert Systems with Applications, Intelligent Data Analysis, International Journal of Electronic Commerce, Intelligent Systems in Accounting, Finance and Management, Neural Computing and Applications, Neurocomputing.,,,.. KAIST, KAIST.,,. 지능정보연구제 17 권제 4 호 2011 년 12 월