SVM
SVM 200112
200112
1 1 2 3 2.1 3 2.2 Labeled Unlabeled 8 2.2.1 8 2.2.2 9 2.2.3 Unlabeled 10 2.2.4 11 3 Unlabeled 12 3.1 Support Vector Machne 12 3.1.1 SVM 12 3.1.2 15 3.1.3 SVM 17 3.2 18 3.2.1 Unlabeled 19 3.2.2 Unlabeled 21 4 23 4.1 23 4.2 24 4.3 25
5 31 33 39
1. 4 2. Unlabeled 10 3. 2 SVM 12 4. Unlabeled 16 5. Unlabeled SVM 19 21 22 25 26 26 27 27 28 28 29 29 30 32
1. 20 2. 31
.,. 1960. 1980,.. k-,, Support Vector Machne,, Nave Bayes...,, unlabeled. unlabeled, unlabeled. labeled unlabeled. Support Vector Machne,, unlabeled,,
Reuter. Labeled, unlabeled., SVM, SVM. :, Support Vector Machne,,
ï
j w = j j j df N tf w log j tf j df d K w w w,...,, 2 1
2 χ
R D CSV f c : j d c j f c d CSV j f c d CSV τ j f c d CSV τ j d
ï
w ρ x ρ b = 0 x ρ w ρ b,, ξ ρ V w ρ b N ρ ρ 1 ρ ρ V w, b, ξ w w + C ξ 2 ξ ρ ρ ρ N 1 [ w x + b] ξ = : y 1 = 1 N = 1 : ξ > 0 W a ρ N N N ρ 1 ρ ρ W α = α + y yαα j x x j 2 = 1 = 1 j= 1 α ρ α α ρ N y α 0 1 : 0 α C = =
χ m c 1 } { = t log log k k k k k k t p c P c t P c t P t P c P c t P c t P t IG + =
log log, B A C A N A c P t P c t P c t MI + + = = c χ χ 2 2 D C B A D B C A CB AD N c t + + + + = χ χ χ χ χ χ d c t P d c t P c t RS k k k + + = log t k c P
1 } 1 k k k k k c t P c t P c t P c t P c t OR = 2 2 Y X XY Joachms [Hers94]
θ 1 var θ I T θ 2 ; ln = θ θ θ θ x f E I
w ρ x ρ b = 0 x ρ x ρ w ρ x ρ b = 0 w ρ x ρ b = 0 dst x ρ x ρ dst x ρ
Gven labeled example set L={x 1, y 1..., x l,y l } and unlabeled example set U={x 1,..., x u } and test example set T={s 1, y 1,..., s s,y s }; Intalze SVM classfer f 0 by tranng wth L 0 =L, U 0 =U Do 1. Set margn t as margn of f t 2. Set = margn t * 2 3. Set dstx = dstance between 4. Set U add = {x 1,y 1,...,x t,y t xu t, y=f t x, dstx } 5. Set L +1 = 6. Set U t+1 =U t -U add 7. Tran f t+1 wth L t+1 8.Classfy test set T wth f t+1 9.Set PR t as classfcaton precson of T 10. Set t = t+1 Whle U add >0 & PR t >PR t-1 Store the fnal SVM classfer
c C A A + B A A + Re Pr Re Pr 1 2 2 + + = β β β F β β β F β β β
χ
χ
[ 93],,, 1993. [ 00], FAQ,, 2000. [Angl87] D. Anglun, Learnng regular sets from queres and counterexamples, Informaton and Computaton, 75, pp. 87-106, 1993. [Angl93] D. Anglun, L. Hellersten and M. Karpnsk, Learnng read-once formulas wth queres, Journal of the ACM, 40, pp. 185-210, 1993. [Balu99] S. Baluja, Probablstc modelng for face orentaton dscrmnaton: Learnng from labeled and unlabeled examples, Advances n Neural Informaton Processng Systems 11, pp. 854-860, 1999. [Blum97] A. L. Blum and P. Langley, Selecton of Relevant Features and Examples n Machne Learnng, Artfcal Intellgence, Vol. 97, pp. 245-271, 1997. [Cast96] V. Castell and T. Cover, The relatve value of labeled and unlabeled samples n pattern recognton wth an unknown mxng parameter, IEEE Transactons on Informaton Theory, 426, pp. 2101-2117, 1996. [Cohn94] D. Cohn, L. Atlas and R. Landner, Improvng generalzaton wth actve learnng, Machne Learnng, 152, pp. 201-221, 1994. [Cohn96] D. Cohn, Z. Ghahraman and M. Jordan, Actve learnng wth statstcal models, Journal of Artfcal Intellgence Research, 4, pp. 29-145, 1996. [Crav98] M. Craven, D. Fretag, A. McCallum, T. Mtchell, K. Ngam and C. Quek, Learnng to extract symbolc knowledge from the World Wde Web, Techncal Report, School of Computer Scence, CMU, 1998. [Crav00] M. Craven, D. DPasquo, D. Fretag, A. McCallum, T. Mtchell, K. Ngam, and S. Slattery, Learnng to construct knowledge bases from the World Wde Web, Artfcal Intellgence, 1181-2, pp. 69-113, 2000.
[Day69] N. Day, Estmatng the components of a mxture of normal dstrbutons, Bometrka, 563, pp. 463-474, 1969. [Demp77] A. Dempster, N. Lard and D. Rubn, Maxmum lkelhood from ncomplete data va the EM algorthm, Journal of the Royal Statstcal Socety, Seres B, 391, pp. 1-38, 1977 [Dyer89] M. Dyer, A. Freze and R. Kannan, A random polynomal tme algorthm for approxmatng the volume of convex bodes, Proceedngs of the Annual ACM Symposum on the Theory of Computng, pp. 375-381, 1989. [Ghah94] Z. Ghahraman and M. Jordan, Supervsed learnng from ncomplete data va an EM approach. Advances n Neural Informaton Processng Systems 6, pp.120-127, 1994. [Gane89] S. Ganesalngam, Classfcaton and mxture approaches to clusterng va maxmum lkelhood, Appled Statstcs, 383, pp. 455-466, 1989. [Gane78] S. Ganesalngam and G. McLachlan, "The effcency of a lnear dscrmnant functon based on unclassfed ntal samples," Bometrka, 65, pp. 658-662, 1978 [Gros91] K. P. Gross, Concept acquston through attrbute evoluton and experment selecton, Doctoral dssertaton, School of Computer Scence, Carnege Mellon Unversty, Pttsburgh, PA., 1991. [Gud97] V. N. Gudvada, et.al, "Informaton Retreval on the World Wde Web," IEEE Internet Computng, Vol. 1, no. 5, September/October, 1997. [Hart98] H. Hardley and J. Rao, Classfcaton and estmaton n analyss of varance problems, Revew of Internatonal Statstcal Insttute, 36, pp. 141-147, 1968. [Hers94] W. R. Hersh, C. Buckley, T. J. Leone and D. H. Hckam, OHSUMED: An nteractve retreval evaluaton and new large test collecton for research, In Proceedngs of the 17th Annual ACM SIGIR Conference, pp. 192-201, 1994 [Jaak00] T. Jaakkola, M. Mela and T. Jebara, Maxmum entropy dscrmnaton, Advances n Neural Informaton Processng Systems 12, pp. 470-476, 2000.
[Joac97] T. Joachms, Text categorzaton wth support vector machnes: Learnng wth many relevant features, Techncal Report 23, Unverstät Dortmund, LS VIII, 1997. [Joac98] T.Joachms, Text Categorzaton wth Support Vector Machnes: Learnng wth Many Relevant Features, In Machne Learnng: ECML-98, Tenth European Conference on Machne Learnng, pp. 137-142, 1998 [Joac00] T. Joachms, Estmatng the Generalzaton Performance of an SVM Effcently, Proceedngs of the 17th Internatonal Conference on Machne Learnng ICML 2000, pp. 431-438, 2000. [Knob77] B. Knobe and K. Knobe, A method for nferrng context-free grammars, Informaton and Control, 31, pp.129-146, 1977. [Kwok98] J. T.-Y. Kwok, Automated text categorzaton usng support vector machne, In Proceedngs of the Internatonal Conference on Neural Informaton Processng, Ktakyushu, Japan, Oct, pp. 347-351, 1998. [Lang95] K. Lang, Newsweeder: Learnng to flter netnews, In Internatonal Conference on Machne Learnng ICML, pp. 331-339, 1995. [Lew94] D. Lews and Gale, A sequental algorthm for tranng text classfers, Proceedngs of ACM SIGIR Conference, 1994. [Lew97] D. Lews and K. Knowles, Threadng electronc mal: A prelmnary study, Informaton Processng and Management 332, pp. 209-207, 1997. [Ltt77] R. Lttle, Dscusson on the paper by Professor Dempster, Professor Lard and Dr. Rubn, Journal of the Royal Statstcal Socety, Seres B, 391, pp. 25, 1977 [Lova92] L. Lovasz and M. Smonovts, On the randomzed complexty of volume and dameter, Proceedngs of the IEEE Symposum on Foundatons of Computer Scence, IEEE, pp. 482-492, 1992. [Mcla75] G. McLachlan, Iteratve reclassfcaton procedure for constructng an asymptotcally optmal rule of allocaton n dscrmnant analyss, Jounal of the
Amercal Statstcal Assocaton, 79350, pp. 365-369, 1975. [Mcla82] G. McLachlan, Updatng a dscrmnant functon on the bass of unclassfed data, Communcatons n Statstcs: Smulaton and Computaton, 116, pp. 753-767, 1982. [Mere00] D. Meretaks, D. Fragouds, H. Lu and S. Lkothanasss, Scalable Assocatonbased Text Classfcaton, Proceedngs of CIKM-00, 9th ACM Internatonal Conference on Informaton and Knowledge Management,pp. 5-11, Washngton, US,2000. [Mll96] D. Mller and H. Uyar, A generalzed Gaussan mxture classfer wth learnng based on both labelled and unlabelled data, Proceedngs of the 1996 Conference on Informaton Scence and Systems. 1996. [Moon00] R. Mooney and L. Roy, Context-based book recommendng usng learnng for text categorzaton, In Proceedngs of the Ffth ACM Conference on Dgtal Lbrares, pp.195-204, 2000. [Murr78] G. Murray and D. Ttterngton, Estmaton problems wth data from a mxture, Appled Statstcs, 273, pp. 325-334, 1978. [Nga98] K. Ngam, A. McCallum, S. Thrun and T. Mtchell, Learnng to Classfy Text from Labeled and Unlabeled Documents, In Proceedngs of AAAI-98, 15th Conference of the Amercan Assocaton for Artfcal Intellgence, pp. 792-799, 1998. [One78] T. O'Nell, Normal dscrmnaton wth unclassfed observatons, Journal of the Amercan Statstcal Assocaton, 73364, pp. 821-826, 1978. [Pazz96] M. Pazzan and D. Bllsus, Syskll & Webert: Identfyng nterestng Web stes, In Proceedngs of the Thrteenth Natonal Conference on Artfcal Intellgence,pp. 54-61, Portland, 1996. [Rats95] J. Ratsaby and S. Venkatesh, Learnng from a mxture of labeled and unlabeled examples wth parametrc sde nformaton, Proceedngs of the Eghth Annual Conference on Computatonal Learnng Theory, pp. 412-417, 1995.
[Rve93] R. L. Rvest, and R. E. Schapre, Inference of fnte automata usng homng sequences, Informaton and Computaton, 103, pp. 299-347, 1993. [Samm86] C. Sammut and R.B. Banerj, Learnng concepts by askng questons, In R.S. Mchalsk, J.G. Carbonell, & T.M. Mtchell Eds., Machne learnng: An artfcal ntellgence approach Vol.2. San Francsco, CA: Morgan Kaufmann, 1986. [Seba99] F. Sebastan, "Machne Learnng n Automated Text Categorsaton," Techncal Report IEI-B4-31-1999, Isttuto d Elaborazone dell'informazone, Consglo Nazonale delle Rcerche, Psa, IT, 1999. [Seun92] H. Seung, M. Opper and H. Sompolnsky, Query by Commttee, Proceedngs of the Ffth Annual Workshop on Computatonal Learnng Theory, pp. 287-294, New York, ACM Press, 1992. [Shah94] B. Shahshahan and D. Landgrebe, The effect of unlabeled samples n reducng the small sze problem and mtgatng the Huges phenomenon, IEEE Transactons on Geoscence and Remote Sensng, 325, pp. 1087-1095, 1994. [Shav98] J. Shavlk and T. Elass-Rad, Intellgent agents for web-based tasks: An advce-takng approach, Learnng for Text Categorzaton: Papers from the AAAI Workshop, pp. 63-70. Tech. rep. WS-98-05, AAAI Press. 1998. [Snc89] A. Snclar and M. Jerrum, Approxmate countng, unform generaton and rapdly mxng Markov chans, Informaton and Computaton, 82, pp. 93-133, 1998. [Szum01] M. Szummer and T. Jaakkola, Kernel expansons wth unlabeled data, Advances n Neural Informaton Processng Systems 13, 2001. [Ttt76] D. Ttterngton, Updatng a dagnostc system usng unconfrmed cases, Appled Statstcs, 253, pp. 238-247, 1976. [Vapn95] V. Vanpk, The Nature of Statstcal Learnng Theory, Sprnger-Verlag, 1995 [Yang99a] Y. Yang, et al., "Learnng Approaches for Detectng and Trackng News Events," IEEE Intellgent System, pp. 32-43, July/August 1999.
[Yang99b] Y. Yang and X. Lu, "A Re-examnaton of Text Categorzaton Methods," Proceedngs of the 22h Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval SIGIR 99, pp. 42-49, 1999. [Yang99c] Y. Yang, "An Evaluaton of Statstcal Approaches to Text Categorzaton," Journal of Informaton Retreval, vol 1, no. 1/2, pp. 67-88, 1999. [Zhan00] T. Zhang and F. Oles, A probablty analyss on the value of unlabeled data for clasfcaton problems, Proceedngs of the Seventeenth Internatonal Conference on Machne Learnng, pp. 1191-1198, 2000.
ABSTRACT Incremental Supervsed Learnng based on SVM wth Unlabeled Documents Soo-Young Km Dept. of Computer Scence The Graduate School Yonse Unversty
Keywords : Unlabeled data, Support Vector Machne, Incremental Supervsed Learnng, Text Categorzaton