Journal of Educational Innovation Research 2018, Vol. 28, No. 4, pp.461-487 DOI: http://dx.doi.org/10.21024/pnuedi.28.4.201812.461 * - 2008 2018 - A Study on the Change of Issues with Adolescent Problem by Using Text Mining - The Internet News Articles for the Years 2008 to 2018 - Purpose: The purpose of this study was to identify issues with adolescent problem, as well as the trend on such issues based on internet articles. Method: The data was collected from Daily Chosun(chosun.com), Dong-A Daily(donga.com), and JoongAng Daily(news.joins.com) which were named as Korea s top 3 news media in 2018 and 8,110 articles from 2008 to march 2018 on the issues of adolescent problems were analyzed. For collecting data and topic analysis, Python 3.6 program and Latent Dirichlet Allocation(LDA) were utilized. Results: Firstly, between 2008 and 2012, the main issues were 'literature contents', education policy, popular culture contents, and 'career counselling programs'. This was a period when the commercialization of cultural contents was accelerating, leading to appearance of entertainment topic, and career problems emerged with cultural keywords. Secondly, between 2013 and 2018, the main issues were 'juvenile sex crime, communication between parents and children, social contribution activities and so on. As aspect of this period, topics were evenly distributed whilst social issues and adolescent issues moved in the same trend. As well, adolescent issues were diversified compared to the past and each problems were being dealt with similar proportion. Conclusion: This research focused on identifying which topics were covered in news articles and understand that the change in social-cultural environment and education policies have to be considered in identifying adolescent issues. Key words : adolescent problem, adolescent issue, textmining, big data, internet news article * 1 2018. Corresponding Author: Kim, Hyun-Sook. Silla University, Dept. of Education, Baegyang-daero 700, Sasang-gu, Busan, Korea, e-mail: khs@silla.ac.kr
..,. (, 2013). 2017 3 (, 2017.03.30.). (, 2017.09.03.),, (, 2017.09.05.). (, 2017.10.08.).,,,. Hall 1904 (, 1992). (, 1990).,, (, 2017)., 4 (, 2018).. 1990 (, 2006;, 1990),,,,, (,,, 1993;, 1992;, 1992;, 1990;, 1990;, 1992;, 1998;, 1998;, 1992)., 21,
.. (, 2006;, 1998). (, 2004;, 1992;, 1998), 21 (,,,,, 2013).,, (, 2000) (,, 2016)., (, 1998),. (1998),,,. (1998), (2005)., 4... 10.,,.,..,
(,,, 2016). (,, 2014). Choi, Lee Sohn(2017) 30 200,000. (2014)., (2017). 2008 2018.,...,?,, 10 (2012-2018)?...,, (http://www.rankey.com/), 2018 3 3,,., (http://www.alexa.com/), 2018 3,,. 3 (www.chosun.com), (www.donga.com), (www.news.joins.com)
.. 2008-2018,,. 2008 1 1 2018 3 31,,, Python 3.6 csv. Python,, HTML < -1>. HTML div > #date_text #news_title_text_id #news_body_id div.title_foot> span.date01 div.article_title>h2 div.article_view> div.article_txt div.article_head> div.clearfx> div.byline>em #article_title #article_body,,,,.,.,, csv. 50. 2008, < -2>, 2018 10 8,110., 1993 5 5 5 10 2008-2012, 2013-2018 3.
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 584 696 891 828 1,010 1,057 727 781 646 686 206 8,110 (%) (7.2) (8.6) (11.0) (10.2) (12.5) (13.0) (9.0) (9.6) (8.0) (8.4) (2.5) (100). (NIA) R KoNLP NIADic (983,012 ),. 2,000,,,,,,,,,,,,,, 14. (Topic) (Modeling). LSA(Latent Semantic Analysis), plsa(probabilistic Latent Semantic Analysis), LDA(Latent Dirichlet Allocation), LDA. (,,, 2018;,,, 2018;,,, 2017;, 2017) (,, 2013;, 2015;,,, 2017). LDA. perplexity, 5 25 perplexity perplexity. perplexity 0. < -4> perplexity, 17. LDA R LDA lda.collapsed.gibbs.sampler,.
lda.collapsed.gibbs.sampler(documents = ldaform$documents, K = 17, # vocab = ldaform$vocab, # vocabulary num.iterations = 5000, # update burnin = 1000, # alpha = 0.01, # eta = 0.01) # 20, 4. 1 2 3 4 45 48 40 44 perplexity 3 331.99 13.27 4 318.72 14.17 5 304.54 7.96 6 296.59 8.43 7 288.16 6.34 8 281.82 3.85 9 277.98 3.79 10 274.19 4.57 11 269.61 3.00 12 266.62 1.94 13 264.68 1.76 14 262.92 2.33 15 260.59 2.29 16 258.30 0.76 17 257.54 0.24 18 257.30 1.75 19 255.56 1.39 20 254.17 1.60 21 252.57 1.92 22 250.65 0.17 23 250.48 1.20 24 249.27 2.25 25 247.02
. 1) 빈도분석 2008 2012 100 < -1>, 30. 1,. 10. 10 (3 ). (7 ), (8 ), (10 ).., (6 ), (9 ),., (16 ), (26 ), (28 ), (30 )., (15 ), (22 ), (24 ),,. 2008-2012 1 6,892 26 1,306 51 924 76 782 2 5,010 27 1,302 52 903 77 778 3 4,027 28 1,298 53 902 78 776 4 2,883 29 1,284 54 899 79 767 5 2,854 30 1,253 55 895 80 762 6 2,470 31 1,217 56 893 81 759 7 2,408 32 1,181 57 892 82 757 8 2,308 33 1,154 58 877 83 757 9 2,112 34 1,115 59 872 84 754 10 2,068 35 1,112 60 861 85 751 11 1,960 36 1,092 61 861 86 744 12 1,919 37 1,090 62 847 87 742 13 1,903 38 1,081 63 844 88 740 14 1,779 39 1,071 64 843 89 738 15 1,711 40 1,015 65 841 90 736 16 1,672 41 1,010 66 832 91 736 17 1,594 42 1,009 67 822 92 735
( ) 2008-2012 18 1,574 43 1,000 68 818 93 734 19 1,453 44 988 69 816 94 732 20 1,430 45 949 70 810 95 731 21 1,415 46 949 71 808 96 729 22 1,392 47 947 72 805 97 729 23 1,363 48 926 73 804 98 726 24 1,338 49 926 74 792 99 722 25 1,329 50 926 75 792 100 719 2) 토픽분석 2008 2012 < -2>.. 2008 2012 5. 17 100. 1 2008. (, 2015). 2 17 (2007. 12. 19) 2008. 2012 18 10, (, 2013).. 3, (,, 2016)., (, 2002). 4 5 2012. 4 (2008-2012) 5., (, 2012).
5 2000, 2011 4. 6,. (2012) 2007 8% 4-5 20%,, (2012), 15-24 70%, 8.8% 1. (2011) 10 1. (2011),,. 7, 2002 (Youth). (2010),, 2 16, 2,, (, 2009) (, 2012). 8,,,,,. 2000 (, 2011). 2000, 2007, 2010. (, 2008),. 9 1990, 2004, 2010 2 5, 2012 2013 (,,, 2010). 2011 2012 2.
10,,,,,,,.. (,,, 2018). 2004 765 2013 1,735 68.8% (, 2016). 11,,,, 12,,,,,,,.,, (,,,,, 2013). 13,,,,,,.. 2008,, (,, 2009).. 14 15 N (,,,,, 2013).,,,,,,,,,, 4 (, 2014)., 2011 PC (,, 2014).,,,,,,,,,. 2007, 2000 (,, 2015),
,,. 16,,,,,,. 2000, (,, 2016), (,,, 2012),. 17,,,,. 1997 1998 12 3,823 (, 2013), 2000 (, 2014). 2009 (, 2014),. 1,,,,,,,,,,,,,,,,,,, 0.145 2,,,,,,,,,,,,,,,,,,, 0.126 3,,,,,,,,,,,,,,,,,,, 0.124 4,,,,,,,,,,,,,,,,,,, 0.122 5,,,,,,,,,,,,,,,,,,, 0.118 6,,,,,,,,,,,,,,,,,,, 0.068 7,,,,,,,,,,,,,,,,,,, 0.051 8,,,,,,,,,,,,,,,,,,, 0.040 9,,,,,,,,,,,,,,,,,,, 0.040 10,,,,,,,,,,,,,,,,,,, 0.038
( ) 11,,,,,,,,,,,,,,,,,,, 0.035 12,,,,,,,,,,,,,,,,,,, 0.024 13,,,,,,,,,,,,,,,,,,, 0.023 14,,,,,,,,,,,,,,,,,,, 0.016 15,,,,,,,,,,,,,,,,,,, 0.011 16,,,,,,,,,,,,,,,,,,, 0.010 17,,,,,,,,,,,,,,,,,,, 0.008 3) 연도별토픽추이 2008 2012 < -1>..,, 2008 2009. 2007 12 2008. 2012. 2012. 2009 2010.,,,,,,,., 2009..
1) 빈도분석 2013 2018 100 < -3>, 30. 10 (3 ). (4 ), (6 ), (7 ), (8 ), (10 ),,. (9 ) 5. (20) 30,. (28), (29), 2016 2017. 2016, (,, 2017).
2013-2018 1 8,095 26 1,510 51 1,071 76 948 2 4,325 27 1,475 52 1,068 77 935 3 3,800 28 1,459 53 1,065 78 934 4 3,447 29 1,443 54 1,056 79 934 5 3,348 30 1,396 55 1,049 80 930 6 3,155 31 1,383 56 1,040 81 924 7 2,470 32 1,379 57 1,038 82 919 8 2,420 33 1,355 58 1,037 83 919 9 2,381 34 1,341 59 1,034 84 918 10 2,376 35 1,314 60 1,025 85 917 11 2,348 36 1,311 61 1,017 86 911 12 2,331 37 1,279 62 1,013 87 910 13 2,207 38 1,278 63 1,011 88 906 14 2,178 39 1,226 64 1,004 89 902 15 2,098 40 1,224 65 1,001 90 899 16 2,018 41 1,206 66 995 91 897 17 1,990 42 1,198 67 991 92 887 18 1,976 43 1,184 68 987 93 884 19 1,922 44 1,109 69 986 94 882 20 1,884 45 1,102 70 986 95 881 21 1,850 46 1,102 71 983 96 881 22 1,796 47 1,094 72 954 97 880 23 1,661 48 1,080 73 952 98 878 24 1,593 49 1,076 74 952 99 871 25 1,565 50 1,075 75 950 100 857 2) 토픽분석 2013 2018 < -4>. 2013 2018 5. 17 5 100. 1,,,,,,,. 2016, 10 3, 29%,, (,,, 2018). 2,,,,,,,,,,. (,, 2017), OECD (,,, 2016)
(2015). 3, 4, 5.,,,,. 1980, (,, 2018). 4,,,,,,,,,, 2016-2017. (, 2016). 5. (,,,, 2015). 6 7. (,, 2011). 90, (, 2016).,, (,, 2016). (,, 2016;, 2016). 8,,,. 3 (2015-2017) (, 2018.07.16.). (2018) 1 10, 4 1
. 9,,,,,,, 2018 2. 2012, 2013, 2015, 2013 10, (, 2017). 10,,,,,,,,,,. 2015 2016. 2016 2001 2015 5. 11,,,,,,,,,,.,, (, 2014).. 12,,,,,,,,,, 2017,,. (, 2014). 13,,,,,,,,, 2014,. 14,,,,,. 2017 (, 2018).
15 96.4% (,, 2014).. (2014) -TV- -,. 16,,,,,,,., 2016 7, (, 2018).,,. 17,,,,,,,,,,. 13, 13 19 (, 2013).,,, (,,,, 2017). 3) 연도별토픽추이 2013 2018 < -2> 2013-2018 2008-2012. 2016 2017, 2015 2017 2018. 2018. 2018,.,. 2015 2016. 2015.
.,, 2008-2018 3 10 8,110. 2008-2012 2013-2018. 2008-2012,,,,,,,,,,,,,,,,. 2008-2012,,,, (, 2018). 2008 2009
. 2007 12.,,,,. 2008 2012,.,. 1980 (,,,,, 2013),.,. 4 (2008-2012) (, 2014). 2013-2018,,,,,,,,,,,,,,,,. 2013-2018, 5. 2013-2018 2015, 2018. 1990 1997 (, 2006), (,,, 2018). 2008, 2018,,,., 5, 2013-2017 5. 5,,,., 5,
. 2017 2018, 2016. 2017 2018. (, 2006)., 5.,. (2014) 57.1%, SNS(36.3%), (13.5%), (12.8%), (11.9%), (9.1%)..,,. 1993... 1990 (, 2006). 10,,,,.,., (, 1998;, 1998). (,,, 1993;, 1992;, 1992;, 1990;, 1990;, 1992;, 1998;, 1998;, 1992),,, (,,, 2004). (2009),
. 10.,,,,, (,,, ) (, 2000)..,. (, 2005).., 2008-2018 30. (, 2005). (, 2005),.,..,.,,,,,.,, (2018).. (1), 431-451.,, (1993).. (1), 12-27. (2011)..
,, (2012).,,. (2013).. (2017.09.03). SNS http://m.news.naver.com 2018.05.06, (2016). :. (2013). 18. (3), 153-180.,,,, (2011). : (2000-2009). (3), 521-542. (2015). 2008. (3), 43-64. (2005).. 109-131. (2012).. (2), 27-43. (1992). :. 57-71., (2016).. (1), 37-48. (2018).,?: ( ). (2), 45-86., (2017).. (6), 311-318. (2013).. (3), 145-177. (1992). :. 17-35., (2011). :. (1), 145-172. (2017).. (1), 31-53. (2014).. (1), 40-57.,, (2018).. 38-69. (2011). : 2000. (4), 301-327., (2009)..
(2), 109-133. (2013)...,, (2018).. (2), 139-161. (2006). : 1990. (1), 1-33., (2013).. (1), 7-32.,, (2017).. (4), 683-711.,, (2010).. (4), 27-34. (1998). :. (2), 131-144. (2016). (2008)., (2014).. (2016). SNS?:,. (7), 154-167.,,, (2015). :. (1998). IMF. (3), 115-145. (2014)...,, (2016).. (2), 165-185. (2012). 5. 3-21. (2011). (2017.03.30.). 8, www.yonhapnews.co.kr 2018.03.30.,, (2016).., (2016). :.
, (2014).. (2015).. (1), 153-169. (2017).. (6(A)), 73-86.,,, (2017).. (2014).. (1), 125-149. (1990).. (2) 33-48. (2017).. (2), 53-72. (2004).. 1-10. (1990). 90. (3) 85-96., (2018). :,,. (1), 81-92. (2018).. (4), 363-371., (2014).. (2), 149-175. (2006).. (2), 5-35. (1990).. (1), 34-60. (1992). :. 37-55. (1998).. 31-48. (2018). 4. 143-150. (2016). :. (4), 163-175.,, (2017).. (4), 213-224. (2017.10.08).... https://news.joins.com/article/21993283 2018.10.26, (2015). :. (2), 73-97.
,, (2004).? 11-28. (1990).. (1998).. (2), 63-80. (2002). 1970.. (2009).. 139-153. (1998).. (2), 17-32. (2010).., (2012).,., (2014).,., (2017).,., (2018).,. (2018).. (2018.07.16)....3 18 5 http://www.sporbiz.co.kr 2018.10.26. (2009).. (2000).. (2012).. (2014).. (2015)..,,,, (2013). :. (1992). :. 1-16. Choi, H. S., Lee, W. S., & Sohn, S. Y. (2017). Analyzing research trends in personal information privacy using topic modeling. Computers & Security, 67, 244-253. : 2018.10.31. / : 2018.11.12. / : 2018.12.20.
- 2008 2018 - :. : 2018 3,,, 2008 2018 3 8,110. Python 3.6 (Latent Dirichlet Allocation). :, 2008 2012,,,.., 2013 2018,,..,. :.