딥러닝이해및미디어응용 아주대학교구형일
인공지능 / 기계학습 / 딥러닝
AI 에관한 4 개의관점 Humanly Rationally Thinking Thinking Humanly Thinking Rationally Acting Acting Humanly Acting Rationally
Acting Humanly 사람처럼일하는 / 행동하는기계 인공지능은사람에의해서수행될때지능이필요한일을수행하는기계를만드는기술이다. (The art of creating machines that perform functions that require intelligence when performed by people. Kurzweil, 1990) 인공지능은컴퓨터가 ( 현재는 ) 사람이잘하는일을할수있도록하는방법을연구하는학문이다. (The study of how to make computers do things, at which, at the moment, people are better. Rich and Knight, 1991)
인공지능?
인공지능?
Artificial Flavor?
AI 역사 2016
튜링테스트 튜링이 1950 년에제안 질문자 (interrogator) 가, 질문에대한대답을바탕으로사람과기계 ( 로봇 ) 을구별할수없다면기계가지능을가지고있다고하자. B.J. Copeland 2000
AI 역사 2016
딥블루 딥블루 vs 게리카스파로프, 1997 Deep Blue vs Kasparov 3 1 2 vs 2 1 2 Brute-force search power 6~8 수를내다봄
AI 역사 2016
IBM Watson 슈퍼컴퓨터 질문예시 ) Kathleen Kenyon s excavation of this city mentioned in Joshua showed the wall had been repaired 17 times WHAT is Jericho This child star got his first onscreen kiss in MY GIRL WHO is Macaulay Culkin
AI 역사 2016
알파고
머신러닝 인공신경망을중심으로
인공지능제스쳐인식시스템을만들자!
손동작인식 Gesture A Gesture B Gesture C
접근법 1: 내경험 / 직관을바탕으로로직을설계하자 끝 ( 곡률이큰 ) 점을찾는다. 손의중심을찾는다. 손의중심을기준으로원을그린다. 원과비트연산을수행한다. https://gogul09.github.io/software/hand-gesture-recognition-p2
접근법 2: 기계학습 Training data Machine learning algorithm Model Training (learning) process Test process/inference
Black-box approach 영상, 비디오, 음성모두벡터 ( 숫자들의어레이 ) 로표현할수있음 출력도벡터로표현할수있음. ( 고양이 :[1,0,0], 개 :[0,1,0], ) 관찰된숫자들이들어가서원하는숫자들이나오도록하는검은상자 검은상자는다양한방식으로구현될수있지만현재는신경망이선호됨 관계식에필요한파라미터는예시로부터결정됨 (training 과정 )
뉴런 : 신경망의기본단위
인공뉴런 (Artificial Neuron) 실제뉴런 뉴런의수학적모델
예시 : 연어와농어의구별 폭 (w) 7.3l + 3.4w = 100 밝기 (l) l w 농어 연어 7.3l + 3.4w 100 7.3l + 3.4w < 100
4 2 2 4 예시 : 연어와농어의구별 l 7.3 l 1.0 0.8 w 3.4 w Σ 0.6 0.4 0.2 연어 / 농어 100
다층구조네트워크 ( 인공신경망 )
예시 : 숫자인식 p(c = "0" x) p(c = "1" x) 140 inputs softmax x 0,1 10 14 Layer 1 with 12 perceptrons p(c = "9" x) Layer 2 with 10 perceptrons Each having 12 inputs
딥러닝
신경망 vs 깊은신경망 신경망 깊은신경망 (Deep Neural Networks)
Large-scale recognition 1,000,000 images and 1,000 categories
AlexNet AlexNet won the 2012 ImageNet competition 5 convolutional layers, 2 fully connected layers The input is a color 224x224 image 2 GPU architectures
AlexNet results (2012) AlexNet TensorFlow codes and some results
GoogLeNet (2013), ResNet-34 (2014) http://ethereon.github.io/netscope/quickstart.html
딥러닝의특징 높은유연성 다양한문제에적용가능 다양한구조가능 표현법학습 계층적특징학습 분산표현
딥러닝의특징 데이터양에비례하여성능향상 병렬처리에적합 GPU, TPU,
미디어응용
NETFLIX
Deep Neural Networks for YouTube Recommendations https://research.google.com/pubs/pub45530.html
MUSIC AI
Cognitive Movie Trailer
100 Horror movie trailers Manual editing Machine Learning Integration with statistical approach Moments Visual Analysis Audio Analysis Scene Compos -ition
Integration with statistical approach Visual Analysis Audio Analysis Scene Compos -ition Morgan Full Movie 1h 32m Automation 10 moments IBM Film maker (black title card, musical overlap, order of moments) Morgan Trailer 2m 32s
어떤문제를해결할수있을가?
모라벡의역설
모라벡의역설
모라벡의역설 RoboCup 2016: NimbRo vs AUTMan
혼다아시모
Why? Encoded in the large, highly evolved sensory and motor portions of the human brain is a billion years of experience about the nature of the world and how to survive in it. We are all prodigious Olympians in perceptual and motor areas, so good that we make the difficult look easy. Abstract thought, though, is a new trick, perhaps less than 100 thousand years old. We have not yet mastered it. It is not all that intrinsically difficult; it just seems so when we do it Moravec, Hans (1988), Mind Children, Harvard University Press
쉬운문제일가? 어려운문제일가? MacGyver, 1985~
쉬운문제일가? 어려운문제일가? 사람 컴퓨터 대화 음성인식, 생각 말하기 - 음성합성 (TTS) 그림그리기 - 그래픽스
Hard problems for programmers Talk, Read, Walk, Drive, Atari, Easy problems for humans Hard problems for humans Easy problems for programmers Modelbased things (e.g., Physics simulateion)
Hard problems for programmers Talk, Read, Walk, Drive, Atari, Deep Learning to the Rescue Easy problems for humans Hard problems for humans Easy problems for programmers Modelbased things (e.g., Physics simulateion)
Hard problems for programmers Talk, Read, Walk, Drive, Atari, Deep Learning to the Rescue? Easy problems for humans Hard problems for humans Easy problems for programmers Modelbased things (e.g., Physics simulateion)
정리 인공지능 사람처럼일하는 / 행동하는기계를만들고자함 머신러닝 인공지능을실현하는하나의방법 입출력관계를데이터를통해서찾으려는분야 딥러닝 대규모의신경망을사용하는머신러닝방법 이론적인측면뿐아니라실용적인측면의장점이많음 인공지능의힘 미래를예측하는건쉽지않지만, 현재수준으로도많은분야에적용되어인간의전문성과창의성이더발휘될수있도록도울수있음
BACKUPS
To Scale: The Solar System
세부적인과정 Training The Research team trained a system on the trailers of 100 horror movies by segmenting out each scene from the trailers. Once each trailer was segmented into "moments", the system completed the following: A visual analysis and identification of the people, objects and scenery. An audio analysis of the ambient sounds (such as the character's tone of voice and the musical score) An analysis of each scene's composition (such the location of the shot, the image framing and the lighting) The analysis was performed on each area separately and in combination with each other using statistical approaches.
세부적인과정 The full-length feature film, "Morgan is fed to the system. It identified 10 moments that would be the best candidates for a trailer. Manual processing Use IBM filmmaker to arrange and edit each of the moments together into a comprehensive trailer. (+ black title cards, the musical overlay and the order of moments). Footage on the cutting room floor Some moments in the movie that were not included.