2009년 6월25일 KRnet 2009 Cloud Computing Platform 한재선 넥스알 대표이사 한국클라우드컴퓨팅연구조합 이사장 KAIST 정보미디어 MBA 겸직교수 jshan@nexr.co.kr Next Revolution, Toward Open Platform
Agenda 2 Cloud Computing 소개 Cloud Computing 배경과 정의 Cloud Computing 업계및시장동향 Cloud Computing 분류 Cloud Computing 기술 및 활용사례 Amazon Cloud Infrastructure Google App Engine Microsoft Azure Hadoop Platform Eucalyptus Platform Cloud Computing 이슈 및 해결과제
Agenda 3 Cloud Computing 소개 Cloud Computing 배경과 정의 Cloud Computing 업계및시장동향 Cloud Computing 분류 Cloud Computing 기술 및 활용사례 Amazon Cloud Infrastructure Google App Engine Microsoft Azure Hadoop Platform Eucalyptus Platform Cloud Computing 이슈 및 해결과제
배경: 컴퓨팅 패러다임의 변화 4 Burden Iron Works Corporate Data Center PC Edison Power Plant & Power Grid Cloud Computing Center & Internet < 전기산업 변화에서 유추할 수 있는 컴퓨팅 패러다임 쉬프트 > 컴퓨팅 자원 소유 방식의 변화 - 기업 내 IT 자원 및 서비스의 아웃소싱 확대 - 분업화와 규모의 경제 실현 인터넷 기반 서비스의 확대 - SW와 컨텐츠의 온라인 서비스화 - 초고속망을 통한 안정적인 서비스 전송 가능 클라우드 컴퓨팅
정의: 클라우드 컴퓨팅 5 정의 기업관점 End-User 관점 Providing IT infrastructure and environment to develop/host/run services and apps, on demand, with pay-as-you-go pricing, as a service Providing resource and services to store data and run application, in any devices, anytime, anywhere, as a service Gartner 선정 2009년 10대 전략기술 중 클라우드 컴퓨팅이 두번째 차지 (첫번째인 가상화 기술 역시 클라우드 컴퓨팅의 기반기술)
클라우드 컴퓨팅의 특징과 장점 6 특징 장점 Prescripted & Abstracted Infrastructure Fully Virtualized Equipped with Dynamic Infrastructure Software Pay by Consumption Free of Long-Term Contracts Application and OS Independent Free of Software or Hardware Installation Source: Is Cloud Computing Ready For The Enterprise, Forrester Research Economies of scale Cost - No upfront CapEx(Capital Expenditure) - Pay-as-you-go pricing model Scalability - Scale capacity on demand - Handling dynamic workloads Productivity -Easy to use - Reduced time-to-market Maintenance - Easy or no management - Instant software updates
클라우드 컴퓨팅의 경제학 7 제공자 관점: 규모의 경제 ~ 1,000 servers ~ 50,000 servers 2006년 기준 사용자 관점: 비용절감 추가고려요소 - 파워, 쿨링, 상면비용 - 운영및관리비용 Source: Above the Clouds: A Berkeley View of Cloud, UC Berkeley TR 2009
클라우드 컴퓨팅 분류 8 Offerings 서비스 + 인프라 자원 개발환경 + 인프라 자원 인프라 자원 Target 개인 + 기업 기업 기업 Public Cloud Cloud Services/Applications (Software as a Service) 최종 서비스 제공 Apple MobileMe, Google Apps, Nokia Ovi, Salesforce.com Apps, etc Cloud Platform (Platform as a Service) IT 인프라 자원과 함께 개발 및 운영 환경 제공 Google App Engine, force.com, MS Azure, Facebook F8, Bungee Labs, etc Cloud Infrastructure (Infrastructure as a Service) IT 인프라 자원 제공 Amazon S3&EC2, Joyent, GoGrid, AT&T, etc Private Cloud Enterprise Cloud Computing IBM, HP, SUN, Redhat, EMC, Dell, etc Cloud Computing Software Hadoop, 3Tera, Xen, VMware, NexR VCC, Eucalyptus, Enomaly ECP, OpenNebula, etc
Public Cloud Players 9 Cloud Infrastructure Cloud Service Cloud Platform
Agenda 10 Cloud Computing 소개 Cloud Computing 배경과 정의 Cloud Computing 업계및시장동향 Cloud Computing 분류 Cloud Computing 기술 및 활용사례 Amazon Cloud Infrastructure Google App Engine Microsoft Azure Hadoop Platform Eucalyptus Platform Cloud Computing 이슈 및 해결과제
Amazon Web Services (AWS) 11 2004년 Cloud Infrastructure 서비스 오픈 2008년 Amazon.com 싸이트의 트랙픽을 넘음 SQS, EC2, S3 서비스에서 Simple DB, CloudFront 등 다양한 인프라 자원 제공으로 서비스 확장 The First & Best Successful Cloud Computing!!! E-Commerce Service Data as a Service Historical Pricing People as as Service Mechanical Turk Alexa Web Info. Service Search as a Service Alexa Top Sites Alexa Site Thumbnail Alexa Web Search Platform S3 $0.15 per GB-Month Simple Queue Service Infrastructure as a Service Simple Storage Service Elastic Compute Cloud Simple DB EC2 $0.10 per Instance-Hour Cloud Infrastructure SimpleDB $1.50 per GB-Month
Amazon Cloud Infrastructure 적용사례 12 온라인 비디오 믹싱 서비스 자체 인프라 대신 Amazon Cloud Infrastructure 서비스 활용 갑작스런 사용자 급증: 25,000 250,000 (3일동안) 최대 시간당 20,000명 신규 등록 EC2 서비스로 신속히 대응 50 4000 instances (5일동안) 최대 시간당 40개 new instances 온라인 서비스에 Cloud Computing이 적합함을 입증한 성공사례 뉴욕타임즈의 1100만 기사(1851-1980) TIFF 이미지를 PDF로 변환 프로젝트 HW 신규구매대신Amazon EC2와S3 활용, SW 구매 대신 오픈소스 Hadoop 플랫폼 활용 소요시간 1일, 소요경비? 배치작업에 Cloud Computing이 적합함을 입증한 성공사례 TIFF format
뉴욕타임즈 사례 분석 13 시스템 구조 Amazon S3 TIFF Image (4TB) PDF (1.5TB) AMI Hadoop MapReduce Amazon EC2 (100 instances) 소요비용 S3 Storage: 5.5 TB Data Transfer-in: 4 TB Only $ 1,465 EC2 Instances: 100 X 24 hours http://calculator.s3.amazonaws.com/calc5.html
Amazon EC2 & S3 기술 구조 14 오픈 인터페이스 기술 (Web Services, SOA, Open API 등) 대용량 분산 스토리지 기술 (분산 파일 시스템, 분산 데이터스토어, 분산 질의 언어, 분산 캐쉬 등) AWS Interface (SOAP, REST) EC2 Manger EC2 Instance EC2 Instance Xen Hypervisor EC2 Host S3 Manger EC2 Instance Pool EC2 Instance EC2 Instance Xen Hypervisor EC2 Host Amazon S3 AMI 가상화 기술 (서버 가상화, 스토리지 가상화, 네트워크 가상화 등)
Amazon AWS 활용방법 15 Command Line Tools AWS Management Console (Web Interface) Third-Party Management Tools & Services Elasticfox: Firefox plug-in for Amazon EC2 S3 Firefox Organizer: Firefox plug-in for Amazon S3 RightScale: Management Service for EC2 & S3 SOAP & Query Interfaces Programming Libraries Java, Python, Ruby, PHP, Perl, C#, VB.Net, etc
Google App Engine 16 Run your web applications on Google's infrastructure http://code.google.com/appengine/ Cloud Platform Service Google 인프라 자원 무료 제공 (2008년 시작) 500MB Storage, 10 GB Bandwidth In&Out/day, 5 million PV/1 month 사용량 기반 가격정책에 따른 추가 자원 제공 CPU hours $0.10, Storage GB/Month $0.15 Python & Java Web 개발 환경 제공 성능, 확장성, 장애대책 등의 시스템 기능 제공 Google App Engine 기술 Scalable Service Infrastructure: Google 플랫폼 활용 Python runtime & 다양한 서비스 Open API Software Development Kit Web-based Admin Console Scalable Datastore (GFS, Bigtable, Memcached 등)
Google App Engine 플랫폼 기술 17 Web-based Admin Console App App App App Service APIs Account Image Mail URL Fetch 상태정보 Datastore Memcache Python Runtime: 서비스 실행환경 App Engine SDK Web Server 관리 Uploader API local version Python Framework webapp, Django 개발 및 테스팅 업로드 Memcache: 글로벌 메모리 캐시 Bigtable: 분산 데이터베이스 GFS: 분산파일시스템 Commodity 서버 클러스터
GAE HelloWorld 프로그래밍 18 helloworld.py app.yaml from from google.appengine.ext google.appengine.api import import webapp users from from from google.appengine.ext.webapp.util google.appengine.ext import import webapp import db run_wsgi_app application: helloworld from google.appengine.ext.webapp.util import run_wsgi_app class class MainPage(webapp.RequestHandler): Greeting(db.Model): // defining data model version: 1 def get(self): runtime: python class author MainPage(webapp.RequestHandler): = db.userproperty() self.response.headers['content-type'] = 'text/plain' api_version: 1 def content get(self): = db.stringproperty(multiline=true) self.response.out.write('hello, user date = users.get_current_user() db.datetimeproperty(auto_now_add=true) webapp World!') handlers: - url: /.* application class = webapp.wsgiapplication( script: helloworld.py if user: Guestbook(webapp.RequestHandler): // storing data def self.response.headers['content-type'] post(self): [('/', MainPage)], = 'text/plain' self.response.out.write('hello, greeting = Greeting() debug=true) ' + user.nickname()) else: if users.get_current_user(): def main(): self.redirect(users.create_login_url(self.request.uri)) greeting.author = users.get_current_user() run_wsgi_app(application) application greeting.content = webapp.wsgiapplication( = self.request.get('content') if name greeting.put() == " main ": [('/', MainPage)], main() self.redirect('/') debug=true) def class main(): MainPage(webapp.RequestHandler): // querying data run_wsgi_app(application) def get(self): google_appengine/dev_appserver.py greetings = db.gqlquery("select helloworld/ * FROM Greeting ORDER BY date DESC LIMIT 10") if name http://localhost:8080/ for greeting == " main ": in greetings: main() if greeting.author: Testing the app
Microsoft Azure 19 PaaS Platform Support both web apps and batch apps Offering computing resources, development tools and Web services Pay-as-you-go pricing model Integrated with Windows Live Services
Windows Azure & Azure Service Platform 20 OS for the cloud Compute Service -VM by cloud-optimized hypervisor -Web Role(Web App) & Worker Role(Batch App) Storage Service -Not a relational system -Types: Blobs, Tables, Queues Database services in the cloud SQL Data Services -Built on MS SQL server -Hierarchical data model, not relational (Scalability, Availability, Reliability) -SOAP/RESTful interface & LING query Others (Future) -Reporting, Analysis, ETL, etc Common services in creating distributed apps Access Control -Identity federation, claims transformation Service Bus -Intermediary between apps Workflow -Workflows for cross-organizational composite apps Live framework in the cloud Live Services -Access Windows Live Services data -HTTP & Atom/RSS feed -SOAP & RESTful interface Mesh Services -Data synchronization -Creating a mesh of devices -Creating a mesh-enabled Web apps Source: Introducing the Azure Services Platform, David Chappell, White Paper 2009
Cloud Software 사례: 21 대용량 분산 데이터 저장 및 처리 시스템 Google 플랫폼의 클론 플랫폼 Apache Open Source 프로젝트 Nutch 오픈소스 검색엔진의 분산 이슈에서 출발 저가 범용 서버 클러스터 기반 대용량 데이터 저장 및 분산 처리 시스템 소프트웨어 솔루션 (Java 언어 기반) 수많은 sub-project들과 ecosystem 형성 Powered by Hadoop Biggest Hadoop Cluster (20,000 대)
Hadoop 플랫폼 기술 22 Nutch: Open Source Search Engine MapReduce: 분산 데이터 처리 시스템 HBase: 분산 데이터베이스 HDFS: 분산 파일 시스템 Commodity 서버 클러스터 Google Search MapReduce Bigtable GFS Google Platform
Open Platform Ecosystem 23 Open Source 플랫폼은 개발자들의 자발적인 참여를 유도하여 상용 플랫폼과 경쟁할 수 있는 Ecosystem을 구축할 수 있는 장점 존재 NexR VCC Hadoop on Virtualization? Cascading Workflow management for Hadoop MapReduce Yahoo Pig Query Language Interface on Hadoop Yahoo Zookeeper Distributed Management IBM MapReduce Tools Eclipse plug-in for MapReduce programs HDFS, MapReduce HBase, HOD, Streaming, Fuse-DFS, EC2 Support Facebook Hive Data warehousing on Hadoop Parhely ORM for HBase Katta Distributed indexing with Hadoop Mahout & Hama Machine Learning using Hadoop MapReduce 참고: 한국 Hadoop Community http://www.hadoop.or.kr
Cloud Software 사례: 24 Open Source Cloud Computing Software Created by UCSB Originally targeted for researchers to help Cloud Computing experiments Designed to easily install on common academic cluster configurations Technical features Hierarchical architecture Node controller, cluster controller, cloud controller Using many Open Source components Axis2, Mule, Rampart, libvirt, JiBX, jetty, HSQLDB, Hibernate, etc Compatible with Amazon EC2, S3, and EBS Support for Xen & KVM virtualization Network virtualization with VLAN Rocks-based automatic installation
Eucalyptus Architecture 25 Cloud Interface Cloud Controller Cluster Controller Node Controller
Agenda 26 Cloud Computing 소개 Cloud Computing 배경과 정의 Cloud Computing 업계및시장동향 Cloud Computing 분류 Cloud Computing 기술 및 활용사례 Amazon Cloud Infrastructure Google App Engine Microsoft Azure Hadoop Platform Eucalyptus Platform Cloud Computing 이슈 및 해결과제
Example of Programming in Cloud Computing 27 GrepTheWeb Grep(filter) the actual web documents with RegEx Cloud Computing is needed Large dataset (even hundreds of TB), complex regex, unknown request patterns Amazon S3, EC2, SQS, SimpleDB + Hadoop MapReduce S3 retrieving input datasets and for storing the output dataset SQS buffering requests acting as a glue between controllers EC2 SimpleDB storing intermediate status, log, and for user data about tasks running a large distributed processing Hadoop cluster ondemand
Programming in Cloud Computing 28 Programming Loosely Coupled System Programming with separate cloud computing resources computing, storage, database, queue, etc Using messaging queues Support concurrency, high availability, load spikes Programming Elastic Resources As a Service model for accessing & controlling resources Almost-zero-infrastructure before & after the execution Programming with Scalable Ingredients Scale capacity on-demand Be a pessimist when using cloud resources Thinking Parallel Low cost and easy management for large cluster Multi-threading & multi-node programming Programming with share-nothing philosophy Programming Cost-Effectively Usage-based costing Infrastructure cost: CAPEX OPEX Difficult to predict the overall cost (Rethinking ROI)
Cloud Computing Applications 29 Data-Intensive Computing Document processing convert hundreds of thousands of documents from Microsoft Word to PDF, OCR millions of pages/images into raw searchable text Image processing create thumbnails or low resolution variants of an image, resize millions of images Video transcoding transcode AVI to MPEG movies Indexing create an index of web crawl data Data mining perform search over millions of records Batch Processing Systems Back-office applications (in financial, insurance or retail sectors) Log analysis analyze and generate daily/weekly reports Nightly builds perform nightly automated builds of source code repository every night in parallel Automated Unit Testing and Deployment Testing Test and deploy and perform automated unit testing (functional, load, quality) on different deployment configurations every night Websites Websites that sleep at night and auto-scale during the day Instant Websites websites for conferences or events (Super Bowl, sports tournaments) Promotion websites Seasonal Websites - websites that only run during the tax season or the holiday season ( Black Friday or Christmas) Source: Cloud Architectures, Jinesh Varia
MapReduce: Programming for Data-Intensive Computing 30 Distributed Processing Framework Invented by Google map (k1, v1) list (k2, v2) reduce (k2, list (v2)) list (v2) Proposed for parallel processing of large data sets parallelization, fault-tolerance, data distribution in framework Applications Log analysis, search indexing, collaborative filtering, clustering, machine learning, data mining, etc Features Mapper locality Overlap of maps, shuffle, sort Speculative execution
MapReduce 동작방식 31 map(k, v) list (k, v ) reduce(k, list (v )) list (v ) MapReduce 논리적 처리 흐름 Task들의 병렬처리
MapReduce 프로그래밍 - WordCount Map 32 1. package org.myorg; 2. 3. import java.io.ioexception; 4. import java.util.*; 5. 6. import org.apache.hadoop.fs.path; 7. import org.apache.hadoop.conf.*; 8. import org.apache.hadoop.io.*; 9. import org.apache.hadoop.mapred.*; 10. import org.apache.hadoop.util.*; 11. 12. public class WordCount { 13. 14. public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { 15. private final static IntWritable one = new IntWritable(1); 16. private Text word = new Text(); 17. 18. public void map(longwritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { 19. String line = value.tostring(); 20. StringTokenizer tokenizer = new StringTokenizer(line); 21. while (tokenizer.hasmoretokens()) { 22. word.set(tokenizer.nexttoken()); 23. output.collect(word, one); 24. } 25. } 26. }
MapReduce 프로그래밍 - WordCount Reduce 33 28. public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { 29. public void reduce(text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { 30. int sum = 0; 31. while (values.hasnext()) { 32. sum += values.next().get(); 33. } 34. output.collect(key, new IntWritable(sum)); 35. } 36. } 37. 38. public static void main(string[] args) throws Exception { 39. JobConf conf = new JobConf(WordCount.class); 40. conf.setjobname("wordcount"); 42. conf.setoutputkeyclass(text.class); 43. conf.setoutputvalueclass(intwritable.class); 45. conf.setmapperclass(map.class); 46. conf.setcombinerclass(reduce.class); 47. conf.setreducerclass(reduce.class); 49. conf.setinputformat(textinputformat.class); 50. conf.setoutputformat(textoutputformat.class); 52. FileInputFormat.setInputPaths(conf, new Path(args[0])); 53. FileOutputFormat.setOutputPath(conf, new Path(args[1])); 54. 55. JobClient.runJob(conf); 57. } 58. }
MapReduce 프로그래밍 - WordCount 동작 34 file01.txt Hello World Bye World HDFS file02.txt Hello Hadoop Goodbye Hadoop input files (from Local) User (Bye, 1) (Goodbye, 1) (Hadoop, 2) (Hello, 2) (World, 2) R (Bye, 1) (Goodbye, 1) (Hadoop, 1) (Hadoop, 1) (Hello, 1) (Hello, 1) (World, 1) (World, 1) Sorter (Hello, 1) (World, 1) (Bye, 1) (World, 1) (Hello, 1) (Hadoop, 1) (Goodbye, 1) (Hadoop, 1) M M input files JobTracker Job (wordcount)
Real MapReduce 프로그래밍 (구글) 35 Google 검색엔진의 Indexing 부분 MapReduce 프로그램 워크플로우 Stolen from Michael Kleber s Presentation
Research in MapReduce 36 Performance Issues Evaluating MapReduce for Multi-core and Multiprocessor Systems (HPCA 2007) Improving MapReduce Performance in Heterogeneous Environments (OSDI 2008) Applications & Algorithm Issues Map-reduce for machine learning on multicore (NIPS 2007) MRPGA: An Extension of MapReduce for Parallelizing Genetic Algorithms (escience 2008) CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications (escience 2008) MapReduce for Data Intensive Scientific Analyses (escience 2008) Apache Mahout Project: Implementing Machine Learning Algorithms in MapReduce CloudBurst: Highly Sensitive Read Mapping with MapReduce (Oxford Bioinformatics 20 09) Frameworks/Implementations Issues MapReduce for the Cell B.E. Architecture (TR 2007) Mars: A MapReduce Framework on Graphics Processors (PACT 2008) A Map Reduce Framework for Programming Graphics Processors (STMCS 2008) Workflow & Language Extension Issues Pig Latin: A Not-So-Foreign Language for Data Processing (SIGMOD 2008) Map-reduce-merge: simplified relational data processing on large clusters (SIGMOD 2007) Interpreting the data: Parallel analysis with Sawzall (Scientific Programming Journal 20 05) Facebook Hive: Data warehousing using Hadoop & MapReduce Cascading: Data processing workflow on a Hadoop cluster NexR MR.Flow: MapReduce workflow management service
Problems of Cloud Computing 37 In Forrester Research Report Concerns about stability Few big-name players offering clouds Few enterprise reference accounts Concerns around security Lack of commercial ISV support Little geographic locality Not for the faint-of-tech Not very enterprise friendly Other problems Integration with in-house systems Application licensing complexity Privacy Constant network connectivity Confidence to service providers Open standard Interoperability between services
Cloud Computing Incidents Database 38 CloudComputing:Incidents Database, Wikipedia
Service Outage Cases 39 Amazon S3 Outage 8 hours in July 20, 2008 (Affected: all) Cause: Design fault (server-to-server communication) Flexiscale Outage 2 days in August 26, 2008 (Affected: all) Cause: Engineer mistake Gmail Outage 2 hours in August 11, 2008 (Affected: many) Cause: Change management Apple MobileMe Outage Several hours in July 10, 2008 (Affected: many) Cause: Migration from.mac to MobileMe CloudComputing:Incidents Database, Wikipedia
Service Closure Cases 40 MediaMax/Linkup Cloud storage service Data loss of half of user files in July 2007 20,000 paid users are affected Finally, service closure in July 2008 Zimki Early cloud platform service (from 2006) Service closure in December 2007 Caused by the cease of investment CloudComputing:Incidents Database, Wikipedia
Solutions 41 복수의 클라우드 컴퓨팅 서비스 이용 (클라우드 컴퓨팅 서비스 이중화) 기술 표준화 (인터페이스, 개발환경, SLA 등) Inter-Cloud 연동 기술 개발 및 표준화 Cloud Federation SLA 기반 서비스 수준 보장 및 QoS 제공 데이터 암호화와 가상화 기술을 통한 보안성 확보 지역별 데이터센터로 국가 규제 준수 사용량 기반 라이선스 모델 및 대량 구매 정책
Top 10 Obstacles to and Opportunities for Adoption and Growth of Cloud Computing 42 Source: Above the Clouds: A Berkeley View of Cloud, UC Berkeley TR 2009
43 Thank You!!! Jaesun Han jshan@nexr.co.kr Korea Hadoop Community http://www.hadoop.or.kr