PowerPoint Presentation - PDF Free Download

When we speak of free software, we are referring to freedom, not price. Richard Stallmann Free Software Foundation, GNU Project

http://amzn.github.io

관계형데이터베이스 쉽고빠른구성 반복적인관리작업을대신수행 RDS 다양한관계형데이터베이스옵션제공 쉽고빠른확장 손쉬운고가용성구성

RDS 데이터베이스엔진 Aurora

Aurora 는? MySQL 호환관계형데이터베이스엔진 상용데이터베이스의성능과가용성제공 오픈소스데이터베이스의효율성과비용

클라우드를위한데이터베이스아키텍처 1 2 로깅및스토리지를멀티-테넌시스케일-아웃기반 DB 최적화스토리지서비스로전환서비스내부에 EC2, VPC, DynamoDB, SWF 및 Route 53 등다른 AWS 서비스들사용 Data Plane SQL Transactions Caching Logging + Storage Control Plane DynamoDB SWF 3 연속적인백업을위한 S3 와통합으로 99.999999999% 내구성제공 S3 Route 53

Aurora 주요특징 고성능뛰어난보안 MySQL 과호환 뛰어난확장성 높은가용성및내구성 완전관리형

뛰어난보안 저장시암호화 AES-256 및하드웨어가속 디스크및 S3 내모든블록들은암호화 AWS KMS 를통한키관리 전송시암호화 SSL VPC를통한네트워크격리 노드에직접접근없음 Application SQL Transactions Caching Storage 산업표준의보안및데이터보호인증서지원 S3

You ve probably heard about our benchmark numbers

SQL 성능테스트결과 Aurora r3.8xl (32 vcpu, 244 GiB RAM) 사용 MySQL SysBench 성능테스트 WRITE PERFORMANCE READ PERFORMANCE 4 클라이언트머신당각 1,000 connections 단일클라이언트머신 1,600 connections

RDS MySQL 5.6 & 5.7 보다 5X 빠른 WRITE PERFORMANCE READ PERFORMANCE 150,000 125,000 100,000 75,000 50,000 25,000 0 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 MySQL SysBench results R3.8XL: 32 cores / 244 GiB RAM Aurora MySQL 5.6 MySQL 5.7 Five times higher throughput than stock MySQL based on industry standard benchmarks.

인스턴스사이즈에따른성능 WRITE PERFORMANCE READ PERFORMANCE Aurora MySQL 5.6 MySQL 5.7 Aurora scales with instance size for both read and write.

읽기복제에따른지연감소 Updates per second Aurora RDS MySQL 30 K IOPS (single AZ) 1,000 2.62 ms 0 s 2,000 3.42 ms 1 s 5,000 3.94 ms 60 s 500x U P T O L O W E R L A G 10,000 5.38 ms 300 s SysBench OLTP 워크로드 250 테이블

성능을위한 Aurora 아키텍처 DO LESS WORK I/O의감소네트워크패킷최소화기존결과를캐시데이터베이스엔진오프로드 BE MORE EFFICIENT 비동기식처리응답속도경로감소락-없는데이터구조사용배치수행동시처리 DATABASES ARE ALL ABOUT I/O NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES

Aurora 클러스터 AZ 1 AZ 2 AZ 3 Aurora 프라이머리인스턴스 3 가용영역에걸친클러스터볼륨 S3

Aurora 클러스터및읽기복제 AZ 1 AZ 2 AZ 3 Aurora 프라이머리인스턴스 Aurora 복제 Aurora 복제 3 가용영역에걸친클러스터볼륨 S3

Aurora I/O 트래픽 MYSQL READ SCALING AMAZON AURORA READ SCALING MySQL 마스터 70% 쓰기 싱글 - 스레드 BINLOG 전송 MySQL 복제 70% 쓰기 Aurora 마스터 70% 쓰기 페이지캐시업데이트 Aurora 복제 100% 신규읽기 30% 읽기 30% 신규읽기 30% 읽기 데이터볼륨 데이터볼륨 공유 Multi-AZ 스토리지 Logical: SQL 문을복제에적용쓰기부하는양쪽노드에서유사별도스토리지마스터및복제사이에데이터차이존재 Physical: 마스터에서복제로 redo를전송복제는스토리지를공유. 쓰기수행없음캐시된페이지는 Redo 적용

Aurora 의고가용성

Aurora 의스토리지 기본고가용성 3 가용영역에 6-way 복제 AZ 1 AZ 2 AZ 3 4 / 6 쓰기, 3 / 6 읽기쿼럼 S3 저장소에연속백업 SSD, 스케일 - 아웃, 멀티 - 테넌트 스토리지 SQL Transactions Caching 연속적스토리지확장 최대 64TB 크기 사용한만큼만지불로그-구조기반스토리지 S3

스토리지자가치유및장애내구성 자동장애감지, 복제, 복구 2 개의복제및 1 개가용영역장애는읽기및쓰기가용성에영향없음 3 개의복제장애에도읽기가용성에영향없음 AZ 1 AZ 2 AZ 3 SQL Transaction Caching AZ 1 AZ 2 AZ 3 SQL Transaction Caching Read availability Read and write availability

Aurora 의인스턴스자동페일 - 오버 읽기복제있는경우 기존복제를새기본인스턴스로승격 페일오버대상인스턴스우선순위지정가능 DB 클러스터엔드포인트유지하며, 신규기본인스턴스로 DNS 레코드변경 일반적으로 1분이내에완료 Automatic Failover to Replica Instance 읽기복제없는경우 동일가용영역에새 DB 인스턴스생성시도 생성불가시다른가용영역에신규 DB 인스턴스생성시도 일반적으로 15 분이내에완료 Create new primary Instance AZ 1 AZ 2 AZ 3 AZ 1 AZ 2 AZ 3 Primary instance Replica instance Replica instance Replica instance Primary instance Primary instance Primary instance Shared Multi-AZ Storage Shared Multi-AZ Storage Aurora Replica 가있는경우 Aurora Replica 가없는경우

신속한크래시복구 기존데이터베이스 최종체크포인트이후로그재생필요 MySQL 은싱글 - 쓰레드동작및다량의디스크억세스필요 Aurora 스토리지수준에서읽기시온 - 디맨드형태로 Redo 레코드재생 병렬, 분산, 비동기 Crash at T 0 requires a re-application of the SQL in the redo log since last checkpoint Crash at T 0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously Checkpointed Data Redo Log T 0 T 0

캐시유지 데이터베이스프로세스와캐시의분리 데이터베이스재기동이벤트시에도캐시웜 (warm) 상태유지 전체캐시활성화가신속 Caching process is outside the DB process and remains warm across a database restart. SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching 즉각적인크래시복구 + 캐시유지 = 빠르고손쉬운 DB 장애복구

Compatible with the MySQL ecosystem

Well established MySQL ecosystem We ran our compatibility test suites against Aurora and everything just worked." - Dan Jewett, Vice President of Product Management at Tableau Business Intelligence Data Integration Query and Monitoring SI and Consulting Source:

How does Open-Source & Cloud fit into Data Analytics?

Generation Collection & Storage Analytics & Computation Collaboration & Sharing

More devices Lower cost Higher throughput Generation Collection & Storage Analytics & Computation 제약사항 Collaboration & Sharing Web Services helps remove constraints

데이터분석의세가지유형 Retrospective 분석또는보고 Here-and-now 실시간분석및대쉬보드 Predictions 보다스마트한서비스

How Fast is Real-Time?

There s no such thing as real time, only near-real time. Typically when we talk about real-time, we mean architectures that allow to respond to data without persisting it to a database first! John Akred CTO, Silicon Valley Data Science

So what is near real-time? 데이터가도착하자마자처리할수있는능력 다시말하면, 미래 가아닌 현재 상태의데이터를처리하는것 그렇다면 현재 란? ecommerce Attention span of a potential customer Options Trader Milliseconds Guided Missile Microseconds

Solution: 스트림프로세싱 Stream storage which allows processing events as they come in and react accordingly

What do we expect from a real-time data stream?

Real-Time Data Stream 에대한기대 Real-time 데이터스트림에무엇을기대합니까? 고가용성 확장성 장애복구능력 내구성 ( 임시 ) 어떻게가능한가요? 다수의데이터센터설비 자동으로확장가능한인프라 글로벌부하분산 기타.

AWS Global Infrastructure 12 Regions Oregon GovCloud Frankfurt Beijing Seoul Tokyo 33 Availability Zones 55 Edge Locations Northern California N. Virginia Ireland Sydney Continuous Expansion Singapore São Paulo