Optimizing Data-Centric IT Environments Accelerate time to insights for HPC and Analytics apps IBM Power CPU 와 NVIDIA GPU 가그리는차세대컴퓨팅솔루션 IBM 허욱실장 Systems-Hardware 부서 (010-45-50, whuh@kr.ibm.com)
시장흐름과 IBM 의 HPC 및 HPA 시스템혁신전략 지속적인가격 / 성능비향상필요 데이터센터효율화를위해시스템부터소프트웨어까지전체스택의혁신유도 개방형협업 Little-endian 지원으로어플리케이션생태계조성 POWER + 조합으로어플리케이션성능향상 2
차세대컴퓨팅솔루션을위한 IBM의전략적방향 대규모연산처리향상 활용 : GPU, FPGA CPU와통합 : Shared coherent memory 활용편이성증대 CPU-GPU 프로그래밍모델 : UVM, OpenMP 4.0/OpenACC CAPI-FPGA 대규모데이터처리성능향상 Move Computation to Data Burst Buffer, NVRAM Spectrum Scale OpenPOWER 파욲데이션을통한개방형협업및혁신추구 3
차세대컴퓨팅솔루션을위한 IBM 의기술스택 POWER 워크로드및워크플로우중심설계 최고성능을달성하기위한최적화 밀접하게통합된 모듈러빌딩블록기반솔루션 OpenPOWER 기반의혁신 4
컴퓨팅 POWER 프로세서 x6 대비 2 배이상의코어당성능 강력한연산능력 고성능시스템대역폭 대용량메모리 POWER 12 코어 / 소켓 쓰레드 / 코어 6MB 온 - 칩 L3 캐시 / 소켓 230GB/s 메모리대역폭 / 소켓 6GB/s I/O 대역폭 / 소켓 Wide SMP bus CAPI 개의메모리채널 / 소켓 1TB(2TB) 메모리 / 소켓 12MB 의버퍼캐시 / 소켓 5
컴퓨팅 POWER 프로세서 & 개방형협업 OpenPOWER 를통한개방형혁신 POWER GPU NVLink CAPI POWER/+ POWER/+ DMI Processors Processors IBM & Partner IBM Devices & Partner Devices Memory Interface Control Server Class Server Memory Class Memory 6
컴퓨팅 POWER 프로세서로드맵 지속적인프로세서기술혁신선도 650mm 2 65mm 2 POWER 567mm 2 POWER7+ 32 nm POWER 22 nm POWER+ 22 nm POWER xx nm POWER10 xx nm 코어 온 - 칩 2x SPFP Power Gating 초대용량 L3 캐쉬 12 코어 SMT 2X DPFP PCIE Gen 3 지원 (CAPI) NVLINK1.0 데이터센터컴퓨팅최적화 CAPI 및 NVLink 개선 효율향상 극도의분석성능향상 데이터센터컴퓨팅최적화 2012 2014 20 20 Future 7
컴퓨팅 기술통합 기술을통한컴퓨팅효율극대화 GPU 연산가속 POWER FPGA 연산가속 : Compression, Encryption, Monte Carlo, 스토리지 I/O 가속 : CAPI attached 네트워크 I/O 가속
컴퓨팅 NVIDIA GPU IBM 과 NVIDIA 간의기술협업 POWER NVLink GPU high speed interconnect 0-200 GB/s; 5-12X PCI-E Gen3 POWER CPU support Stacked Memory 4x Higher Bandwidth (~1 TB/s) 3x Larger Capacity 4x More Energy Efficient per bit
컴퓨팅 NVIDIA GPU NVLink 인터커텍트의차별점 NVIDIA GPU NVIDIA GPU w/ NVLink Graphics Memory Graphics Memory 40+40 GB/s Graphics Memory POWER GB/s PCIe x 0 GB/s Peak* System Memory Power Chip System Memory Power Chip with NVLink 현재향후 (20 ~ ) 10
컴퓨팅 NVIDIA GPU POWER CPU 와 Nvidia GPU 기반시스템로드맵 Kepler CUDA 5.5 7.0 Unified Memory Pascal CUDA Full GPU Paging Volta CUDA Cache Coherent POWER Kepler K40/K0 PCIe Pascal SXM2 NVLink 1.0 Volta SXM2 NVLink 2.0 POWER POWER+ POWER Buffered Memory 2014-2015 20 20 11
네트워킹 - Mellanox Mellanox 와의협업 POWER Networking evolution for system to system connection High bandwidth, low latency networking Ethernet(RoCE) Infiniband RDMA Connect-X, SwitchIB, LinkX CAPI 12
스토리지 Spectrum Scale Spectrum Scale : Server Client workstations Users and applications Compute Farm Single name space POWER POSIX NFS SMB/CIFS Map Reduce Connector OpenStack Cinder Manila Swift Glance Site B Site A Site C IBM Spectrum Scale Automated data placement and data migration Off Premise Elastic Storage Server 13
0 1 2 3 4 5 6 7 10 11 12 13 14 15 0 1 2 3 4 5 6 7 10 11 12 13 14 15 System x3650 M4 System x3650 M4 0 1 2 3 4 5 6 7 10 11 12 13 14 15 0 1 2 3 4 5 6 7 10 11 12 13 14 15 System x3650 M4 System x3650 M4 0 1 2 3 4 5 6 7 10 11 12 13 14 15 0 1 2 3 4 5 6 7 10 11 12 13 14 15 System x3650 M4 System x3650 M4 0 1 2 3 4 5 6 7 10 11 12 13 14 15 0 1 2 3 4 5 6 7 10 11 12 13 14 15 System x3650 M4 System x3650 M4 스토리지 Server GL product Line Power S22L 대용량 POWER JBOD Enclosure 2, 4 or 6 TB NL SAS 1.2TB SAS or 400/00GB SSD Model 22: Analytics Focus 2 Enclosures, 12U 1 NL-SAS, 2 SSD 5 GB/Sec Model 24: Analytics & Cloud 4 Enclosures, 20U 232 NL-SAS, 2 SSD 15+ GB/Sec Model 26: Petascale 6 Enclosures, 2U 34 NL-SAS, 2 SSD 25+ GB/sec GS product Line 최고의확장성 최고의성능 14 뛰어난관리기능 소프트웨어정의 Model 21s 24 SSD Model 22s 4 SAS or SSD 12 GB/Sec Model 24s 6 SAS or SSD 1+ GB/Sec Model 26s 144 SAS
솔루션스택 IBM Platform Computing 포괄적인 HPC 및 HPA 를위한시스템 Workload Management POWER Application Runtime System Management Infrastructure Services 15
차세대슈퍼컴퓨팅시스템을위한여정의시작 Oak Ridge 와 Lawrence Livermore Lab 을위한 2 개의슈퍼컴퓨터구축사업 (20 년구축완료예정 ) 현재 DoE 의최고수준슈퍼컴퓨터시스템 Titan (ORNL) 2012-20 Sequoia (LLNL) 2012-20 Mira (ANL) 2012-20 현재시스템보다 5X 10X 높은어플리케이션성능 >100 PF, >2GB/core main memory, 00 GB/node local NVRAM, ~10MW 120 PB, 1 TB/s GPFS TM File System Mellanox Dual-Rail InfiniBand, IBM POWER CPUs, NVIDIA Volta TM GPUs
차세대슈퍼컴퓨팅시스템을위한여정의시작
차세대슈퍼컴퓨팅시스템을위한여정의시작 POWER CPU POWER 2 Socket Server 2 P + 4/6 Volta GPU 512+ GiB SMP Memory & GPU Memory (HBM stacks) HPC 와 HPA 를위한범용 2U 디자인 Volta GPU SXM2 - Scalable system software and data architecture - LLVM Open Source compiler - Water cooling - Integrated Local Active Storage Scalable Active Network: Mellanox IB4X EDR Switch 256 Compute Racks System: 200 Pflops compute +120 PB Compute Rack: 1 Servers/rack 40 Racks ESS Rack: 1
HPC 와 HPA 를위한 IBM 의솔루션 10 월출시 Firestone (2015) Power S22LC 2 x POWER processor 2 x Nvidia K0 via PCIe slot 2U form factor 후속모델 (20) 후속모델 (20) 2 x POWER+processor 2 ~4 x Nvidia Pascal via SXM2 NVLink 1.0 2U form factor 2 x POWER processor 4 ~6 x Nvidia Volta via SXM2 NVLink 2.0 2U form factor 1
Thank You 감사합니다. 20