File Systems for Flash Memories

Similar documents
<4D F736F F F696E74202D20BCD2C7C1C6AEBFFEBEEEC6AFB7D03038B3E22E BC8A3C8AF20B8F0B5E55D>

6.24-9년 6월

(72) 발명자 이동희 서울 동작구 여의대방로44길 10, 101동 802호 (대 방동, 대림아파트) 노삼혁 서울 중구 정동길 21-31, B동 404호 (정동, 정동상 림원) 이 발명을 지원한 국가연구개발사업 과제고유번호 부처명 교육과학기술부

휠세미나3 ver0.4

solution map_....

리뉴얼 xtremI 최종 softcopy

Microsoft PowerPoint - 알고리즘_1주차_2차시.pptx

GNU/Linux 1, GNU/Linux MS-DOS LOADLIN DOS-MBR LILO DOS-MBR LILO... 6

FMX M JPG 15MB 320x240 30fps, 160Kbps 11MB View operation,, seek seek Random Access Average Read Sequential Read 12 FMX () 2

Microsoft PowerPoint - Flash Memory Based Bottom Up Analysis for Smart Phone System _Final [호환 모드]

R50_51_kor_ch1

PowerPoint 프레젠테이션

CD-RW_Advanced.PDF

Output file

슬라이드 1

Chap06(Interprocess Communication).PDF

Mango220 Android How to compile and Transfer image to Target

2 / 26

°í¼®ÁÖ Ãâ·Â

<BBEABEF7B5BFC7E22DA5B12E687770>

#Ȳ¿ë¼®

DE1-SoC Board

¹Ìµå¹Ì3Â÷Àμâ

PJTROHMPCJPS.hwp

김기남_ATDC2016_160620_[키노트].key

Microsoft Word _반도체-최종

Copyright 2012, Oracle and/or its affiliates. All rights reserved.,.,,,,,,,,,,,,.,...,. U.S. GOVERNMENT END USERS. Oracle programs, including any oper


OP_Journalism

PowerChute Personal Edition v3.1.0 에이전트 사용 설명서

목차 1. 제품 소개 특징 개요 Function table 기능 소개 Copy Compare Copy & Compare Erase

LXR 설치 및 사용법.doc

2009년 국제법평론회 동계학술대회 일정

인켈(국문)pdf.pdf

Simplify your Job Automatic Storage Management DB TSC

원고스타일 정의

다음 사항을 꼭 확인하세요! 도움말 안내 - 본 도움말에는 iodd2511 조작방법 및 활용법이 적혀 있습니다. - 본 제품 사용 전에 안전을 위한 주의사항 을 반드시 숙지하십시오. - 문제가 발생하면 문제해결 을 참조하십시오. 중요한 Data 는 항상 백업 하십시오.

Development of culture technic for practical cultivation under structure in Gastrodia elate Blume

Microsoft PowerPoint - o8.pptx

Vol.257 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M


DBPIA-NURIMEDIA

서론 34 2

05Àå

Voice Portal using Oracle 9i AS Wireless

Microsoft PowerPoint - eSlim SV [080116]

6장.indd


Microsoft PowerPoint - eSlim SV [ ]

04-다시_고속철도61~80p

<31325FB1E8B0E6BCBA2E687770>

K7VT2_QIG_v3

Microsoft PowerPoint - ch03ysk2012.ppt [호환 모드]

Oracle9i Real Application Clusters

MS-SQL SERVER 대비 기능

Microsoft Word - USB복사기.doc

¨ìÃÊÁ¡2

untitled

<3034B1E2B9DD32302DBAB8B0EDBCAD2D DC0FCC6C4C0DABFF BAB0C3A53420C8A8B3D7C6AEBFF6C5A9292E687770>

PowerPoint Presentation

Remote UI Guide

<3132BFF93136C0CFC0DA2E687770>

PCServerMgmt7

SRC PLUS 제어기 MANUAL

Hi-MO 애프터케어 시스템 편 5. 오비맥주 카스 카스 후레쉬 테이블 맥주는 천연식품이다 편 처음 스타일 그대로, 부탁 케어~ Hi-MO 애프터케어 시스템 지속적인 모발 관리로 끝까지 스타일이 유지되도록 독보적이다! 근데 그거 아세요? 맥주도 인공첨가물이

Orcad Capture 9.x

Integ

소프트웨어개발방법론

강의지침서 작성 양식

untitled

SMB_ICMP_UDP(huichang).PDF

±èÇö¿í Ãâ·Â

H3050(aap)

MAX+plus II Getting Started - 무작정따라하기

Vol.258 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

#KM-235(110222)

歯15-ROMPLD.PDF

05( ) CPLV12-04.hwp

Preliminary spec(K93,K62_Chip_081118).xls

Microsoft PowerPoint - 알고리즘_5주차_1차시.pptx

02이용배(239~253)ok

DBPIA-NURIMEDIA

KDTÁ¾ÇÕ-1-07/03


Microsoft PowerPoint - 알고리즘_2주차_1차시.pptx

결과보고서

KDTÁ¾ÇÕ-2-07/03

public key private key Encryption Algorithm Decryption Algorithm 1

04_오픈지엘API.key

2011´ëÇпø2µµ 24p_0628

Microsoft PowerPoint ppt

Á¶Áø¼º Ãâ·Â-1

PRO1_04E [읽기 전용]

공학박사학위 논문 운영 중 터널확대 굴착시 지반거동 특성분석 및 프로텍터 설계 Ground Behavior Analysis and Protector Design during the Enlargement of a Tunnel in Operation 2011년 2월 인하대

6주차.key

<4D F736F F F696E74202D2037C0E52DC4B3BDC3BFCDB8DEB8F0B8AE>


- 2 -

퇴좈저널36호-4차-T.ps, page Preflight (2)

歯김병철.PDF

Transcription:

File Systems for Flash Memories (jinsoo@cs.kaist.ac.kr)

Outline Introduction to Flash Memories File Systems for Flash Memories JFFS/JFFS2 LFFS

Introduction to Flash Memories

Memory Types EPROM FLASH High-density Low-cost High-speed Low-power High reliability Non-volatile High-density Ultraviolet light for erasure EEPROM Non-volatile Lower reliability Higher cost Lowest density Electrically byte-erasable DRAM High-density Low-cost High-speed High-power ROM High-density Reliable Low-cost Suitable for high production with stable code Source: Intel Corporation.

Flash Memory Characteristics Operations Read Write or Program change state from 1 to 0 Erase change state from 0 to 1 1 1 1 1 1 1 1 1 write 1 1 0 1 1 0 1 0 Unit Page (sector) management or program unit Block erase unit erase 1 1 1 1 1 1 1 1

NOR vs. NAND Flash (1) NOR Flash Random, direct access interface Fast random reads Slow erase and write Mainly for code storage Intel (28%), Spansion (25%), STMicro (13%), Samsung (7%), Toshiba (5%), Source: isuppli Corp. (Q2/2005) NAND Flash I/O mapped access Smaller cell size Lower cost Smaller size erase blocks Better performance for erase and write Mainly for data storage Samsung (55%), Toshiba (23%), Hynix (10%), Renesas (6%), STMicro (2%), Infineon (2%), Micron (2%)

NOR vs. NAND Flash (2) Mass Storage-NAND Memory Cards (mobile computers) Solid-State Disk (rugged & reliable storage) Digital Camera (still & moving pictures) Voice/Audio Recorder (near CD quality) Low Cost and High Density Good P/E Cycling Endurance Code Memory-NOR BIOS/Networking (PC/router/hub) Telecommunications (switcher) Cellular Phone (code & data) POS / PDA / PCA (code & data) Fast Random Access XIP Source: Samsung Electronics

NOR vs. NAND Flash (3) Access times comparison Media Read Write Erase DRAM 60ns (2B) 2.56us (512B) 60ns (2B) 2.56us (512B) - NOR Flash 150ns (2B) 14.4us (512B) 211ns (2B) 3.53ms (512B) 1.2s (128KB) NAND Flash 10.2us (2B) 35.9us (512B) 201us (2B) 226us (512B) 2ms (16KB) Disk 12.4ms (512B) (average) 12.4ms (512B) (average) -

Flash: Beauty and the Beast Flash memory is a beauty. Small, light-weight, robust, low-cost, low-power nonvolatile device Flash memory is a beast. Much slower program/erase operations No in-place-update Erase unit > write unit Limited lifetime (100K~1M program/erase cycles) Bad blocks (for NAND), Software support for flash memory is very important for performance & reliability.

NAND Flash Memory NAND Flash memory structure Small Block NAND: (512+16)B/page, 32pages/block Large Block NAND: (2K+64)B/page, 64pages/block Limited NOP (Number of Programming): Usually 4

NAND Flash-based Storage (1) Flash cards CompactFlash, MMC, SD/miniSD, Memory Stick, xd, Source: IDC (from http://www.bitmicro.com)

NAND Flash-based Storage (2) Flash SSDs (Solid State Disks) M-Systems FFD (Fast Flash Disk) 2.5 Solid-state flash disk in a 2.5 disk Up to 90GB ATA-6: interface speed of 100MB/s 40MB/s sustained read/write rates Released: March 10, 2004 ~$40,000 for 90GB BiTMICRO E-Disk Battery-backed DRAM + NAND Flash Samsung Flash SSDs

NAND Flash-based Storage (3) OKI Wireless Base Station In-Flight Entertainment Patriot Air and Missile Defense System Advanced Amphibious Assault Vehicle F-18 Hornet Nortel Networks Optical Switches Eurocopter AS 532U2 Cougar

NAND Flash-based Storage (4) Flash-embedded devices Handheld phones MP3 players PMPs PDAs Digital TVs Set-top boxes Car navigation & entertainment systems

File Systems for Flash Memories

Storage: A Logical View Abstraction given by block device drivers: 512B 512B 512B 0 1 N-1 Operations Identify(): returns N Read(start sector #, # of sectors) Write(start sector #, # of sectors) Source: Sang Lyul Min (Seoul National Univ.)

File System Basics (1) For each file, we have File contents (data) Nobody cares what they are. File attributes (metadata) File size Owner, access control lists Creation time, last access time, last modification time, File name File access begins with File name open ( /etc/passwd, O_RDONLY);

File System Basics (2) File system: A mapping problem <filename, data, metadata> <a set of blocks> a.out meta1 dog.jpg meta2 1 2 3 1 2 3 4 meta1 a.out 1 4 3 2 1 2 3 dog.jpg meta2

File System Basics (3) Goals Performance + Reliability Design Issues What information should be kept in metadata? How to locate metadata? Mapping from pathname to metadata How to locate data blocks? How to manage metadata and data blocks? Allocation, reclamation, free space management, etc. How to recover the file system after a crash?

File System Example Ext2 file system A disk-based file system for Linux Similar to UNIX Fast File System (FFS) Evolved to Ext3 File system (with journaling) Directory: pathname metadata (i-node) Direct/indirect block pointers: i-node data blocks Boot Block Block group 0 Block group n Super Group Data block i-node i-node Data Block Descriptors Bitmap Bitmap Table blocks 1 block n blocks 1 block 1 block n block n blocks

Flash File Systems Disks vs. NAND Flash No seek time Asymmetric read/write cost No in-place-update Wear-leveling Approaches to flash file systems Layered approach Block device emulation using FTL (Flash Translation Layer) Native (or cross-layer) approach

Layered Approach (1) Flash Translation Layer (FTL) A software layer to make NAND flash fully emulate magnetic disks. Sector mapping Garbage collection Power-off recovery Bad block management Wear-leveling Error correction code (ECC) Power management

Layered Approach (2) Page mapping in FTL Logical Sector Number Physical Sector Number Flash 0 (-1,-1) Block 0 Page 0 sector 6 1 2 (1,2) (1,0) Page 1 Page 2 3 (0,0) Page 3 4 5 (-1,-1) (1,3) Block 1 Page 0 6 (0,2) Page 1 7 (-1,-1) Page 2 Page 3

Layered Approach (3) Block mapping in FTL Logical Block Number Physical Block Number Flash sector 6 = 0 1 1 0 0-1 Block 0 Page 0 Page 1 1 0 Page 2 Page 3 Block 1 Page 0 Page 1 Logical offset = Physical offset Page 2 Page 3

Layered Approach (4) Benefits Easy to deploy. No modification is required for upper layers. Legacy file systems or swap space can be built. Flash cards or flash SSDs already come with FTL. Limitations Most FTLs are patented. FTL can not make use of kernel-level information. Kernel is not aware of the presence of flash memory.

Layered Approach (5) What happens on file deletion? File: abc.txt 1 2 3 4 5 6 Datablock bitmap Superblk 01 00 00 10 10 11.. i-node for abc.txt 1 2 6 5 3 4 Logical blocks i-node bitmap 1 2 6 5 4 3 Raw NAND Flash Memory

Native Approach Cross-layer optimization Kernel manages raw flash memory directly. More opportunities to optimize the performance. Kernel is involved in some FTL functionalities. Sector mapping, garbage collection, wear-leveling, power-off recovery, etc. Example: Flash-aware file systems: JFFS/JFFS2, YAFFS Limitations Need to change the host operating system Only applicable Flash-embedded devices

JFFS/JFFS2

JFFS (1) JFFS (Journaling Flash File Systems) Developed by Axis Communications, Sweden in 1999. Released under GNU GPL Designed for small NOR flashes A log-structured file system Any file system modification is appended to the log. The log is the only data structure on the flash media. Log = <metadata, (name), (data)> A file is obsoleted by a later log in whole or in part. Obsoleted logs are reclaimed via garbage collection. Rely on special in-core data structures for filename metadata, metadata data mappings.

JFFS (2) JFFS architecture metadata name data jffs_raw_inode magic ino pino version mode uid gid atime mtime ctime offset dsize rsize nsize nlink flags dchksum nchksum chksum : magic number : inode number : parent inode number : version number : file s type or mode : file s owner and group : last access time : last modification time : creation time : where to begin to write : size of the node s data : how much are going to be replaced? : name length, number of links, flags for rename/deleted/accurate : checksum for the data : checksums for the name and the raw inode

JFFS (3) Garbage collection The free space is eventually exhausted. Now what? Erase the oldest block in the log. head head tail tail Live nodes should be moved. Perfectly wear-leveled.

JFFS2 (1) JFFS limitations Poor garbage collection performance A block is garbage collected even if it contains only clean nodes. In many cases, there are static data. (libraries, program executables, etc.) No compression support Flash memories are expensive. No support for hard links File name and parent i-node are stored in each i-node. No support for NAND flashes

JFFS2 (2) Node types JFFS2_NODETYPE_INODE Similar to jffs_raw_inode No filename, no parent i-node number Compression support JFFS2_NODETYPE_DIRENT Represent a directory entry, or a link File name, i-node, parent i-node (directory s i-node), etc. File name with i-node = 0: deleted file JFFS2_NODETYPE_CLEANMARKER To deal with the problem of partially-erased blocks due to the power failure during erase operation

JFFS2 (3) JFFS2 architecture jffs2_inode_cache next_in_ino next_phys offset totlen jffs2_raw_node_ref raw next version ino = 10 nhash type a next_in_ino next_phys offset totlen raw next version ino = 20 nhash type b scan_dents next nodes ino = 3 nlink status jffs2_full_dirent next_in_ino next_phys offset totlen next_in_ino next_phys offset totlen scan_dents next nodes ino = 20 nlink status next_in_ino next_phys offset totlen next_in_ino next_phys offset totlen next_in_ino next_phys offset totlen scan_dents next nodes ino = 10 nlink status a, i10 i10, #1 b, i20 i20, #1 i10, #2 i20, #2 i10, #3

JFFS2 (4) What happens on mount: Physically scan the whole flash media Check CRC Build in-core data structures» jffs2_raw_node_ref, jffs2_inode_cache, jffs2_full_dirent, etc. Scan the directory tree, calculating nlink for each inode Scan for inodes with nlink == 0 and remove them Free temporary data structures e.g., jffs2_full_dirent

JFFS2 (5) Block lists free_list: empty blocks clean_list: blocks full of valid nodes dirty_list: blocks containing at least one obsoleted node Garbage collection Invoked if the size of free_list is less than the threshold. Which blocks? 99% from dirty_list (jiffies % 100!= 0) 1% from clean_list (for wear-leveling) Small nodes can be merged by GC.

JFFS2 (6) JFFS2 limitations Large memory consumption In-core data structures» jffs2_raw_node_ref (16bytes/node), jffs2_inode_cache Slow mount time 4 sec for 4MB! Runtime overheads (space & time) Build child directory entries from flash on directory access Build node fragments on file access All the inode s nodes should be examined (with CRC checked) Do not utilize NAND OOB area

JFFS2 (7) JFFS2 memory consumption example JFFS2 with 64MB NAND flash Typical Linux root FS: 2.2MB (719 directories, 2995 regular files) 64MB file with 512bytes/node: 6.7MB 64MB file with 10bytes/node: 47.6MB JFFS2 with 1GB NAND flash (estimated) Typical Linux root FS: 34.7MB 64MB file with 512bytes/node: 104.2MB 64MB file with 10bytes/node: 743.6MB (Source: JFFS3 Design Issues, June 4, 2005)

LFFS

Design Objectives LFFS (Log-structured Flash File System) A file system for large block NAND flash memories running over Linux MTD Scalable file system Supporting up to several GB Fast mount Small memory footprint Comparable performance to JFFS2

LFFS Approach (1) Back to the original LFS design VFS-compliant metadata structure and caching Fast mount and recovery using checkpoints

LFFS Approach (2) LFFS differences Use multiple checkpoint blocks LFS: small (two) fixed checkpoint slots Avoid wear-out of checkpoint region Introduce Indirect inode map block Points to the locations of inode map blocks Reduces the size of checkpoint data Make use of OOB area in NAND flash Segment summary info.

LFFS (1) LFFS data structures Checkpoint Block Indirect Inode map Indirect Inode map Inode Map Block Inode Map Block Inode Inode Inode File Information Data Block Direct Pointer Indirect Double Triple Indirect Block Data Block

LFFS (2) LFFS architecture

LFFS (3) Blocks in LFFS Inode block Indirect block Data (directory) block Inodemap Points to inode positions within flash memory Indirect inode map Points to inode map blocks (fully cached, 128KB) Checkpoint block OOB data Bad block indicator, next log block, ECC Inode number and offset for recovery

LFFS (4) Multiple checkpoint blocks Checkpoint area Recovery data and file system metadata 256KB or 512KB Log Area Checkpoint Area Super Block Flash Block Flash Block Flash Block Flash Block Flash Block Flash Block Flash Block Flash Block Total lifetime (checkpointing at every 15sec) 15sec * (512KB/1KB * 100,000) 24 years

LFFS Performance (1) Mount time 16 JFFS 0MB 30 JFFS2 LFFS 14 JFFS 20MB JFFS 40MB 25 12 LFFS 40MB Mount time (sec) 10 8 6 4 Mount Time (sec) 20 15 10 2 5 0 64 128 256 0 0 20 40 60 80 100 Flash Memory Size (MB) Data Size (MB) (256MB total)

LFFS Performance (2) File read and write performance Raw performance with nandsim Read: 4.51MB/s Write: 2.02MB/s LFFS performance Read: 4.31MB/s Write: 1.98MB/s Bandwidth (MB/s) 5 4.5 4 3.5 3 2.5 2 1.5 1 JFFS2 LFFS 0.5 0 Read Write

Conclusion Flash-aware file system has many opportunities. JFFS2 has drawbacks. Large memory footprints Slow mount time No clear winner, yet. Is an LFS-style file system the answer?