차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본

Size: px
Start display at page:

Download "차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본"

Transcription

1 MPI 를이용한 병렬프로그래밍 KISTI 슈퍼컴퓨팅센터 목 표 MPI 를활용해메시지패싱기반의병렬 프로그램작성을가능하도록한다. Supercomputing Center

2 차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본개념들에대해알아본다.

3 메시지패싱 (/) Serial Message-passing time S time S S S S P P P P P4 P S S S S P Process Process Process Process P4 Node Node Node Node 4 S Data transmission over the interconnect Process Supercomputing Center 5 메시지패싱 (/) q 지역적으로메모리를따로가지는프로세스들이데이터를공 유하기위해메시지 ( 데이터 ) 를송신, 수신하여통신하는방식 병렬화를위한작업할당, 데이터분배, 통신의운용등모든것을 프로그래머가담당 : 어렵지만유용성좋음 (Very Flexible) 다양한하드웨어플랫폼에서구현가능 분산메모리다중프로세서시스템 공유메모리다중프로세서시스템 단일프로세서시스템 q 메시지패싱라이브러리 MPI,, PVM, Shmem Supercomputing Center 6

4 MPI 란무엇인가? q Message Passing Interface q 메시지패싱병렬프로그래밍을위해표준화된데이터 통신라이브러리 MPI- 표준마련 (MPI Forum) : 994 년 MPI- 발표 : 997 년 Supercomputing Center 7 MPI 의목표 이식성 (Portability) 효율성 (Efficiency) 기능성 (Functionality) Supercomputing Center 8 4

5 MPI 의기본개념 (/4) 5 process Message,Tag Communicator process 4 Supercomputing Center 9 q 프로세스와프로세서 MPI 의기본개념 (/4) MPI 는프로세스기준으로작업할당 프로세서대프로세스 = 일대일또는일대다 q 메시지 ( = 데이터 + 봉투 (Envelope) ) 어떤프로세스가보내는가어디에있는데이터를보내는가어떤데이터를보내는가얼마나보내는가어떤프로세스가받는가어디에저장할것인가얼마나받을준비를해야하는가 Supercomputing Center 5

6 MPI 의기본개념 (/4) q 꼬리표 (tag) 메시지매칭과구분에이용순서대로메시지도착을처리할수있음와일드카드사용가능 q 커뮤니케이터 (Communicator) 서로간에통신이허용되는프로세스들의집합 q 프로세스랭크 (Rank) 동일한커뮤니케이터내의프로세스들을식별하기위한식별자 Supercomputing Center MPI 의기본개념 (4/4) q 점대점통신 (Point to Point Communication) 두개프로세스사이의통신하나의송신프로세스에하나의수신프로세스가대응 q 집합통신 (Collective Communication) 동시에여러개의프로세스가통신에참여일대다, 다대일, 다대다대응가능여러번의점대점통신사용을하나의집합통신으로대체 오류의가능성이적다. 최적화되어일반적으로빠르다. Supercomputing Center 6

7 제 장 MPI 를이용한 병렬프로그래밍기초 MPI 를이용한병렬프로그램작성의기본 과통신, 유도데이터타입의이용, 그리고 가상토폴로지에대해알아본다. MPI 프로그램의기본구조 커뮤니케이터 메시지 MPI 데이터타입 Supercomputing Center 4 7

8 MPI 프로그램의기본구조 include MPI header file variable declarations initialize the MPI environment do do computation and MPI communication calls close MPI environment Supercomputing Center 5 q 헤더파일삽입 MPI 헤더파일 Fortran INCLUDE mpif.h C #include mpi.h MPI 서브루틴과함수의프로토타입선언매크로, MPI 관련인수, 데이터타입정의위치 /usr/lpp/ppe.poe/include/ Supercomputing Center 6 8

9 MPI 핸들 q MPI 고유의내부자료구조참조에이용되는포인터변수 q C의핸들은 typedef 으로정의된특별한데이터타입을가짐 MPI_Comm, MPI_Datatype,, MPI_Request, q Fortran 의핸들은 INTEGER 타입 Supercomputing Center 7 MPI 루틴의호출과리턴값 (/) q MPI 루틴의호출 Fortran Format Example Error code CALL MPI_XXXXX(parameter,,ierr) CALL MPI_INIT(ierr) Returned as ierr parameter, MPI_SUCCESS if successful Format Example Error code C err = MPI_Xxxxx(parameter, ); MPI_Xxxxx(parameter, ); err = MPI_Init(&argc, &argv); Returned as err, MPI_SUCCESS if successful Supercomputing Center 8 9

10 MPI 루틴의호출과리턴값 (/) q MPI 루틴의리턴값 호출된 MPI 루틴의실행성공을알려주는에러코드리턴 성공적으로실행되면정수형상수 MPI_SUCCESS 리턴 Fortran 서브루틴은마지막정수인수가에러코드를나타냄 MPI_SUCCESS 는헤더파일에선언되어있음 Fortran INTEGER ierr CALL MPI_INIT(ierr) IF(ierr.EQ. MPI_SUCCESS) THEN ENDIF C int err; err = MPI_Init(&argc, &argv); if (err == MPI_SUCCESS){ Supercomputing Center 9 MPI 초기화 Fortran CALL MPI_INIT(ierr) C int MPI_Init(&argc, &argv) q MPI 환경초기화 q MPI 루틴중가장먼저오직한번반드시호출되어야함 Supercomputing Center

11 커뮤니케이터 (/) q 서로통신할수있는프로세스들의집합을나타내는핸들 q 모든 MPI 통신루틴에는커뮤니케이터인수가포함됨 q 커뮤니케이터를공유하는프로세스들끼리통신가능 q MPI_COMM_WORLD 프로그램실행시정해진, 사용가능한모든프로세스를포함하 는커뮤니케이터 MPI_Init 이호출될때정의됨 Supercomputing Center q 프로세스랭크 커뮤니케이터 (/) 같은커뮤니케이터에속한프로세스의식별번호 프로세스가 n개있으면 부터 n-까지번호할당 메시지의송신자와수신자를나타내기위해사용 프로세스랭크가져오기 Fortran C CALL MPI_COMM_RANK(comm,, rank, ierr) int MPI_Comm_rank(MPI_Comm comm, int *rank) 커뮤니케이터 comm 에서이루틴을호출한프로세스의랭크를인수 rank 를이용해출력 Supercomputing Center

12 q 커뮤니케이터사이즈 커뮤니케이터 (/) 커뮤니케이터에포함된프로세스들의총개수 커뮤니케이터사이즈가져오기 Fortran C CALL MPI_COMM_SIZE(comm,, size, ierr) int MPI_Comm_size(MPI_Comm comm, int *size) 루틴을호출되면커뮤니케이터 comm 의사이즈를인수 size 를통해리턴 Supercomputing Center MPI 프로그램종료 Fortran CALL MPI_FINALIZE(ierr) C int MPI_Finalize(); q 모든 MPI 자료구조정리 q 모든프로세스들에서마지막으로한번호출되어야함 q 프로세스를종료시키는것은아님 Supercomputing Center 4

13 PROGRAM skeleton INCLUDE mpif.h bones.f INTEGER ierr, rank, size CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)! your code here CALL MPI_FINALIZE(ierr) END Supercomputing Center 5 /* program skeleton*/ #include mpi.h bones.c void main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); /* your code here */ MPI_Finalize(); Supercomputing Center 6

14 MPI 메시지 (/) q 데이터 + 봉투 데이터 ( 버퍼, 개수, 데이터타입 ) 버퍼 : 수신 ( 송신 ) 데이터의변수이름 개수 : 수신 ( 송신 ) 데이터의개수 데이터타입 : 수신 ( 송신 ) 데이터의데이터타입 봉투 ( 수신자 ( 송신자 ), 꼬리표, 커뮤니케이터 ) 수신자 ( 송신자 ) : 수신 ( 송신 ) 프로세스랭크 꼬리표 : 수신 ( 송신 ) 데이터를나타내는고유한정수 IBM/MPI, MPICH : ~ 7748( ) 커뮤니케이터 : 송신, 수신프로세스들이포함된프로세스그룹 Supercomputing Center 7 MPI 메시지 (/) q MPI 데이터특정 MPI 데이터타입을가지는원소들의배열로구성 q MPI 데이터타입기본타입유도타입 (derived type) q 유도타입은기본타입또는다른유도타입을기반으로만들수있다. q C 데이터타입과 Fortran 데이터타입은같지않다. q 송신과수신데이터타입은반드시일치해야한다. Supercomputing Center 8 4

15 MPI 기본데이터타입 (/) MPI Data Type MPI_INTEGER MPI_REAL MPI_DOUBLE_PRECISION MPI_COMPLEX MPI_LOGICAL MPI_CHARACTER Fortran Data Type INTEGER REAL DOUBLE PRECISION COMPLEX LOGICAL CHARACTER() MPI_BYTE MPI_PACKED Supercomputing Center 9 MPI 기본데이터타입 (/) MPI Data Type MPI_CHAR MPI_SHORT MPI_INT MPI_LONG MPI_UNSIGNED_CHAR MPI_UNSIGNED_SHORT MPI_UNSIGNED MPI_UNSIGNED_LONG MPI_FLOAT MPI_DOUBLE MPI_LONG_DOUBLE MPI_BYTE MPI_PACKED C Data Type signed char signed short int signed int signed long int unsigned char unsigned short int unsigned int unsigned long int float double long double Supercomputing Center 5

16 점대점통신과통신모드 블록킹통신 논블록킹통신 단방향통신과양방향통신 Supercomputing Center 점대점통신 (/) Communicator 5 source 4 destination 반드시두개의프로세스만참여하는통신 통신은커뮤니케이터내에서만이루어진다. 송신 / 수신프로세스의확인을위해커뮤니케이터와랭크사용 Supercomputing Center 6

17 점대점통신 (/) q 통신의완료메시지전송에이용된메모리위치에안전하게접근할수있음을의미 송신 : 송신변수는통신이완료되면다시사용될수있음 수신 : 수신변수는통신이완료된후부터사용될수있음 q 블록킹통신과논블록킹통신블록킹 통신이완료된후루틴으로부터리턴됨논블록킹 통신이시작되면완료와상관없이리턴, 이후완료여부를검사 q 통신완료에요구되는조건에따라통신모드분류 Supercomputing Center 통신모드 통신모드 블록킹 MPI 호출루틴 논블록킹 동기송신 준비송신 버퍼송신 표준송신 수신 MPI_SSEND MPI_RSEND MPI_BSEND MPI_SEND MPI_RECV MPI_ISSEND MPI_IRSEND MPI_IBSEND MPI_ISEND MPI_IRECV Supercomputing Center 4 7

18 동기송신 MPI_SSEND (Blocking Synchronous Send) Task Waits data transfer from source complete S R Wait MPI_RECV Receiving task waits Until buffer is filled q q q q q 송신시작 : 대응되는수신루틴의실행에무관하게시작 송신 : 수신측이받을준비가되면전송시작 송신완료 : 수신루틴이메시지를받기시작 + 전송완료가장안전한통신논-로컬송신모드 Supercomputing Center 5 준비송신 MPI_RSEND (blocking ready send) data transfer from source complete S R Wait MPI_RECV Receiving task waits Until buffer is filled q q q q 수신측이미리받을준비가되어있음을가정하고송신시작 수신이준비되지않은상황에서의송신은에러 성능면에서유리 논-로컬송신모드 Supercomputing Center 6 8

19 버퍼송신 MPI_BSEND (buffered send) copy data to buffer data transfer user-supplied supplied buffer complete S R MPI_RECV task waits q q q q 송신시작 : 대응되는수신루틴의실행에무관하게시작 송신완료 : 버퍼로복사가끝나면수신과무관하게완료 사용자가직접버퍼공간관리 MPI_Buffer_attach MPI_Buffer_detach 로컬송신모드 Supercomputing Center 7 q 직접복사 송신버퍼 à 수신버퍼 표준송신 q 버퍼사용송신버퍼 à 시스템버퍼 à 수신버퍼 q 상황에따라다르게실행됨 q 버퍼관리불필요 q 송신완료가반드시메시지가도착되었음을의미하지않음 q 논-로컬송신모드 Supercomputing Center 8 9

20 블록킹송신 : 표준 C Fortran int MPI_Send(void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) MPI_SEND(buf, count, datatype, dest,, tag, comm, ierr) (CHOICE) buf : 송신버퍼의시작주소 (IN) INTEGER count : 송신될원소개수 (IN) INTEGER datatype : 각원소의 MPI 데이터타입 ( 핸들 ) (IN) INTEGER dest : 수신프로세스의랭크 (IN) 통신이불필요하면 MPI_PROC_NULL INTEGER tag : 메시지꼬리표 (IN) INTEGER comm : MPI 커뮤니케이터 ( 핸들 ) (IN) MPI_SEND(a, 5, MPI_REAL, 5,, MPI_COMM_WORLD, ierr) Supercomputing Center 9 블록킹수신 (/4) C Fortran int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) MPI_RECV(buf, count, datatype, source, tag, comm, status, ierr) (CHOICE) buf : 수신버퍼의시작주소 (IN) INTEGER count : 수신될원소개수 (IN) INTEGER datatype : 각원소의 MPI 데이터타입 ( 핸들 ) (IN) INTEGER source : 송신프로세스의랭크 (IN) INTEGER tag : 메시지꼬리표 (IN) 통신이불필요하면 MPI_PROC_NULL INTEGER comm : MPI 커뮤니케이터 ( 핸들 ) (IN) INTEGER status(mpi_status_size) : 수신된메시지의정보저장 (OUT) MPI_RECV(a,5,MPI_REAL,,,MPI_COMM_WORLD,status,ierr) Supercomputing Center 4

21 블록킹수신 (/4) q 수신자는와일드카드를사용할수있음 q 모든프로세스로부터메시지수신 MPI_ANY_SOURCE q 어떤꼬리표를단메시지든모두수신 MPI_ANY_TAG Supercomputing Center 4 블록킹수신 (/4) q 수신자의 status 인수에저장되는정보송신프로세스꼬리표데이터크기 : MPI_GET_COUNT 사용 Information` source Fortran status(mpi_source) C status.mpi_source tag status(mpi_tag) status.mpi_tag count MPI_GET_COUNT MPI_Get_count Supercomputing Center 4

22 블록킹수신 (4/4) q MPI_GET_COUNT : 수신된메시지의원소개수를리턴 C Fortran int MPI_Get_count(MPI_Status *status, MPI_Datatype datatype, int *count) MPI_GET_COUNT(status, datatype, count, ierr) INTEGER status(mpi_status_size) : 수신된메시지의상태 (IN) INTEGER datatype : 각원소의데이터타입 (IN) INTEGER count : 원소의개수 (OUT) Supercomputing Center 4 블록킹통신예제 : Fortran PROGRAM isend INCLUDE 'mpif.h' INTEGER err, rank, size, count REAL data(), value() INTEGER status(mpi_status_size) CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) CALL MPI_COMM_SIZE(MPI_COMM_WORLD,size,err) IF (rank.eq.) THEN data=. CALL MPI_SEND(data,,MPI_REAL,,55,MPI_COMM_WORLD,err) ELSEIF (rank.eq. ) THEN CALL MPI_RECV(value,,MPI_REAL,MPI_ANY_SOURCE,55, & MPI_COMM_WORLD,status,err) PRINT *, "P:",rank," got data from processor ", & status(mpi_source) CALL MPI_GET_COUNT(status,MPI_REAL,count,err) PRINT *, "P:",rank," got ",count," elements" PRINT *, "P:",rank," value(5)=",value(5) ENDIF CALL MPI_FINALIZE(err) END Supercomputing Center 44

23 블록킹통신예제 : C #include <stdio.h> #include <mpi.h> void main(int argc, char *argv[]) { int rank, i, count; float data[],value[]; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); if(rank==) { for(i=;i<;++i) data[i]=i; MPI_Send(data,,MPI_FLOAT,,55,MPI_COMM_WORLD); else if(rank==) { MPI_Recv(value,,MPI_FLOAT,MPI_ANY_SOURCE,55,MPI_COMM_WO RLD, &status); printf("p:%d Got data from processor %d \n",rank, status.mpi_source); MPI_Get_count(&status,MPI_FLOAT,&count); printf("p:%d Got %d elements \n",rank,count); printf("p:%d value[5]=%f \n",rank,value[5]); MPI_Finalize(); Supercomputing Center 45 성공적인통신을위해주의할점들 q 송신측에서수신자랭크를명확히할것 q 수신측에서송신자랭크를명확히할것 q 커뮤니케이터가동일할것 q 메시지꼬리표가일치할것 q 수신버퍼는충분히클것 Supercomputing Center 46

24 q 통신을세가지상태로분류 논블록킹통신. 논블록킹통신의초기화 : 송신또는수신의포스팅. 전송데이터를사용하지않는다른작업수행 통신과계산작업을동시수행. 통신완료 : 대기또는검사 q 교착가능성제거, 통신부하감소 Supercomputing Center 47 논블록킹통신의초기화 C Fortran C Fortran int MPI_ISend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) MPI_ISEND(buf, count, datatype, dest,, tag, comm, request, ierr) int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request) MPI_IRECV(buf, count, datatype, source, tag, comm, request, ierr) INTEGER request : 초기화된통신의식별에이용 ( 핸들 ) (OUT) 논블록킹수신에는 status 인수가없음 Supercomputing Center 48 4

25 논블록킹통신의완료 q 대기 (waiting) 또는검사 (testing) 대기 루틴이호출되면통신이완료될때까지프로세스를블록킹 논블록킹통신 + 대기 = 블록킹통신 검사 루틴은통신의완료여부에따라참또는거짓을리턴 Supercomputing Center 49 대기 C Fortran int MPI_Wait(MPI_Request *request, MPI_Status *status) MPI_WAIT(request, status, ierr) INTEGER request : 포스팅된통신의식별에이용 ( 핸들 ) (IN) INTEGER status(mpi_status_size) : 수신메시지에대한정보또는송신루틴에대한에러코드 (OUT) Supercomputing Center 5 5

26 검사 C Fortran int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status) MPI_TEST(request, flag, status, ierr) INTEGER request : 포스팅된통신의식별에이용 ( 핸들 ) (IN) LOGICAL flag : 통신이완료되면참, 아니면거짓을리턴 (OUT) INTEGER status(mpi_status_size) : 수신메시지에대한정보또는송신루틴에대한에러코드 (OUT) Supercomputing Center 5 논블록킹통신예제 : Fortran PROGRAM isend INCLUDE 'mpif.h' INTEGER err, rank, count, req REAL data(), value() INTEGER status(mpi_status_size) CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) IF (rank.eq.) THEN data=. CALL MPI_ISEND(data,,MPI_REAL,,55,MPI_COMM_WORLD,req, err) CALL MPI_WAIT(req, status, err) ELSE IF(rank.eq.) THEN CALL MPI_IRECV(value,,MPI_REAL,,55,MPI_COMM_WORLD,req,err) CALL MPI_WAIT(req, status, err) PRINT *, "P:",rank," value(5)=",value(5) ENDIF CALL MPI_FINALIZE(err) END Supercomputing Center 5 6

27 논블록킹통신예제 : C /* isend */ #include <stdio.h> #include <mpi.h> void main(int argc, char *argv[]) { int rank, i; float data[],value[]; MPI_Request req; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); if(rank==) { for(i=;i<;++i) data[i]=i; MPI_Isend(data,,MPI_FLOAT,,55,MPI_COMM_WORLD,&req); MPI_Wait(&req, &status); else if(rank==){ MPI_Irecv(value,,MPI_FLOAT,,55,MPI_COMM_WORLD,&req); MPI_Wait(&req, &status); printf("p:%d value[5]=%f \n",rank,value[5]); MPI_Finalize(); Supercomputing Center 5 q 단방향통신과양방향통신 q 양방향통신은교착에주의 점대점통신의사용 rank rank rank rank sendbuf recvbuf sendbuf recvbuf recvbuf sendbuf recvbuf sendbuf Supercomputing Center 54 7

28 단방향통신 (/) q 블록킹송신, 블록킹수신 IF (myrank==) THEN CALL MPI_SEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ierr) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, istatus, ierr) ENDIF q 논블록킹송신, 블록킹수신 IF (myrank==) THEN CALL MPI_ISEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, istatus, ierr) ENDIF Supercomputing Center 55 단방향통신 (/) q 블록킹송신, 논블록킹수신 IF (myrank==) THEN CALL MPI_SEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ierr) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) ENDIF q 논블록킹송신, 논블록킹수신 IF (myrank==) THEN CALL MPI_ISEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) ENDIF CALL MPI_WAIT(ireq, istatus, ierr) Supercomputing Center 56 8

29 양방향통신 (/9) q 선송신, 후수신. : 메시지크기에따라교착가능 IF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf,...) ELSEIF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf,...) ENDIF Supercomputing Center 57 양방향통신 (/9) q 선송신, 후수신. (. 의경우와동일 ) IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_WAIT(ireq, ) CALL MPI_RECV(recvbuf,...) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_WAIT(ireq, ) CALL MPI_RECV(recvbuf,...) ENDIF Supercomputing Center 58 9

30 양방향통신 (/9) q 선송신, 후수신. : 메시지크기와무관하게교착없음 IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_RECV(recvbuf,...) CALL MPI_WAIT(ireq, ) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_RECV(recvbuf,...) CALL MPI_WAIT(ireq, ) ENDIF Supercomputing Center 59 양방향통신 (4/9) q 전송데이터크기에따른교착여부확인 q 교착을피하기위해논블록킹통신사용 INTEGER N PARAMETER (N=4) REAL a(n), b(n) IF( myrank.eq. ) THEN CALL MPI_SEND( a, N, ) CALL MPI_RECV( b, N, ) ELSE IF( myrank.eq. ) THEN CALL MPI_SEND( a, N, ) CALL MPI_RECV( b, N, ) ENDIF Supercomputing Center 6

31 양방향통신 (5/9) q 선수신, 후송신. : 메시지크기와무관하게교착 IF (myrank==) THEN CALL MPI_RECV(recvbuf,...) CALL MPI_SEND(sendbuf,...) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf,...) CALL MPI_SEND(sendbuf,...) ENDIF Supercomputing Center 6 양방향통신 (6/9) q 선수신, 후송신. : 메시지크기와무관하게교착없음 IF (myrank==) THEN CALL MPI_IRECV(recvbuf,,ireq, ) CALL MPI_SEND(sendbuf,...) CALL MPI_WAIT(ireq, ) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf,,ireq, ) CALL MPI_SEND(sendbuf,...) CALL MPI_WAIT(ireq, ) ENDIF Supercomputing Center 6

32 양방향통신 (7/9) q 전송데이터크기와무관한교착발생확인 q 교착을피하기위해논블록킹통신사용 REAL a(), b() IF (myrank==) THEN CALL MPI_RECV(b,,) CALL MPI_SEND(a,, ) ELSE IF (myrank==) THEN CALL MPI_RECV(b,, ) CALL MPI_SEND(a,, ) ENDIF Supercomputing Center 6 양방향통신 (8/9) q 한쪽은송신부터, 다른한쪽은수신부터 : 블록킹, 논블록킹루틴의사용과무관하게교착없음 IF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf, ) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, ) CALL MPI_SEND(sendbuf,...) ENDIF Supercomputing Center 64

33 양방향통신 (9/9) q 권장코드 IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_IRECV(recvbuf,, ireq, ) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_IRECV(recvbuf,, ireq, ) ENDIF CALL MPI_WAIT(ireq, ) CALL MPI_WAIT(ireq, ) Supercomputing Center 65 집합통신 방송 (Broadcast) 취합 (Gather) 환산 (Reduce) 확산 (Scatter) 장벽 (Barrier) 기타 Supercomputing Center 66

34 집합통신 (/) q 한그룹의프로세스가참여하는통신 q 점대점통신기반 q 점대점통신을이용한구현보다편리하고성능면에서유리 q 집합통신루틴 커뮤니케이터내의모든프로세스에서호출 동기화가보장되지않음 (MPI_Barrier 제외 ) 논블록킹루틴없음 꼬리표없음 Supercomputing Center 67 집합통신 (/) Category One buffer One send buffer and one receive buffer Reduction Others Subroutines MPI_BCAST MPI_GATHER,, MPI_SCATTER, MPI_ALLGATHER,, MPI_ALLTOALL, MPI_GATHERV, MPI_SCATTERV, MPI_ALLGATHERV, MPI_ALLTOALLV MPI_REDUCE, MPI_ALLREDUCE,, MPI_SCAN, MPI_REDUCE_SCATTER MPI_BARRIER, MPI_OP_CREATE, MPI_OP_FREE Supercomputing Center 68 4

35 process data 집합통신 (/) P A A P A A*B*C*D P broadcast A P B reduce P A P C P A P D *:some operator P P P P A B C D scatter gather A B C D P P P P A B C D all reduce A*B*C*D A*B*C*D A*B*C*D A*B*C*D *:some operator P P A B allgather A A B B C C D D P P A B scan A A*B P C A B C D P C A*B*C P D A B C D P D A*B*C*D *:some operator P P P A B C A B C A B C A B C alltoall A A A B B B C C C D D D P P P A B C A B C A B C A B C reduce scatter A*B*C*D A*B*C*D A*B*C*D P D D D D A B C D P D D D D A*B*C*D *:some operator Supercomputing Center 69 방송 : MPI_BCAST C Fortran int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm) MPI_BCAST(buffer, count, datatype, root, comm, ierr) (CHOICE) buffer : 버퍼의시작주소 (IN) INTEGER count : 버퍼원소의개수 (IN) INTEGER datatype : 버퍼원소의 MPI 데이터타입 (IN) INTEGER root : 루트프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 루트프로세스로부터커뮤니케이터내의다른프로세스로동일한데이터를전송 : 일대다통신 Supercomputing Center 7 5

36 MPI_BCAST 예제 MPI_COMM_WORLD rank==root rank= rank= MPI_INTEGER imsg imsg imsg Supercomputing Center 7 MPI_BCAST 예제 : Fortran PROGRAM bcast INCLUDE mpif.h INTEGER imsg(4) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=,4 imsg(i) = i ELSE DO i=,4 imsg(i) = ENDIF PRINT*, Before:,imsg CALL MPI_BCAST(imsg, 4, MPI_INTEGER,, MPI_COMM_WORLD, ierr) PRINT*, After :,imsg CALL MPI_FINALIZE(ierr) END Supercomputing Center 7 6

37 MPI_BCAST 예제 : C /*broadcast*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int imsg[4]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank==) for(i=; i<4; i++) imsg[i] = i+; else for (i=; i<4; i++) imsg[i] = ; printf( %d: BEFORE:, myrank); for(i=; i<4; i++) printf( %d, imsg[i]); printf( \n ); MPI_Bcast(imsg, 4, MPI_INT,, MPI_COMM_WORLD); printf( %d: AFTER:, myrank); for(i=; i<4; i++) printf( %d, imsg[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 7 취합 : MPI_GATHER C Fortran int MPI_Gather(void *sendbuf* sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_GATHER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) INTEGER sendcount : 송신버퍼의원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 수신할원소의개수 (IN) INTEGER recvtype : 수신버퍼원소의 MPI 데이터타입 (IN) INTEGER root : 수신프로세스 ( 루트프로세스 ) 의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 모든프로세스 ( 루트포함 ) 가송신한데이터를취합하여랭크 순서대로저장 : 다대일통신 Supercomputing Center 74 7

38 MPI_GATHER : 주의사항 q 송신버퍼 (sendbuf) 와수신버퍼 (recvbuf) 의메모리위치가겹쳐지지않도록주의할것. 즉, 같은이름을쓰면안됨 è 송신버퍼와수신버퍼를이용하는모든집합통신에해당 q 전송되는데이터의크기는모두동일할것 q 크기가서로다른데이터의취합 è MPI_GATHERV Supercomputing Center 75 MPI_GATHER 예제 MPI_COMM_WORLD MPI_INTEGER rank==root rank= rank= isend isend isend MPI_INTEGER irecv Supercomputing Center 76 8

39 MPI_GATHER 예제 : Fortran PROGRAM gather INCLUDE mpif.h INTEGER irecv() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) isend = myrank + CALL MPI_GATHER(isend,,MPI_INTEGER,irecv,,MPI_INTEGER,&, MPI_COMM_WORLD, ierr) IF (myrank==) THEN ENDIF PRINT *, irecv =,irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center 77 MPI_GATHER 예제 : C /*gather*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, nprocs, myrank ; int isend, irecv[]; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); isend = myrank + ; MPI_Gather(&isend,,MPI_INT,irecv,,MPI_INT,,MPI_COMM_WORLD); if(myrank == ) { printf( irecv = ); for(i=; i<; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 78 9

40 C Fortran 취합 : MPI_GATHERV int MPI_Gatherv(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int displs, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_GATHERV(sendbuf, sendcount, sendtype, recvbuf, recvcounts, displs, recvtype,, root, comm, ierr) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcounts(*) : 수신된원소의개수를저장하는정수배열 (IN) INTEGER displs(*) : 정수배열, i 번째자리에는프로세스 i에서들어오는데이터가저장될수신버퍼상의위치를나타냄 (IN) q q 각프로세스로부터전송되는데이터의크기가다를경우사용서로다른메시지크기는배열 recvcounts 에지정, 배열 displs 에루트프로세스의어디에데이터가위치하게되는가를저장 Supercomputing Center 79 MPI_GATHERV 예제 COMM sendcount rank==root rank= rank= sendbuf sendbuf sendbuf recvcounts() recvcounts() recvcounts() = displs() = displs() = displs() 4 5 recvbuf Supercomputing Center 8 4

41 MPI_GATHERV 예제 : Fortran PROGRAM gatherv INCLUDE mpif.h INTEGER isend(), irecv(6) INTEGER ircnt(:), idisp(:) DATA ircnt/,,/ idisp/,,/ CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i=,myrank+ isend(i) = myrank + iscnt = myrank + CALL MPI_GATHERV(isend,iscnt,MPI_INTEGER,irecv,ircnt,idisp,& MPI_INTEGER,,MPI_COMM_WORLD,ierr) IF (myrank==) THEN PRINT *, irecv =,irecv ENDIF CALL MPI_FINALIZE(ierr) END Supercomputing Center 8 MPI_GATHERV 예제 : C /*gatherv*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int isend[], irecv[6]; int iscnt, ircnt[]={,,, idisp[]={,,; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<myrank+; i++) isend[i] = myrank + ; iscnt = myrank +; MPI_Gatherv(isend, iscnt, MPI_INT, irecv, ircnt, idisp, MPI_INT,, MPI_COMM_WORLD); if(myrank == ) { printf( irecv = ); for(i=; i<6; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 8 4

42 취합 : MPI_ALLGATHER C Fortran int MPI_Allgather(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLGATHER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) INTEGER sendcount : 송신버퍼의원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 각프로세스로부터수신된데이터개수 (IN) INTEGER recvtype : 수신버퍼데이터타입 (IN) INTEGER comm : 커뮤니케이터 (IN) q MPI_GATHER + MPI_BCAST q 프로세스 j 의데이터 è 모든수신버퍼 j 번째블록에저장 Supercomputing Center 8 MPI_ALLGATHER 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 84 4

43 MPI_ALLGATHER 예제 : Fortran PROGRAM allgather INCLUDE mpif.h INTEGER irecv() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) isend = myrank + CALL MPI_ALLGATHER(isend,, MPI_INTEGER, & irecv,, MPI_INTEGER, MPI_COMM_WORLD, ierr) PRINT *, irecv =, irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center 85 MPI_ALLGATHER 예제 : C /*allgather*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int isend, irecv[]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); isend = myrank + ; MPI_Allgather(&isend,, MPI_INT, irecv,, MPI_INT, MPI_COMM_WORLD); printf( %d irecv = ); for(i=; i<; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 86 4

44 취합 : MPI_ALLGATHERV C Fortran int MPI_Allgatherv(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int displs, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLGATHERV(sendbuf, sendcount, sendtype, recvbuf, recvcounts, displs, recvtype, comm, ierr) q MPI_ALLGATHER 와같은기능을하며서로다른크기의데이터를취합할때사용 Supercomputing Center 87 MPI_ALLGATHERV 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcounts() recvcounts() recvcounts() = displs() = displs() = displs() 4 5 recvbuf recvbuf recvbuf Supercomputing Center 88 44

45 환산 : MPI_REDUCE C Fortran int MPI_Reduce(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, int root, MPI_Comm comm) MPI_REDUCE(sendbuf, recvbuf,, count, datatype,, op, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER count : 송신버퍼의원소개수 (IN) INTEGER datatype : 송신버퍼원소의 MPI 데이터타입 (IN) INTEGER op : 환산연산자 (IN) INTEGER root : 루트프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 각프로세스로부터데이터를모아하나의값으로환산, 그결과를루트프로세스에저장 Supercomputing Center 89 MPI_REDUCE : 연산과데이터타입 (/) Operation MPI_SUM(sum), MPI_PROD(product) MPI_MAX(maximum), MPI_MIN(minimum) MPI_MAXLOC(max value and location), MPI_MINLOC(min value and location) MPI_LAND(logical AND), MPI_LOR(logical OR), MPI_LXOR(logical XOR) MPI_BAND(bitwise AND), MPI_BOR(bitwise OR), MPI_BXOR(bitwise XOR) Data Type (Fortran) MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_COMPLEX MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION MPI_LOGICAL MPI_INTEGER, MPI_BYTE Supercomputing Center 9 45

46 MPI_REDUCE : 연산과데이터타입 (/) Operation MPI_SUM(sum), MPI_PROD(product) MPI_MAX(maximum), MPI_MIN(minimum) MPI_MAXLOC(max value and location), MPI_MINLOC(min value and location) MPI_LAND(logical AND), MPI_LOR(logical OR), MPI_LXOR(logical XOR) MPI_BAND(bitwise AND), MPI_BOR(bitwise OR), MPI_BXOR(bitwise XOR) Data Type (C) MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_LONG_DOUBLE MPI_FLOAT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, MPI_INT, MPI_SHORT_INT, MPI_LONG_DOUBLE_INT MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG, MPI_BYTE Supercomputing Center 9 MPI_REDUCE : 연산과데이터타입 (/) q C의 MPI_MAXLOC, MPI_MINLOC 에사용된데이터타입 Data Type MPI_FLOAT_INT MPI_DOUBLE_INT MPI_LONG_INT MPI_INT MPI_SHORT_INT MPI_LONG_DOUBLE_INT Description (C) { MPI_FLOAT, MPI_INT { MPI_DOUBLE, MPI_INT { MPI_LONG, MPI_INT { MPI_INT, MPI_INT { MPI_SHORT, MPI_INT { MPI_LONG_DOUBLE, MPI_INT Supercomputing Center 9 46

47 MPI_REDUCE : 사용자정의연산 (/) q 다음형식으로새로운연산 (my_operator) 을정의 C : void my_operator (void *invec, void *inoutvec, int *len, MPI_Datatype *datatype) Fortran : <type> INVEC(LEN),INOUTVEC(LEN) INTEGER LEN,DATATYPE FUNCTION MY_OPERATOR(INVEC(*), INOUTVEC(*), LEN, DATATYPE) Supercomputing Center 9 MPI_REDUCE : 사용자정의연산 (/) q 사용자정의연산의등록 ( : my_operator 를 op 로등록 ) 인수 commute 가참이면환산이좀더빠르게수행됨 C : int MPI_Op_create (MPI_User_function *my_operator, int commute, MPI_Op *op) Fortran : EXTERNAL MY_OPERATOR INTEGER OP,IERR LOGICAL COMMUTE MPI_OP_CREATE (MY_OPERATOR, COMMUTE, OP, IERR) Supercomputing Center 94 47

48 MPI_REDUCE 예제 MPI_COMM_WORLD rank==root rank= rank= array a array a array a 6 sum 5 sum 4 sum tmp Supercomputing Center 95 MPI_REDUCE 예제 : Fortran PROGRAM reduce INCLUDE mpif.h REAL a(9) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) ista = myrank * + iend = ista + DO i=ista,iend a(i) = i sum =. DO i=ista,iend sum = sum + a(i) CALL MPI_REDUCE(sum,tmp,,MPI_REAL,MPI_SUM,,MPI_COMM_WORLD,ierr ) sum = tmp IF (myrank==) THEN PRINT *, sum =,sum ENDIF CALL MPI_FINALIZE(ierr) END Supercomputing Center 96 48

49 MPI_REDUCE 예제 : C /*reduce*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ista, iend; double a[9], sum, tmp; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); ista = myrank* ; iend = ista + ; for(i = ista; i<iend+; i++) a[i] = i+; sum =.; for(i = ista; i<iend+; i++) sum = sum + a[i]; MPI_Reduce(&sum, &tmp,, MPI_DOUBLE, MPI_SUM,, MPI_COMM_WORLD); sum = tmp; if(myrank == ) printf( sum = %f \n, sum); MPI_Finalize(); Supercomputing Center 97 MPI_REDUCE 예제 : 배열 COMM rank==root rank= rank= count sendbuf sendbuf sendbuf + op + op count 6 = ++ 5 = recvbuf Supercomputing Center 98 49

50 환산 : MPI_ALLREDUCE C Fortran int MPI_Allreduce(void *sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_ALLREDUCE(sendbuf, recvbuf,, count, datatype,, op, comm, ierr) q 각프로세스로부터데이터를모아하나의값으로환산, 그결과를모든프로세스에저장 Supercomputing Center 99 MPI_ALLREDUCE 예제 COMM count sendbuf sendbuf + op + op + op count 6 = = = + + recvbuf recvbuf recvbuf Supercomputing Center 5

51 확산 : MPI_SCATTER C Fortran int MPI_Scatter(void *sendbuf* sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_SCATTER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의주소 (IN) INTEGER sendcount : 각프로세스로보내지는원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 수신버퍼의원소개수 (IN) INTEGER recvtype : 수신버퍼의 MPI 데이터타입 (IN) INTEGER root : 송신프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 루트프로세스는데이터를같은크기로나누어각프로세스에랭크순서대로하나씩전송 Supercomputing Center MPI_SCATTER 예제 COMM rank==root rank= rank= sendcount sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 5

52 MPI_SCATTER 예제 : Fortran PROGRAM scatter INCLUDE mpif.h INTEGER isend() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN ENDIF DO i=,nprocs isend(i)=i CALL MPI_SCATTER(isend,, MPI_INTEGER, irecv,, MPI_INTEGER, &, MPI_COMM_WORLD, ierr) PRINT *, irecv =,irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center MPI_SCATTER 예제 : C /*scatter*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, nprocs; int isend[], irecv; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<nprocs; i++) isend[i]=i+; MPI_Scatter(isend,, MPI_INT, &irecv,, MPI_INT,, MPI_COMM_WORLD); printf( %d: irecv = %d\n, myrank, irecv); MPI_Finalize(); Supercomputing Center 4 5

53 확산 : MPI_SCATTERV C Fortran int MPI_Scatterv(void *sendbuf, int sendcounts, int displs, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_SCATTERV(sendbuf, sendcounts, displs, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의주소 (IN) INTEGER sendcounts(*) : 정수배열, i 번째자리에프로세스 i 로전송될데이터개수저장 (IN) INTEGER displs(*) : 정수배열, i 번째자리에프로세스 i로전송될데이터의송신버퍼상의상대적위치가저장됨 (IN) q 루트프로세스는데이터를서로다른크기로나누어각프로세스에랭크순서대로하나씩전송 Supercomputing Center 5 MPI_SCATTERV 예제 COMM rank==root rank= rank= sendcount() = displs() = displs() sendcount() sendcount() = displs() 4 5 sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 6 5

54 장벽 : MPI_BARRIER C Fortran int MPI_Barrier(MPI_Comm comm) MPI_BARRIER(comm, ierr) q 커뮤니케이터내의모든프로세스가 MPI_BARRIER 를호출할때까지더이상의프로세스진행을막음 Supercomputing Center 7 기타 : MPI_ALLTOALL C Fortran int MPI_Alltoall(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLTOALL(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype, comm, ierr) q 각프로세스로부터모든프로세스에동일한크기의개별적인메시지를전달 q 프로세스 i 로부터송신되는 j 번째데이터블록은프로세스 j 가받아수신버퍼의 i번째블록에저장 Supercomputing Center 8 54

55 MPI_ALLTOALL 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcount recvbuf 8 recvbuf 9 recvbuf Supercomputing Center 9 기타 : MPI_ALLTOALLV C Fortran int MPI_Alltoallv(void *sendbuf, int sendcounts, int sdispls, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int rdispls, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts, rdispls, recvtype, comm, ierr) q 각프로세스로부터모든프로세스에서로다른크기의개별적인메시지를전달 q 프로세스 i 로부터송신되는 sendcounts(j) 개의데이터는프로세스 j가받아수신버퍼의 rdispls(i) 번째위치부터저장 Supercomputing Center 55

56 MPI_ALLTOALLV 예제 COMM sendcounts() sendcounts() sendcounts() sendbuf rank= rank= rank= = sdispls() = sdispls() = sdispls() 4 5 sendbuf recvcounts() recvcounts() recvcounts() 4 7 recvbuf recvbuf = rdispls() = rdispls() = rdispls() 7 8 recvbuf Supercomputing Center 기타 : MPI_REDUCE_SCATTER C Fortran int MPI_Reduce_scatter(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int recvcounts, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_REDUCE_SCATTER(sendbuf, recvbuf, recvcounts, datatype,, op, comm, ierr) q 각프로세스의송신버퍼에저장된데이터를환산연산 (reduction Operation) 을수행하고그결과를차례대로 recvcounts(i) 개씩모아서프로세스 i로송신 q MPI_REDUCE + MPI_SCATTERV Supercomputing Center 56

57 MPI_REDUCE_SCATTER 예제 COMM rank= rank= rank= recvcounts() recvcounts() recvcounts() sendbuf sendbuf + op recvbuf 9 recvbuf recvbuf Supercomputing Center 기타 : MPI_SCAN C Fortran int MPI_Scan(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_SCAN(sendbuf, recvbuf,, count, datatype,, op, comm, ierr) q 프로세스 i의수신버퍼에프로세스 에서프로세스 i까지의수신버퍼데이터들에대한환산 (reduction) 값을저장한다 Supercomputing Center 4 57

58 MPI_SCAN 예제 COMM rank= rank= rank= count sendbuf sendbuf + op + op count = + 6 = + + recvbuf recvbuf recvbuf Supercomputing Center 5 유도데이터타입 부분배열의전송 Supercomputing Center 6 58

59 유도데이터타입 (/) q 데이터타입이다르거나불연속적인데이터의전송 동일한데이터타입을가지는불연속데이터 다른데이터타입을가지는연속데이터다른데이터타입을가지는불연속데이터. 각각을따로따로전송. 새로운버퍼로묶어서전송후묶음을풀어원위치로저장 : MPI_PACK/MPI_UNPACK, MPI_PACKED( 데이터타입 ) è 느린속도, 불편, 오류의위험 Supercomputing Center 7 유도데이터타입 (/) q a(4), a(5), a(7), a(8), a(), a() 의전송 a() :REAL itype itype 유도데이터타입 itype, 한개전송 CALL MPI_SEND(a(4),, itype, idst, itag, 유도데이터타입 itype, 세개전송 MPI_COMM_WORLD, ierr) CALL MPI_SEND(a(4),, itype, idst, itag, MPI_COMM_WORLD, ierr) Supercomputing Center 8 59

60 유도데이터타입의사용 q CONSTRUCT MPI 루틴이용해새로운데이터타입작성 MPI_Type_contiguous MPI_Type_(h)vector MPI_Type_struct q COMMIT 작성된데이터타입등록 MPI_Type_Commit q USE 송신, 수신등에새로운데이터타입사용 Supercomputing Center 9 MPI_TYPE_COMMIT C int MPI_Type_commit (MPI_Datatype( *datatype) Fortran MPI_TYPE_COMMIT (datatype ( datatype, ierr) Fortran INTEGER datatype : 등록데이터타입 ( 핸들 ) (INOUT) q 새롭게정의된데이터타입을통신상에서사용가능하게함 q 등록된데이터타입은 MPI_TYPE_FREE(datatype, ierr) 을이용 해해제하기전까지계속사용가능 Supercomputing Center 6

61 MPI_TYPE_CONTIGUOUS C Fortran int MPI_Type_contiguous (int( count, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_CONTIGUOUS (count, oldtype, newtype, ierr) INTEGER count : 하나로묶을데이터개수 (IN) INTEGER oldtype : 이전데이터타입 ( 핸들 ) (IN) INTEGER newtype : 새로운데이터타입 ( 핸들 ) (OUT) q 같은데이터타입 (oldtype) 을가지는연속적인데이터를 count 개묶어새로운데이터타입 (newtype) 정의 Supercomputing Center MPI_TYPE_CONTIGUOUS 예제 = count newtype : MPI_INTEGER(oldtype) Supercomputing Center 6

62 MPI_TYPE_CONTIGUOUS 예제 : Fortran PROGRAM type_contiguous INCLUDE mpif.h INTEGER ibuf() INTEGER inewtype CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN ENDIF DO i=, ibuf(i) = i CALL MPI_TYPE_CONTIGUOUS(, MPI_INTEGER, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, ibuf =,ibuf CALL MPI_FINALIZE(ierr) END Supercomputing Center MPI_TYPE_CONTIGUOUS 예제 : C /*type_contiguous*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ibuf[]; MPI_Datatype inewtype ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) ibuf[i]=i+; else for(i=; i<; i++) ibuf[i]=; MPI_Type_contiguous(, MPI_INT, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : ibuf =, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 4 6

63 MPI_TYPE_VECTOR (/) C Fortran int MPI_Type_vector (int( count, int blocklength, int stride, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_VECTOR (count, blocklength, stride, oldtype, newtype, ierr) INTEGER count : 블록의개수 (IN) INTEGER blocklength : 각블록의 oldtype 데이터의개수 (IN) INTEGER stride : 인접한두블록의시작점사이의폭 (IN) INTEGER oldtype : 이전데이터타입 ( 핸들 ) (IN) INTEGER newtype : 새로운데이터타입 ( 핸들 ) (OUT) q 똑같은간격만큼떨어져있는 count 개블록들로구성되는새로운데이터타입정의 q 각블록에는 blocklength 개의이전타입데이터있음 Supercomputing Center 5 MPI_TYPE_VECTOR (/) count = stride = 5 blocklength = Supercomputing Center 6 6

64 MPI_TYPE_VECTOR 예제 blocklength 4=count newtype stride(# of units) :MPI_INTEGER(oldtype) Supercomputing Center 7 MPI_TYPE_VECTOR 예제 : Fortran PROGRAM type_vector INCLUDE mpif.h INTEGER ibuf(), inewtype CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=, ibuf(i) = i ENDIF CALL MPI_TYPE_VECTOR(4,,, MPI_INTEGER, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, ibuf =, ibuf CALL MPI_FINALIZE(ierr) END Supercomputing Center 8 64

65 MPI_TYPE_VECTOR 예제 : C /*type_vector*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ibuf[]; MPI_Datatype inewtype ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) ibuf[i]=i+; else for(i=; i<; i++) ibuf[i]=; MPI_Type_vector(4,,, MPI_INT, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : ibuf =, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 9 MPI_TYPE_HVECTOR C Fortran int MPI_Type_hvector (int count, int blocklength, int stride, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_HVECTOR (count, blocklength, stride, oldtype, newtype, ierr) q stride 를바이트단위로표시 bytes = stride : MPI_TYPE_HVECTOR bytes = stride*extent(oldtype extent(oldtype) ) : MPI_TYPE_VECTOR blocklength newtype 4=count stride(# of bytes) :MPI_INTEGER(oldtype) Supercomputing Center 65

66 MPI_TYPE_STRUCT (/) C Fortran int MPI_Type_struct (int count, int *array_of_blocklengths, MPI_Aint *array_of_displacements, MPI_Datatype *array_of_type, MPI_Datatype *newtype) MPI_TYPE_STRUCT (count, array_of_blocklengths, array_of_displacements, array_of_types, newtype, ierr) INTEGER count : 블록의개수, 동시에배열 array_of_blocklengths, array_of_displacements, array_of_types 의원소의개수를나타냄 (IN) INTEGER array_of_blocklengths(*) : 각블록당데이터의개수, array_of_blocklengths(i) 는데이터타입이 array_of_types(i) 인 i번째블록의데이터개수 (IN) INTEGER array_of_displacements(*) : 바이트로나타낸각블록의위치 (IN) INTEGER array_of_types(*) : 각블록을구성하는데이터타입, i 번째블록은데이터타입이 array_of_types(i) 인데이터로구성 (IN) INTEGER newtype : 새로운데이터타입 (OUT) Supercomputing Center MPI_TYPE_STRUCT (/) q 가장일반적인유도데이터타입 q 서로다른데이터타입들로구성된변수정의가능 C 구조체 Fortran 커먼블록 q count 개블록으로구성된새로운데이터타입정의, i 번째블록은데이터타입이 array_of_types(i) 인 array_of_blocklengths(i) 개의데이터로구성되며그위치는 array_of_displacements(i) 가됨 Supercomputing Center 66

67 MPI_TYPE_STRUCT (/) count = array_of_blocklengths = {, array_of_types = {MPI_INT, MPI_DOUBLE array_of_displacements = {, extent(mpi_int) Supercomputing Center MPI_TYPE_STRUCT 예제 (/) = count array_of_blocklengths()= array_of_blocklengths()= array_of_types() newtype array_of_types() : MPI_INTEGER array_of_displacements()= array_of_displacements()=5*4 Supercomputing Center 4 67

68 MPI_TYPE_STRUCT 예제 (/) = count array_of_blocklengths()= array_of_blocklengths()= newtype array_of_types() L array_of_types() array_of_displacements()=*4 : MPI_INTEGER L : MPI_LB array_of_displacements()= Supercomputing Center 5 MPI_TYPE_STRUCT 예제 : Fortran (/) PROGRAM type_struct INCLUDE mpif.h INTEGER ibuf(), ibuf() INTEGER iblock(), idisp(), itype() CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=, ibuf(i) = i ibuf(i) = i ENDIF iblock() = ; iblock() = idisp() = ; idisp() = 5 * 4 itype() = MPI_INTEGER; itype() = MPI_INTEGER CALL MPI_TYPE_STRUCT(, iblock, idisp, itype, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) Supercomputing Center 6 68

69 MPI_TYPE_STRUCT 예제 : Fortran (/) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, Ex. :,ibuf iblock() = ; iblock() = idisp() = ; idisp() = * 4 itype() = MPI_LB itype() = MPI_INTEGER CALL MPI_TYPE_STRUCT(, iblock, idisp, itype,inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, Ex. :,ibuf CALL MPI_FINALIZE(ierr) END MPI_UB, MPI_LB : MPI 유사 (pseudo) 타입 차지하는공간없이데이터타입의시작, 끝에서빈공간이나타나도록해야할때사용 Supercomputing Center 7 MPI_TYPE_STRUCT 예제 : C (/) /*type_struct*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int ibuf[], ibuf[], iblock[]; MPI_Datatype inewtype, inewtype; MPI_Datatype itype[]; MPI_Aint idisp[]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) { ibuf[i]=i+; ibuf[i]=i+; Supercomputing Center 8 69

70 MPI_TYPE_STRUCT 예제 : C (/) else for(i=; i<; i++){ ibuf[i]=; ibuf[i]=; iblock[] = ; iblock[] = ; idisp[] = ; idisp[] = 5*4; itype[] = MPI_INT; itype[] = MPI_INT; MPI_Type_struct(, iblock, idisp, itype, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : Ex. :, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); Supercomputing Center 9 MPI_TYPE_STRUCT 예제 : C (/) iblock[] = ; iblock[] = ; idisp[] = ; idisp[] = *4; itype[] = MPI_LB; itype[] = MPI_INT; MPI_Type_struct(, iblock, idisp, itype, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : Ex. :, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 4 7

71 MPI_TYPE_EXTENT C int MPI_Type_extent (MPI_Datatype( *datatype, MPI_Aint *extent) Fortran MPI_TYPE_EXTENT (datatype ( datatype, extent, ierr) Fortran INTEGER datatype : 데이터타입 ( 핸들 ) (IN) INTEGER extent : 데이터타입의범위 (OUT) q 데이터타입의범위 = 메모리에서차지하는바이트수 Supercomputing Center 4 MPI_TYPE_EXTENT 예제 : Fortran (/) PROGRAM structure INCLUDE 'mpif.h' INTEGER err, rank, num INTEGER status(mpi_status_size) REAL x COMPLEX data(4) COMMON /result/num,x,data INTEGER blocklengths() DATA blocklengths/,,4/ INTEGER displacements() INTEGER types(), restype DATA types/mpi_integer,mpi_real,mpi_complex/ INTEGER intex,realex CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) CALL MPI_TYPE_EXTENT(MPI_INTEGER,intex,err) CALL MPI_TYPE_EXTENT(MPI_REAL,realex,err) displacements()=; displacements()=intex Supercomputing Center 4 7

72 MPI_TYPE_EXTENT 예제 : Fortran (/) displacements()=intex+realex CALL MPI_TYPE_STRUCT(,blocklengths,displacements, & types,restype,err) CALL MPI_TYPE_COMMIT(restype,err) IF(rank.eq.) THEN num=6; x=.4 DO i=,4 data(i)=cmplx(i,i) CALL MPI_SEND(num,,restype,,,MPI_COMM_WORLD,err) ELSE IF(rank.eq.) THEN CALL MPI_RECV(num,,restype,,,MPI_COMM_WORLD,status,err) PRINT *,'P:',rank,' I got' PRINT *,num PRINT *,x PRINT *,data END IF CALL MPI_FINALIZE(err) END Supercomputing Center 4 MPI_TYPE_EXTENT 예제 : C (/) #include <stdio.h> #include<mpi.h> void main(int argc, char *argv[]) { int rank,i; MPI_Status status; struct { int num; float x; double data[4]; a; int blocklengths[]={,,4; MPI_Datatype types[]={mpi_int,mpi_float,mpi_double; MPI_Aint displacements[]; MPI_Datatype restype; MPI_Aint intex,floatex; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Type_extent(MPI_INT,&intex); MPI_Type_extent(MPI_FLOAT,&floatex); Supercomputing Center 44 7

73 MPI_TYPE_EXTENT 예제 : C (/) displacements[]=; displacements[]=intex; displacements[]=intex+floatex; MPI_Type_struct(,blocklengths,displacements,types,&restype); MPI_Type_commit(&restype); if (rank==){ a.num=6; a.x=.4; for(i=;i<4;++i) a.data[i]=(double) i; MPI_Send(&a,,restype,,5,MPI_COMM_WORLD); else if(rank==) { MPI_Recv(&a,,restype,,5,MPI_COMM_WORLD,&status); printf("p:%d my a is %d %f %lf %lf %lf %lf\n", rank,a.num,a.x,a.data[],a.data[],a.data[],a.data[]); MPI_Finalize(); Supercomputing Center 45 부분배열의전송 (/) C Fortran int MPI_Type_create_subarray (int ndims,int *array_of_sizes, int *array_of_subsizes, int *array_of_starts, int order, MPI_Datatype oldtype, MPI_Datatype *newtype); MPI_TYPE_CREATE_SUBARRAY (ndims, array_of_sizes, array_of_subsizes, array_of_starts, order, oldtype, newtype, ierr) INTEGER ndims : 배열의차원 ( 양의정수 ) (IN) INTEGER array_of_sizes(*) : 전체배열의각차원의크기, i 번째원소는 i번째차원의크기 ( 양의정수 ) (IN) INTEGER array_of_subsizes(*) : 부분배열의각차원의크기, i 번째원소는 i 번째차원의크기 ( 양의정수 ) (IN) INTEGER array_of_starts(*) : 부분배열의시작좌표, i 번째원소는 i번째차원의시작좌표 ( 부터시작 ) (IN) INTEGER order : 배열저장방식 ( 행우선또는열우선 ) 결정 (IN) INTEGER oldtype : 전체배열원소의데이터타입 (IN) INTEGER newtype : 부분배열로구성된새로운데이터타입 (OUT) Supercomputing Center 46 7

74 부분배열의전송 (/) q 부분배열로구성되는유도데이터타입생성루틴 q order : 배열을읽고저장하는방식결정 order = MPI_ORDER_FORTRAN : 열우선 order = MPI_ORDER_C : 행우선 MPI_TYPE_CREATE_SUBARRAY 는 MPI- 에서지원하는루틴으로 KISTI IBM 시스템에서컴파일하는경우 _r 을붙일것 % mpxlf9_r o % mpcc_r o Supercomputing Center 47 부분배열의전송예제 a(:7,:6) ndims = array_of_sizes() = 6; array_of_sizes() = 7 array_of_subsizes() = ; array_of_subsizes() = 5 array_of_starts() = ; array_of_starts() = order = MPI_ORDER_FORTRAN Supercomputing Center 48 74

75 부분배열의전송예제 : Fortran (/) PROGRAM sub_array INCLUDE 'mpif.h' INTEGER ndims PARAMETER(ndims=) INTEGER ibuf(:7,:6) INTEGER array_of_sizes(ndims), array_of_subsizes(ndims) INTEGER array_of_starts(ndims) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO j =, 6 DO i =, 7 IF (myrank==) THEN ibuf(i,j) = i ELSE ibuf(i,j) = ENDIF Supercomputing Center 49 부분배열의전송예제 : Fortran (/) array_of_sizes()=6; array_of_sizes()=7 array_of_subsizes()=; array_of_subsizes()=5 array_of_starts()=; array_of_starts()= CALL MPI_TYPE_CREATE_SUBARRAY(ndims, array_of_sizes, & array_of_subsizes, array_of_starts, MPI_ORDER_FORTRAN,& MPI_INTEGER, newtype, ierr) CALL MPI_TYPE_COMMIT(newtype, ierr) CALL MPI_BCAST(ibuf,, newtype,, MPI_COMM_WORLD, ierr) PRINT *, I am :, myrank DO i=,7 PRINT *, (ibuf(i,j), j=,6) CALL MPI_FINALIZE(ierr) END Supercomputing Center 5 75

76 부분배열의전송예제 : C (/) #include <mpi.h> #define ndims void main(int argc, char *argv[]){ int ibuf[6][7]; int array_of_sizes[ndims], array_of_subsizes[ndims], array_of_starts[ndims]; int i, j, myrank; MPI_Datatype newtype; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<6; i++) for(j=; j<7; j++) ibuf[i][j] = i+; else for(i=; i<6; i++) for(j=; j<7; j++) ibuf[i][j] = ; array_of_sizes[]=6; array_of_sizes[]=7; array_of_subsizes[]=; array_of_subsizes[]=5; array_of_starts[]=; array_of_startst[]=; Supercomputing Center 5 부분배열의전송예제 : C (/) MPI_Type_create_subarray(ndims, array_of_sizes, array_of_subsizes, array_of_starts, MPI_ORDER_C, MPI_INT, &newtype); MPI_Type_commit(&newtype); MPI_Bcast(ibuf,, newtype,, MPI_COMM_WORLD); if(myrank!= ) { printf(" I am : %d \n ", myrank); for(i=; i<6; i++) { for(j=; j<7; j++) printf(" %d", ibuf[i][j]); printf("\n"); MPI_Finalize(); Supercomputing Center 5 76

77 프로세스그룹생성 : MPI_COMM_SPLIT 가상토폴로지 Supercomputing Center 5 프로세스그룹생성 : MPI_COMM_SPLIT C Fortran int MPI_Comm_split(MPI_Comm comm, int color, int key, MPI_Comm *newcomm) MPI_COMM_SPLIT(comm, color, key, newcomm, ierr) INTEGER comm: 커뮤니케이터 ( 핸들 ) (IN) INTEGER color: 같은 color 을가지는프로세스들을같은그룹에포함 (IN) INTEGER key: key 순서에따라그룹내의프로세스에새로운랭크를할당 (IN) INTEGER newcomm: 새로운커뮤니케이터 ( 핸들 ) (OUT) q comm 내의프로세스들을여러그룹으로묶은새로운커뮤니케이터 newcomm 생성 q color q color = MPI_UNDEFINED è newcomm = MPI_COMM_NULL Supercomputing Center 54 77

78 MPI_COMM_SPLIT 예제 MPI_COMM_WORLD newcomm newcomm icolor= icolor= ikey= ikey= ikey= ikey= rank= rank= rank= rank= rank= rank= rank= rank= Supercomputing Center 55 MPI_COMM_SPLIT 예제 : Fortran PROGRAM comm_split INCLUDE mpif.h CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ENDIF CALL MPI_COMM_SPLIT(MPI_COMM_WORLD, icolor, ikey, newcomm, ierr) CALL MPI_COMM_SIZE(newcomm, newprocs, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) PRINT *, newcomm=, newcomm, newprocs=,newprocs, newrank=,newrank CALL MPI_FINALIZE(ierr) END Supercomputing Center 56 78

79 MPI_COMM_SPLIT 예제 : C (/) /*comm_split*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, nprocs, myrank ; int icolor, ikey; int newprocs, newrank; MPI_Comm newcomm; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank == ){ icolor = ; ikey = ; else if (myrank == ){ icolor = ; ikey = ; Supercomputing Center 57 MPI_COMM_SPLIT 예제 : C (/) else if (myrank == ){ icolor = ; ikey = ; else if (myrank == ){ icolor = ; ikey = ; MPI_Comm_split(MPI_COMM_WORLD, icolor, ikey, &newcomm); MPI_Comm_size(newcomm, &newprocs); MPI_Comm_rank(newcomm, &newrank); printf( %d, myrank); printf( newcomm = %d, newcomm); printf( newprocs = %d, newprocs); printf( newrank = %d, newrank); printf( \n ); MPI_Finalize(); Supercomputing Center 58 79

80 가상토폴로지 (/) q 통신패턴에적합하도록프로세스에적절한이름을부여한새로운커뮤니케이터를구성하는것 q 코드작성을쉽게하고최적화된통신을가능케함 q 직교가상토폴로지 가상적인그리드상에서각프로세스가인접한이웃과연결 각프로세스는직교좌표값으로식별됨 주기적경계조건 (periodic boundary) Supercomputing Center 59 dim 가상토폴로지 (/) dim (,) (,) 6 (,) 9 (,) (,) 4 (,) 7 (,) (,) (,) 5 (,) 8 (,) (,) Supercomputing Center 6 8

81 가상토폴로지의사용 q 토폴로지를만들어새로운커뮤니케이터생성 MPI_CART_CREATE q MPI 대응함수를통해토폴로지상의명명방식에근거한프로세스랭크계산 MPI_CART_RANK MPI_CART_COORDS MPI_CART_SHIFT Supercomputing Center 6 토폴로지생성 : MPI_CART_CREATE C Fortran int MPI_Cart_create(MPI_Comm oldcomm, int ndims, int *dimsize, int *periods, int reorder, MPI_Comm *newcomm) MPI_CART_CREATE(oldcomm, ndims, dimsize,, periods, reorder, newcomm, ierr) INTEGER oldcomm: 기존커뮤니케이터 (IN) INTEGER ndims: 직교좌표의차원 (IN) INTEGER dimsize(*): 각좌표축의길이. 크기 ndims 의배열 (IN) LOGICAL periods(*): 각좌표축의주기성결정. 크기 ndims 의배열 (IN) LOGICAL reorder: MPI 가프로세스랭크를재정렬할것인가를결정 (IN) INTEGER newcomm: 새로운커뮤니케이터 (OUT) q q 가상토폴로지의구성을가지는커뮤니케이터 newcomm 리턴 인수 reorder 가거짓이면기존커뮤니케이터의랭크를그대로가지 고랭크와토폴로지그리드좌표사이의대응만설정함 Supercomputing Center 6 8

82 MPI_CART_CREATE 예제 4 5, (), (), (4), (), (), (5) Supercomputing Center 6 MPI_CART_CREATE 예제 : Fortran PROGRAM cart_create INCLUDE mpif.h INTEGER oldcomm, newcomm, ndims, ierr INTEGER dimsize(:) LOGICAL periods(:), reorder CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) oldcomm = MPI_COMM_WORLD ndims = dimsize() = ; dimsize() = periods() =.TRUE.; periods() =.FALSE. reorder =.FALSE. CALL MPI_CART_CREATE(oldcomm,ndims,dimsize,periods,reorder, newcomm, ierr) CALL MPI_COMM_SIZE(newcomm, newprocs, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) PRINT*,myrank, :newcomm=,newcomm, newprocs=,newprocs, & newrank=,newrank CALL MPI_FINALIZE(ierr) END Supercomputing Center 64 8

83 MPI_CART_CREATE 예제 : C /*cart_create*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int nprocs, myrank ; int ndims, newprocs, newrank; MPI_Comm newcomm; int dimsize[], periods[], reorder; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); ndims = ; dimsize[] = ; dimsize[] = ; periods[] = ; periods[] = ; reorder = ; MPI_Cart_create(MPI_COMM_WORLD,ndims,dimsize,periods,reorder, &newcomm); MPI_Comm_size(newcomm, &newprocs); MPI_Comm_rank(newcomm, &newrank); printf( %d, myrank); printf( newcomm= %d, newcomm); printf( newprocs= %d, newprocs); printf( newrank= %d, newrank); printf( \n ); MPI_Finalize(); Supercomputing Center 65 대응함수 : MPI_CART_RANK C Fortran int MPI_Cart_rank(MPI_Comm comm, int *coords, int *rank) MPI_CART_RANK(comm, coords,, rank, ierr) INTEGER comm : 가상토폴로지로생성된커뮤니케이터 (IN) INTEGER coords(*) : 직교좌표를나타내는크기 ndims 의배열 (IN) INTEGER rank : coords 에의해표현되는프로세스의랭크 (OUT) q 프로세스직교좌표를대응하는프로세스랭크로나타냄 q 좌표를알고있는경우그좌표에해당하는프로세스와의통신을위해사용 Supercomputing Center 66 8

84 MPI_CART_RANK 예제 : Fortran CALL MPI_CART_CREATE(oldcomm, ndims, dimsize, periods, reorder, newcomm, ierr) IF (myrank == ) THEN DO i =, dimsize()- DO j =, dimsize()- ENDIF END coords() = i coords() = j CALL MPI_CART_RANK(newcomm, coords, rank, ierr) PRINT *, coords =, coords, rank = rank MPI_CART_CREATE 예제에첨부 Supercomputing Center 67 MPI_CART_RANK 예제 : C MPI_Cart_create(oldcomm,ndims,dimsize,periods,reorder, &newcomm); if(myrank == ) { for(i=; i<dimsize[]; i++){ for(j=; j<dimsize[]; j++){ coords[] = i; coords[] = j; MPI_Cart_rank(newcomm, coords, &rank); printf( coords = %d, %d, rank = %d\n, coords[], coords[], rank); MPI_CART_CREATE 예제에첨부 Supercomputing Center 68 84

85 대응함수 : MPI_CART_COORDS C Fortran int MPI_Cart_coords(MPI_Comm comm, int rank, int ndims, int *coords) MPI_CART_COORDS(comm, rank, ndims, coords, ierr) INTEGER comm : 가상토폴로지로생성된커뮤니케이터 (IN) INTEGER rank : 루틴을호출한프로세스의랭크 (IN) INTEGER ndims : 직교좌표의차원 (IN) INTEGER coords(*) : 랭크에대응하는직교좌표 (IN) q 프로세스랭크를대응하는직교좌표로나타냄 q MPI_CART_RANK 의역함수 Supercomputing Center 69 MPI_CART_COORDS 예제 : Fortran CALL MPI_CART_CREATE(oldcomm, ndims, dimsize, periods, reorder, newcomm, ierr) IF (myrank == ) THEN DO rank =, nprocs - ENDIF END CALL MPI_CART_COORDS(newcomm,rank,ndims,coords,ierr) PRINT *,, rank = rank, coords =, coords MPI_CART_CREATE 예제에첨부 Supercomputing Center 7 85

86 MPI_CART_COORDS 예제 : C MPI_Cart_create(oldcomm,ndims,dimsize,periods,reorder, &newcomm); if(myrank == ) { for(rank=; rank<nprocs; rank++){ MPI_Cart_coords(newcomm,rank,ndims,coords); printf( rank = %d, coords = %d, %d\n, rank, coords[], coords[]); MPI_CART_CREATE 예제에첨부 Supercomputing Center 7 대응함수 : MPI_CART_SHIFT C Fortran int MPI_Cart_shift(MPI_Comm comm, int direction, int displ, int *source, int *dest) MPI_CART_SHIFT(comm, direction, displ, source, dest, ierr) INTEGER comm : 가상토폴로지로생성된커뮤니케이터 (IN) INTEGER direction : 시프트할방향 (IN) INTEGER displ : 프로세스좌표상의시프트할거리 (+/-) (IN) INTEGER source : direction 방향으로 displ 떨어진거리에있는프로세스, displ > 일때직교좌표가작아지는방향의프로세스랭크 (OUT) INTEGER dest : direction 방향으로 displ 떨어진거리에있는프로세스, displ > 일때직교좌표가커지는방향의프로세스랭크 (OUT) q 실제시프트를실행하는것은아님 q 특정방향을따라루틴을호출한프로세스의이웃프로세스의랭크를발견하는데사용 Supercomputing Center 7 86

87 MPI_CART_SHIFT 예제 direction= direction= source calling process dest direction= displ= Supercomputing Center 7 MPI_CART_SHIFT 예제 : Fortran ndims = dimsize() = 6; dimsize() = 4 periods() =.TRUE.; periods() =.TRUE. reorder =.TRUE. CALL MPI_CART_CREATE(oldcomm,ndims,dimsize,periods,reorder, & newcomm, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) CALL MPI_CART_COORDS(newcomm, newrank, ndims, coords, ierr) direction= displ= CALL MPI_CART_SHIFT(newcomm, direction, displ, source, dest, ierr) PRINT *,' myrank =',newrank, 'coords=', coords PRINT *, 'source =', source, 'dest =', dest MPI_CART_CREATE 예제에첨부 Supercomputing Center 74 87

88 MPI_CART_SHIFT 예제 : C ndims = ; dimsize[] = 6; dimsize[] = 4; periods[] = ; periods[] = ; reorder = ; MPI_Cart_create(MPI_COMM_WORLD, ndims, dimsize, periods, reorder, &newcomm); MPI_Comm_rank(newcomm, &newrank); MPI_Cart_coords(newcomm, newrank, ndims, coords); direction=; displ=; MPI_Cart_shift(newcomm, direction, displ, &source, &dest); printf( myrank= %d, coords= %d, %d \n, newrank, coords[], coords[]); printf( source= %d, dest= %d \n, source, dest); MPI_CART_CREATE 예제에첨부 Supercomputing Center 75 토폴로지분해 : MPI_CART_SUB C Fortran int MPI_Cart_sub(MPI_Comm oldcomm, int *belongs, MPI_Comm *newcomm) MPI_CART_SUB(oldcomm, belongs, newcomm, ierr) INTEGER oldcomm : 가상토폴로지로생성된커뮤니케이터 (IN) LOGICAL belongs(*) : 토폴로지상에서해당좌표축방향으로의분해여부를나타내는 ndims 크기의배열 (IN) INTEGER newcomm : 토폴로지를분해한새로운커뮤니케이터 (OUT) q q q 직교토폴로지를축방향으로분해하여여러개의서브토폴로지로구성되는새로운커뮤니케이터생성 토폴로지상의특정행또는열들에대해서만통신이필요한경우사용 MPI_COMM_SPLIT 과유사 Supercomputing Center 76 88

89 MPI_CART_SUB 예제, (), (), () (), () (), () (), () (), (), (4), (), (5), () (), () (), () (), () (), (4) (), (5) (), (4) (), (5) () belongs()=.false. belongs()=.true. belongs()=.true. belongs()=.false. Supercomputing Center 77 MPI_CART_SUB 예제 : Fortran ndims = dimsize() = ; dimsize() = CALL MPI_CART_CREATE(oldcomm, ndims, dimsize, periods, & reorder, newcomm, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) CALL MPI_CART_COORDS(newcomm, newrank, ndims, coords, ierr) belongs()=.false.; belongs()=.true. CALL MPI_CART_SUB(newcomm, belongs, commrow,ierr) CALL MPI_comm_rank(commrow, rank, ierr) PRINT *,' myrank =',newrank, 'coords=', coords PRINT *, 'commrow =', commrow PTINT *, 'rank=', rank MPI_CART_CREATE 예제에첨부 Supercomputing Center 78 89

90 MPI_CART_SUB 예제 : C ndims = ; dimsize[] = ; dimsize[] = ; MPI_Cart_create(MPI_COMM_WORLD,ndims,dimsize,periods, reorder,&newcomm); MPI_Comm_rank(newcomm, &newrank); MPI_Cart_coords(newcomm, newrank, ndims, coords); belongs[]=; belongs[]=; MPI_Cart_sub(newcomm, belongs, &commrow); MPI_Comm_rank(commrow, &rank); printf( myrank= %d, coords= %d,%d \n, newrank, coords[], coords[]); printf( commrow = %d \n, commrow); printf( rank= %d \n, rank); MPI_CART_CREATE 예제에첨부 Supercomputing Center 79 제 장 MPI 를이용한 병렬프로그래밍실제 장의내용을기본으로 MPI 를이용한병렬 프로그램작성시염두에둬야할데이터처 리방식과, 프로그래밍테크닉, 주의할점 등에대해알아본다. 9

91 병렬프로그램의입출력 DO 루프의병렬화 블록분할 순환분할 배열수축 Supercomputing Center 8 병렬프로그램의입력 (/4) q 공유파일시스템으로부터동시에읽어오기 rank rank rank indata indata indata read() indata Shared File System Supercomputing Center 8 9

92 병렬프로그램의입력 (/4) q 입력파일의복사본을각각따로가지는경우 rank indata rank indata rank indata READ() indata Local File System Local File System Local File System Supercomputing Center 8 병렬프로그램의입력 (/4) q 한프로세스가입력파일을읽어다른프로세스에전달. rank indata rank indata rank indata IF (myrank==) THEN READ () indata ENDIF CALL MPI_BCAST(indata,) Local or Shared File System Supercomputing Center 84 9

93 병렬프로그램의입력 (4/4) q 한프로세스가입력파일을읽어다른프로세스에전달. rank rank rank indata indata indata IF (myrank==) THEN READ () indata ENDIF CALL MPI_SCATTER(indata,) Local or Shared File System Supercomputing Center 85 q 표준출력 병렬프로그램의출력 (/) print *, I am :, myrank, Hello world! 모든프로세스들이출력 (IBM 환경변수 ) MP_STDOUTMODE = unordered (: 디폴트 ) MP_STDOUTMODE = ordered if(myrank==) then print *, I am :, myrank, Hello world! endif 원하는프로세스만출력 (IBM 환경변수 ) MP_STDOUTMODE = rank_id Supercomputing Center 86 9

94 병렬프로그램의출력 (/) q 한프로세스가데이터를모아로컬파일시스템에저장 rank rank rank outdata outdata outdata CALL MPI_GATHER(outdata,) IF (myrank==) THEN WRITE() outdata ENDIF Local or Shared File System Supercomputing Center 87 병렬프로그램의출력 (/) q 각프로세스가공유파일시스템에순차적으로저장 rank rank rank outdata outdata outdata First Second Third Shared File System Supercomputing Center 88 94

95 DO 루프의병렬화 q 루프내에서반복되는계산의할당문제는루프인덱스 와관련된인덱스를가지는배열의할당문제가됨 배열을어떻게나눌것인가? 나눠진배열과관련계산을어떻게효율적으로할당할것 인가? Supercomputing Center 89 배열의분할 q 블록분할 (Block Distribution) iteration rank q 순환분할 (Cyclic Distribution) iteration rank q 블록 -순환분할 (Block-Cyclic Distribution) iteration rank Supercomputing Center 9 95

96 q n = p Ⅹ q + r n : 반복계산회수 p : 프로세스개수 q : n 을 p로나눈몫 r : n 을 p로나눈나머지 블록분할 (/) q r개프로세스에 q+ 번, 나머지프로세스에 q번의계산할당 n=r(q+) + (p-r)q Iteration Rank Supercomputing Center 9 q 블록분할코드예 : Fortran 블록분할 (/) SUBROUTINE para_range(n, n, nprocs, irank, ista, iend) iwork = (n - n + ) / nprocs iwork = MOD(n - n +, nprocs) ista = irank * iwork + n + MIN(irank, iwork) iend = ista + iwork - IF (iwork > irank) iend = iend + END n 부터 n 까지반복되는루프계산을 nprocs 개프로세스에블록분할을이용해할당하는서브루틴 프로세스 irank 가 ista 부터 iend 까지계산을할당받음 Supercomputing Center 9 96

97 q 블록분할코드예 : C 블록분할 (/) void para_range(int n,int n, int nprocs, int myrank, int *ista, int *iend){ int iwork, iwork; iwork = (n-n+)/nprocs; iwork = (n-n+) % nprocs; *ista= myrank*iwork + n + min(myrank, iwork); *iend = *ista + iwork - ; if(iwork > myrank) *iend = *iend + ; Supercomputing Center 9 블록분할예제 S = N å i= a( i) S n = å a( i) i= S n = å a( i) i= n + S = N å i= n a( i) + S = S + + S S Supercomputing Center 94 97

98 블록분할예제 : Fortran PROGRAM para_sum INCLUDE mpif.h PARAMETER (n = ) Integer(8),DIMENSION a(n) Integer(8) :: sum,ssum CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, n, nprocs, myrank, ista, iend) DO i = ista, iend a(i) = i sum = DO i = ista, iend sum = sum + a(i) CALL MPI_REDUCE(sum,ssum,,MPI_INTEGER8,MPI_SUM,,MPI_COMM_WORLD,i err) sum = ssum IF (myrank == ) PRINT *, sum =,sum CALL MPI_FINALIZE(ierr) END Supercomputing Center 95 블록분할예제 : C (/) /*parallel_main*/ #include <mpi.h> #include <stdio.h> #define n void para_range(int, int, int, int, int*, int*); int min(int, int); void main (int argc, char *argv[]){ int i, nprocs, myrank ; int ista, iend; double a[n], sum, tmp; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, n, nprocs, myrank, &ista, &iend); for(i = ista-; i<iend; i++) a[i] = i+; sum =.; for(i = ista-; i<iend; i++) sum = sum + a[i]; Supercomputing Center 96 98

99 블록분할예제 : C (/) MPI_Reduce(&sum, &tmp,, MPI_DOUBLE, MPI_SUM,, MPI_COMM_WORLD); sum = tmp; if(myrank == ) printf( sum = %f \n, sum); MPI_Finalize(); int min(int x, int y){ int v; if (x>=y) v = y; else v = x; return v; void para_range(int n,int n, int nprocs, int myrank, int *ista, int *iend){ Supercomputing Center 97 순환분할 DO i = n, n computation DO i = n+myrank, n, nprocs computation Iteration Rank q 블록분할보다효과적인로드밸런싱 q 블록분할보다캐시미스발생이많아짐 Supercomputing Center 98 99

100 블록 -순환분할 DO ii = n+myrank*iblock, n, nprocs*iblock DO i = ii, MIN(ii+iblock-, n) computation Iteration Rank iblock Supercomputing Center 99 순환분할, 블록 -순환분할예제 q 블록분할예제 è 순환분할 è 블록 -순환분할 Supercomputing Center

101 배열수축 (/4) q 병렬화작업에참여하는프로세스는전체배열을가져올필요없이자신이계산을담당한부분의데이터만메모리에가져와계산을수행하면됨다른프로세스의데이터가필요하면통신을통해송 / 수신 q n개의프로세서를연결한분산메모리시스템은 개의프로세서를가진시스템보다 n배의메모리를사용할수있음 q 사용자는분산메모리시스템에서의병렬화작업을통하여처리하는데이터크기를증가시킬수있게됨 è 배열수축 (Shrinking Arrays) 기술 Supercomputing Center q 순차실행 배열수축 (/4) a(i,j) Supercomputing Center

102 배열수축 (/4) q 병렬실행 Process Process Process Process Supercomputing Center 배열수축 (4/4) Process Process Process Process Supercomputing Center 4

103 메모리동적할당예제 : Fortran (/) PROGRAM dynamic_alloc INCLUDE mpif.h PARAMETER (n =, n = ) REAL(8), ALLOCATABLE :: a(:) Real(8) :: sum, ssum CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(n, n, nprocs, myrank, ista, iend) ALLOCATE (a(ista:iend)) DO i = ista, iend a(i) = real(i) sum =. DO i = ista, iend sum = sum + a(i) DEALLOCATE (a) Supercomputing Center 5 메모리동적할당예제 : Fortran (/) CALL MPI_REDUCE(sum, ssum,, MPI_REAL8, MPI_SUM,, & MPI_COMM_WORLD, ierr) sum = ssum PRINT *, sum =,sum CALL MPI_FINALIZE(ierr) END SUBROUTINE para_range( ) Supercomputing Center 6

104 메모리동적할당예제 : C (/) /*dynamic_alloc*/ #include <mpi.h> #include <stdio.h> #define n void para_range(int, int, int, int, int*, int*); int min(int, int); void main (int argc, char *argv[]){ int i, nprocs, myrank ; int ista, iend, diff; double sum, tmp; double *a; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, n-, nprocs, myrank, &ista, &iend); diff = iend-ista+; a = (double *)malloc(diff*sizeof(double)); Supercomputing Center 7 메모리동적할당예제 : C (/) for(i = ista-; i<iend; i++) a[i] = i+; sum =.; for(i = ista-; i<iend; i++) sum = sum + a[i]; free(a); MPI_Reduce(&sum, &tmp,, MPI_DOUBLE, MPI_SUM,, MPI_COMM_WORLD); sum = tmp; if(myrank == ) { printf( sum = %f \n, sum); MPI_Finalize(); Supercomputing Center 8 4

105 내포된루프의병렬화 캐시미스줄이기 통신량줄이기 Supercomputing Center 9 내포된루프의병렬화 q 캐시미스와통신의최소화에유념할것 q 차원배열의메모리저장방식 a(i,j) j i ㆍㆍㆍ N a[i][j] j i 4 4 ㆍㆍㆍ N- ㆍㆍㆍ ㆍㆍㆍ N N- (a) Fortran (b) C Supercomputing Center 5

106 캐시미스줄이기 (/) q 메모리저장방식에의한캐시미스차이 Loop A (column-major) DO j =, n DO i =, n a(i,j) = b(i,j) + c(i,j) Loop B (row-major ) DO i =, n DO j =, n a(i,j) = b(i,j) + c(i,j) Fortran 은루프 B에서더많은캐시미스가발생하므로, 루프 A가더빠르게실행됨 è 루프 A의병렬화 Supercomputing Center 캐시미스줄이기 (/) q 바깥쪽루프와안쪽루프의병렬화에따른캐시미스차이 Loop A ( 바깥쪽병렬화 ) DO j = jsta, jend DO i =, n a(i,j) = b(i,j) + c(i,j) Loop A ( 안쪽병렬화 ) DO j =, n DO i = ista, iend a(i,j) = b(i,j) + c(i,j) Fortran 은루프 A 에서더많은캐시미스가발생하므로, 루프 A 이더빠르게실행됨 ( 다음장그림참조 ) Supercomputing Center 6

107 캐시미스줄이기 (/) a(i,j) i j Process Process Process a(i,j) i j Process Process Process () Loop A () Loop A Supercomputing Center 통신량줄이기 (/6) q 한방향으로필요한데이터를통신해야하는경우 i j b(i,j-) a(i,j) b(i,j+) Loop C DO j =, n DO i =, n a(i,j) = b(i, j-) + b(i, j+) Supercomputing Center 4 7

108 통신량줄이기 (/6) q 바깥쪽루프 ( 열방향 ) 병렬화에의해요구되는통신 Process r- Process r Process r+ j i jsta jend b(i,jsta-) a(i,jsta) b(i,jsta+) b(i,jend-) a(i,jend) b(i,jend+) n q 이경우안쪽루프 ( 행방향 ) 의병렬화는통신필요없음 Supercomputing Center 5 통신량줄이기 (/6) q 양방향으로필요한데이터를통신해야하는경우 i j b(i-,j) b(i,j-) a(i,j) b(i,j+) b(i+,j) Loop D DO j =, n DO i =, m a(i,j) = b(i-,j) + b(i,j-) + b(i,j+) + b(i+,j) Supercomputing Center 6 8

109 q m, n 의크기에의존 통신량줄이기 (4/6) b(i,j) i j Process Process Process n b(i,j) j i n Process Process Process m m () Column-wise distribution () Row-wise distribution Supercomputing Center 7 통신량줄이기 (5/6) q 안쪽과바깥쪽모두에대해병렬화를하는경우 i j Loop E DO j = jsta, jend DO i = ista, iend Process Process Process Process a(i,j) = b(i-,j) + b(i,j-) + b(i,j+) + b(i+,j) Supercomputing Center 8 9

110 통신량줄이기 (6/6) q 프로세스의개수는합성수여야함 q 블록의형태가정사각형에가까울수록통신량이감소 Supercomputing Center 9 다른프로세스의데이터참조 차원유한차분법의병렬화 대량데이터전송 데이터동기화 Supercomputing Center

111 외부데이터참조의간단한예 q 병렬화과정에서다른프로세스가가진데이터를참조해야하는경우 REAL a(9), b(9) DO i =, 9 a(i) = i DO i =, 9 b(i) = b(i)*a() REAL a(9), b(9) DO i = ista, iend a(i) = i CALL MPI_BCAST(a(),,MPI_REAL,, & MPI_COMM_WORLD, ierr) DO i = ista, iend b(i) = b(i)*a() Supercomputing Center 차원유한차분법 (/) q 차원유한차분법 (FDM) 의핵심부 : 순차프로그램 Fortran PROGRAM D_fdm_serial PARAMETER (n=) DIMENSION a(n), b(n) DO i =, n b(i) = i DO i =, n- a(i) = b(i-) + b(i+) END C /*D_fdm_serial*/ #define n main(){ double a[n], b[n]; int i; for(i=; i<n; i++) b[i] = i+; for(i=; i<n-; i++) a[i] = b[i-] + b[i+]; Supercomputing Center

112 차원유한차분법 (/) q 차원 FDM 의데이터의존성 b() b() b() b(4) b(5) b(6) b(7) b(8) b(9) b() b() a() a() a(4) a(5) a(6) a(7) a(8) a(9) a() Supercomputing Center 차원유한차분법 (/) q 병렬화된 차원 FDM 의데이터전송 Process Process Process ista iend b 4 ista iend b b 9 send send send send ista iend receive ista iend b 4 5 receive ista iend b receive receive ista iend b 8 9 a ista iend a 4 6 ista iend a 8 ista iend 차원 FDM 은 차원배열의경계선, 차원 FDM 은 차원배열의경계면전송 Supercomputing Center 4

113 차원 FDM 병렬화코드 : Fortran (/) PROGRAM parallel_d_fdm INCLUDE mpif.h PARAMETER (n=) DIMENSION a(n), b(n) INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, n, nprocs, myrank, ista, iend) ista = ista; iend = iend IF (myrank == ) ista = IF (myrank == nprocs-) iend = n- inext = myrank + ; iprev = myrank IF (myrank == nprocs-) inext = MPI_PROC_NULL IF (myrank == ) iprev = MPI_PROC_NULL DO i = ista, iend b(i) = i Supercomputing Center 5 차원 FDM 병렬화코드 : Fortran (/) CALL MPI_ISEND(b(iend),, MPI_REAL, inext,, & MPI_COMM_WORLD, isend, ierr) CALL MPI_ISEND(b(ista),, MPI_REAL, iprev,, & MPI_COMM_WORLD, isend, ierr) CALL MPI_IRECV(b(ista-),, MPI_REAL, iprev,, & MPI_COMM_WORLD, irecv, ierr) CALL MPI_IRECV(b(iend+),, MPI_REAL, inext,, & MPI_COMM_WORLD, irecv, ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) DO i = ista, iend a(i) = b(i-) + b(i+) CALL MPI_FINALIZE(ierr) END Supercomputing Center 6

114 차원 FDM 병렬화코드 : C (/) /*parallel_d_fdm*/ #include <mpi.h> #define n void para_range(int, int, int, int, int*, int*); int min(int, int); main(int argc, char *argv[]){ int i, nprocs, myrank ; double a[n], b[n]; int ista, iend, ista, iend, inext, iprev; MPI_Request isend, isend, irecv, irecv; MPI_Status istatus; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, n-, nprocs, myrank, &ista, &iend); ista = ista; iend = iend; if(myrank==) ista=; if(myrank==nprocs-)iend=n-; Supercomputing Center 7 차원 FDM 병렬화코드 : C (/) inext=myrank+; iprev=myrank-; for(i=ista; i<=iend; i++) b[i]=i+; if(myrank==nprocs-) inext=mpi_proc_null; if(myrank==) iprev=mpi_proc_null; MPI_Isend(&b[iend],, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &isend); MPI_Isend(&b[ista],, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &isend); MPI_Irecv(&b[ista-],, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &irecv); MPI_Irecv(&b[iend+],, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &irecv); MPI_Wait(&isend, &istatus); MPI_Wait(&isend, &istatus); MPI_Wait(&irecv, &istatus); MPI_Wait(&irecv, &istatus); for(i=ista; i<=iend; i++) a[i] = b[i-] + b[i+]; MPI_Finalize(); Supercomputing Center 8 4

115 대량데이터전송 q 각프로세스의모든데이터를한프로세스로취합. 연속데이터 : 송 / 수신버퍼가중복되지않는경우. 연속데이터 : 송 / 수신버퍼가중복되는경우. 불연속데이터 : 송 / 수신버퍼가중복되는경우 4. 불연속데이터 : 송 / 수신버퍼가중복되지않는경우 q 모든프로세스가가진전체배열을동시에갱신. 송 / 수신버퍼가중복되지않는경우. 송 / 수신버퍼가중복되는경우 q 블록분할된데이터의전환 ( 행방향 열방향 ) 과재할당 Supercomputing Center 9 연속데이터취합 : 버퍼중복없음 (/) q 송 / 수신버퍼가중복되지않는경우 : Fortran j A(i,j) i M jsta 4 5 Process 6 7 jend N A M Process Process 8 9 N A M jsta jend jsta jend N idisp() idisp() idisp() B M 4 5 jjlen()= jjlen()= jjlen()= N Supercomputing Center 5

116 연속데이터취합 : 버퍼중복없음 (/) q 병렬화코드 : Fortran REAL a(m,n), b(m,n) INTEGER, ALLOCATABLE :: idisp(:), jjlen(:)... ALLOCATE (idisp(:nprocs-), jjlen(:nprocs-)) DO irank =, nprocs - CALL para_range(, n, nprocs, irank, jsta, jend) jjlen(irank) = m * (jend - jsta + ) idisp(irank) = m * (jsta - ) CALL para_range(, n, nprocs, myrank, jsta, jend)... CALL MPI_GATHERV(a(,jsta), jjlen(myrank), MPI_REAL, & b, jjlen, idisp, MPI_REAL,, MPI_COMM_WORLD, ierr) DEALLOCATE (idisp, jjlen) Supercomputing Center 연속데이터취합 : 버퍼중복없음 (/) q 병렬화코드 : C double a[m][n], b[m][n]; int *idisp, *iilen; idisp = (int *)malloc(nprocs*sizeof(int)); iilen = (int *)malloc(nprocs*sizeof(int)); for(irank = ; irank<nprocs; irank++){ para_range(, m-, nprocs, irank, &ista, &iend); iilen[irank] = n*(iend-ista+); idisp[irank] = n*ista; para_range(, m-, nprocs, myrank, &ista, &iend); MPI_Gatherv(&a[ista][], iilen[myrank], MPI_DOUBLE, b, iilen, idisp, MPI_DOUBLE,, MPI_COMM_WORLD); free(idisp); free(iilen); Supercomputing Center 6

117 연속데이터취합 : 버퍼중복 (/5) q 송 / 수신버퍼가중복되는경우 : Fortran Process Process Process j jjlen()= 4 jjlen()= jjlen()= A(i,j) N i A 8 N A N M jsta() jsta() jsta() M M jsta jend jsta jend Supercomputing Center 연속데이터취합 : 버퍼중복 (/5) q 병렬화코드 : Fortran REAL a(m,n) INTEGER, ALLOCATABLE :: jjsta(:), jjlen(:), iireq(:) INTEGER istatus(mpi_status_size)... ALLOCATE (jjsta(:nprocs-)) ALLOCATE (jjlen(:nprocs-)) ALLOCATE (iireq(:nprocs-)) DO irank =, nprocs - CALL para_range(, n, nprocs, irank, jsta, jend) jjsta(irank) = jsta jjlen(irank) = m * (jend - jsta + ) CALL para_range(, n, nprocs, myrank, jsta, jend)... Supercomputing Center 4 7

118 연속데이터취합 : 버퍼중복 (/5) q 병렬화코드 : Fortran ( 계속 ) IF (myrank == ) THEN ELSE ENDIF DO irank =, nprocs - CALL MPI_IRECV(a(,jjsta(irank)),jjlen(irank),MPI_REAL,& DO irank =, nprocs - irank,, MPI_COMM_WORLD, iireq(irank),ierr) CALL MPI_WAIT(iireq(irank), istatus, ierr) CALL MPI_ISEND(a(,jsta), jjlen(myrank), MPI_REAL, &,, MPI_COMM_WORLD, ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) DEALLOCATE (jjsta, jjlen, iireq) Supercomputing Center 5 연속데이터취합 : 버퍼중복 (4/5) q 병렬화코드 : C double a[m][n]; int *iista, *iilen; MPI_Request *iireq; MPI_Status istatus; iista = (int *)malloc(nprocs*sizeof(int)); iilen = (int *)malloc(nprocs*sizeof(int)); iireq = (int *)malloc(nprocs*sizeof(int)); for(irank = ; irank<nprocs; irank++){ para_range(, m-, nprocs, irank, &ista, &iend); iista[irank] = ista; iilen[irank] = n*(iend-ista+); para_range(, m-, nprocs, myrank, &ista, &iend); Supercomputing Center 6 8

119 연속데이터취합 : 버퍼중복 (5/5) q 병렬화코드 : C ( 계속 ) if (myrank == ) { for(irank = ; irank<nprocs; irank++) MPI_Irecv(&a[][iista[irank]], iilen[irank], MPI_DOUBLE, irank,, MPI_COMM_WORLD, &iireq[irank]); for(irank = ; irank<nprocs; irank++) else { MPI_Wait(&iireq[irank], &istatus); MPI_Isend(&a[][ista], iilen[myrank], MPI_DOUBLE,,, MPI_COMM_WORLD, &iireq); MPI_Wait(&ireq, &istatus); free(iista); free(iilen); free(iireq); Supercomputing Center 7 불연속데이터취합 : 버퍼중복 (/5) q 송 / 수신버퍼가중복되는경우 : Fortran Process Process Process A(i,j) i j N ista A N A N itype() iend itype() ista 5 itype() iend ista 8 M 9 9 M M 9 9 iend Supercomputing Center 8 9

120 불연속데이터취합 : 버퍼중복 (/5) q 병렬화코드 : Fortran REAL a(m,n) PARAMETER(ndims=) INTEGER sizes(ndims), subsizes(ndims), starts(ndims) INTEGER, ALLOCATABLE :: itype(:), iireq(:) INTEGER istatus(mpi_status_size)... ALLOCATE (itype(:nprocs-), iireq(:nprocs-)) sizes()=m; sizes()=n DO irank =, nprocs - CALL para_range(, m, nprocs, irank, ista, iend) subsizes() = iend-ista+; subsizes() = n starts() = ista-; starts() = CALL MPI_TYPE_CREATE_SUBARRAY(ndims, sizes, subsizes, & starts, MPI_ORDER_FORTRAN, MPI_REAL, itype(irank), ierr) CALL MPI_TYPE_COMMIT(itype(irank), ierr) Supercomputing Center 9 불연속데이터취합 : 버퍼중복 (/5) q 병렬화코드 : Fortran ( 계속 ) CALL para_range(, m, nprocs, myrank, ista, iend) IF (myrank == ) THEN DO irank =, nprocs - CALL MPI_IRECV(a,, itype(irank), irank, &, MPI_COMM_WORLD, iireq(irank), ierr) DO irank =, nprocs - CALL MPI_WAIT(iireq(irank), istatus, ierr) ELSE CALL MPI_ISEND(a,, itype(myrank),,, MPI_COMM_WORLD, & ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) ENDIF DEALLOCATE (itype, iireq) Supercomputing Center 4

121 불연속데이터취합 : 버퍼중복 (4/5) q 병렬화코드 : C #define ndims double a[m][n]; MPI_Datatype *itype; MPI_Request *iireq; int sizes[ndims], subsizes[ndims], starts[ndims]; MPI_Status istatus; itype = (int *)malloc(nprocs*sizeof(int)); iireq = (int *)malloc(nprocs*sizeof(int)); sizes[]=m; sizes[]=n; for(irank = ; irank<nprocs; irank++){ para_range(, n-, nprocs, irank, &jsta, &jend); subsizes[]= m; subsizes[] = jend-jsta+; starts[] = ; starts[] = jsta; MPI_Type_create_subarray(ndims, sizes, subsizes, starts, MPI_ORDER_C, MPI_DOUBLE, &itype[irank]); MPI_Type_commit(&itype[irank]); Supercomputing Center 4 불연속데이터취합 : 버퍼중복 (5/5) q 병렬화코드 : C ( 계속 ) para_range(, n, nprocs, myrank, &jsta, &jend); if (myrank == ) { for(irank = ; irank<nprocs; irank++) MPI_Irecv(a,, itype[irank], irank,, MPI_COMM_WORLD, &iireq[irank]); for(irank = ; irank<nprocs; irank++) MPI_Wait(&iireq[irank], &istatus); else { MPI_Isend(a,, itype[myrank],,, MPI_COMM_WORLD, &ireq); MPI_Wait(&ireq, &istatus); free(itype); free(iireq); Supercomputing Center 4

122 데이터동기화 : 버퍼중복없음 (/) q 송 / 수신버퍼가중복되지않는경우 : Fortran Process Process Process A 4 6 N A 8 N A N M jsta 5 7 jend M 9 M jsta jend jsta jend idisp() idisp() idisp() B 4 6 N B 4 6 N B 4 6 N M M M jjlen()= 4 jjlen()= jjlen()= Supercomputing Center 4 데이터동기화 : 버퍼중복없음 (/) q 병렬화코드 : Fortran REAL a(m,n), b(m,n) INTEGER, ALLOCATABLE :: idisp(:), jjlen(:)... ALLOCATE (idisp(:nprocs-), jjlen(:nprocs-)) DO irank =, nprocs - CALL para_range(, n, nprocs, irank, jsta, jend) jjlen(irank) = m * (jend - jsta + ) idisp(irank) = m * (jsta - ) CALL para_range(, n, nprocs, myrank, jsta, jend)... CALL MPI_ALLGATHERV(a(,jsta), jjlen(myrank), MPI_REAL, & b, jjlen, idisp, MPI_REAL, MPI_COMM_WORLD, ierr) DEALLOCATE (idisp, jjlen) Supercomputing Center 44

123 데이터동기화 : 버퍼중복없음 (/) q 병렬화코드 : C double a[m][n], b[m][n]; int *idisp, *iilen; idisp = (int *)malloc(nprocs*sizeof(int)); iilen = (int *)malloc(nprocs*sizeof(int)); for(irank = ; irank<nprocs; irank++){ para_range(, m-, nprocs, irank, &ista, &iend); iilen[irank] = n*(iend-ista+); idisp[irank] = n*ista; para_range(, m-, nprocs, myrank, &ista, &iend); MPI_Allgathrev(&a[ista][], iilen[myrank], MPI_DOUBLE, b, iilen, idisp, MPI_DOUBLE,, MPI_COMM_WORLD); free(idisp); free(iilen); Supercomputing Center 45 데이터동기화 : 버퍼중복 (/) q 송 / 수신버퍼가중복되는경우 : Fortran Process Process Process jjlen()= 4 jjlen()= jjlen()= A 4 6 N A 4 6 N A 4 6 N M jjsta() jjsta() jjsta() M M Supercomputing Center 46

124 데이터동기화 : 버퍼중복 (/) q 병렬화코드 : Fortran REAL a(m,n) INTEGER, ALLOCATABLE :: jjsta(:), jjlen(:)... ALLOCATE (jjsta(:nprocs-), jjlen(:nprocs-)) DO irank =, nprocs - CALL para_range(, n, nprocs, irank, jsta, jend) jjsta(irank) = jsta jjlen(irank) = m * (jend - jsta + ) CALL para_range(, n, nprocs, myrank, jsta, jend)... DO irank =, nprocs - CALL MPI_BCAST(a(,jjsta(irank)), jjlen(irank), MPI_REAL, & irank, MPI_COMM_WORLD, ierr) DEALLOCATE (jjsta, jjlen) Supercomputing Center 47 데이터동기화 : 버퍼중복 (/) q 병렬화코드 : C double a[m][n]; int *iista, *iilen ; iista = (int *)malloc(nprocs*sizeof(int)); iilen = (int *)malloc(nprocs*sizeof(int)); for(irank = ; irank<nprocs; irank++){ para_range_sta(, m-, nprocs, irank, &ista, &iend); iista[irank] = ista; iilen[irank] = n*(iend-ista+); para_range_sta(, m-, nprocs, myrank, &ista, &iend); for(irank = ; irank<nprocs; irank++){ MPI_BCAST(&a[iista[irank]][], iilen[irank], MPI_DOUBLE, irank, MPI_COMM_WORLD); free(iista); free(iilen); Supercomputing Center 48 4

125 블록분할의전환 중첩 파이프라인방법 비틀림분해 프리픽스합 Supercomputing Center 49 블록분할의전환 (/8) q 열방향블록분할 è 행방향블록분할 Process Process A(i,j) j N i M Process A(i,j) i M j N Supercomputing Center 5 5

126 블록분할의전환 (/8) q 유도데이터타입정의 : MPI_TYPE_CREATE_SUBARRAY A(i,j) i j itype(,) itype(,) itype(,) itype(,) itype(,) itype(,) N M itype(,) itype(,) itype(,) Supercomputing Center 5 블록분할의전환 (/8) q 병렬화코드 : Fortran PARAMETER (m=7, n=8) PARAMETER (ncpu=) PARAMETER(ndims=) REAL a(m,n) INTEGER sizes(ndims), subsizes(ndims), starts(ndims) INTEGER itype(:ncpu-, :ncpu-) INTEGER ireq(:ncpu-), ireq(:ncpu-) INTEGER istatus(mpi_status_size) sizes()=m sizes()=n CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) Supercomputing Center 5 6

127 블록분할의전환 (4/8) q 병렬화코드 : Fortran ( 계속 ) DO jrank =, nprocs- CALL para_range(, n, nprocs, jrank, jsta, jend) DO irank =, nprocs- CALL para_range(, m, nprocs, irank, ista, iend) subsizes() = iend-ista+; subsizes() = jend-jsta+ starts() = ista-; starts() = jsta-; CALL MPI_TYPE_CREATE_SUBARRAY(ndims, sizes, subsizes, & starts, MPI_ORDER_FORTRAN, MPI_REAL, & itype(irank,jrank), ierr) CALL MPI_TYPE_COMMIT(itype(irank,jrank), ierr) CALL para_range(, m, nprocs, myrank, ista, iend) CALL para_range(, n, nprocs, myrank, jsta, jend) Supercomputing Center 5 블록분할의전환 (5/8) q 병렬화코드 : Fortran ( 계속 ) DO irank =, nprocs- IF (irank /= myrank) THEN CALL MPI_ISEND(a,, itype(irank, myrank), irank,, & MPI_COMM_WORLD, ireq(irank), ierr) CALL MPI_IRECV(a,, itype(myrank, irank), irank,, & MPI_COMM_WORLD, ireq(irank), ierr) ENDIF DO irank =, nprocs- IF (irank /= myrank) THEN CALL MPI_WAIT(ireq(irank), istatus, ierr) CALL MPI_WAIT(ireq(irank), istatus, ierr) ENDIF Supercomputing Center 54 7

128 q 병렬화코드 : C 블록분할의전환 (6/8) #define ndims #define ncpu #define m 7 #define n 8 double a[m][n]; int *itype; int sizes[ndims], subsizes[ndims], starts[ndims]; MPI_Request ireq[ncpu], ireq[ncpu]; MPI_Datatype itype[ncpu][ncpu]; MPI_Status istatus; sizes[]=m; sizes[]=n; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 55 블록분할의전환 (7/8) q 병렬화코드 : C ( 계속 ) for(jrank = ; jrank<nprocs; jrank++){ para_range(, n-, nprocs, jrank, &jsta, &jend); for(irank = ; irank<nprocs; irank++){ para_range(, m-, nprocs, irank, &ista, &iend); subsizes[] =iend-ista+; subsizes[] = jend-jsta+; starts[] = ista; starts[] = jsta; MPI_Type_create_subarray(ndims, sizes, subsizes, starts, MPI_ORDER_C, itype[irank][jrank]); MPI_Type_commit(&itype[irank][jrank]); para_range(, m-, nprocs, myrank, &ista, &iend); para_range(, n-, nprocs, myrank, &jsta, &jend); MPI_DOUBLE, Supercomputing Center 56 8

129 q 병렬화코드 : C ( 계속 ) 블록분할의전환 (8/8) for(irank = ; irank<nprocs; irank++){ if (irank!= myrank) { MPI_Isend(a,, itype[irank][myrank], irank,, MPI_COMM_WORLD, &ireq[irank]); MPI_Irecv(a,, itype[myrank][irank], irank,, MPI_COMM_WORLD, &ireq[irank]); for(irank = ; irank<nprocs; irank++){ if (irank!= myrank) { MPI_Wait(&ireq[irank], &istatus); MPI_Wait(&ireq[irank], &istatus); Supercomputing Center 57 중첩 (/) q 환산연산을이용한데이터취합각프로세스가모든데이터를가지고있으며 ( 배열수축없음 ) 불연속적인데이터에대한계산수행후그결과를한프로세스로모으고자할때 è 계산을수행한불연속데이터를제외한모든데이터를 으로두고환산연산수행 Supercomputing Center 58 9

130 Supercomputing Center Supercomputing Center 중첩중첩 (/) (/) q Fortran 코드 REAL a(n,n), aa(n,n)... DO j =, n DO i =, n a(i,j) =. DO j =, n DO i = + myrank, n, nprocs a(i,j) = computation CALL MPI_REDUCE(a, aa, n*n, MPI_REAL, MPI_SUM,, & MPI_COMM_WORLD, ierr)... Supercomputing Center Supercomputing Center 6 6 중첩중첩 (/) (/) j A(i,j) i = j A(i,j) i + j A(i,j) i + j A(i,j) i Process Process Process

131 파이프라인방법 (/9) q 실행에의존성을가지는루프. x(i,j) i j Proc Proc Proc PROGRAM main PARAMETER (mx =, my = ) DIMENSION x(:mx, :my) DO j =, my DO i =, mx x(i,j) = x(i,j)+x(i-,j)+x(i,j-) Supercomputing Center 6 파이프라인방법 (/9) q 실행에의존성을가지는루프. x(i,j) i j Proc Proc Proc PROGRAM main PARAMETER (mx =, my = ) DIMENSION x(:mx, :my) DO j =, my DO i =, mx x(i,j) = x(i,j) + x(i,j-) Supercomputing Center 6

132 파이프라인방법 (/9) q 의존성을가지는루프의병렬화 : Fortran x(i,j) i deg= j Proc Proc Proc iblock deg(degree of parallelism): 동시에계산을수행하는프로세스개수 deg= 4 deg= deg= Process Process Process time deg= deg= deg= (a) Blocks and data distribution (b)how processing of blocks is scheduled Supercomputing Center 6 파이프라인방법 (4/9) q 의존성을가지는루프의병렬화코드 : Fortran PROGRAM main_pipe INCLUDE mpif.h PARAMETER (mx=, my=) DIMENSION x(:mx, :my) INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, my, nprocs, myrank, jsta, jend) inext = myrank + IF (inext == nprocs) inext = MPI_PROC_NULL iprev = myrank IF (iprev == -) iprev = MPI_PROC_NULL iblock = Supercomputing Center 64

133 파이프라인방법 (5/9) q 의존성을가지는루프의병렬화코드 : Fortran ( 계속 ) DO ii =, mx, iblock iblklen = MIN(iblock, mx-ii+) CALL MPI_IRECV (x(ii, jsta-), iblklen, MPI_REAL, iprev,, & MPI_COMM_WORLD, ireqr, ierr) CALL MPI_WAIT(ireqr, istatus, ierr) DO j = jsta, jend DO i = ii, ii+iblklen- x(i,j) = x(i,j) + x(i-,j) + x(i,j-) CALL MPI_ISEND (x(ii, jend), iblklen, MPI_REAL, inext,, & MPI_COMM_WORLD, ireqs, ierr) CALL MPI_WAIT(ireqs, istatus, ierr) Supercomputing Center 65 파이프라인방법 (6/9) q 의존성을가지는루프의병렬화코드 : C /* main_pipe */ main(int argc, char *argv[]){ double x[mx+][my+]; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, mx, nprocs, myrank, &ista, &iend); inext = myrank + ; if (inext == nprocs) inext = MPI_PROC_NULL; iprev = myrank ; if (iprev == -) iprev = MPI_PROC_NULL; jblock = ; Supercomputing Center 66

134 파이프라인방법 (7/9) q 의존성을가지는루프의병렬화코드 : C ( 계속 ) for(jj=; jj<=my; jj+=jblock){ jblklen = min(jblock, my-jj+); MPI_Irecv(&x[ista-][jj], jblklen, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &ireqr); MPI_Wait(&ireqr, &istatus); for(i=ista; i<=iend; i++) for(j=jj; j<=jj+jblklen-; j++){ if((i-)==) x[i-][j]=.; if((j-)==) x[i][j-]=.; x[i][j] = x[i][j] + x[i-][j] + x[i][j-]; MPI_Isend(&x[iend][jj], jblklen, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &ireqs); MPI_Wait(&ireqs, &istatus); Supercomputing Center 67 파이프라인방법 (8/9) q 데이터의흐름 : 교착에빠지지않음 Process Process Process Process(nprocs-) Supercomputing Center 68 4

135 파이프라인방법 (9/9) q 블록 (iblock/jblock) 크기결정 동시수행되는부분이많을수록통신부담이커지게됨 A(i,j) j A(i,j) j A(i,j) j iblock iblock iblock i i i Proc Proc Proc Proc Proc Proc Proc Proc Proc Proc Proc Proc Supercomputing Center 69 비틀림분해 (/4) q 항상모든프로세스가참여하는병렬화가능 파이프라인방법보다성능면에서유리 효과적인로드밸런싱데이터분배가복잡해코드작성이어려움 Supercomputing Center 7 5

136 비틀림분해 (/4) q 비틀림분해구현의예 : Fortran! Loop A DO j =, my DO i =, mx x(i,j) = x(i,j) + x(i-,j)! Loop B DO j =, my DO i =, mx x(i,j) = x(i,j) + x(i,j-) Supercomputing Center 7 x(i,j) i 비틀림분해 (/4). 행과열을각각프로세스개수 (nprocs) 의블록으로분해. 각블록의위치를좌표 (I, J) 로표현. (I, J) 블록의계산을 (J-I+nprocs I+nprocs) 를 nprocs 로나눈나머지에해당되는프로세스에할당 j x(i,j) i j x(i,j) j J= J= J= i I= I= I= (a) Loop A (b) Loop B (c) The Twisted Decomposition Supercomputing Center 7 6

137 비틀림분해 (4/4) q 비틀림분해를이용한병렬화코드 : Fortran PROGRAM main_twist INCLUDE mpif.h INTEGER istatus(mpi_status_size) INTEGER, ALLOCATABLE :: is(:), ie(:), js(:). je(:) PARAMETER (mx=, my=, m=) DIMENSION x(:mx, :my) DIMENSION bufs(m), bufr(m) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) ALLOCATE (is(:nprocs-), ie(:nprocs-)) ALLOCATE (js(:nprocs-), je(:nprocs-)) Supercomputing Center 7 비틀림분해 (5/4) q 비틀림분해를이용한병렬화코드 : Fortran ( 계속 ) DO ix =, nprocs- CALL para_range(, mx, nprocs, ix, is(ix), ie(ix)) CALL para_range(, my, nprocs, ix, js(ix), je(ix)) inext = myrank + IF (inext == nprocs) inext = iprev = myrank IF (iprev == -) iprev = nprocs-! Loop A DO ix =, nprocs- iy = MOD(ix+myrank, nprocs) ista = is(ix); iend = ie(ix); jsta = js(iy); jend = je(iy) jlen = jend jsta + Supercomputing Center 74 7

138 비틀림분해 (6/4) q 비틀림분해를이용한병렬화코드 : Fortran ( 계속 ) IF (ix /= ) THEN CALL MPI_IRECV (bufr(jsta), jlen, MPI_REAL, inext,, & MPI_COMM_WORLD, ireqr, ierr) CALL MPI_WAIT(ireqr, istatus, ierr) CALL MPI_WAIT(ireqs, istatus, ierr) DO j = jsta, jend ENDIF x(ista-,j) = bufr(j) DO j = jsta, jend DO i = ista, iend x(i,j) = x(i,j) + x(i-,j) Supercomputing Center 75 비틀림분해 (7/4) q 비틀림분해를이용한병렬화코드 : Fortran ( 계속 ) IF (ix /= nprocs-) THEN DO j = jsta, jend bufs(j) = x(iend,j) CALL MPI_ISEND (bufs(jsta), jlen, MPI_REAL, iprev,, & ENDIF MPI_COMM_WORLD, ireqs, ierr)! Loop B DO iy =, nprocs- ix = MOD(iy+nprocs-myrank, nprocs) ista = is(ix); iend = ie(ix); jsta = js(iy); jend = je(iy) jlen = jend jsta + Supercomputing Center 76 8

139 비틀림분해 (8/4) q 비틀림분해를이용한병렬화코드 : Fortran ( 계속 ) IF (iy /= ) THEN CALL MPI_IRECV (x(ista,jsta-), ilen, MPI_REAL, iprev,, & MPI_COMM_WORLD, ireqr, ierr) CALL MPI_WAIT(ireqr, istatus, ierr) CALL MPI_WAIT(ireqs, istatus, ierr) ENDIF DO j = jsta, jend DO i = ista, iend x(i, j) = x(i,j) + x(i,j-) IF (iy /= nprocs-) THEN CALL MPI_ISEND (x(ista, jend), ilen, MPI_REAL, inext,, & MPI_COMM_WORLD, ireqs, ierr) ENDIF Supercomputing Center 77 비틀림분해 (9/4) q 비틀림분해를이용한병렬화코드 : C /* main_twist*/ main(int argc, char *argv[]){ double x[mx+][my+]; double bufs[m], bufr[m]; int *is, *ie, *js, *je; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); is = (int *)malloc(nprocs*sizeof(int)); ie = (int *)malloc(nprocs*sizeof(int)); js = (int *)malloc(nprocs*sizeof(int)); je = (int *)malloc(nprocs*sizeof(int)); Supercomputing Center 78 9

140 비틀림분해 (/4) q 비틀림분해를이용한병렬화코드 : C ( 계속 ) for(ix=; ix<nprocs; ix++){ para_range(, mx, nprocs, myrank, &is[ix], &ie[ix]); para_range(, my, nprocs, myrank, &js[ix], &je[ix]); inext = myrank + ; if (inext == nprocs) inext = ; iprev = myrank ; if (iprev == -) iprev = nprocs-; c Loop A for(ix=; ix<nprocs; ix++){ iy = (ix+myrank)%nprocs; ista=is[ix]; iend=ie[ix]; jsta=js[iy]; jend=je[iy]; jlen = jend-jsta+; Supercomputing Center 79 비틀림분해 (/4) q 비틀림분해를이용한병렬화코드 : C ( 계속 ) if(ix!= ){ MPI_Irecv(&x[ista-][jsta], jlen, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &ireqr); MPI_Wait(&ireqr, &istatus); MPI_Wait(&ireqs, &istatus); for(i=ista; i<=iend; i++) for(j=jsta; j<=jend; j++){ if((i-)==) x[i-][j]=.; x[i][j] = x[i][j] + x[i-][j]; if(ix!= nprocs-){ MPI_Isend(&x[iend][jsta], jlen, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &ireqs); Supercomputing Center 8 4

141 비틀림분해 (/4) q 비틀림분해를이용한병렬화코드 : C ( 계속 ) c Loop B for(iy=; iy<nprocs; iy++){ ix = (iy+nprocs-myrank)%nprocs ista=is[ix]; iend=ie[ix]; jsta=js[iy]; jend=je[iy]; ilen = iend-ista+; if(iy!= ){ MPI_Irecv(&bufr[ista], ilen, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &ireqr); MPI_Wait(&ireqr, &istatus); MPI_Wait(&ireqs, &istatus); for(i=ista; i<=iend; i++) x[i][jsta-]=bufr[i]; for(i=ista; i<=iend; i++) for(j=jsta; j<=jend; j++){ Supercomputing Center 8 비틀림분해 (/4) q 비틀림분해를이용한병렬화코드 : C ( 계속 ) if((j-)==) x[i][j-]=.; x[i][j] = x[i][j] + x[i][j-]; if(iy!= nprocs-){ for(i=ista; i<=iend; i++) bufs[i]=x[i][jend]; MPI_Isend(&bufs[ista], ilen, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &ireqs); free(is); free(ie); free(js); free(je); Supercomputing Center 8 4

142 비틀림분해 (4/4) q 데이터의흐름 : 교착에주의 Process Process Process Process(nprocs-)! Loop B DO iy =, nprocs- IF (iy /= ) THEN Receive Wait for receive to complete Wait for send to complete ENDIF IF (iy /= nprocs-) THEN send ENDIF Supercomputing Center 8 프리픽스합 (/6) q 내포되지않은루프 ( 차원배열 ) 가의존성을가지는경우 DO i =, n a(i) = a(i-) op b(i) PROGRAM main PARAMETER(n=) REAL a(:n), b(n) DO i=,n a(i) = a(i-) + b(i) Supercomputing Center 84 4

143 프리픽스합 (/6) b b b b 4 b 5 b 7 b 8 b 9 b b b b b 4 b 6 b 5 S S S MPI_SCAN S S + S S + S + S a = a + b a 6 = a + S + b 6 a = a + S + S + b DO i =, 5 a i = a i- + b i DO i = 7, a i = a i- + b i DO i =, 5 a i = a i- + b i Supercomputing Center 85 프리픽스합 (/6) q 프리픽스합을이용한병렬화코드 : Fortran PROGRAM main_prefix_sum INCLUDE mpif.h PARAMETER (n = ) REAL a(:n), b(n) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, n, nprocs, myrank, ista, iend) sum =. DO i = ista, iend sum = sum + b(i) Supercomputing Center 86 4

144 프리픽스합 (4/6) q 프리픽스합을이용한병렬화코드 : Fortran ( 계속 ) IF (myrank == ) THEN ENDIF sum = sum + a() CALL MPI_SCAN (sum, ssum,, MPI_REAL, MPI_SUM, & MPI_COMM_WORLD, ierr) a(ista) = b(ista) + ssum sum IF (myrank == ) THEN ENDIF a(ista) = a(ista) + a() DO i = ista+, iend a(i) = a(i-) + b(i) Supercomputing Center 87 프리픽스합 (5/6) q 프리픽스합을이용한병렬화코드 : C /* prefix_sum */ #include <mpi.h> #define n void main(int argc, char *argv[]){ double a[n+], b[n+]; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, n, nprocs, myrank, &ista, &iend); sum =.; for(i=ista; i<=iend; i++) sum = sum + b[i]; Supercomputing Center 88 44

145 프리픽스합 (6/6) q 프리픽스합을이용한병렬화코드 : C ( 계속 ) if(myrank==) sum = sum + a[]; MPI_Scan (&sum, &ssum,, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); a[ista] = b[ista] + ssum sum; if (myrank == ) a[ista] = a[ista] + a[]; for(i=ista+; i<=iend; i++) a[i] = a[i-] + b[i]; Supercomputing Center 89 제 4 장 MPI 병렬프로그램예제 차원유한차분법, 분자동역학등의실제 계산문제들에서어떻게 MPI 가사용되는지 살펴본다. 45

146 차원유한차분법의병렬화 몬테카를로방법의병렬화 분자동역학 MPMD 모델 Supercomputing Center 9 차원유한차분법의병렬화 (/) q 차원유한차분법 (FDM) 의핵심부 : 순차프로그램 Fortran PARAMETER (m=6,n=9) DIMENSION a(m,n), b(m,n) DO j =, n DO i =, m a(i,j) = i+.*j DO j =, n- DO i =, m- b(i,j) = a(i-,j)+a(i,j-) & + a(i,j+) + a(i+,j) C #define m 6 #define n 9 main(){ double a[m][n], b[m][n]; for(i=; i<m; i++) for(j=; j<n; j++) a[i][j] = (i+)+.*(j+); for(i=; i<m-; i++) for(j=; j<n-; j++) b[i][j] = a[i-][j] + a[i][j-] + a[i][j+] + a[i+][j] Supercomputing Center 9 46

147 차원유한차분법의병렬화 (/) q 양방향의존성을모두가지고있음 q 통신량을최소화하는데이터분배방식결정 열방향블록분할 행방향블록분할양방향블록분할 Supercomputing Center 9 열방향블록분할 q 경계데이터 : Fortran( 연속 ), C( 불연속 ) Process Process Process a(i,j) j jsta jend jsta jend i M N jsta jend Supercomputing Center 94 47

148 열방향블록분할코드 : Fortran (/) PROGRAM parallel_d_fdm_column INCLUDE mpif.h PARAMETER (m = 6, n = 9) DIMENSION a(m,n), b(m,n) INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, n, nprocs, myrank, jsta, jend) jsta = jsta; jend = jend IF (myrank == ) jsta = IF (myrank == nprocs - ) jend = n - inext = myrank + iprev = myrank - Supercomputing Center 95 열방향블록분할코드 : Fortran (/) IF (myrank == nprocs - ) inext = MPI_PROC_NULL IF (myrank == ) iprev = MPI_PROC_NULL DO j = jsta, jend DO i =, m a(i,j) = i +. * j CALL MPI_ISEND(a(,jend), m, MPI_REAL, inext,, & MPI_COMM_WORLD, isend, ierr) CALL MPI_ISEND(a(,jsta), m, MPI_REAL, iprev,, & MPI_COMM_WORLD, isend, ierr) CALL MPI_IRECV(a(,jsta-), m, MPI_REAL, iprev,, & MPI_COMM_WORLD, irecv, ierr) CALL MPI_IRECV(a(,jend+), m, MPI_REAL, inext,, & MPI_COMM_WORLD, irecv, ierr) Supercomputing Center 96 48

149 열방향블록분할코드 : Fortran (/) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) DO j = jsta, jend DO i =, m - b(i,j) = a(i-,j) + a(i,j-) + a(i,j+) + a(i+,j) CALL MPI_FINALIZE(ierr) END Supercomputing Center 97 열방향블록분할코드 : C (/4) /*parallel_d_fdm_column*/ #include <mpi.h> #define m 6 #define n 9 void para_range(int, int, int, int, int*, int*); int min(int, int); main(int argc, char *argv[]){ int i, j, nprocs, myrank ; double a[m][n],b[m][n]; double works[m],workr[m],works[m],workr[m]; int jsta, jend, jsta, jend, inext, iprev; MPI_Request isend, isend, irecv, irecv; MPI_Status istatus; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 98 49

150 열방향블록분할코드 : C (/4) para_range(, n-, nprocs, myrank, &jsta, &jend); jsta = jsta; jend = jend; if(myrank==) jsta=; if(myrank==nprocs-) jend=n-; inext = myrank + ; iprev = myrank ; if (myrank == nprocs-) inext = MPI_PROC_NULL if (myrank == ) iprev = MPI_PROC_NULL for(i=; i<m; i++) for(j=jsta; j<=jend; j++) a[i][j] = i +. * j if(myrank!= nprocs-) for(i=; i<m; i++) works[i]=a[i][jend]; if(myrank!= ) for(i=; i<m; i++) works[i]=a[i][jsta]; Supercomputing Center 99 열방향블록분할코드 : C (/4) MPI_Isend(works, m, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &isend); MPI_Isend(works, m, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &isend); MPI_Irecv(workr, m, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &irecv); MPI_Irecv(workr, m, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &irecv); MPI_Wait(&isend, &istatus); MPI_Wait(&isend, &istatus); MPI_Wait(&irecv, &istatus); MPI_Wait(&irecv, &istatus); if (myrank!= ) for(i=; i<m; i++) a[i][jsta-] = workr[i]; if (myrank!= nprocs-) for(i=; i<m; i++) a[i][jend+] = workr[i]; Supercomputing Center 5

151 열방향블록분할코드 : C (4/4) for (i=; i<=m-; i++) for(j=jsta; j<=jend; j++) b[i][j] = a[i-][j] + a[i][j-] + a[i][j+] + a[i+][j]; MPI_Finalize(); Supercomputing Center 행방향블록분할 q 경계데이터 : Fortran( 불연속 ), C( 연속 ) Process Process Process j a(i,j) N ista i iend works workr M workr works iend ista works workr workr works ista end Supercomputing Center 5

152 행방향블록분할코드 : Fortran (/4) PROGRAM parallel_d_fdm_row INCLUDE mpif.h PARAMETER (m =, n = ) DIMENSION a(m,n), b(m,n) DIMENSION works(n), workr(n), works(n), workr(n) INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, m, nprocs, myrank, ista, iend) ista = ista; iend = iend IF (myrank == ) ista = IF (myrank == nprocs - ) iend = m- inext = myrank + ; iprev = myrank Supercomputing Center 행방향블록분할코드 : Fortran (/4) IF (myrank == nprocs - ) inext = MPI_PROC_NULL IF (myrank == ) iprev = MPI_PROC_NULL DO j =, n DO i = ista, iend a(i,j) = i +. * j IF (myrank /= nprocs - ) THEN DO j =, n works(j) = a(iend,j) ENDIF IF (myrank /= ) THEN DO j =, n works(j) = a(ista,j) ENDIF Supercomputing Center 4 5

153 행방향블록분할코드 : Fortran (/4) CALL MPI_ISEND(works,n,MPI_REAL,inext,, & MPI_COMM_WORLD, isend,ierr) CALL MPI_ISEND(works,n,MPI_REAL,iprev,, & MPI_COMM_WORLD, isend,ierr) CALL MPI_IRECV(workr,n,MPI_REAL,iprev,, & MPI_COMM_WORLD, irecv,ierr) CALL MPI_IRECV(workr,n,MPI_REAL,inext,, & MPI_COMM_WORLD, irecv,ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) Supercomputing Center 5 행방향블록분할코드 : Fortran (4/4) IF (myrank /= ) THEN DO j =, n a(ista-,j) = workr(j) ENDIF IF (myrank /= nprocs - ) THEN DO j =, n a(iend+,j) = workr(j) ENDIF DO j =, n - DO i = ista, iend b(i,j) = a(i-,j) + a(i,j-) + a(i,j+) + a(i+,j) CALL MPI_FINALIZE(ierr) END Supercomputing Center 6 5

154 행방향블록분할코드 : C (/) /*parallel_d_fdm_row*/ #include <mpi.h> #define m #define n void para_range(int, int, int, int, int*, int*); int min(int, int); main(int argc, char *argv[]){ int i, j, nprocs, myrank ; double a[m][n],b[m][n]; int ista, iend, ista, iend, inext, iprev; MPI_Request isend, isend, irecv, irecv; MPI_Status istatus; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 7 행방향블록분할코드 : C (/) para_range(, m-, nprocs, myrank, &ista, &iend); ista = ista; iend = iend; if(myrank==) ista=; if(myrank==nprocs-) iend=m-; inext = myrank + ; iprev = myrank ; if (myrank == nprocs-) inext = MPI_PROC_NULL if (myrank == ) iprev = MPI_PROC_NULL for(i=ista; i<=iend; i++) for(j=; j<n; j++) a[i][j] = i +. * j MPI_Isend(&a[iend][], n, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &isend); MPI_Isend(&a[ista][], n, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &isend); Supercomputing Center 8 54

155 행방향블록분할코드 : C (/) MPI_Irecv(&a[ista-][], n, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &irecv); MPI_Irecv(&a[iend+][], n, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &irecv); MPI_Wait(&isend, &istatus); MPI_Wait(&isend, &istatus); MPI_Wait(&irecv, &istatus); MPI_Wait(&irecv, &istatus); for (i=ista; i<=iend; i++) for(j=; j<=n-; j++) b[i][j] = a[i-][j] + a[i][j-] + a[i][j+] + a[i+][j]; MPI_Finalize(); Supercomputing Center 9 q 프로세스그리드이용 양방향블록분할 (/) a(i,j) i M j 각숫자는그영역을할당받은프로세스랭크를나타냄 N itable(i,j) i - - null null null null null null 6 null j null 4 7 null 프로세스 5 번은좌표 (,) 에해당 null 5 8 null null null null null null (a) The distribution of a() (b) The process grid Supercomputing Center 55

156 56 Supercomputing Center Supercomputing Center 양방향방향블록블록분할분할 (/) (/) Process Process Process Process 7 Process 4 Process 5 Process 6 Process Process 8 a(i,j) i j M N Supercomputing Center Supercomputing Center 양방향방향블록블록분할분할코드코드 : Fortran (/6) : Fortran (/6) PROGRAM parallel_d_fdm_both INCLUDE mpif.h PARAMETER (m =, n = 9) DIMENSION a(m,n), b(m,n) DIMENSION works(n), workr(n), works(n), workr(n) INTEGER istatus(mpi_status_size) PARAMETER (iprocs =, jprocs = ) INTEGER itable(-:iprocs, -:jprocs) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF(nprocs /= iprocs*jprocs) THEN PRINT *, === ERROR === STOP ENDIF DO j = -, jprocs DO i = -, iprocs itable(i,j) = MPI_PROC_NULL

157 양방향블록분할코드 : Fortran (/6) irank = DO i =, iprocs- DO j =, jprocs- itable(i,j) = irank IF (myrank == irank) THEN myranki = i; myrankj = j ENDIF irank = irank + CALL para_range(, n, jprocs, myrankj, jsta, jend) jsta = jsta; jend = jend IF (myrankj == ) jsta = IF (myrankj == jprocs-) jend = n- CALL para_range(, m, iprocs, myranki, ista, iend) ista = ista; iend = iend IF (myranki == ) ista = IF (myranki == iprocs-) iend = m- ilen = iend ista + ; jlen = jend jsta Supercomputing Center 양방향블록분할코드 : Fortran (/6) jnext = itable(myranki, myrankj + ) jprev = itable(myranki, myrankj ) inext = itable(myranki+, myrankj) iprev = itable(myranki-, myrankj) DO j = jsta, jend DO i = ista, iend a(i,j) = i +.*j IF (myranki /= iprocs-) THEN DO j = jsta, jend works(j) = a(iend,j) ENDIF IF (myranki /= ) THEN DO j = jsta, jend works(j) = a(ista,j) ENDIF Supercomputing Center 4 57

158 양방향블록분할코드 : Fortran (4/6) CALL MPI_ISEND(a(ista,jend), ilen, MPI_REAL, jnext,,& MPI_COMM_WORLD, isend, ierr) CALL MPI_ISEND(a(ista,jsta), ilen, MPI_REAL, jprev,,& MPI_COMM_WORLD, isend, ierr) CALL MPI_ISEND(works(jsta), jlen, MPI_REAL, inext,,& MPI_COMM_WORLD, jsend, ierr) CALL MPI_ISEND(works(jsta), jlen, MPI_REAL, iprev,,& MPI_COMM_WORLD, jsend, ierr) CALL MPI_IRECV(a(ista,jsta -), ilen, MPI_REAL, jprev,,& MPI_COMM_WORLD, irecv, ierr) CALL MPI_IRECV (a(ista,jend+), ilen, MPI_REAL, jnext,,& MPI_COMM_WORLD, irecv, ierr) CALL MPI_IRECV (workr(jsta), jlen, MPI_REAL, iprev,,& MPI_COMM_WORLD, jrecv, ierr) CALL MPI_IRECV (workr(jsta), jlen, MPI_REAL, inext,,& MPI_COMM_WORLD, jrecv, ierr) Supercomputing Center 5 양방향블록분할코드 : Fortran (5/6) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(isend, istatus, ierr) CALL MPI_WAIT(jsend, istatus, ierr) CALL MPI_WAIT(jsend, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) CALL MPI_WAIT(irecv, istatus, ierr) CALL MPI_WAIT(jrecv, istatus, ierr) CALL MPI_WAIT(jrecv, istatus, ierr) IF (myranki /= ) THEN DO j = jsta, jend a(ista-,j) = workr(j) ENDIF IF (myranki /= iprocs-) THEN DO j = jsta, jend a(iend+,j) = workr(j) ENDIF Supercomputing Center 6 58

159 양방향블록분할코드 : Fortran (6/6) DO j = jsta, jend DO i = ista, iend b(i,j) = a(i-,j) + a(i,j-) + a(i,j+) + a(i+,j) CALL MPI_FINALIZE(ierr) END Supercomputing Center 7 양방향블록분할코드 : C (/6) /*parallel_d_fdm_both*/ #include <mpi.h> #define m #define n 9 #define iprocs #define jprocs void para_range(int, int, int, int, int*, int*); int min(int, int); main(int argc, char *argv[]){ int i, j, irank, nprocs, myrank ; double a[m][n],b[m][n]; double works[m],workr[m],works[m],workr[m]; int jsta, jend, jsta, jend, jnext, jprev, jlen; int ista, iend, ista, iend, inext, iprev, ilen; int itable[iprocs+][jprocs+]; int myranki, myrankj; MPI_Request isend,isend,irecv,irecv,jsend,jsend,jrecv,jrecv; MPI_Status istatus; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 8 59

160 양방향블록분할코드 : C (/6) for(i=; i<=iprocs+; i++) for(j=; j<=jprocs+; j++) itable[i][j]=mpi_proc_null; irank = ; for(i=; i<=iprocs; i++) for(j=; j<=jprocs; j++){ itable[i][j]=irank; if(myrank==irank){ myranki = i-; myrankj = j-; irank = irank + ; para_range(, n-, jprocs, myrankj, &jsta, &jend); jsta = jsta; jend = jend; if(myrankj==) jsta=; if(myrankj==jprocs-) jend=n-; para_range(, m-, iprocs, myranki, &ista, &iend); ista = ista; iend = iend; if(myranki==) ista=; if(myranki==iprocs-) iend=m-; Supercomputing Center 9 양방향블록분할코드 : C (/6) ilen = iend-ista+; jlen = jend-jsta+; jnext = itable[myranki][myrankj+]; jprev = itable[myranki][myrankj-]; inext = itable[myranki+][myrankj]; iprev = itable[myranki ][myrankj]; for(i=ista; i<=iend; i++) for(j=jsta; j<=jend; j++) a[i][j] = i +. * j if(myrankj!= jprocs-) for(i=ista; i<=iend; i++) works[i]=a[i][jend]; if(myrankj!= ) for(i=ista; i<=iend; i++) works[i]=a[i][jsta]; Supercomputing Center 6

161 양방향블록분할코드 : C (4/6) MPI_Isend(&works[ista], ilen, MPI_DOUBLE, jnext,, MPI_COMM_WORLD, &isend); MPI_Isend(&works[ista], ilen, MPI_DOUBLE, jprev,, MPI_COMM_WORLD, &isend); MPI_Isend(&a[iend][jsta], jlen, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &jsend); MPI_Isend(&a[ista][jsta], jlen, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &jsend); MPI_Irecv(&workr[ista], ilen, MPI_DOUBLE, jprev,, MPI_COMM_WORLD, &irecv); MPI_Irecv(&workr[ista], ilen, MPI_DOUBLE, jnext,, MPI_COMM_WORLD, &irecv); MPI_Irecv(&a[ista-][jsta], jlen, MPI_DOUBLE, iprev,, MPI_COMM_WORLD, &jrecv); MPI_Irecv(&a[iend+][jsta], jlen, MPI_DOUBLE, inext,, MPI_COMM_WORLD, &jrecv); Supercomputing Center 양방향블록분할코드 : C (5/6) MPI_Wait(&isend, &istatus); MPI_Wait(&isend, &istatus); MPI_Wait(&jsend, &istatus); MPI_Wait(&jsend, &istatus); MPI_Wait(&irecv, &istatus); MPI_Wait(&irecv, &istatus); MPI_Wait(&jrecv, &istatus); MPI_Wait(&jrecv, &istatus); if (myrankj!= ) for(i=ista; i<=iend; i++) a[i][jsta-] = workr[i]; if (myrankj!= jprocs-) for(i=ista; i<=iend; i++) a[i][jend+] = workr[i]; Supercomputing Center 6

162 양방향블록분할코드 : C (6/6) for (i=ista; i<=iend; i++) for(j=jsta; j<=jend; j++) b[i][j] = a[i-][j] + a[i][j-] + a[i][j+] + a[i+][j]; MPI_Finalize(); Supercomputing Center 몬테카를로방법의병렬화 (/) q 차원임의행로 y x Supercomputing Center 4 6

163 몬테카를로방법의병렬화 (/) PROGRAM random_serial PARAMETER (n = ) INTEGER itotal(:9) REAL seed pi =.4596 DO i =, 9 itotal(i) = seed =.5 CALL srand(seed) DO i =, n x =.; y =. DO istep =, angle =.*pi*rand() x = x + cos(angle) y = y + sin(angle) itemp = sqrt(x** + y**) itotal(itemp) = & itotal(itemp) + PRINT *, total =, itotal END /*random serial*/ #include <math.h> #define n main(){ int i,istep,itotal[],itemp; double r, seed, pi, x, y, angle; pi =.4596; for(i=;i<;i++) itotal[i]=; seed =.5; srand(seed); for(i=; i<n; i++){ x =.; y =.; for(istep=;istep<;istep++){ r = (double)rand(); angle =.*pi*r/768.; x = x + cos(angle); y = y + sin(angle); itemp = sqrt(x*x + y*y); itotal[itemp]=itotal[itemp]+; for(i=; i<; i++){ printf( %d :, i); printf( total=%d\n,itotal[i]); Supercomputing Center 5 차원임의행로코드 : Fortran (/) PROGRAM random_parallel INCLUDE mpif.h PARAMETER (n = ) INTEGER itotal(:9), iitotal(:9) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL para_range(, n, nprocs, myrank, ista, iend) pi =.4596 DO i =, 9 itotal(i) = seed =.5 + myrank CALL srand(seed) Supercomputing Center 6 6

164 차원임의행로코드 : Fortran (/) DO i = ista, iend x =.; y =. DO istep =, angle =.*pi*rand() x = x + cos(angle) y = y + sin(angle) itemp = sqrt(x** + y**) itotal(itemp) = itotal(itemp) + CALL MPI_REDUCE(itotal, iitotal,, MPI_INTEGER, & MPI_SUM,, MPI_COMM_WORLD, ierr) PRINT *, total =, iitotal CALL MPI_FINALIZE(ierr) END Supercomputing Center 7 차원임의행로코드 : C (/) /*para_random*/ #include <mpi.h> #include <stdio.h> #include <math.h> #define n void para_range(int, int, int, int, int*, int*); int min(int, int); main (int argc, char *argv[]){ int i, istep, itotal[], iitotal[], itemp; int ista, iend, nprocs, myrank; double r, seed, pi, x, y, angle; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); para_range(, n-, nprocs, myrank, &ista, &iend); pi =.4596; for(i=; i<; i++) itotal[i] = ; Supercomputing Center 8 64

165 차원임의행로코드 : C (/) seed =.5 + myrank; srand(seed); for(i=ista; i<=iend; i++){ x =.; y =.; for(istep=; istep<; istep++){ r = (double)rand(); angle =.*pi*r/768.; x = x + cos(angle); y = y + sin(angle); itemp = sqrt(x*x + y*y); itotal[itemp] = itotal[itemp] + ; MPI_Reduce(itotal, iitotal,, MPI_INT, MPI_SUM,, MPI_COMM_WORLD); for(i=; i<; i++){ printf( %d :, i); printf( total = %d\n,iitotal[i]); MPI_Finalize(); Supercomputing Center 9 분자동역학 (/7) q 차원상에서상호작용하는두개입자 fij = /(x(j)-x(i)) x(i) fji = -fij x(j) x q 입자 i가받게되는힘의총합 f i = å j¹ i f ij = - å j< i f ji + å j> i f ij Supercomputing Center 65

166 분자동역학 (/7) q 7개의입자 f() = +f +f +f4 +f5 +f6 +f7 f() = -f +f +f4 +f5 +f6 +f7 f() = -f -f +f4 +f5 +f6 +f7 f(4) = -f4 -f4 -f4 +f45 +f46 +f47 f(5) = -f5 -f5 -f5 -f45 +f56 +f57 f(6) = -f6 -f6 -f6 -f46 -f56 +f67 f(7) = -f7 -f7 -f7 -f47 -f57 -f67 Supercomputing Center 분자동역학 (/7) q Fortran 순차코드 삼각형모양의루프실행 순환분할 ( i 또는 j 에대해 )... PARAMETER (n =...) REAL f(n), x(n)... DO itime =, DO i =, n f(i) =. hot spot DO i =, n- DO j = i+, n fij =. / (x(j)-x(i)) f(i) = f(i) + fij f(j) = f(j) - fij DO i =, n x(i) = x(i) + f(i)... Supercomputing Center 66

167 Process 분자동역학 (4/7) f() = +f +f +f4 +f5 +f6 +f7 f() = -f f() = -f f(4) = -f4 +f45 +f46 +f47 f(5) = -f5 -f45 f(6) = -f6 -f46 f(7) = -f7 -f47 변수 i 에대한순환분할 Process f() = f() = f() = f(4) = f(5) = f(6) = f(7) = +f +f4 +f5 +f6 +f7 -f -f4 -f5 +f56 +f57 -f6 -f56 -f7 -f57 MPI_ALLREDUCE Process All Processes f() = f() = f() = f(4) = f(5) = f(6) = f(7) = +f4 +f5 +f6 +f7 -f4 -f5 -f6 +f67 -f7 -f67 ff() = ff() = ff() = ff(4) = ff(5) = ff(6) = ff(7) = -f -f -f4 -f5 -f6 -f7 +f -f -f4 -f5 -f6 -f7 +f +f4 +f5 +f6 +f7 +f +f4 +f5 +f6 +f7 +f4 +f5 +f6 +f7 -f4 +f45 +f46 +f47 -f5 -f45 +f56 +f57 -f6 -f46 -f56 +f67 -f7 -f47 -f57 -f67 Supercomputing Center 분자동역학 (5/7) PARAMETER (n = ) REAL f(n), x(n), ff(n) 변수 i에대한 DO itime =, 순환분할 DO i =, n f(i) =. DO i = +myrank, n-, nprocs DO j = i+, n fij =. /(x(j) x(i)) f(i) = f(i) + fij f(j) = f(j) fij CALL MPI_ALLREDUCE(f, ff, n, MPI_REAL, MPI_SUM, & MPI_COMM_WORLD, ierr) DO i =, n x(i) = x(i) + ff(i) Supercomputing Center 4 67

168 분자동역학 (6/7) Process f() = +f +f5 f() = -f +f +f6 f() = -f +f5 f(4) = +f45 f(5) = -f5 -f -f45 +f56 f(6) = -f6 -f56 f(7) = 변수 j 에대한순환분할 Process f() = +f +f6 f() = +f4 +f7 f() = -f +f6 f(4) = -f4 +f46 f(5) = +f57 f(6) = -f6 -f6 -f46 f(7) = -f7 -f57 MPI_ALLREDUCE Process All Processes f() = +f4 f() = +f5 f() = +f4 +f7 f(4) = -f4 -f4 +f47 f(5) = -f5 f(6) = +f67 f(7) = -f7 -f7 -f47 -f67 ff() = ff() = ff() = ff(4) = ff(5) = ff(6) = ff(7) = -f -f -f4 -f5 -f6 -f7 +f -f -f4 -f5 -f6 -f7 +f +f -f4 -f5 -f6 -f7 +f4 +f5 +f6 +f7 +f4 +f5 +f6 +f7 +f4 +f5 +f6 +f7 +f45 +f46 +f47 -f45 +f56 +f57 -f46 -f56 +f67 -f47 -f57 -f67 Supercomputing Center 5 분자동역학 (7/7) PARAMETER (n = ) REAL f(n), x(n), ff(n) DO itime =, DO i =, n f(i) =. irank = - DO i =, n- DO j = i+, n irank = irank + IF (irank == nprocs) irank = IF (myrank == irank) THEN fij =. /(x(j) x(i)) f(i) = f(i) + fij f(j) = f(j) fij ENDIF 변수 j 에대한순환분할 CALL MPI_ALLREDUCE (f,ff,n,mpi_real,mpi_sum,mpi_comm_world,ierr ) DO i =, n x(i) = x(i) + ff(i) Supercomputing Center 6 68

169 MPMD 모델 (/4) Process Process PROGRAM fluid INCLUDE mpif.h CALL MPI_INIT CALL MPI_COMM_SIZE CALL MPI_COMM-RANK DO itime =,n Computation of Fluid Dynamics CALL MPI_SEND CALL MPI_RECV END PROGRAM struct INCLUDE mpif.h CALL MPI_INIT CALL MPI_COMM_SIZE CALL MPI_COMM-RANK DO itime =,n Computation of Structural Analysis CALL MPI_RECV CALL MPI_SEND END Supercomputing Center 7 MPMD 모델 (/4) q MPMD 병렬실행 : IBM AIX(ksh) $mpxlf9 fluid.f o fluid $mpxlf9 struct.f o struct $export MP_PGMMODEL=mpmd $export MP_CMDFILE=cmdfile $poe procs q cmdfile : 실행명령구성파일 fluid struct Supercomputing Center 8 69

170 MPMD 모델 (/4) q 마스터 / 워커 MPMD 프로그램 PROGRAM main PARAMETER (njobmax = ) DO njob =, njobmax END CALL work(njob) 서로독립적으로수행되는 개의작업 Supercomputing Center 9 MPMD 모델 (4/4) Master PROGRAM master INCLUDE mpif.h PARAMETER (njobmax=) INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) itag = DO njob =, njobmax CALL MPI_RECV(iwk,, MPI_INTEGER, MPI_ANY_SOURCE, itag,mpi_comm_world, istatus, ierr) idest = istatus(mpi_source) CALL MPI_SEND(njob,, MPI_INTEGER, idest, itag, MPI_COMM_WORLD, ierr) DO i =, nprocs- CALL MPI_RECV(iwk,, MPI_INTEGER, MPI_ANY_SOURCE, itag,mpi_comm_world, istatus, ierr) idest = istatus(mpi_source) CALL MPI_SEND(-,, MPI_INTEGER, idest, itag,mpi_comm_world, ierr) CALL MPI_FINALIZE(ierr) END Worker PROGRAM worker INCLUDE mpif.h INTEGER istatus(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) itag = iwk = DO CALL MPI_SEND(iwk,, MPI_INTEGER,, itag, MPI_COMM_ WORLD,ierr) CALL MPI_RECV(njob,, MPI_INTEGER,, itag, MPI_COMM_WORLD, istatus, ierr) IF(njob == -) EXIT CALL work(njob) CALL MPI_FINALIZE(ierr) END Supercomputing Center 4 7

171 부 록 A. MPI- MPI- q MPI- 에추가된영역. 병렬 I/O (MPI I/O). 원격메모리접근 ( 일방통신 ). 동적프로세스운영 q IBM 시스템의병렬환경지원소프트웨어, PE(v.) 에서 는동적프로세스운영을제외한대부분의 MPI- 규약을 지원하고있다. Supercomputing Center 4 7

172 병렬 I/O q MPI- 에서지원 q 운영체제가지원하는일반적인순차 I/O 기능기반에추가적인 성능과안정성지원 Supercomputing Center 4 병렬프로그램과순차 I/O () (/6) q 하나의프로세스가모든 I/O 담당 q 편리할수있지만성능과범위성에한계가있음 Memory Process File Supercomputing Center 44 7

173 병렬프로그램과순차 I/O () (/6) q 예 : 각프로세스가가진 개의정수출력. 각프로세스가배열 buf() 초기화. 프로세스 를제외한모든프로세스는배열 buf() 를프로세스 으로송신. 프로세스 은가지고있는배열을파일에먼저기록하고, 다른프로세스로부터차례로데이터를받아파일에기록 Supercomputing Center 45 병렬프로그램과순차 I/O () (/6) q Fortran 코드 PROGRAM serial_io INCLUDE mpif.h INTEGER BUFSIZE PARAMETER (BUFSIZE = ) INTEGER nprocs, myrank, ierr, buf(bufsize) INTEGER status(mpi_status_size) Call MPI_INIT(ierr) Call MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) Call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i =, BUFSIZE buf(i) = myrank * BUFSIZE + i Supercomputing Center 46 7

174 병렬프로그램과순차 I/O () (4/6) q Fortran 코드 ( 계속 ) IF (myrank /= ) THEN CALL MPI_SEND(buf, BUFSIZE, MPI_INTEGER,, 99, & ELSE MPI_COMM_WORLD, ierr) OPEN (UNIT=,FILE= testfile,status= NEW,ACTION= WRITE ) WRITE(,*) buf DO i =, nprocs- CALL MPI_RECV(buf, BUFSIZE, MPI_INTEGER, i, 99, & WRITE (,*) buf ENDIF CALL MPI_FINALIZE(ierr) END MPI_COMM_WORLD, status, ierr) Supercomputing Center 47 병렬프로그램과순차 I/O () (5/6) q C 코드 /*example of serial I/O*/ #include <mpi.h> #include <stdio.h> #define BUFSIZE void main (int argc, char *argv[]){ int i, nprocs, myrank, buf[bufsize] ; MPI_Status status; FILE *myfile; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<bufsize; i++) buf[i] = myrank * BUFSIZE + i; Supercomputing Center 48 74

175 병렬프로그램과순차 I/O () (6/6) q C 코드 ( 계속 ) if(myrank!= ) else{ MPI_Send(buf, BUFSIZE, MPI_INT,, 99, MPI_COMM_WORLD); myfile = fopen( testfile, wb ); fwrite(buf, sizeof(int), BUFSIZE, myfile); for(i=; i<nprocs; i++){ MPI_Recv(buf, BUFSIZE, MPI_INT, i, 99, MPI_COMM_WORLD, &status); fwrite(buf, sizeof(int), BUFSIZE, myfile); fclose(myfile); MPI_Finalize(); Supercomputing Center 49 병렬프로그램과순차 I/O () (/) q 모든프로세스가독립적으로각자의 I/O 기능담당 q 여러파일생성으로결과처리가불편 Memory Process File Supercomputing Center 5 75

176 병렬프로그램과순차 I/O () (/) q Fortran 코드 : 개별적인파일생성 PROGRAM serial_io INCLUDE mpif.h INTEGER BUFSIZE PARAMETER (BUFSIZE = ) INTEGER nprocs, myrank, ierr, buf(bufsize) CHARACTER* number CHARACTER* fname(:8) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i =, BUFSIZE buf(i) = myrank * BUFSIZE + i WRITE(number, *) myrank fname(myrank) = testfile. //number OPEN(UNIT=myrank+,FILE=fname(myrank),STATUS= NEW,ACTION= WRITE ) WRITE(myrank+,*) buf CLOSE(myrank+) CALL MPI_FINALIZE(ierr) END Supercomputing Center 5 병렬프로그램과순차 I/O () (/) q C 코드 : 개별적인파일생성 /*example of parallel UNIX write into separate files */ #include <mpi.h> #include <stdio.h> #define BUFSIZE void main (int argc, char *argv[]){ int i, nprocs, myrank, buf[bufsize] ; char filename[8]; FILE *myfile; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<bufsize; i++) buf[i] = myrank * BUFSIZE + i; sprintf(filename, testfile.%d, myrank); myfile = fopen(filename, wb ); fwrite(buf, sizeof(int), BUFSIZE, myfile); fclose(myfile); MPI_Finalize(); Supercomputing Center 5 76

177 병렬 I/O 의사용 (/5) q 기본적인병렬 I/O 루틴의사용 MPI_FILE_OPEN MPI_FILE_WRITE MPI_FILE_CLOSE Supercomputing Center 5 병렬 I/O 의사용 (/5) q Fortran 코드 : 개별적인파일생성 PROGRAM parallel_io_ INCLUDE mpif.h INTEGER BUFSIZE PARAMETER (BUFSIZE = ) INTEGER nprocs, myrank, ierr, buf(bufsize), myfile CHARACTER* number CHARACTER* filename(:8) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i =, BUFSIZE buf(i) = myrank * BUFSIZE + i Supercomputing Center 54 77

178 병렬 I/O 의사용 (/5) q Fortran 코드 ( 계속 ) : 개별적인파일생성 WRITE(number, *) myrank filename(myrank) = testfile. //number CALL MPI_FILE_OPEN(MPI_COMM_SELF, filename, & MPI_MODE_WRONLY+MPI_MODE_CREATE, MPI_INFO_NULL, & myfile, ierr) CALL MPI_FILE_WRITE(myfile, buf, BUFSIZE, MPI_INTEGER, & MPI_STATUS_IGNORE, ierr) CALL MPI_FILE_CLOSE(myfile, ierr) CALL MPI_FINALIZE(ierr) END Supercomputing Center 55 병렬 I/O 의사용 (4/5) q C 코드 : 개별적인파일생성 /*example of parallel MPI write into separate files */ #include <mpi.h> #include <stdio.h> #define BUFSIZE void main (int argc, char *argv[]){ int i, nprocs, myrank, buf[bufsize] ; char filename[8]; MPI_File myfile; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<bufsize; i++) buf[i] = myrank * BUFSIZE + i; Supercomputing Center 56 78

179 병렬 I/O 의사용 (5/5) q C 코드 ( 계속 ) : 개별적인파일생성 sprintf(filename, testfile.%d, myrank); MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_WRONLY MPI_MODE_CREATE, MPI_INFO_NULL, &myfile); MPI_File_write(myfile, buf, BUFSIZE, MPI_INT, MPI_STATUS_IGNORE); MPI_File_close(&myfile); MPI_Finalize(); Supercomputing Center 57 병렬 I/O 루틴 : MPI_FILE_OPEN (/) C Fortran int MPI_File_open(MPI_Comm comm,, char *filename, int amode,, MPI_Info info, MPI_File *fh* fh) MPI_FILE_OPEN(comm, filename, amode, info, fh, ierr) INTEGER comm : 커뮤니케이터 ( 핸들 ) (IN) CHARACTER filename : 오픈하는파일이름 (IN) INTEGER amode : 파일접근모드 (IN) INTEGER info : info 객체 ( 핸들 ) (IN) INTEGER fh : 새파일핸들 ( 핸들 ) (OUT) q 집합통신 : 동일커뮤니케이터의프로세스는같은파일오픈 MPI_COMM_SELF : 프로세스하나로구성되는커뮤니케이터 Supercomputing Center 58 79

180 병렬 I/O 루틴 : MPI_FILE_OPEN (/) q 파일접근모드 : OR( :C), IOR(+:Fortran) 로연결가능 MPI_MODE_APPEND MPI_MODE_CREATE MPI_MODE_DELETE_ON_CLOSE MPI_MODE_EXCL MPI_MODE_RDONLY MPI_MODE_RDWR MPI_MODE_SEQUENTIAL MPI_MODE_UNIQUE_OPEN MPI_MODE_WDONLY 파일포인터의시작위치를파일마지막에설정파일생성, 만약파일이있으면덮어씀파일을닫으면삭제파일생성시파일이있으면에러리턴읽기만가능읽기와쓰기가능파일을순차적으로만접근가능다른곳에서동시에열수없음쓰기만가능 q info 객체 : 시스템환경에따른프로그램구현의변화에대한 정보제공, 통상적으로 MPI_INFO_NULL 사용 Supercomputing Center 59 병렬 I/O 루틴 : MPI_FILE_WRITE C Fortran int MPI_File_write(MPI_File fh,, void *buf* buf, int MPI_Datatype datatype, MPI_Status *status) MPI_FILE_WRITE(fh, buf,, count, datatype, status(mpi_status_size), ierr) count, INTEGER fh : 파일핸들 ( 핸들 ) (INOUT) CHOICE buf : 버퍼의시작주소 (IN) INTEGER count : 버퍼의원소개수 (IN) INTEGER datatype : 버퍼원소의데이터타입 ( 핸들 ) (IN) INTEGER status(mpi_status_size) : 상태객체 (OUT) MPI_STATUS_IGNORE (MPI-) : 상태저장없음 Supercomputing Center 6 8

181 병렬 I/O 루틴 : MPI_FILE_CLOSE C Fortran int MPI_File_close(MPI_File *fh* fh) MPI_FILE_CLOSE(fh, ierr) INTEGER fh : 파일핸들 ( 핸들 ) (INOUT) Supercomputing Center 6 병렬프로그램과병렬 I/O (/6) q 병렬프로세스가하나의공유파일작성 Memory Process File File View Supercomputing Center 6 8

182 병렬프로그램과병렬 I/O (/6) q 여러프로세스들이하나의파일공유 MPI_COMM_SELF à MPI_COMM_WORLD q 파일뷰 : 공유된파일에각프로세스가접근하는부분 MPI_FILE_SET_VIEW 로설정 q 각프로세스의파일뷰시작위치계산 disp = myrank * BUFSIZE * 4 4 바이트정수 disp = myrank*bufsize*sizeof(int); Supercomputing Center 6 병렬프로그램과병렬 I/O (/6) q Fortran 코드 : 하나의공유파일생성 PROGRAM parallel_io_ INCLUDE mpif.h INTEGER BUFSIZE PARAMETER (BUFSIZE = ) INTEGER nprocs, myrank, ierr, buf(bufsize), thefile INTEGER(kind=MPI_OFFSET_KIND) disp CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i =, BUFSIZE buf(i) = myrank * BUFSIZE + i Supercomputing Center 64 8

183 병렬프로그램과병렬 I/O (4/6) q Fortran 코드 ( 계속 ) : 하나의공유파일생성 CALL MPI_FILE_OPEN(MPI_COMM_WORLD, testfile, & MPI_MODE_WRONLY + MPI_MODE_CREATE, MPI_INFO_NULL, & thefile, ierr) disp = myrank * BUFSIZE * 4 CALL MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & MPI_INTEGER, native, MPI_INFO_NULL, ierr) CALL MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & MPI_STATUS_IGNORE, ierr) CALL MPI_FILE_CLOSE(thefile, ierr) CALL MPI_FINALIZE(ierr) END Supercomputing Center 65 병렬프로그램과병렬 I/O (5/6) q C 코드 : 하나의공유파일생성 /*example of parallel MPI write into single files */ #include <mpi.h> #include <stdio.h> #define BUFSIZE void main (int argc, char *argv[]){ int i, nprocs, myrank, buf[bufsize] ; MPI_File thefile; MPI_Offset disp; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 66 8

184 병렬프로그램과병렬 I/O (6/6) q C 코드 ( 계속 ) : 하나의공유파일생성 for(i=; i<bufsize; i++) buf[i] = myrank * BUFSIZE + i; MPI_File_open(MPI_COMM_WORLD, testfile, MPI_MODE_WRONLY MPI_MODE_CREATE, MPI_INFO_NULL, &thefile); disp = myrank*bufsize*sizeof(int); MPI_File_set_view(thefile, disp, MPI_INT, MPI_INT, native, MPI_INFO_NULL); MPI_File_write(thefile, buf, BUFSIZE, MPI_INT, MPI_STATUS_IGNORE); MPI_File_close(&thefile); MPI_Finalize(); Supercomputing Center 67 병렬 I/O 루틴 : MPI_FILE_SET_VIEW (/) C Fortran int MPI_File_set_view(MPI_File fh,, MPI_Offset disp, MPI_Datatype etype, MPI_Datatype filetype,, char *datarep* datarep,, MPI_Info info) MPI_FILE_SET_VIEW(fh, disp, etype, filetype, datarep, info, ierr) INTEGER fh : 파일핸들 (IN) INTEGER(kind=MPI_OFFSET_KIND) disp : 파일뷰의시작위치 (IN) INTEGER etype : 기본데이터타입, 파일안의데이터타입 (IN) INTEGER filetype : 파일뷰의데이터타입, 유도데이터타입을이용하여뷰접근을불연속적으로할수있도록함 (IN) CHARACTER datarep(*) : 데이터표현 (IN) INTEGER info : info 객체 (IN) Supercomputing Center 68 84

185 병렬 I/O 루틴 : MPI_FILE_SET_VIEW (/) q 커뮤니케이터의모든프로세스가호출하는집합통신루틴 q 데이터표현 시스템에따른데이터표현방식의기술파일의이식성을높여줌 native : 데이터가메모리에있는것과똑같이파일에저장됨 동종환경시스템에적합하며, 대부분사용 internal 시스템에서정의된내부포맷으로데이터전환 동종환경또는이기종환경에서사용가능 external MPI- 에서정의한포맷으로데이터전환 Supercomputing Center 69 병렬 I/O : 파일읽기 (/5) q 여러프로세스가하나의파일을공유하여병렬로읽기가능 q 프로그램내에서파일크기계산 MPI_FILE_GET_SIZE q 근사적으로동일한크기로설정된파일뷰로부터각프로세스는동시에데이터를읽어들임 MPI_FILE_READ Supercomputing Center 7 85

186 q Fortran 코드 PROGRAM parallel_io_ INCLUDE mpif.h 병렬 I/O : 파일읽기 (/5) INTEGER nprocs, myrank, ierr INTEGER count, bufsize, thefile INTEGER (kind=mpi_offset_kind) filesize, disp INTEGER, ALLOCATABLE :: buf(:) INTEGER status(mpi_status_size) CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) CALL MPI_FILE_OPEN(MPI_COMM_WORLD, testfile, & MPI_MODE_RDONLY, MPI_INFO_NULL, thefile, ierr) Supercomputing Center 7 병렬 I/O : 파일읽기 (/5) q Fortran 코드 ( 계속 ) CALL MPI_FILE_GET_SIZE(thefile, filesize, ierr) filesize = filesize/4 bufsize = filesize/nprocs + ALLOCATE(buf(bufsize)) disp = myrank * bufsize * 4 CALL MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & MPI_INTEGER, native, MPI_INFO_NULL, ierr) CALL MPI_FILE_READ(thefile, buf, bufsize, MPI_INTEGER, & status, ierr) CALL MPI_GET_COUNT(status, MPI_INTEGER, count, ierr) print *, process, myrank, read, count, ints CALL MPI_FILE_CLOSE(thefile, ierr) CALL MPI_FINALIZE(ierr) END Supercomputing Center 7 86

187 q C 코드 병렬 I/O : 파일읽기 (4/5) /* parallel MPI read with arbitrary number of processes */ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int nprocs, myrank, bufsize, *buf, count; MPI_File thefile; MPI_Status status; MPI_Offset filesize; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Supercomputing Center 7 q C 코드 ( 계속 ) 병렬 I/O : 파일읽기 (5/5) MPI_File_open(MPI_COMM_WORLD, testfile, MPI_MODE_RDONLY, MPI_INFO_NULL, &thefile); MPI_File_get_size(thefile, &filesize); filesize = filesize / sizeof(int); bufsize = filesize /nprocs + ; buf = (int *) malloc(bufsize * sizeof(int)); MPI_File_set_view(thefile, myrank*bufsize*sizeof(int), MPI_INT, MPI_INT, native, MPI_INFO_NULL); MPI_File_read(thefile, buf, bufsize, MPI_INT, &status); MPI_Get_count(&status, MPI_INT, &count); printf( process %d read%c ints \n, myrank, count); MPI_File_close(&thefile); MPI_Finalize(); Supercomputing Center 74 87

188 병렬 I/O 루틴 : MPI_FILE_GET_SIZE C Fortran int MPI_File_get_size(MPI_File fh,, MPI_Offset *size) MPI_FILE_GET_SIZE(fh, size, ierr) INTEGER fh : 파일핸들 ( 핸들 ) (IN) INTEGER (kind=mpi_offset_kind) size : 파일의크기 (OUT) q 파일의크기를바이트단위로저장 Supercomputing Center 75 병렬 I/O 루틴 : MPI_FILE_READ C Fortran int MPI_File_read(MPI_File fh,, void *buf* buf, int MPI_Datatype datatype, MPI_Status *status) MPI_FILE_READ(fh, buf,, count, datatype, status(mpi_status_size), ierr) count, INTEGER fh : 파일핸들 ( 핸들 ) (INOUT) CHOICE buf : 버퍼의시작주소 (OUT ) INTEGER count : 버퍼의원소개수 (IN) INTEGER datatype : 버퍼원소의데이터타입 ( 핸들 ) (IN) INTEGER status : 상태객체 (OUT) q 파일에서지정된개수의데이터를읽어들여버퍼에저장 Supercomputing Center 76 88

189 일방통신 q 메시지패싱모델의통신 한쌍의통신연산 ( 송신과수신 ) 을통한데이터전송 q 일방통신 (one-sided communication) 송 / 수신조합이아닌한쪽프로세스만으로데이터전송가능 원격메모리접근 (RMA) 알고리즘설계를위한유연성제공 get, put, accumulate 등 Supercomputing Center 77 메모리윈도우 q 단일프로세스의메모리일부 q 다른프로세스들에게메모리연산을허용하는공간 메모리연산 : 읽기 (get), 쓰기 (put), 갱신 (accumulate) q MPI_WIN_CREATE 으로생성 get Local address spaces put RMA windows Address space of Process Address space of Process Supercomputing Center 78 89

190 MPI 프로그램 : PI 계산 (/4) q Fortran 코드 PROGRAM parallel_pi INCLUDE mpif.h DOUBLE PRECISION mypi, pi, h, sum, x LOGICAL continue CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) continue =.TRUE. DO WHILE(continue) IF(myrank==) THEN PRINT*, Enter the Number of intervals: ( quits) READ*, n ENDIF CALL MPI_BCAST(n,, MPI_INTEGER,, MPI_COMM_WORLD, ierr) IF(n==) THEN continue =.FALSE. GOTO ELSE Supercomputing Center 79 MPI 프로그램 : PI 계산 (/4) q Fortran 코드 ( 계속 ) h =.d/dble(n) sum=.d DO i=myrank+, n, nprocs x = h*(dble(i)-.5d) sum = sum + 4.d/(.d+x*x) mypi = h*sum CALL MPI_REDUCE(mypi, pi,, MPI_DOUBLE_PRECISIN, & MPI_SUM,, MPI_COMM_WORLD, ierr) IF(myrank==) THEN PRINT*, pi is approximately, pi ENDIF ENDIF CALL MPI_FINALIZE(ierr) END Supercomputing Center 8 9

191 q C 코드 MPI 프로그램 : PI 계산 (/4) #include <mpi.h> void main (int argc, char *argv[]){ int n, i, myrank, nprocs; double mypi, x, pi, h, sum; MPI_Init(&argc, &argv) ; MPI_Comm_rank(MPI_COMM_WORLD, &myrank) ; MPI_Comm_size(MPI_COMM_WORLD, &nprocs) ; while(){ if(myrank==) { printf( Enter the Number of Intervals: ( quits)\n ); scanf( %d, &n); MPI_Bcast(&n,, MPI_INT,, MPI_COMM_WORLD); Supercomputing Center 8 MPI 프로그램 : PI 계산 (/4) q C 코드 ( 계속 ) if(n==) break; else{ h =./(double) n; sum=.; for (i=myrank; i<n ; i+=nprocs) { x = h*((double)i-.5); sum += 4./(.+x*x); mypi = h*sum; MPI_Reduce(&mypi, &pi,, MPI_DOUBLE, MPI_SUM,, MPI_COMM_WORLD); if(myrank==) printf( pi is approximately %f \n, pi ); MPI_Finalize(); Supercomputing Center 8 9

192 일방통신을이용한병렬화코드 (/7) q 메모리윈도우생성 MPI_WIN_CREATE q 일방통신루틴을이용한데이터전송 MPI_BCAST è MPI_GET MPI_REDUCE è MPI_ACCUMULATE q 동기화루틴 MPI_WIN_FENCE Supercomputing Center 8 일방통신을이용한병렬화코드 (/7) q Fortran 코드 PROGRAM PI_RMA INCLUDE 'mpif.h' INTEGER nwin DOUBLE PRECISION piwin DOUBLE PRECISION mypi, pi, h, sum, x LOGICAL continue CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF(myrank==) THEN CALL MPI_WIN_CREATE(n, 4,, MPI_INFO_NULL, MPI_COMM_WORLD, & nwin, ierr) CALL MPI_WIN_CREATE(pi, 8,, MPI_INFO_NULL, MPI_COMM_WORLD, & piwin, ierr) ELSE CALL MPI_WIN_CREATE(MPI_BOTTOM,,, MPI_INFO_NULL, & MPI_COMM_WORLD, nwin, ierr) CALL MPI_WIN_CREATE(MPI_BOTTOM,,, MPI_INFO_NULL, & MPI_COMM_WORLD, piwin, ierr) ENDIF Supercomputing Center 84 9

193 일방통신을이용한병렬화코드 (/7) q Fortran 코드 ( 계속 ) continue=.true. DO WHILE(continue) IF(myid == ) THEN PRINT*, 'Enter the Number of intervals: ( quits)' READ*, n pi=.d ENDIF CALL MPI_WIN_FENCE(, nwin, ierr) IF(myrank /= ) THEN CALL MPI_GET(n,, MPI_INTEGER,,,, MPI_INTEGER, & nwin, ierr) ENDIF CALL MPI_WIN_FENCE(, nwin, ierr) IF(n==) THEN continue =.FALSE. GOTO ELSE h =.d/dble(n) sum =.d Supercomputing Center 85 일방통신을이용한병렬화코드 (4/7) q Fortran 코드 ( 계속 ) DO i=myrank+, n, nprocs x = h*(dble(i)-.5d) sum = sum + 4.d/(.d+x*x) mypi = h*sum CALL MPI_WIN_FENCE(, piwin, ierr) CALL MPI_ACCUMULATE(mypi,, MPI_DOUBLE_PRECISION,,,&, MPI_DOUBLE_PRECISION, MPI_SUM, piwin, ierr) CALL MPI_WIN_FENCE(, piwin, ierr) IF(myrank==) THEN PRINT*, 'pi is approximately ', pi ENDIF ENDIF CALL MPI_WIN_FREE(nwin, ierr) CALL MPI_WIN_FREE(piwin, ierr) CALL MPI_FINALIZE(ierr) END Supercomputing Center 86 9

194 일방통신을이용한병렬화코드 (5/7) q C 코드 #include <mpi.h> void main (int argc, char *argv[]){ int n, i, myrank, nprocs; double pi, mypi, x, h, sum; MPI_Win nwin, piwin; MPI_Init(&argc, &argv) ; MPI_Comm_rank(MPI_COMM_WORLD, &myrank) ; MPI_Comm_size(MPI_COMM_WORLD, &nprocs) ; if (myrank==) { MPI_Win_create(&n, sizeof(int),, MPI_INFO_NULL, MPI_COMM_WORLD, &nwin); MPI_Win_create(&pi, sizeof(double),, MPI_INFO_NULL, MPI_COMM_WORLD, &piwin); else{ MPI_Win_create(MPI_BOTTOM,,, MPI_INFO_NULL, MPI_COMM_WORLD, &nwin); MPI_Win_create(MPI_BOTTOM,,, MPI_INFO_NULL, MPI_COMM_WORLD, &piwin); Supercomputing Center 87 일방통신을이용한병렬화코드 (6/7) q C 코드 ( 계속 ) while(){ if(myrank==) { printf( Enter the Number of Intervals: ( quits)\n ); scanf( %d, &n); pi=.; MPI_Win_fence(, nwin); if(myrank!= ) MPI_Get(&n,, MPI_INT,,,, MPI_INT, nwin); MPI_Win_fence(, nwin); if(n==) break; else{ h =./(double) n; sum=.; for (i=myrank+; i<=n ; i+=nprocs) { x = h*((double)i-.5); sum += 4./(.+x*x); Supercomputing Center 88 94

195 일방통신을이용한병렬화코드 (7/7) q C 코드 ( 계속 ) mypi = h*sum; MPI_Win_fence(, piwin); MPI_Accumulate(&mypi,, MPI_DOUBLE,,,, MPI_DOUBLE, MPI_SUM, piwin); MPI_Win_fence(, piwin); if(myrank==) printf( pi is approximately %f \n, pi); MPI_Win_free(&nwin); MPI_Win_free(&piwin); MPI_Finalize(); Supercomputing Center 89 일방통신루틴 : MPI_WIN_CREATE (/) C Fortran int MPI_Win_create(void *base, MPI_Aint size, int disp_unit,, MPI_Info info, MPI_Comm comm, MPI_Win *win) MPI_WIN_CREATE(base, size, disp_unit, info, comm, win, ierr) CHOICE base : 윈도우의시작주소 (IN) INTEGER size : 바이트로나타낸윈도우의크기 ( 음아닌정수 ) (IN) INTEGER disp_unit : 바이트로나타낸변위의크기 ( 양의정수 ) (IN) INTEGER info : info 객체 ( 핸들 ) (IN) INTEGER comm : 커뮤니케이터 ( 핸들 ) (IN) INTEGER win : 리턴되는윈도우객체 ( 핸들 ) (OUT) q 메모리윈도우생성루틴 q 커뮤니케이터내부의모든프로세스들이참여하는집합통신 Supercomputing Center 9 95

196 일방통신루틴 : MPI_WIN_CREATE (/) q CALL MPI_WIN_CREATE(n, 4,, MPI_INFO_NULL, MPI_COMM_WORLD, nwin, ierr) 프로세스 의정수 n 에접근허용하는윈도우객체 nwin 생성, 윈도 우시작주소는 n, 길이는 4 바이트임을나타냄 변수가하나인윈도우객체이므로변수간의변위는의미없음 커뮤니케이터내의모든프로세스는직접 n 값을 get 할수있다. q CALL MPI_WIN_CREATE(MPI_BOTTOM,,, ierr) MPI_INFO_NULL, MPI_COMM_WORLD, nwin, 다른프로세스에서는접근허용하는윈도우생성이없음을나타내기 위해주소는 MPI_BOTTOM, 길이를 으로두었음 Supercomputing Center 9 일방통신루틴 : MPI_WIN_FENCE C Fortran int MPI_Win_fence(int assert, MPI_Win *win) MPI_WIN_FENCE(assert, win, ierr) INTEGER assert : 성능향상관련인수, 은항상허용됨 (IN) INTEGER win : 펜스연산이수행되는윈도우객체 (IN) q 원격연산에서의동기화함수 q 원격연산에서는 MPI_BARRIER 를쓸수없음 q 원격연산과지역연산또는두개의원격연산사이를분리시켜줌 q 원격연산은논블록킹이기때문에연산의완료를확인하기위해서는반드시동기화함수를호출해야함 Supercomputing Center 9 96

197 일방통신루틴 : MPI_GET (/) C Fortran int MPI_Get(void *origin_addr* origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype target_datatype,, MPI_Win win) MPI_GET(origin_addr,, origin_count, origin_datatype, target_rank, target_disp,, target_count, target_datatype,, win, ierr) CHOICE origin_addr : 데이터를가져오는 (get) 버퍼 ( 원버퍼 ) 의 시작주소 (IN) INTEGER origin_count : 원버퍼의데이터개수 (IN) INTEGER origin_datatype : 원버퍼의데이터타입 ( 핸들 ) (IN) INTEGER target_rank : 메모리접근을허용하는목적프로세스의랭크 (IN) INTEGER target_disp : 윈도우시작위치에서목적버퍼까지의변위 (IN) INTEGER target_count : 목적버퍼의데이터원소개수 (IN) INTEGER target_datatype : 목적버퍼원소의데이터타입 ( 핸들 ) (IN) INTEGER win : 윈도우객체 ( 핸들 ) (IN) Supercomputing Center 9 일방통신루틴 : MPI_GET (/) q CALL MPI_GET(n,, MPI_INTEGER,,,, & MPI_INTEGER, nwin, ierr) 수신지정보 (n,, MPI_INTEGER) MPI_INTEGER 타입의 개데이터를 n 에저장 송신지정보 (,,, MPI_INTEGER) 번프로세스의 윈도우 MPI_INTEGER 타입데이터를 개가져옴 시작위치에서 만큼떨어져있는 Supercomputing Center 94 97

198 일방통신루틴 : MPI_PUT C Fortran int MPI_Put(void *origin_addr* origin_addr, int MPI_Datatype origin_datatype, int MPI_Aint target_disp, int target_count, MPI_Datatype target_datatype,, MPI_Win win) origin_count, target_rank, MPI_PUT(origin_addr,, origin_count, origin_datatype, target_rank, target_disp,, target_count, target_datatype,, win, ierr) CHOICE origin_addr : 데이터를보내는 (put) 버퍼 ( 원버퍼 ) 의시작주소 (IN) INTEGER origin_count : 원버퍼의데이터개수 (IN) INTEGER origin_datatype : 원버퍼의데이터타입 ( 핸들 )(IN) INTEGER target_rank : 메모리접근을허용하는프로세스의랭크 (IN) INTEGER target_disp : 윈도우시작점에서목적버퍼까지의변위 (IN) INTEGER target_count : 목적버퍼의데이터원소개수 (IN) INTEGER target_datatype : 목적버퍼원소의데이터타입 ( 핸들 ) (IN) INTEGER win : 윈도우객체 ( 핸들 ) (IN) Supercomputing Center 95 일방통신루틴 : MPI_ACCUMULATE (/) C Fortran int MPI_Accumulate(void *origin_addr* origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype target_datatype,, MPI_Op op, MPI_Win win) MPI_ACCUMULATE(origin_addr,, origin_count, origin_datatype,, target_rank, target_disp, target_count, target_datatype,, op, win, ierr) CHOICE origin_addr : 데이터를갱신 (accumulate) 하는버퍼 ( 원버퍼 ) 의시작주소 (IN) INTEGER origin_count : 원버퍼의데이터개수 (IN) INTEGER origin_datatype : 원버퍼의데이터타입 ( 핸들 ) (IN) INTEGER target_rank : 메모리접근을허용하는프로세스의랭크 (IN) INTEGER target_disp : 윈도우시작점에서목적버퍼까지의변위 (IN) INTEGER target_count : 목적버퍼의데이터원소개수 (IN) INTEGER target_datatype : 목적버퍼원소의데이터타입 ( 핸들 ) (IN) INTEGER op : 환산 (reduction) 연산 (IN) INTEGER win : 윈도우객체 ( 핸들 ) (IN) Supercomputing Center 96 98

199 일방통신루틴 : MPI_ACCUMULATE (/) q CALL MPI_ACCUMULATE(mypi,, & MPI_DOUBLE_PRECISION,,,, & MPI_DOUBLE_PRECISION, MPI_SUM, piwin, ierr) 갱신에사용될지역변수정보 (mypi,, MPI_DOUBLE_PRECISION) mypi 에서시작되는 MPI_DOUBLE_PRECISION 타입의 개 데이터를목적지의윈도우정보갱신에이용 갱신할목적지정보 (,,, MPI_DOUBLE_PRECISION, MPI_SUM) 번프로세스의윈도우시작위치에서 만큼떨어져있는데이터 개를연산 MPI_SUM 을이용해갱신 Supercomputing Center 97 일방통신루틴 : MPI_WIN_FREE C Fortran int MPI_Win_free(MPI_Win *win) MPI_WIN_FREE(win, ierr) INTEGER win : 윈도우객체 ( 핸들 ) (IN) q 윈도우객체를풀어널 (null) 핸들을리턴함 Supercomputing Center 98 99

200 용어정리 / 참고자료 용어정리 (/) MPI : Message Passing Interface 봉투 : Envelope 꼬리표 : Tag 식별자 : Identifier 유도데이터타입 : Derived Data Type 가상토폴로지 : Virtual Topology 핸들 : Handle 송신 : Send 수신 : Receive 동기송신 : Synchronous Send 준비송신 : Ready Send 버퍼송신 : Buffered Send 표준송신 : Standard Send 포스팅 : Posting 교착 : Deadlock 통신부하 : Communication Overhead 점대점통신 : Point-To To-Point Communication 집합통신 : Collective Communication 단방향통신 : Unidirectional Communication Supercomputing Center 4

201 용어정리 (/) 양방향통신 : Bidirectional Communication 취합 : Gather 환산 : Reduction 연산자 : Operator 확산 : Scatter 장벽 : Barrier 커먼블록 : Common Block 유사타입 : Pseudo Type 부분행렬 : Subarray 행우선순 : Row major Order 열우선순 : Column Major Order 메모리대응 : Memory Mapping 직교가상토폴로지 : Cartesian Virtual Topology 그래프가상토폴로지 : Graph Virtual Topology 대응함수 : Mapping Function 블록분할 : Block Distribution 순환분할 : Cyclic Distribution 블록 -순환분할 : Block-Cyclic Distribution 로드밸런싱 : Load Balancing Supercomputing Center 4 캐시미스 : Cache Miss 배열수축 : Shrinking Array 내포된루프 : Nested Loop 용어정리 (/) 유한차분법 : Finite Difference Method(FDM) 의존성 : dependence 대량데이터 : Bulk Data 중첩 : Superposition 비틀림분해 : Twisted Decomposition 프리픽스합 : Prefix Sum 임의행로 : Random Walk 분자동역학 : Molecular Dynamics 원격메모리접근 : Remote Memory Access(RMA) 일방통신 : One sided Communication 동적메모리운영 : Dynamic Memory Management 객체 : Object Supercomputing Center 4

202 참고자료 (/) q q q q q q Gropp,, Lusk, and Skjellum. Using MPI,, second edition. MIT Press. 999 Gropp,, Lusk, and Thakur. Using MPI-,, second edition. MIT Press. 999 Snir,, Otto, Huss-Lederman Lederman,, Walker, and Dongarra. MPI-The Complete Reference Volume. Second Edition. MIT Press Andrews. Foundations of Multithreaded, Parallel, and Distributed Programming.. Addison-Wesley.. SP Parallel Programming Workshop Parallel Programming Concepts p Supercomputing Center 4 참고자료 (/) q q q q Introduction to Parallel Computing Practical MPI Programming Lawrence Livermore National Laboratory Parallel Programming with MPI oscinfo.osc.edu/training/mpi/ Supercomputing Center 44

203 기술지원 q Helpdesk q 교육센터게시판 webedu.ksc.re.kr q [email protected] Supercomputing Center 45

fprintf(fp, "clf; clear; clc; \n"); fprintf(fp, "x = linspace(0, %d, %d)\n ", L, N); fprintf(fp, "U = [ "); for (i = 0; i <= (N - 1) ; i++) for (j = 0

fprintf(fp, clf; clear; clc; \n); fprintf(fp, x = linspace(0, %d, %d)\n , L, N); fprintf(fp, U = [ ); for (i = 0; i <= (N - 1) ; i++) for (j = 0 병렬계산을이용한열방정식풀기. 1. 처음 병렬계산을하기전에 C 언어를이용하여명시적유한차분법으로하나의열방정식을풀어본 다. 먼저 C 로열방정식을이해한다음초기조건만다르게하여클러스터로여러개의열방 정식을풀어보자. 2. C 를이용한명시적유한차분법으로열방적식풀기 열방정식을풀기위한자세한이론은앞서다룬 Finite-Difference method 을보기로하고 바로식 (1.10)

More information

Microsoft PowerPoint - 병렬표준.pptx

Microsoft PowerPoint - 병렬표준.pptx Parallel Programming & MPI 박필성 수원대학교 IT 대학컴퓨터학과 목차 Why Parallel? Parallel Processing Parallel Programming MPI의기본개념 MPI 표준 MPI 프로그램및실행 Related Topics & Questions Why Parallel? (1) Why parallel computing?

More information

Parallel Programming & MPI 박필성 수원대학교 IT 대학컴퓨터학과

Parallel Programming & MPI 박필성 수원대학교 IT 대학컴퓨터학과 Parallel Programming & MPI 박필성 수원대학교 IT 대학컴퓨터학과 목차 Why Parallel? Parallel Computers Parallel Processing Parallel Programming MPI의기본개념 MPI 표준 MPI 프로그램및실행 Related Topics & Questions Why Parallel? (1) Why

More information

슬라이드 1

슬라이드 1 -Part3- 제 4 장동적메모리할당과가변인 자 학습목차 4.1 동적메모리할당 4.1 동적메모리할당 4.1 동적메모리할당 배울내용 1 프로세스의메모리공간 2 동적메모리할당의필요성 4.1 동적메모리할당 (1/6) 프로세스의메모리구조 코드영역 : 프로그램실행코드, 함수들이저장되는영역 스택영역 : 매개변수, 지역변수, 중괄호 ( 블록 ) 내부에정의된변수들이저장되는영역

More information

[ 마이크로프로세서 1] 2 주차 3 차시. 포인터와구조체 2 주차 3 차시포인터와구조체 학습목표 1. C 언어에서가장어려운포인터와구조체를설명할수있다. 2. Call By Value 와 Call By Reference 를구분할수있다. 학습내용 1 : 함수 (Functi

[ 마이크로프로세서 1] 2 주차 3 차시. 포인터와구조체 2 주차 3 차시포인터와구조체 학습목표 1. C 언어에서가장어려운포인터와구조체를설명할수있다. 2. Call By Value 와 Call By Reference 를구분할수있다. 학습내용 1 : 함수 (Functi 2 주차 3 차시포인터와구조체 학습목표 1. C 언어에서가장어려운포인터와구조체를설명할수있다. 2. Call By Value 와 Call By Reference 를구분할수있다. 학습내용 1 : 함수 (Function) 1. 함수의개념 입력에대해적절한출력을발생시켜주는것 내가 ( 프로그래머 ) 작성한명령문을연산, 처리, 실행해주는부분 ( 모듈 ) 자체적으로실행되지않으며,

More information

Microsoft PowerPoint - ch07 - 포인터 pm0415

Microsoft PowerPoint - ch07 - 포인터 pm0415 2015-1 프로그래밍언어 7. 포인터 (Pointer), 동적메모리할당 2015 년 4 월 4 일 교수김영탁 영남대학교공과대학정보통신공학과 (Tel : +82-53-810-2497; Fax : +82-53-810-4742 http://antl.yu.ac.kr/; E-mail : [email protected]) Outline 포인터 (pointer) 란? 간접참조연산자

More information

학습목차 2.1 다차원배열이란 차원배열의주소와값의참조

학습목차 2.1 다차원배열이란 차원배열의주소와값의참조 - Part2- 제 2 장다차원배열이란무엇인가 학습목차 2.1 다차원배열이란 2. 2 2 차원배열의주소와값의참조 2.1 다차원배열이란 2.1 다차원배열이란 (1/14) 다차원배열 : 2 차원이상의배열을의미 1 차원배열과다차원배열의비교 1 차원배열 int array [12] 행 2 차원배열 int array [4][3] 행 열 3 차원배열 int array [2][2][3]

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 System Software Experiment 1 Lecture 5 - Array Spring 2019 Hwansoo Han ([email protected]) Advanced Research on Compilers and Systems, ARCS LAB Sungkyunkwan University http://arcs.skku.edu/ 1 배열 (Array) 동일한타입의데이터가여러개저장되어있는저장장소

More information

Microsoft PowerPoint - chap02-C프로그램시작하기.pptx

Microsoft PowerPoint - chap02-C프로그램시작하기.pptx #include int main(void) { int num; printf( Please enter an integer "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; } 1 학습목표 을 작성하면서 C 프로그램의

More information

<322EBCF8C8AF28BFACBDC0B9AEC1A6292E687770>

<322EBCF8C8AF28BFACBDC0B9AEC1A6292E687770> 연습문제해답 5 4 3 2 1 0 함수의반환값 =15 5 4 3 2 1 0 함수의반환값 =95 10 7 4 1-2 함수의반환값 =3 1 2 3 4 5 연습문제해답 1. C 언어에서의배열에대하여다음중맞는것은? (1) 3차원이상의배열은불가능하다. (2) 배열의이름은포인터와같은역할을한다. (3) 배열의인덱스는 1에서부터시작한다. (4) 선언한다음, 실행도중에배열의크기를변경하는것이가능하다.

More information

Parallel Programming 박필성 IT 대학컴퓨터학과

Parallel Programming 박필성 IT 대학컴퓨터학과 Parallel Programming 박필성 IT 대학컴퓨터학과 목차 Why Parallel? Parallel Computers Parallel Processing Parallel Programming Models Parallel Programming OpenMP Standard MPI Standard Related Topics & Questions Why

More information

Microsoft Word - 3부A windows 환경 IVF + visual studio.doc

Microsoft Word - 3부A windows 환경 IVF + visual studio.doc Visual Studio 2005 + Intel Visual Fortran 9.1 install Intel Visual Fortran 9.1 intel Visual Fortran Compiler 9.1 만설치해서 DOS 모드에서실행할수있지만, Visual Studio 2005 의 IDE 를사용하기위해서는 Visual Studio 2005 를먼저설치후 Integration

More information

금오공대 컴퓨터공학전공 강의자료

금오공대 컴퓨터공학전공 강의자료 C 프로그래밍프로젝트 Chap 14. 포인터와함수에대한이해 2013.10.09. 오병우 컴퓨터공학과 14-1 함수의인자로배열전달 기본적인인자의전달방식 값의복사에의한전달 val 10 a 10 11 Department of Computer Engineering 2 14-1 함수의인자로배열전달 배열의함수인자전달방식 배열이름 ( 배열주소, 포인터 ) 에의한전달 #include

More information

Microsoft PowerPoint - chap11-포인터의활용.pptx

Microsoft PowerPoint - chap11-포인터의활용.pptx #include int main(void) int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; 1 학습목표 포인터를 사용하는 다양한 방법에

More information

untitled

untitled int i = 10; char c = 69; float f = 12.3; int i = 10; char c = 69; float f = 12.3; printf("i : %u\n", &i); // i printf("c : %u\n", &c); // c printf("f : %u\n", &f); // f return 0; i : 1245024 c : 1245015

More information

Microsoft PowerPoint - chap13-입출력라이브러리.pptx

Microsoft PowerPoint - chap13-입출력라이브러리.pptx #include int main(void) int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; 1 학습목표 스트림의 기본 개념을 알아보고,

More information

Microsoft PowerPoint - chap06-2pointer.ppt

Microsoft PowerPoint - chap06-2pointer.ppt 2010-1 학기프로그래밍입문 (1) chapter 06-2 참고자료 포인터 박종혁 Tel: 970-6702 Email: [email protected] 한빛미디어 출처 : 뇌를자극하는 C프로그래밍, 한빛미디어 -1- 포인터의정의와사용 변수를선언하는것은메모리에기억공간을할당하는것이며할당된이후에는변수명으로그기억공간을사용한다. 할당된기억공간을사용하는방법에는변수명외에메모리의실제주소값을사용하는것이다.

More information

Microsoft PowerPoint - chap12-고급기능.pptx

Microsoft PowerPoint - chap12-고급기능.pptx #include int main(void) int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; 1 학습목표 가 제공하는 매크로 상수와 매크로

More information

Microsoft PowerPoint - chap10-함수의활용.pptx

Microsoft PowerPoint - chap10-함수의활용.pptx #include int main(void) { int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; } 1 학습목표 중 값에 의한 전달 방법과

More information

11장 포인터

11장 포인터 누구나즐기는 C 언어콘서트 제 9 장포인터 이번장에서학습할내용 포인터이란? 변수의주소 포인터의선언 간접참조연산자 포인터연산 포인터와배열 포인터와함수 이번장에서는포인터의기초적인지식을학습한다. 포인터란? 포인터 (pointer): 주소를가지고있는변수 메모리의구조 변수는메모리에저장된다. 메모리는바이트단위로액세스된다. 첫번째바이트의주소는 0, 두번째바이트는 1, 변수와메모리

More information

chap 5: Trees

chap 5: Trees 5. Threaded Binary Tree 기본개념 n 개의노드를갖는이진트리에는 2n 개의링크가존재 2n 개의링크중에 n + 1 개의링크값은 null Null 링크를다른노드에대한포인터로대체 Threads Thread 의이용 ptr left_child = NULL 일경우, ptr left_child 를 ptr 의 inorder predecessor 를가리키도록변경

More information

A Dynamic Grid Services Deployment Mechanism for On-Demand Resource Provisioning

A Dynamic Grid Services Deployment Mechanism for On-Demand Resource Provisioning C Programming Practice (II) Contents 배열 문자와문자열 구조체 포인터와메모리관리 구조체 2/17 배열 (Array) (1/2) 배열 동일한자료형을가지고있으며같은이름으로참조되는변수들의집합 배열의크기는반드시상수이어야한다. type var_name[size]; 예 ) int myarray[5] 배열의원소는원소의번호를 0 부터시작하는색인을사용

More information

Microsoft PowerPoint - 3Àϰ_º¯¼ö¿Í »ó¼ö.ppt

Microsoft PowerPoint - 3Àϰ_º¯¼ö¿Í »ó¼ö.ppt 변수와상수 1 변수란무엇인가? 변수 : 정보 (data) 를저장하는컴퓨터내의특정위치 ( 임시저장공간 ) 메모리, register 메모리주소 101 번지 102 번지 변수의크기에따라 주로 byte 단위 메모리 2 기본적인변수형및변수의크기 변수의크기 해당컴퓨터에서는항상일정 컴퓨터마다다를수있음 short

More information

02장.배열과 클래스

02장.배열과 클래스 ---------------- DATA STRUCTURES USING C ---------------- CHAPTER 배열과구조체 1/20 많은자료의처리? 배열 (array), 구조체 (struct) 성적처리프로그램에서 45 명의성적을저장하는방법 주소록프로그램에서친구들의다양한정보 ( 이름, 전화번호, 주소, 이메일등 ) 를통합하여저장하는방법 홍길동 이름 :

More information

임베디드시스템설계강의자료 6 system call 2/2 (2014 년도 1 학기 ) 김영진 아주대학교전자공학과

임베디드시스템설계강의자료 6 system call 2/2 (2014 년도 1 학기 ) 김영진 아주대학교전자공학과 임베디드시스템설계강의자료 6 system call 2/2 (2014 년도 1 학기 ) 김영진 아주대학교전자공학과 System call table and linkage v Ref. http://www.ibm.com/developerworks/linux/library/l-system-calls/ - 2 - Young-Jin Kim SYSCALL_DEFINE 함수

More information

Microsoft PowerPoint - chap03-변수와데이터형.pptx

Microsoft PowerPoint - chap03-변수와데이터형.pptx #include int main(void) { int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num %d\n", num); return 0; } 1 학습목표 의 개념에 대해 알아본다.

More information

KNK_C_05_Pointers_Arrays_structures_summary_v02

KNK_C_05_Pointers_Arrays_structures_summary_v02 Pointers and Arrays Structures adopted from KNK C Programming : A Modern Approach 요약 2 Pointers and Arrays 3 배열의주소 #include int main(){ int c[] = {1, 2, 3, 4}; printf("c\t%p\n", c); printf("&c\t%p\n",

More information

K&R2 Reference Manual 번역본

K&R2 Reference Manual 번역본 typewriter structunion struct union if-else if if else if if else if if if if else else ; auto register static extern typedef void char short int long float double signed unsigned const volatile { } struct

More information

Microsoft PowerPoint - ch09 - 연결형리스트, Stack, Queue와 응용 pm0100

Microsoft PowerPoint - ch09 - 연결형리스트, Stack, Queue와 응용 pm0100 2015-1 프로그래밍언어 9. 연결형리스트, Stack, Queue 2015 년 5 월 4 일 교수김영탁 영남대학교공과대학정보통신공학과 (Tel : +82-53-810-2497; Fax : +82-53-810-4742 http://antl.yu.ac.kr/; E-mail : [email protected]) 연결리스트 (Linked List) 연결리스트연산 Stack

More information

11장 포인터

11장 포인터 Dynamic Memory and Linked List 1 동적할당메모리의개념 프로그램이메모리를할당받는방법 정적 (static) 동적 (dynamic) 정적메모리할당 프로그램이시작되기전에미리정해진크기의메모리를할당받는것 메모리의크기는프로그램이시작하기전에결정 int i, j; int buffer[80]; char name[] = data structure"; 처음에결정된크기보다더큰입력이들어온다면처리하지못함

More information

C 언어 프로그래밊 과제 풀이

C 언어 프로그래밊 과제 풀이 과제풀이 (1) 홀수 / 짝수판정 (1) /* 20094123 홍길동 20100324 */ /* even_or_odd.c */ /* 정수를입력받아홀수인지짝수인지판정하는프로그램 */ int number; printf(" 정수를입력하시오 => "); scanf("%d", &number); 확인 주석문 가필요한이유 printf 와 scanf 쌍

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 Chapter 08 함수 01 함수의개요 02 함수사용하기 03 함수와배열 04 재귀함수 함수의필요성을인식한다. 함수를정의, 선언, 호출하는방법을알아본다. 배열을함수의인자로전달하는방법과사용시장점을알아본다. 재귀호출로해결할수있는문제의특징과해결방법을알아본다. 1.1 함수의정의와기능 함수 (function) 특별한기능을수행하는것 여러가지함수의예 Page 4 1.2

More information

<4D F736F F F696E74202D20B8AEB4AABDBA20BFC0B7F920C3B3B8AEC7CFB1E22E BC8A3C8AF20B8F0B5E55D>

<4D F736F F F696E74202D20B8AEB4AABDBA20BFC0B7F920C3B3B8AEC7CFB1E22E BC8A3C8AF20B8F0B5E55D> 리눅스 오류처리하기 2007. 11. 28 안효창 라이브러리함수의오류번호얻기 errno 변수기능오류번호를저장한다. 기본형 extern int errno; 헤더파일 라이브러리함수호출에실패했을때함수예 정수값을반환하는함수 -1 반환 open 함수 포인터를반환하는함수 NULL 반환 fopen 함수 2 유닉스 / 리눅스 라이브러리함수의오류번호얻기 19-1

More information

목차 포인터의개요 배열과포인터 포인터의구조 실무응용예제 C 2

목차 포인터의개요 배열과포인터 포인터의구조 실무응용예제 C 2 제 8 장. 포인터 목차 포인터의개요 배열과포인터 포인터의구조 실무응용예제 C 2 포인터의개요 포인터란? 주소를변수로다루기위한주소변수 메모리의기억공간을변수로써사용하는것 포인터변수란데이터변수가저장되는주소의값을 변수로취급하기위한변수 C 3 포인터의개요 포인터변수및초기화 * 변수데이터의데이터형과같은데이터형을포인터 변수의데이터형으로선언 일반변수와포인터변수를구별하기위해

More information

6주차.key

6주차.key 6, Process concept A program in execution Program code PCB (process control block) Program counter, registers, etc. Stack Heap Data section => global variable Process in memory Process state New Running

More information

03장.스택.key

03장.스택.key ---------------- DATA STRUCTURES USING C ---------------- 03CHAPTER 1 ? (stack): (LIFO:Last-In First-Out) 2 : top : ( index -1 ),,, 3 : ( ) ( ) -> ->. ->.... 4 Stack ADT : (LIFO) : init():. is_empty():

More information

BMP 파일 처리

BMP 파일 처리 BMP 파일처리 김성영교수 금오공과대학교 컴퓨터공학과 학습내용 영상반전프로그램제작 2 Inverting images out = 255 - in 3 /* 이프로그램은 8bit gray-scale 영상을입력으로사용하여반전한후동일포맷의영상으로저장한다. */ #include #include #define WIDTHBYTES(bytes)

More information

제 14 장포인터활용 유준범 (JUNBEOM YOO) Ver 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다.

제 14 장포인터활용 유준범 (JUNBEOM YOO) Ver 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다. 제 14 장포인터활용 유준범 (JUNBEOM YOO) Ver. 2.0 [email protected] http://dslab.konkuk.ac.kr 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다. 이번장에서학습할내용 이중포인터란무엇인가? 포인터배열 함수포인터 다차원배열과포인터 void 포인터 포인터는다양한용도로유용하게활용될수있습니다. 2 이중포인터

More information

Chapter #01 Subject

Chapter #01  Subject Device Driver March 24, 2004 Kim, ki-hyeon 목차 1. 인터럽트처리복습 1. 인터럽트복습 입력검출방법 인터럽트방식, 폴링 (polling) 방식 인터럽트서비스등록함수 ( 커널에등록 ) int request_irq(unsigned int irq, void(*handler)(int,void*,struct pt_regs*), unsigned

More information

이번장에서학습할내용 동적메모리란? malloc() 와 calloc() 연결리스트 파일을이용하면보다많은데이터를유용하고지속적으로사용및관리할수있습니다. 2

이번장에서학습할내용 동적메모리란? malloc() 와 calloc() 연결리스트 파일을이용하면보다많은데이터를유용하고지속적으로사용및관리할수있습니다. 2 제 17 장동적메모리와연결리스트 유준범 (JUNBEOM YOO) Ver. 2.0 [email protected] http://dslab.konkuk.ac.kr 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다. 이번장에서학습할내용 동적메모리란? malloc() 와 calloc() 연결리스트 파일을이용하면보다많은데이터를유용하고지속적으로사용및관리할수있습니다.

More information

윤성우의 열혈 TCP/IP 소켓 프로그래밍

윤성우의 열혈 TCP/IP 소켓 프로그래밍 C 프로그래밍프로젝트 Chap 22. 구조체와사용자정의자료형 1 2013.10.10. 오병우 컴퓨터공학과 구조체의정의 (Structure) 구조체 하나이상의기본자료형을기반으로사용자정의자료형 (User Defined Data Type) 을만들수있는문법요소 배열 vs. 구조체 배열 : 한가지자료형의집합 구조체 : 여러가지자료형의집합 사용자정의자료형 struct

More information

0. 표지에이름과학번을적으시오. (6) 1. 변수 x, y 가 integer type 이라가정하고다음빈칸에 x 와 y 의계산결과값을적으시오. (5) x = (3 + 7) * 6; x = 60 x = (12 + 6) / 2 * 3; x = 27 x = 3 * (8 / 4

0. 표지에이름과학번을적으시오. (6) 1. 변수 x, y 가 integer type 이라가정하고다음빈칸에 x 와 y 의계산결과값을적으시오. (5) x = (3 + 7) * 6; x = 60 x = (12 + 6) / 2 * 3; x = 27 x = 3 * (8 / 4 Introduction to software design 2012-1 Final 2012.06.13 16:00-18:00 Student ID: Name: - 1 - 0. 표지에이름과학번을적으시오. (6) 1. 변수 x, y 가 integer type 이라가정하고다음빈칸에 x 와 y 의계산결과값을적으시오. (5) x = (3 + 7) * 6; x = 60 x

More information

Microsoft PowerPoint - [2009] 02.pptx

Microsoft PowerPoint - [2009] 02.pptx 원시데이터유형과연산 원시데이터유형과연산 원시데이터유형과연산 숫자데이터유형 - 숫자데이터유형 원시데이터유형과연산 표준입출력함수 - printf 문 가장기본적인출력함수. (stdio.h) 문법 ) printf( Test printf. a = %d \n, a); printf( %d, %f, %c \n, a, b, c); #include #include

More information

The Pocket Guide to TCP/IP Sockets: C Version

The Pocket Guide to  TCP/IP Sockets: C Version 얇지만얇지않은 TCP/IP 소켓프로그래밍 C 2 판 4 장 UDP 소켓 제 4 장 UDP 소켓 4.1 UDP 클라이언트 4.2 UDP 서버 4.3 UDP 소켓을이용한데이터송싞및수싞 4.4 UDP 소켓의연결 UDP 소켓의특징 UDP 소켓의특성 싞뢰할수없는데이터젂송방식 목적지에정확하게젂송된다는보장이없음. 별도의처리필요 비연결지향적, 순서바뀌는것이가능 흐름제어 (flow

More information

OCW_C언어 기초

OCW_C언어 기초 초보프로그래머를위한 C 언어기초 4 장 : 연산자 2012 년 이은주 학습목표 수식의개념과연산자및피연산자에대한학습 C 의알아보기 연산자의우선순위와결합방향에대하여알아보기 2 목차 연산자의기본개념 수식 연산자와피연산자 산술연산자 / 증감연산자 관계연산자 / 논리연산자 비트연산자 / 대입연산자연산자의우선순위와결합방향 조건연산자 / 형변환연산자 연산자의우선순위 연산자의결합방향

More information

Microsoft PowerPoint - 제11장 포인터(강의)

Microsoft PowerPoint - 제11장 포인터(강의) 쉽게풀어쓴 C 언어 Express 제 11 장포인터 이번장에서학습할내용 포인터이란? 변수의주소 포인터의선언 간접참조연산자 포인터연산 포인터와배열 포인터와함수 이번장에서는포인터의기초적인지식을학습한다. 포인터란? 포인터 (pointer): 주소를가지고있는변수 1003 1004 1005 영화관 1002 1006 1001 포인터 (pointer) 1007 메모리의구조

More information

< E20C6DFBFFEBEEE20C0DBBCBAC0BB20C0A7C7D12043BEF0BEEE20492E707074>

< E20C6DFBFFEBEEE20C0DBBCBAC0BB20C0A7C7D12043BEF0BEEE20492E707074> Chap #2 펌웨어작성을위한 C 언어 I http://www.smartdisplay.co.kr 강의계획 Chap1. 강의계획및디지털논리이론 Chap2. 펌웨어작성을위한 C 언어 I Chap3. 펌웨어작성을위한 C 언어 II Chap4. AT89S52 메모리구조 Chap5. SD-52 보드구성과코드메모리프로그래밍방법 Chap6. 어드레스디코딩 ( 매핑 ) 과어셈블리어코딩방법

More information

Microsoft PowerPoint - Chapter_09.pptx

Microsoft PowerPoint - Chapter_09.pptx 프로그래밍 1 1 Chapter 9. Structures May, 2016 Dept. of software Dankook University http://embedded.dankook.ac.kr/~baeksj 구조체의개념 (1/4) 2 (0,0) 구조체 : 다양한종류의데이터로구성된사용자정의데이터타입 복잡한자료를다루는것을편하게해줌 예 #1: 정수로이루어진 x,

More information

<C6F7C6AEB6F5B1B3C0E72E687770>

<C6F7C6AEB6F5B1B3C0E72E687770> 1-1. 포트란 언어의 역사 1 1-2. 포트란 언어의 실행 단계 1 1-3. 문제해결의 순서 2 1-4. Overview of Fortran 2 1-5. Use of Columns in Fortran 3 1-6. INTEGER, REAL, and CHARACTER Data Types 4 1-7. Arithmetic Expressions 4 1-8. 포트란에서의

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 Chapter 15 고급프로그램을 만들기위한 C... 1. main( ) 함수의숨겨진이야기 2. 헤더파일 3. 전처리문과예약어 1. main( ) 함수의숨겨진이야기 main( ) 함수의매개변수 [ 기본 14-1] main( ) 함수에매개변수를사용한예 1 01 #include 02 03 int main(int argc, char* argv[])

More information

설계란 무엇인가?

설계란 무엇인가? 금오공과대학교 C++ 프로그래밍 [email protected] 컴퓨터공학과 황준하 5 강. 배열, 포인터, 참조목차 배열 포인터 C++ 메모리구조 주소연산자 포인터 포인터연산 배열과포인터 메모리동적할당 문자열 참조 1 /20 5 강. 배열, 포인터, 참조배열 배열 같은타입의변수여러개를하나의변수명으로처리 int Ary[10]; 총 10 개의변수 : Ary[0]~Ary[9]

More information

Microsoft PowerPoint - chap04-연산자.pptx

Microsoft PowerPoint - chap04-연산자.pptx int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); } 1 학습목표 수식의 개념과 연산자, 피연산자에 대해서 알아본다. C의 를 알아본다. 연산자의 우선 순위와 결합 방향에

More information

PowerPoint Presentation

PowerPoint Presentation #include int main(void) { int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; } 1 학습목표 을작성하면서 C 프로그램의구성요소에대하여알아본다.

More information

OCW_C언어 기초

OCW_C언어 기초 초보프로그래머를위한 C 언어기초 2 장 : C 프로그램시작하기 2012 년 이은주 학습목표 을작성하면서 C 프로그램의구성요소 주석 (comment) 이란무엇인지알아보고, 주석을만드는방법 함수란무엇인지알아보고, C 프로그램에반드시필요한 main 함수 C 프로그램에서출력에사용되는 printf 함수 변수의개념과변수의값을입력받는데사용되는 scanf 함수 2 목차 프로그램코드

More information

Microsoft PowerPoint - chap06-5 [호환 모드]

Microsoft PowerPoint - chap06-5 [호환 모드] 2011-1 학기프로그래밍입문 (1) chapter 06-5 참고자료 변수의영역과데이터의전달 박종혁 Tel: 970-6702 Email: [email protected] h k 한빛미디어 출처 : 뇌를자극하는 C프로그래밍, 한빛미디어 -1- ehanbit.net 자동변수 지금까지하나의함수안에서선언한변수는자동변수이다. 사용범위는하나의함수내부이다. 생존기간은함수가호출되어실행되는동안이다.

More information

Microsoft PowerPoint - 제11장 포인터

Microsoft PowerPoint - 제11장 포인터 쉽게풀어쓴 C 언어 Express 제 11 장포인터 이번장에서학습할내용 포인터이란? 변수의주소 포인터의선언 간접참조연산자 포인터연산 포인터와배열 포인터와함수 이번장에서는포인터의기초적인지식을학습한다. 포인터란? 포인터 (pointer): 주소를가지고있는변수 1003 1004 1005 영화관 1002 1006 1001 포인터 (pointer) 1007 메모리의구조

More information

Microsoft Word - FunctionCall

Microsoft Word - FunctionCall Function all Mechanism /* Simple Program */ #define get_int() IN KEYOARD #define put_int(val) LD A val \ OUT MONITOR int add_two(int a, int b) { int tmp; tmp = a+b; return tmp; } local auto variable stack

More information

API 매뉴얼

API 매뉴얼 PCI-DIO12 API Programming (Rev 1.0) Windows, Windows2000, Windows NT and Windows XP are trademarks of Microsoft. We acknowledge that the trademarks or service names of all other organizations mentioned

More information

Microsoft PowerPoint - 제3장-배열.pptx

Microsoft PowerPoint - 제3장-배열.pptx 제 3 강. 배열 (Array) 자료구조 1 제 3 강. 배열자료구조 학습목차 1. 배열의개념 2. 구조체 3. 희소 (Sparce) 행렬 4. 다차원배열의저장 2 1. 배열의개념 리스트는일상생활에서가장많이쓰이는자료형태이다. 예 ) 학생의명단, 은행거래고객명단, 월별판매액등 배열 (Array) 은컴퓨터언어에서리스트를저장하는데이터타입이다. 리스트와배열은같은개념이지만다른차원의용어이다.

More information

1 장 C 언어복습 표준입출력배열포인터배열과포인터함수 const와포인터구조체컴파일러사용방법 C++ 프로그래밍입문

1 장 C 언어복습 표준입출력배열포인터배열과포인터함수 const와포인터구조체컴파일러사용방법 C++ 프로그래밍입문 1 장 C 언어복습 표준입출력배열포인터배열과포인터함수 const와포인터구조체컴파일러사용방법 C++ 프로그래밍입문 1. 표준입출력 표준입출력 입력 : 키보드, scanf 함수 출력 : 모니터, printf 함수문제 : 정수값 2개를입력받고두값사이의값들을더하여출력하라. #include int main(void) int Num1, Num2; int

More information

chap7.key

chap7.key 1 7 C 2 7.1 C (System Calls) Unix UNIX man Section 2 C. C (Library Functions) C 1975 Dennis Ritchie ANSI C Standard Library 3 (system call). 4 C?... 5 C (text file), C. (binary file). 6 C 1. : fopen( )

More information

프로그램을 학교 등지에서 조금이라도 배운 사람들을 위한 프로그래밍 노트 입니다. 저 역시 그 사람들 중 하나 입니다. 중고등학교 시절 학교 도서관, 새로 생긴 시립 도서관 등을 다니며 책을 보 고 정리하며 어느정도 독학으르 공부하긴 했지만, 자주 안하다 보면 금방 잊어

프로그램을 학교 등지에서 조금이라도 배운 사람들을 위한 프로그래밍 노트 입니다. 저 역시 그 사람들 중 하나 입니다. 중고등학교 시절 학교 도서관, 새로 생긴 시립 도서관 등을 다니며 책을 보 고 정리하며 어느정도 독학으르 공부하긴 했지만, 자주 안하다 보면 금방 잊어 개나리 연구소 C 언어 노트 (tyback.egloos.com) 프로그램을 학교 등지에서 조금이라도 배운 사람들을 위한 프로그래밍 노트 입니다. 저 역시 그 사람들 중 하나 입니다. 중고등학교 시절 학교 도서관, 새로 생긴 시립 도서관 등을 다니며 책을 보 고 정리하며 어느정도 독학으르 공부하긴 했지만, 자주 안하다 보면 금방 잊어먹고 하더라구요. 그래서,

More information

구조체정의 자료형 (data types) 기본자료형 (primitive data types) : char, int, float 등과같이 C 언어에서제공하는자료형. 사용자정의자료형 (user-defined data types) : 다양한자료형을묶어서목적에따라새로운자료형을

구조체정의 자료형 (data types) 기본자료형 (primitive data types) : char, int, float 등과같이 C 언어에서제공하는자료형. 사용자정의자료형 (user-defined data types) : 다양한자료형을묶어서목적에따라새로운자료형을 (structures) 구조체정의 구조체선언및초기화 구조체배열 구조체포인터 구조체배열과포인터 구조체와함수 중첩된구조체 구조체동적할당 공용체 (union) 1 구조체정의 자료형 (data types) 기본자료형 (primitive data types) : char, int, float 등과같이 C 언어에서제공하는자료형. 사용자정의자료형 (user-defined

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 KeyPad Device Control - Device driver Jo, Heeseung HBE-SM5-S4210 에는 16 개의 Tack Switch 를사용하여 4 행 4 열의 Keypad 가장착 4x4 Keypad 2 KeyPad 를제어하기위하여 FPGA 내부에 KeyPad controller 가구현 KeyPad controller 16bit 로구성된

More information

untitled

untitled while do-while for break continue while( ) ; #include 0 i int main(void) int meter; int i = 0; while(i < 3) meter = i * 1609; printf("%d %d \n", i, meter); i++; return 0; i i< 3 () 0 (1)

More information

The Pocket Guide to TCP/IP Sockets: C Version

The Pocket Guide to  TCP/IP Sockets: C Version 1 목포해양대해양컴퓨터공학과 UDP 소켓 네트워크프로그램설계 4 장 2 목포해양대해양컴퓨터공학과 목차 제 4장 UDP 소켓 4.1 UDP 클라이언트 4.2 UDP 서버 4.3 UDP 소켓을이용한데이터송신및수신 4.4 UDP 소켓의연결 3 목포해양대해양컴퓨터공학과 UDP 소켓의특징 UDP 소켓의특성 신뢰할수없는데이터전송방식 목적지에정확하게전송된다는보장이없음.

More information

Microsoft PowerPoint - chap-11.pptx

Microsoft PowerPoint - chap-11.pptx 쉽게풀어쓴 C 언어 Express 제 11 장포인터 컴퓨터프로그래밍기초 이번장에서학습할내용 포인터이란? 변수의주소 포인터의선언 간접참조연산자 포인터연산 포인터와배열 포인터와함수 이번장에서는포인터의기초적인지식을학습한다. 컴퓨터프로그래밍기초 2 포인터란? 포인터 (pointer): 주소를가지고있는변수 컴퓨터프로그래밍기초 3 메모리의구조 변수는메모리에저장된다. 메모리는바이트단위로액세스된다.

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 Chapter 10 포인터 01 포인터의기본 02 인자전달방법 03 포인터와배열 04 포인터와문자열 변수의주소를저장하는포인터에대해알아본다. 함수의인자를값과주소로전달하는방법을알아본다. 포인터와배열의관계를알아본다. 포인터와문자열의관계를알아본다. 1.1 포인터선언 포인터선언방법 자료형 * 변수명 ; int * ptr; * 연산자가하나이면 1 차원포인터 1 차원포인터는일반변수의주소를값으로가짐

More information

chap x: G입력

chap x: G입력 재귀알고리즘 (Recursive Algorithms) 재귀알고리즘의특징 문제자체가재귀적일경우적합 ( 예 : 피보나치수열 ) 이해하기가용이하나, 비효율적일수있음 재귀알고리즘을작성하는방법 재귀호출을종료하는경계조건을설정 각단계마다경계조건에접근하도록알고리즘의재귀호출 재귀알고리즘의두가지예 이진검색 순열 (Permutations) 1 장. 기본개념 (Page 19) 이진검색의재귀알고리즘

More information

일반적인 네트워크의 구성은 다음과 같다

일반적인 네트워크의 구성은 다음과 같다 W5200 Errata Sheet Document History Ver 1.0.0 (Feb. 23, 2012) First release (erratum 1) Ver 1.0.1 (Mar. 28, 2012) Add a solution for erratum 1, 2 Ver 1.0.2 (Apr. 03, 2012) Add a solution for erratum 3

More information

ch15

ch15 쉽게풀어쓴 C 언어 Express 제 14 장포인터활용 C Express 이중포인터 이중포인터 (double pointer) : 포인터를가리키는포인터 int i = 10; int *p = &i; int **q = &p; // i 는 int 형변수 // p 는 i 를가리키는포인터 // q 는포인터 p 를가리키는이중포인터 이중포인터 이중포인터의해석 이중포인터 //

More information

A Hierarchical Approach to Interactive Motion Editing for Human-like Figures

A Hierarchical Approach to Interactive Motion Editing for Human-like Figures 단일연결리스트 (Singly Linked List) 신찬수 연결리스트 (linked list)? tail 서울부산수원용인 null item next 구조체복습 struct name_card { char name[20]; int date; } struct name_card a; // 구조체변수 a 선언 a.name 또는 a.date // 구조체 a의멤버접근 struct

More information

설계란 무엇인가?

설계란 무엇인가? 금오공과대학교 C++ 프로그래밍 [email protected] 컴퓨터공학과 황준하 6 강. 함수와배열, 포인터, 참조목차 함수와포인터 주소값의매개변수전달 주소의반환 함수와배열 배열의매개변수전달 함수와참조 참조에의한매개변수전달 참조의반환 프로그래밍연습 1 /15 6 강. 함수와배열, 포인터, 참조함수와포인터 C++ 매개변수전달방법 값에의한전달 : 변수값,

More information

Microsoft PowerPoint - chap01-C언어개요.pptx

Microsoft PowerPoint - chap01-C언어개요.pptx #include int main(void) { int num; printf( Please enter an integer: "); scanf("%d", &num); if ( num < 0 ) printf("is negative.\n"); printf("num = %d\n", num); return 0; } 1 학습목표 프로그래밍의 기본 개념을

More information

adfasdfasfdasfasfadf

adfasdfasfdasfasfadf C 4.5 Source code Pt.3 ISL / 강한솔 2019-04-10 Index Tree structure Build.h Tree.h St-thresh.h 2 Tree structure *Concpets : Node, Branch, Leaf, Subtree, Attribute, Attribute Value, Class Play, Don't Play.

More information

<4D F736F F F696E74202D20BBB7BBB7C7D15F FBEDFB0A3B1B3C0B05FC1A638C0CFC2F72E BC8A3C8AF20B8F0B5E55D>

<4D F736F F F696E74202D20BBB7BBB7C7D15F FBEDFB0A3B1B3C0B05FC1A638C0CFC2F72E BC8A3C8AF20B8F0B5E55D> 뻔뻔한 AVR 프로그래밍 The Last(8 th ) Lecture 유명환 ( [email protected]) INDEX 1 I 2 C 통신이야기 2 ATmega128 TWI(I 2 C) 구조분석 4 ATmega128 TWI(I 2 C) 실습 : AT24C16 1 I 2 C 통신이야기 I 2 C Inter IC Bus 어떤 IC들간에도공통적으로통할수있는 ex)

More information

3. 1 포인터란 3. 2 포인터변수의선언과사용 3. 3 다차원포인터변수의선언과사용 3. 4 주소의가감산 3. 5 함수포인터

3. 1 포인터란 3. 2 포인터변수의선언과사용 3. 3 다차원포인터변수의선언과사용 3. 4 주소의가감산 3. 5 함수포인터 - Part2-3 3. 1 포인터란 3. 2 포인터변수의선언과사용 3. 3 다차원포인터변수의선언과사용 3. 4 주소의가감산 3. 5 함수포인터 3.1 포인터란 ü ü ü. ü. ü. ü ( ) ? 3.1 ü. ü C ( ).? ü ü PART2-4 ü ( ) PART3-4 3.2 포인터변수의선언과사용 3.2 포인터 변수의 선언과 사용 (1/8) 포인터 변수의

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 Chapter 06 반복문 01 반복문의필요성 02 for문 03 while문 04 do~while문 05 기타제어문 반복문의의미와필요성을이해한다. 대표적인반복문인 for 문, while 문, do~while 문의작성법을 알아본다. 1.1 반복문의필요성 반복문 동일한내용을반복하거나일정한규칙으로반복하는일을수행할때사용 프로그램을좀더간결하고실제적으로작성할수있음.

More information

C++ Programming

C++ Programming C++ Programming 연산자다중정의 Seo, Doo-okok [email protected] http://www.clickseo.com 목 차 연산자다중정의 C++ 스타일의문자열 2 연산자다중정의 연산자다중정의 단항연산자다중정의 이항연산자다중정의 cin, cout 그리고 endl C++ 스타일의문자열 3 연산자다중정의 연산자다중정의 (Operator

More information

<4D F736F F F696E74202D20C1A63137C0E520B5BFC0FBB8DEB8F0B8AEBFCD20BFACB0E1B8AEBDBAC6AE>

<4D F736F F F696E74202D20C1A63137C0E520B5BFC0FBB8DEB8F0B8AEBFCD20BFACB0E1B8AEBDBAC6AE> 쉽게풀어쓴 C 언어 Express 제 17 장동적메모리와연결리스트 이번장에서학습할내용 동적메모리할당의이해 동적메모리할당관련함수 연결리스트 동적메모리할당에대한개념을이해하고응용으로연결리스트를학습합니다. 동적할당메모리의개념 프로그램이메모리를할당받는방법 정적 (static) 동적 (dynamic) 정적메모리할당 정적메모리할당 프로그램이시작되기전에미리정해진크기의메모리를할당받는것

More information

歯9장.PDF

歯9장.PDF 9 Hello!! C printf() scanf() getchar() putchar() gets() puts() fopen() fclose() fprintf() fscant() fgetc() fputs() fgets() gputs() fread() fwrite() fseek() ftell() I/O 2 (stream) C (text stream) : `/n'

More information

금오공대 컴퓨터공학전공 강의자료

금오공대 컴퓨터공학전공 강의자료 C 프로그래밍프로젝트 Chap 13. 포인터와배열! 함께이해하기 2013.10.02. 오병우 컴퓨터공학과 13-1 포인터와배열의관계 Programming in C, 정재은저, 사이텍미디어. 9 장참조 ( 교재의 13-1 은읽지말것 ) 배열이름의정체 배열이름은 Compile 시의 Symbol 로서첫번째요소의주소값을나타낸다. Symbol 로서컴파일시에만유효함 실행시에는메모리에잡히지않음

More information

<4D F736F F F696E74202D20C1A63134C0E520C6F7C0CEC5CD5FC8B0BFEB>

<4D F736F F F696E74202D20C1A63134C0E520C6F7C0CEC5CD5FC8B0BFEB> 쉽게풀어쓴 C 언어 Express 제 14 장포인터활용 이중포인터 이중포인터 (double pointer) : 포인터를가리키는포인터 int i = 10; int *p = &i; int **q = &p; // i 는 int 형변수 // p 는 i 를가리키는포인터 // q 는포인터 p 를가리키는이중포인터 이중포인터 이중포인터의해석 이중포인터 // 이중포인터프로그램

More information

Microsoft PowerPoint - chap06-1Array.ppt

Microsoft PowerPoint - chap06-1Array.ppt 2010-1 학기프로그래밍입문 (1) chapter 06-1 참고자료 배열 박종혁 Tel: 970-6702 Email: [email protected] 한빛미디어 출처 : 뇌를자극하는 C프로그래밍, 한빛미디어 -1- 배열의선언과사용 같은형태의자료형이많이필요할때배열을사용하면효과적이다. 배열의선언 배열의사용 배열과반복문 배열의초기화 유연성있게배열다루기 한빛미디어

More information

<443A5C4C C4B48555C B3E25C32C7D0B1E25CBCB3B0E8C7C1B7CEC1A7C6AE425CBED0C3E0C7C1B7CEB1D7B7A55C D616E2E637070>

<443A5C4C C4B48555C B3E25C32C7D0B1E25CBCB3B0E8C7C1B7CEC1A7C6AE425CBED0C3E0C7C1B7CEB1D7B7A55C D616E2E637070> #include "stdafx.h" #include "Huffman.h" 1 /* 비트의부분을뽑아내는함수 */ unsigned HF::bits(unsigned x, int k, int j) return (x >> k) & ~(~0

More information

Chap 6: Graphs

Chap 6: Graphs 그래프표현법 인접행렬 (Adjacency Matrix) 인접리스트 (Adjacency List) 인접다중리스트 (Adjacency Multilist) 6 장. 그래프 (Page ) 인접행렬 (Adjacency Matrix) n 개의 vertex 를갖는그래프 G 의인접행렬의구성 A[n][n] (u, v) E(G) 이면, A[u][v] = Otherwise, A[u][v]

More information

Frama-C/JESSIS 사용법 소개

Frama-C/JESSIS 사용법 소개 Frama-C 프로그램검증시스템소개 박종현 @ POSTECH PL Frama-C? C 프로그램대상정적분석도구 플러그인구조 JESSIE Wp Aorai Frama-C 커널 2 ROSAEC 2011 동계워크샵 @ 통영 JESSIE? Frama-C 연역검증플러그인 프로그램분석 검증조건추출 증명 Hoare 논리에기초한프로그램검증도구 사용법 $ frama-c jessie

More information

Microsoft PowerPoint - a10.ppt [호환 모드]

Microsoft PowerPoint - a10.ppt [호환 모드] Structure Chapter 10: Structures t and Macros Structure 관련된변수들의그룹으로이루어진자료구조 template, pattern field structure를구성하는변수 (cf) C언어의 struct 프로그램의 structure 접근 entire structure 또는 individual fields Structure는

More information

untitled

untitled Step Motor Device Driver Embedded System Lab. II Step Motor Step Motor Step Motor source Embedded System Lab. II 2 open loop, : : Pulse, 1 Pulse,, -, 1 +5%, step Step Motor (2),, Embedded System Lab. II

More information

프로그래밍개론및실습 2015 년 2 학기프로그래밍개론및실습과목으로본내용은강의교재인생능출판사, 두근두근 C 언어수업, 천인국지음을발췌수정하였음

프로그래밍개론및실습 2015 년 2 학기프로그래밍개론및실습과목으로본내용은강의교재인생능출판사, 두근두근 C 언어수업, 천인국지음을발췌수정하였음 프로그래밍개론및실습 2015 년 2 학기프로그래밍개론및실습과목으로본내용은강의교재인생능출판사, 두근두근 C 언어수업, 천인국지음을발췌수정하였음 CHAPTER 9 둘중하나선택하기 관계연산자 두개의피연산자를비교하는연산자 결과값은참 (1) 아니면거짓 (0) x == y x 와 y 의값이같은지비교한다. 관계연산자 연산자 의미 x == y x와 y가같은가? x!= y

More information

PowerPoint 프레젠테이션

PowerPoint 프레젠테이션 Web server porting 2 Jo, Heeseung Web 을이용한 LED 제어 Web 을이용한 LED 제어프로그램 web 에서데이터를전송받아타겟보드의 LED 를조작하는프로그램을작성하기위해다음과같은소스파일을생성 2 Web 을이용한 LED 제어 LED 제어프로그램작성 8bitled.html 파일을작성 root@ubuntu:/working/web# vi

More information

중간고사

중간고사 중간고사 예제 1 사용자로부터받은두개의숫자 x, y 중에서큰수를찾는알고리즘을의사코드로작성하시오. Step 1: Input x, y Step 2: if (x > y) then MAX

More information

제 11 장포인터 유준범 (JUNBEOM YOO) Ver 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다.

제 11 장포인터 유준범 (JUNBEOM YOO) Ver 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다. 제 11 장포인터 유준범 (JUNBEOM YOO) Ver. 2.0 [email protected] http://dslab.konkuk.ac.kr 본강의자료는생능출판사의 PPT 강의자료 를기반으로제작되었습니다. 이번장에서학습할내용 포인터이란? 변수의주소 포인터의선언 간접참조연산자 포인터연산 포인터와배열 포인터와함수 이번장에서는포인터의기초적인지식을학습합니다.

More information