차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본"

Transcription

1 MPI 를이용한 병렬프로그래밍 KISTI 슈퍼컴퓨팅센터 목 표 MPI 를활용해메시지패싱기반의병렬 프로그램작성을가능하도록한다. Supercomputing Center

2 차례. MPI 소개. MPI 를이용한병렬프로그래밍기초. MPI 를이용한병렬프로그래밍실제 4. MPI 병렬프로그램예제 l 부록 : MPI- l 용어정리 / 참고자료 Supercomputing Center 제 장 MPI 소개 MPI 를소개하고 MPI 를이해하는데 필요한기본개념들에대해알아본다.

3 메시지패싱 (/) Serial Message-passing time S time S S S S P P P P P4 P S S S S P Process Process Process Process P4 Node Node Node Node 4 S Data transmission over the interconnect Process Supercomputing Center 5 메시지패싱 (/) q 지역적으로메모리를따로가지는프로세스들이데이터를공 유하기위해메시지 ( 데이터 ) 를송신, 수신하여통신하는방식 병렬화를위한작업할당, 데이터분배, 통신의운용등모든것을 프로그래머가담당 : 어렵지만유용성좋음 (Very Flexible) 다양한하드웨어플랫폼에서구현가능 분산메모리다중프로세서시스템 공유메모리다중프로세서시스템 단일프로세서시스템 q 메시지패싱라이브러리 MPI,, PVM, Shmem Supercomputing Center 6

4 MPI 란무엇인가? q Message Passing Interface q 메시지패싱병렬프로그래밍을위해표준화된데이터 통신라이브러리 MPI- 표준마련 (MPI Forum) : 994 년 MPI- 발표 : 997 년 Supercomputing Center 7 MPI 의목표 이식성 (Portability) 효율성 (Efficiency) 기능성 (Functionality) Supercomputing Center 8 4

5 MPI 의기본개념 (/4) 5 process Message,Tag Communicator process 4 Supercomputing Center 9 q 프로세스와프로세서 MPI 의기본개념 (/4) MPI 는프로세스기준으로작업할당 프로세서대프로세스 = 일대일또는일대다 q 메시지 ( = 데이터 + 봉투 (Envelope) ) 어떤프로세스가보내는가어디에있는데이터를보내는가어떤데이터를보내는가얼마나보내는가어떤프로세스가받는가어디에저장할것인가얼마나받을준비를해야하는가 Supercomputing Center 5

6 MPI 의기본개념 (/4) q 꼬리표 (tag) 메시지매칭과구분에이용순서대로메시지도착을처리할수있음와일드카드사용가능 q 커뮤니케이터 (Communicator) 서로간에통신이허용되는프로세스들의집합 q 프로세스랭크 (Rank) 동일한커뮤니케이터내의프로세스들을식별하기위한식별자 Supercomputing Center MPI 의기본개념 (4/4) q 점대점통신 (Point to Point Communication) 두개프로세스사이의통신하나의송신프로세스에하나의수신프로세스가대응 q 집합통신 (Collective Communication) 동시에여러개의프로세스가통신에참여일대다, 다대일, 다대다대응가능여러번의점대점통신사용을하나의집합통신으로대체 오류의가능성이적다. 최적화되어일반적으로빠르다. Supercomputing Center 6

7 제 장 MPI 를이용한 병렬프로그래밍기초 MPI 를이용한병렬프로그램작성의기본 과통신, 유도데이터타입의이용, 그리고 가상토폴로지에대해알아본다. MPI 프로그램의기본구조 커뮤니케이터 메시지 MPI 데이터타입 Supercomputing Center 4 7

8 MPI 프로그램의기본구조 include MPI header file variable declarations initialize the MPI environment do do computation and MPI communication calls close MPI environment Supercomputing Center 5 q 헤더파일삽입 MPI 헤더파일 Fortran INCLUDE mpif.h C #include mpi.h MPI 서브루틴과함수의프로토타입선언매크로, MPI 관련인수, 데이터타입정의위치 /usr/lpp/ppe.poe/include/ Supercomputing Center 6 8

9 MPI 핸들 q MPI 고유의내부자료구조참조에이용되는포인터변수 q C의핸들은 typedef 으로정의된특별한데이터타입을가짐 MPI_Comm, MPI_Datatype,, MPI_Request, q Fortran 의핸들은 INTEGER 타입 Supercomputing Center 7 MPI 루틴의호출과리턴값 (/) q MPI 루틴의호출 Fortran Format Example Error code CALL MPI_XXXXX(parameter,,ierr) CALL MPI_INIT(ierr) Returned as ierr parameter, MPI_SUCCESS if successful Format Example Error code C err = MPI_Xxxxx(parameter, ); MPI_Xxxxx(parameter, ); err = MPI_Init(&argc, &argv); Returned as err, MPI_SUCCESS if successful Supercomputing Center 8 9

10 MPI 루틴의호출과리턴값 (/) q MPI 루틴의리턴값 호출된 MPI 루틴의실행성공을알려주는에러코드리턴 성공적으로실행되면정수형상수 MPI_SUCCESS 리턴 Fortran 서브루틴은마지막정수인수가에러코드를나타냄 MPI_SUCCESS 는헤더파일에선언되어있음 Fortran INTEGER ierr CALL MPI_INIT(ierr) IF(ierr.EQ. MPI_SUCCESS) THEN ENDIF C int err; err = MPI_Init(&argc, &argv); if (err == MPI_SUCCESS){ Supercomputing Center 9 MPI 초기화 Fortran CALL MPI_INIT(ierr) C int MPI_Init(&argc, &argv) q MPI 환경초기화 q MPI 루틴중가장먼저오직한번반드시호출되어야함 Supercomputing Center

11 커뮤니케이터 (/) q 서로통신할수있는프로세스들의집합을나타내는핸들 q 모든 MPI 통신루틴에는커뮤니케이터인수가포함됨 q 커뮤니케이터를공유하는프로세스들끼리통신가능 q MPI_COMM_WORLD 프로그램실행시정해진, 사용가능한모든프로세스를포함하 는커뮤니케이터 MPI_Init 이호출될때정의됨 Supercomputing Center q 프로세스랭크 커뮤니케이터 (/) 같은커뮤니케이터에속한프로세스의식별번호 프로세스가 n개있으면 부터 n-까지번호할당 메시지의송신자와수신자를나타내기위해사용 프로세스랭크가져오기 Fortran C CALL MPI_COMM_RANK(comm,, rank, ierr) int MPI_Comm_rank(MPI_Comm comm, int *rank) 커뮤니케이터 comm 에서이루틴을호출한프로세스의랭크를인수 rank 를이용해출력 Supercomputing Center

12 q 커뮤니케이터사이즈 커뮤니케이터 (/) 커뮤니케이터에포함된프로세스들의총개수 커뮤니케이터사이즈가져오기 Fortran C CALL MPI_COMM_SIZE(comm,, size, ierr) int MPI_Comm_size(MPI_Comm comm, int *size) 루틴을호출되면커뮤니케이터 comm 의사이즈를인수 size 를통해리턴 Supercomputing Center MPI 프로그램종료 Fortran CALL MPI_FINALIZE(ierr) C int MPI_Finalize(); q 모든 MPI 자료구조정리 q 모든프로세스들에서마지막으로한번호출되어야함 q 프로세스를종료시키는것은아님 Supercomputing Center 4

13 PROGRAM skeleton INCLUDE mpif.h bones.f INTEGER ierr, rank, size CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)! your code here CALL MPI_FINALIZE(ierr) END Supercomputing Center 5 /* program skeleton*/ #include mpi.h bones.c void main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); /* your code here */ MPI_Finalize(); Supercomputing Center 6

14 MPI 메시지 (/) q 데이터 + 봉투 데이터 ( 버퍼, 개수, 데이터타입 ) 버퍼 : 수신 ( 송신 ) 데이터의변수이름 개수 : 수신 ( 송신 ) 데이터의개수 데이터타입 : 수신 ( 송신 ) 데이터의데이터타입 봉투 ( 수신자 ( 송신자 ), 꼬리표, 커뮤니케이터 ) 수신자 ( 송신자 ) : 수신 ( 송신 ) 프로세스랭크 꼬리표 : 수신 ( 송신 ) 데이터를나타내는고유한정수 IBM/MPI, MPICH : ~ 7748( ) 커뮤니케이터 : 송신, 수신프로세스들이포함된프로세스그룹 Supercomputing Center 7 MPI 메시지 (/) q MPI 데이터특정 MPI 데이터타입을가지는원소들의배열로구성 q MPI 데이터타입기본타입유도타입 (derived type) q 유도타입은기본타입또는다른유도타입을기반으로만들수있다. q C 데이터타입과 Fortran 데이터타입은같지않다. q 송신과수신데이터타입은반드시일치해야한다. Supercomputing Center 8 4

15 MPI 기본데이터타입 (/) MPI Data Type MPI_INTEGER MPI_REAL MPI_DOUBLE_PRECISION MPI_COMPLEX MPI_LOGICAL MPI_CHARACTER Fortran Data Type INTEGER REAL DOUBLE PRECISION COMPLEX LOGICAL CHARACTER() MPI_BYTE MPI_PACKED Supercomputing Center 9 MPI 기본데이터타입 (/) MPI Data Type MPI_CHAR MPI_SHORT MPI_INT MPI_LONG MPI_UNSIGNED_CHAR MPI_UNSIGNED_SHORT MPI_UNSIGNED MPI_UNSIGNED_LONG MPI_FLOAT MPI_DOUBLE MPI_LONG_DOUBLE MPI_BYTE MPI_PACKED C Data Type signed char signed short int signed int signed long int unsigned char unsigned short int unsigned int unsigned long int float double long double Supercomputing Center 5

16 점대점통신과통신모드 블록킹통신 논블록킹통신 단방향통신과양방향통신 Supercomputing Center 점대점통신 (/) Communicator 5 source 4 destination 반드시두개의프로세스만참여하는통신 통신은커뮤니케이터내에서만이루어진다. 송신 / 수신프로세스의확인을위해커뮤니케이터와랭크사용 Supercomputing Center 6

17 점대점통신 (/) q 통신의완료메시지전송에이용된메모리위치에안전하게접근할수있음을의미 송신 : 송신변수는통신이완료되면다시사용될수있음 수신 : 수신변수는통신이완료된후부터사용될수있음 q 블록킹통신과논블록킹통신블록킹 통신이완료된후루틴으로부터리턴됨논블록킹 통신이시작되면완료와상관없이리턴, 이후완료여부를검사 q 통신완료에요구되는조건에따라통신모드분류 Supercomputing Center 통신모드 통신모드 블록킹 MPI 호출루틴 논블록킹 동기송신 준비송신 버퍼송신 표준송신 수신 MPI_SSEND MPI_RSEND MPI_BSEND MPI_SEND MPI_RECV MPI_ISSEND MPI_IRSEND MPI_IBSEND MPI_ISEND MPI_IRECV Supercomputing Center 4 7

18 동기송신 MPI_SSEND (Blocking Synchronous Send) Task Waits data transfer from source complete S R Wait MPI_RECV Receiving task waits Until buffer is filled q q q q q 송신시작 : 대응되는수신루틴의실행에무관하게시작 송신 : 수신측이받을준비가되면전송시작 송신완료 : 수신루틴이메시지를받기시작 + 전송완료가장안전한통신논-로컬송신모드 Supercomputing Center 5 준비송신 MPI_RSEND (blocking ready send) data transfer from source complete S R Wait MPI_RECV Receiving task waits Until buffer is filled q q q q 수신측이미리받을준비가되어있음을가정하고송신시작 수신이준비되지않은상황에서의송신은에러 성능면에서유리 논-로컬송신모드 Supercomputing Center 6 8

19 버퍼송신 MPI_BSEND (buffered send) copy data to buffer data transfer user-supplied supplied buffer complete S R MPI_RECV task waits q q q q 송신시작 : 대응되는수신루틴의실행에무관하게시작 송신완료 : 버퍼로복사가끝나면수신과무관하게완료 사용자가직접버퍼공간관리 MPI_Buffer_attach MPI_Buffer_detach 로컬송신모드 Supercomputing Center 7 q 직접복사 송신버퍼 à 수신버퍼 표준송신 q 버퍼사용송신버퍼 à 시스템버퍼 à 수신버퍼 q 상황에따라다르게실행됨 q 버퍼관리불필요 q 송신완료가반드시메시지가도착되었음을의미하지않음 q 논-로컬송신모드 Supercomputing Center 8 9

20 블록킹송신 : 표준 C Fortran int MPI_Send(void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) MPI_SEND(buf, count, datatype, dest,, tag, comm, ierr) (CHOICE) buf : 송신버퍼의시작주소 (IN) INTEGER count : 송신될원소개수 (IN) INTEGER datatype : 각원소의 MPI 데이터타입 ( 핸들 ) (IN) INTEGER dest : 수신프로세스의랭크 (IN) 통신이불필요하면 MPI_PROC_NULL INTEGER tag : 메시지꼬리표 (IN) INTEGER comm : MPI 커뮤니케이터 ( 핸들 ) (IN) MPI_SEND(a, 5, MPI_REAL, 5,, MPI_COMM_WORLD, ierr) Supercomputing Center 9 블록킹수신 (/4) C Fortran int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) MPI_RECV(buf, count, datatype, source, tag, comm, status, ierr) (CHOICE) buf : 수신버퍼의시작주소 (IN) INTEGER count : 수신될원소개수 (IN) INTEGER datatype : 각원소의 MPI 데이터타입 ( 핸들 ) (IN) INTEGER source : 송신프로세스의랭크 (IN) INTEGER tag : 메시지꼬리표 (IN) 통신이불필요하면 MPI_PROC_NULL INTEGER comm : MPI 커뮤니케이터 ( 핸들 ) (IN) INTEGER status(mpi_status_size) : 수신된메시지의정보저장 (OUT) MPI_RECV(a,5,MPI_REAL,,,MPI_COMM_WORLD,status,ierr) Supercomputing Center 4

21 블록킹수신 (/4) q 수신자는와일드카드를사용할수있음 q 모든프로세스로부터메시지수신 MPI_ANY_SOURCE q 어떤꼬리표를단메시지든모두수신 MPI_ANY_TAG Supercomputing Center 4 블록킹수신 (/4) q 수신자의 status 인수에저장되는정보송신프로세스꼬리표데이터크기 : MPI_GET_COUNT 사용 Information` source Fortran status(mpi_source) C status.mpi_source tag status(mpi_tag) status.mpi_tag count MPI_GET_COUNT MPI_Get_count Supercomputing Center 4

22 블록킹수신 (4/4) q MPI_GET_COUNT : 수신된메시지의원소개수를리턴 C Fortran int MPI_Get_count(MPI_Status *status, MPI_Datatype datatype, int *count) MPI_GET_COUNT(status, datatype, count, ierr) INTEGER status(mpi_status_size) : 수신된메시지의상태 (IN) INTEGER datatype : 각원소의데이터타입 (IN) INTEGER count : 원소의개수 (OUT) Supercomputing Center 4 블록킹통신예제 : Fortran PROGRAM isend INCLUDE 'mpif.h' INTEGER err, rank, size, count REAL data(), value() INTEGER status(mpi_status_size) CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) CALL MPI_COMM_SIZE(MPI_COMM_WORLD,size,err) IF (rank.eq.) THEN data=. CALL MPI_SEND(data,,MPI_REAL,,55,MPI_COMM_WORLD,err) ELSEIF (rank.eq. ) THEN CALL MPI_RECV(value,,MPI_REAL,MPI_ANY_SOURCE,55, & MPI_COMM_WORLD,status,err) PRINT *, "P:",rank," got data from processor ", & status(mpi_source) CALL MPI_GET_COUNT(status,MPI_REAL,count,err) PRINT *, "P:",rank," got ",count," elements" PRINT *, "P:",rank," value(5)=",value(5) ENDIF CALL MPI_FINALIZE(err) END Supercomputing Center 44

23 블록킹통신예제 : C #include <stdio.h> #include <mpi.h> void main(int argc, char *argv[]) { int rank, i, count; float data[],value[]; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); if(rank==) { for(i=;i<;++i) data[i]=i; MPI_Send(data,,MPI_FLOAT,,55,MPI_COMM_WORLD); else if(rank==) { MPI_Recv(value,,MPI_FLOAT,MPI_ANY_SOURCE,55,MPI_COMM_WO RLD, &status); printf("p:%d Got data from processor %d \n",rank, status.mpi_source); MPI_Get_count(&status,MPI_FLOAT,&count); printf("p:%d Got %d elements \n",rank,count); printf("p:%d value[5]=%f \n",rank,value[5]); MPI_Finalize(); Supercomputing Center 45 성공적인통신을위해주의할점들 q 송신측에서수신자랭크를명확히할것 q 수신측에서송신자랭크를명확히할것 q 커뮤니케이터가동일할것 q 메시지꼬리표가일치할것 q 수신버퍼는충분히클것 Supercomputing Center 46

24 q 통신을세가지상태로분류 논블록킹통신. 논블록킹통신의초기화 : 송신또는수신의포스팅. 전송데이터를사용하지않는다른작업수행 통신과계산작업을동시수행. 통신완료 : 대기또는검사 q 교착가능성제거, 통신부하감소 Supercomputing Center 47 논블록킹통신의초기화 C Fortran C Fortran int MPI_ISend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) MPI_ISEND(buf, count, datatype, dest,, tag, comm, request, ierr) int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request) MPI_IRECV(buf, count, datatype, source, tag, comm, request, ierr) INTEGER request : 초기화된통신의식별에이용 ( 핸들 ) (OUT) 논블록킹수신에는 status 인수가없음 Supercomputing Center 48 4

25 논블록킹통신의완료 q 대기 (waiting) 또는검사 (testing) 대기 루틴이호출되면통신이완료될때까지프로세스를블록킹 논블록킹통신 + 대기 = 블록킹통신 검사 루틴은통신의완료여부에따라참또는거짓을리턴 Supercomputing Center 49 대기 C Fortran int MPI_Wait(MPI_Request *request, MPI_Status *status) MPI_WAIT(request, status, ierr) INTEGER request : 포스팅된통신의식별에이용 ( 핸들 ) (IN) INTEGER status(mpi_status_size) : 수신메시지에대한정보또는송신루틴에대한에러코드 (OUT) Supercomputing Center 5 5

26 검사 C Fortran int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status) MPI_TEST(request, flag, status, ierr) INTEGER request : 포스팅된통신의식별에이용 ( 핸들 ) (IN) LOGICAL flag : 통신이완료되면참, 아니면거짓을리턴 (OUT) INTEGER status(mpi_status_size) : 수신메시지에대한정보또는송신루틴에대한에러코드 (OUT) Supercomputing Center 5 논블록킹통신예제 : Fortran PROGRAM isend INCLUDE 'mpif.h' INTEGER err, rank, count, req REAL data(), value() INTEGER status(mpi_status_size) CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) IF (rank.eq.) THEN data=. CALL MPI_ISEND(data,,MPI_REAL,,55,MPI_COMM_WORLD,req, err) CALL MPI_WAIT(req, status, err) ELSE IF(rank.eq.) THEN CALL MPI_IRECV(value,,MPI_REAL,,55,MPI_COMM_WORLD,req,err) CALL MPI_WAIT(req, status, err) PRINT *, "P:",rank," value(5)=",value(5) ENDIF CALL MPI_FINALIZE(err) END Supercomputing Center 5 6

27 논블록킹통신예제 : C /* isend */ #include <stdio.h> #include <mpi.h> void main(int argc, char *argv[]) { int rank, i; float data[],value[]; MPI_Request req; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); if(rank==) { for(i=;i<;++i) data[i]=i; MPI_Isend(data,,MPI_FLOAT,,55,MPI_COMM_WORLD,&req); MPI_Wait(&req, &status); else if(rank==){ MPI_Irecv(value,,MPI_FLOAT,,55,MPI_COMM_WORLD,&req); MPI_Wait(&req, &status); printf("p:%d value[5]=%f \n",rank,value[5]); MPI_Finalize(); Supercomputing Center 5 q 단방향통신과양방향통신 q 양방향통신은교착에주의 점대점통신의사용 rank rank rank rank sendbuf recvbuf sendbuf recvbuf recvbuf sendbuf recvbuf sendbuf Supercomputing Center 54 7

28 단방향통신 (/) q 블록킹송신, 블록킹수신 IF (myrank==) THEN CALL MPI_SEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ierr) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, istatus, ierr) ENDIF q 논블록킹송신, 블록킹수신 IF (myrank==) THEN CALL MPI_ISEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, istatus, ierr) ENDIF Supercomputing Center 55 단방향통신 (/) q 블록킹송신, 논블록킹수신 IF (myrank==) THEN CALL MPI_SEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ierr) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) CALL MPI_WAIT(ireq, istatus, ierr) ENDIF q 논블록킹송신, 논블록킹수신 IF (myrank==) THEN CALL MPI_ISEND(sendbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf, icount, MPI_REAL,, itag, MPI_COMM_WORLD, ireq, ierr) ENDIF CALL MPI_WAIT(ireq, istatus, ierr) Supercomputing Center 56 8

29 양방향통신 (/9) q 선송신, 후수신. : 메시지크기에따라교착가능 IF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf,...) ELSEIF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf,...) ENDIF Supercomputing Center 57 양방향통신 (/9) q 선송신, 후수신. (. 의경우와동일 ) IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_WAIT(ireq, ) CALL MPI_RECV(recvbuf,...) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_WAIT(ireq, ) CALL MPI_RECV(recvbuf,...) ENDIF Supercomputing Center 58 9

30 양방향통신 (/9) q 선송신, 후수신. : 메시지크기와무관하게교착없음 IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_RECV(recvbuf,...) CALL MPI_WAIT(ireq, ) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_RECV(recvbuf,...) CALL MPI_WAIT(ireq, ) ENDIF Supercomputing Center 59 양방향통신 (4/9) q 전송데이터크기에따른교착여부확인 q 교착을피하기위해논블록킹통신사용 INTEGER N PARAMETER (N=4) REAL a(n), b(n) IF( myrank.eq. ) THEN CALL MPI_SEND( a, N, ) CALL MPI_RECV( b, N, ) ELSE IF( myrank.eq. ) THEN CALL MPI_SEND( a, N, ) CALL MPI_RECV( b, N, ) ENDIF Supercomputing Center 6

31 양방향통신 (5/9) q 선수신, 후송신. : 메시지크기와무관하게교착 IF (myrank==) THEN CALL MPI_RECV(recvbuf,...) CALL MPI_SEND(sendbuf,...) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf,...) CALL MPI_SEND(sendbuf,...) ENDIF Supercomputing Center 6 양방향통신 (6/9) q 선수신, 후송신. : 메시지크기와무관하게교착없음 IF (myrank==) THEN CALL MPI_IRECV(recvbuf,,ireq, ) CALL MPI_SEND(sendbuf,...) CALL MPI_WAIT(ireq, ) ELSEIF (myrank==) THEN CALL MPI_IRECV(recvbuf,,ireq, ) CALL MPI_SEND(sendbuf,...) CALL MPI_WAIT(ireq, ) ENDIF Supercomputing Center 6

32 양방향통신 (7/9) q 전송데이터크기와무관한교착발생확인 q 교착을피하기위해논블록킹통신사용 REAL a(), b() IF (myrank==) THEN CALL MPI_RECV(b,,) CALL MPI_SEND(a,, ) ELSE IF (myrank==) THEN CALL MPI_RECV(b,, ) CALL MPI_SEND(a,, ) ENDIF Supercomputing Center 6 양방향통신 (8/9) q 한쪽은송신부터, 다른한쪽은수신부터 : 블록킹, 논블록킹루틴의사용과무관하게교착없음 IF (myrank==) THEN CALL MPI_SEND(sendbuf,...) CALL MPI_RECV(recvbuf, ) ELSEIF (myrank==) THEN CALL MPI_RECV(recvbuf, ) CALL MPI_SEND(sendbuf,...) ENDIF Supercomputing Center 64

33 양방향통신 (9/9) q 권장코드 IF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_IRECV(recvbuf,, ireq, ) ELSEIF (myrank==) THEN CALL MPI_ISEND(sendbuf,, ireq, ) CALL MPI_IRECV(recvbuf,, ireq, ) ENDIF CALL MPI_WAIT(ireq, ) CALL MPI_WAIT(ireq, ) Supercomputing Center 65 집합통신 방송 (Broadcast) 취합 (Gather) 환산 (Reduce) 확산 (Scatter) 장벽 (Barrier) 기타 Supercomputing Center 66

34 집합통신 (/) q 한그룹의프로세스가참여하는통신 q 점대점통신기반 q 점대점통신을이용한구현보다편리하고성능면에서유리 q 집합통신루틴 커뮤니케이터내의모든프로세스에서호출 동기화가보장되지않음 (MPI_Barrier 제외 ) 논블록킹루틴없음 꼬리표없음 Supercomputing Center 67 집합통신 (/) Category One buffer One send buffer and one receive buffer Reduction Others Subroutines MPI_BCAST MPI_GATHER,, MPI_SCATTER, MPI_ALLGATHER,, MPI_ALLTOALL, MPI_GATHERV, MPI_SCATTERV, MPI_ALLGATHERV, MPI_ALLTOALLV MPI_REDUCE, MPI_ALLREDUCE,, MPI_SCAN, MPI_REDUCE_SCATTER MPI_BARRIER, MPI_OP_CREATE, MPI_OP_FREE Supercomputing Center 68 4

35 process data 집합통신 (/) P A A P A A*B*C*D P broadcast A P B reduce P A P C P A P D *:some operator P P P P A B C D scatter gather A B C D P P P P A B C D all reduce A*B*C*D A*B*C*D A*B*C*D A*B*C*D *:some operator P P A B allgather A A B B C C D D P P A B scan A A*B P C A B C D P C A*B*C P D A B C D P D A*B*C*D *:some operator P P P A B C A B C A B C A B C alltoall A A A B B B C C C D D D P P P A B C A B C A B C A B C reduce scatter A*B*C*D A*B*C*D A*B*C*D P D D D D A B C D P D D D D A*B*C*D *:some operator Supercomputing Center 69 방송 : MPI_BCAST C Fortran int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm) MPI_BCAST(buffer, count, datatype, root, comm, ierr) (CHOICE) buffer : 버퍼의시작주소 (IN) INTEGER count : 버퍼원소의개수 (IN) INTEGER datatype : 버퍼원소의 MPI 데이터타입 (IN) INTEGER root : 루트프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 루트프로세스로부터커뮤니케이터내의다른프로세스로동일한데이터를전송 : 일대다통신 Supercomputing Center 7 5

36 MPI_BCAST 예제 MPI_COMM_WORLD rank==root rank= rank= MPI_INTEGER imsg imsg imsg Supercomputing Center 7 MPI_BCAST 예제 : Fortran PROGRAM bcast INCLUDE mpif.h INTEGER imsg(4) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=,4 imsg(i) = i ELSE DO i=,4 imsg(i) = ENDIF PRINT*, Before:,imsg CALL MPI_BCAST(imsg, 4, MPI_INTEGER,, MPI_COMM_WORLD, ierr) PRINT*, After :,imsg CALL MPI_FINALIZE(ierr) END Supercomputing Center 7 6

37 MPI_BCAST 예제 : C /*broadcast*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int imsg[4]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank==) for(i=; i<4; i++) imsg[i] = i+; else for (i=; i<4; i++) imsg[i] = ; printf( %d: BEFORE:, myrank); for(i=; i<4; i++) printf( %d, imsg[i]); printf( \n ); MPI_Bcast(imsg, 4, MPI_INT,, MPI_COMM_WORLD); printf( %d: AFTER:, myrank); for(i=; i<4; i++) printf( %d, imsg[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 7 취합 : MPI_GATHER C Fortran int MPI_Gather(void *sendbuf* sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_GATHER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) INTEGER sendcount : 송신버퍼의원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 수신할원소의개수 (IN) INTEGER recvtype : 수신버퍼원소의 MPI 데이터타입 (IN) INTEGER root : 수신프로세스 ( 루트프로세스 ) 의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 모든프로세스 ( 루트포함 ) 가송신한데이터를취합하여랭크 순서대로저장 : 다대일통신 Supercomputing Center 74 7

38 MPI_GATHER : 주의사항 q 송신버퍼 (sendbuf) 와수신버퍼 (recvbuf) 의메모리위치가겹쳐지지않도록주의할것. 즉, 같은이름을쓰면안됨 è 송신버퍼와수신버퍼를이용하는모든집합통신에해당 q 전송되는데이터의크기는모두동일할것 q 크기가서로다른데이터의취합 è MPI_GATHERV Supercomputing Center 75 MPI_GATHER 예제 MPI_COMM_WORLD MPI_INTEGER rank==root rank= rank= isend isend isend MPI_INTEGER irecv Supercomputing Center 76 8

39 MPI_GATHER 예제 : Fortran PROGRAM gather INCLUDE mpif.h INTEGER irecv() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) isend = myrank + CALL MPI_GATHER(isend,,MPI_INTEGER,irecv,,MPI_INTEGER,&, MPI_COMM_WORLD, ierr) IF (myrank==) THEN ENDIF PRINT *, irecv =,irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center 77 MPI_GATHER 예제 : C /*gather*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, nprocs, myrank ; int isend, irecv[]; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); isend = myrank + ; MPI_Gather(&isend,,MPI_INT,irecv,,MPI_INT,,MPI_COMM_WORLD); if(myrank == ) { printf( irecv = ); for(i=; i<; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 78 9

40 C Fortran 취합 : MPI_GATHERV int MPI_Gatherv(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int displs, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_GATHERV(sendbuf, sendcount, sendtype, recvbuf, recvcounts, displs, recvtype,, root, comm, ierr) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcounts(*) : 수신된원소의개수를저장하는정수배열 (IN) INTEGER displs(*) : 정수배열, i 번째자리에는프로세스 i에서들어오는데이터가저장될수신버퍼상의위치를나타냄 (IN) q q 각프로세스로부터전송되는데이터의크기가다를경우사용서로다른메시지크기는배열 recvcounts 에지정, 배열 displs 에루트프로세스의어디에데이터가위치하게되는가를저장 Supercomputing Center 79 MPI_GATHERV 예제 COMM sendcount rank==root rank= rank= sendbuf sendbuf sendbuf recvcounts() recvcounts() recvcounts() = displs() = displs() = displs() 4 5 recvbuf Supercomputing Center 8 4

41 MPI_GATHERV 예제 : Fortran PROGRAM gatherv INCLUDE mpif.h INTEGER isend(), irecv(6) INTEGER ircnt(:), idisp(:) DATA ircnt/,,/ idisp/,,/ CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO i=,myrank+ isend(i) = myrank + iscnt = myrank + CALL MPI_GATHERV(isend,iscnt,MPI_INTEGER,irecv,ircnt,idisp,& MPI_INTEGER,,MPI_COMM_WORLD,ierr) IF (myrank==) THEN PRINT *, irecv =,irecv ENDIF CALL MPI_FINALIZE(ierr) END Supercomputing Center 8 MPI_GATHERV 예제 : C /*gatherv*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int isend[], irecv[6]; int iscnt, ircnt[]={,,, idisp[]={,,; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<myrank+; i++) isend[i] = myrank + ; iscnt = myrank +; MPI_Gatherv(isend, iscnt, MPI_INT, irecv, ircnt, idisp, MPI_INT,, MPI_COMM_WORLD); if(myrank == ) { printf( irecv = ); for(i=; i<6; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 8 4

42 취합 : MPI_ALLGATHER C Fortran int MPI_Allgather(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLGATHER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) INTEGER sendcount : 송신버퍼의원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 각프로세스로부터수신된데이터개수 (IN) INTEGER recvtype : 수신버퍼데이터타입 (IN) INTEGER comm : 커뮤니케이터 (IN) q MPI_GATHER + MPI_BCAST q 프로세스 j 의데이터 è 모든수신버퍼 j 번째블록에저장 Supercomputing Center 8 MPI_ALLGATHER 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 84 4

43 MPI_ALLGATHER 예제 : Fortran PROGRAM allgather INCLUDE mpif.h INTEGER irecv() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) isend = myrank + CALL MPI_ALLGATHER(isend,, MPI_INTEGER, & irecv,, MPI_INTEGER, MPI_COMM_WORLD, ierr) PRINT *, irecv =, irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center 85 MPI_ALLGATHER 예제 : C /*allgather*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int isend, irecv[]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); isend = myrank + ; MPI_Allgather(&isend,, MPI_INT, irecv,, MPI_INT, MPI_COMM_WORLD); printf( %d irecv = ); for(i=; i<; i++) printf( %d, irecv[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 86 4

44 취합 : MPI_ALLGATHERV C Fortran int MPI_Allgatherv(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int displs, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLGATHERV(sendbuf, sendcount, sendtype, recvbuf, recvcounts, displs, recvtype, comm, ierr) q MPI_ALLGATHER 와같은기능을하며서로다른크기의데이터를취합할때사용 Supercomputing Center 87 MPI_ALLGATHERV 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcounts() recvcounts() recvcounts() = displs() = displs() = displs() 4 5 recvbuf recvbuf recvbuf Supercomputing Center 88 44

45 환산 : MPI_REDUCE C Fortran int MPI_Reduce(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, int root, MPI_Comm comm) MPI_REDUCE(sendbuf, recvbuf,, count, datatype,, op, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의시작주소 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER count : 송신버퍼의원소개수 (IN) INTEGER datatype : 송신버퍼원소의 MPI 데이터타입 (IN) INTEGER op : 환산연산자 (IN) INTEGER root : 루트프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 각프로세스로부터데이터를모아하나의값으로환산, 그결과를루트프로세스에저장 Supercomputing Center 89 MPI_REDUCE : 연산과데이터타입 (/) Operation MPI_SUM(sum), MPI_PROD(product) MPI_MAX(maximum), MPI_MIN(minimum) MPI_MAXLOC(max value and location), MPI_MINLOC(min value and location) MPI_LAND(logical AND), MPI_LOR(logical OR), MPI_LXOR(logical XOR) MPI_BAND(bitwise AND), MPI_BOR(bitwise OR), MPI_BXOR(bitwise XOR) Data Type (Fortran) MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_COMPLEX MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION MPI_LOGICAL MPI_INTEGER, MPI_BYTE Supercomputing Center 9 45

46 MPI_REDUCE : 연산과데이터타입 (/) Operation MPI_SUM(sum), MPI_PROD(product) MPI_MAX(maximum), MPI_MIN(minimum) MPI_MAXLOC(max value and location), MPI_MINLOC(min value and location) MPI_LAND(logical AND), MPI_LOR(logical OR), MPI_LXOR(logical XOR) MPI_BAND(bitwise AND), MPI_BOR(bitwise OR), MPI_BXOR(bitwise XOR) Data Type (C) MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_LONG_DOUBLE MPI_FLOAT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, MPI_INT, MPI_SHORT_INT, MPI_LONG_DOUBLE_INT MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED MPI_UNSIGNED_LONG, MPI_BYTE Supercomputing Center 9 MPI_REDUCE : 연산과데이터타입 (/) q C의 MPI_MAXLOC, MPI_MINLOC 에사용된데이터타입 Data Type MPI_FLOAT_INT MPI_DOUBLE_INT MPI_LONG_INT MPI_INT MPI_SHORT_INT MPI_LONG_DOUBLE_INT Description (C) { MPI_FLOAT, MPI_INT { MPI_DOUBLE, MPI_INT { MPI_LONG, MPI_INT { MPI_INT, MPI_INT { MPI_SHORT, MPI_INT { MPI_LONG_DOUBLE, MPI_INT Supercomputing Center 9 46

47 MPI_REDUCE : 사용자정의연산 (/) q 다음형식으로새로운연산 (my_operator) 을정의 C : void my_operator (void *invec, void *inoutvec, int *len, MPI_Datatype *datatype) Fortran : <type> INVEC(LEN),INOUTVEC(LEN) INTEGER LEN,DATATYPE FUNCTION MY_OPERATOR(INVEC(*), INOUTVEC(*), LEN, DATATYPE) Supercomputing Center 9 MPI_REDUCE : 사용자정의연산 (/) q 사용자정의연산의등록 ( : my_operator 를 op 로등록 ) 인수 commute 가참이면환산이좀더빠르게수행됨 C : int MPI_Op_create (MPI_User_function *my_operator, int commute, MPI_Op *op) Fortran : EXTERNAL MY_OPERATOR INTEGER OP,IERR LOGICAL COMMUTE MPI_OP_CREATE (MY_OPERATOR, COMMUTE, OP, IERR) Supercomputing Center 94 47

48 MPI_REDUCE 예제 MPI_COMM_WORLD rank==root rank= rank= array a array a array a 6 sum 5 sum 4 sum tmp Supercomputing Center 95 MPI_REDUCE 예제 : Fortran PROGRAM reduce INCLUDE mpif.h REAL a(9) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) ista = myrank * + iend = ista + DO i=ista,iend a(i) = i sum =. DO i=ista,iend sum = sum + a(i) CALL MPI_REDUCE(sum,tmp,,MPI_REAL,MPI_SUM,,MPI_COMM_WORLD,ierr ) sum = tmp IF (myrank==) THEN PRINT *, sum =,sum ENDIF CALL MPI_FINALIZE(ierr) END Supercomputing Center 96 48

49 MPI_REDUCE 예제 : C /*reduce*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ista, iend; double a[9], sum, tmp; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); ista = myrank* ; iend = ista + ; for(i = ista; i<iend+; i++) a[i] = i+; sum =.; for(i = ista; i<iend+; i++) sum = sum + a[i]; MPI_Reduce(&sum, &tmp,, MPI_DOUBLE, MPI_SUM,, MPI_COMM_WORLD); sum = tmp; if(myrank == ) printf( sum = %f \n, sum); MPI_Finalize(); Supercomputing Center 97 MPI_REDUCE 예제 : 배열 COMM rank==root rank= rank= count sendbuf sendbuf sendbuf + op + op count 6 = ++ 5 = recvbuf Supercomputing Center 98 49

50 환산 : MPI_ALLREDUCE C Fortran int MPI_Allreduce(void *sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_ALLREDUCE(sendbuf, recvbuf,, count, datatype,, op, comm, ierr) q 각프로세스로부터데이터를모아하나의값으로환산, 그결과를모든프로세스에저장 Supercomputing Center 99 MPI_ALLREDUCE 예제 COMM count sendbuf sendbuf + op + op + op count 6 = = = + + recvbuf recvbuf recvbuf Supercomputing Center 5

51 확산 : MPI_SCATTER C Fortran int MPI_Scatter(void *sendbuf* sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_SCATTER(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의주소 (IN) INTEGER sendcount : 각프로세스로보내지는원소개수 (IN) INTEGER sendtype : 송신버퍼원소의 MPI 데이터타입 (IN) (CHOICE) recvbuf : 수신버퍼의주소 (IN) INTEGER recvcount : 수신버퍼의원소개수 (IN) INTEGER recvtype : 수신버퍼의 MPI 데이터타입 (IN) INTEGER root : 송신프로세스의랭크 (IN) INTEGER comm : 커뮤니케이터 (IN) q 루트프로세스는데이터를같은크기로나누어각프로세스에랭크순서대로하나씩전송 Supercomputing Center MPI_SCATTER 예제 COMM rank==root rank= rank= sendcount sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 5

52 MPI_SCATTER 예제 : Fortran PROGRAM scatter INCLUDE mpif.h INTEGER isend() CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN ENDIF DO i=,nprocs isend(i)=i CALL MPI_SCATTER(isend,, MPI_INTEGER, irecv,, MPI_INTEGER, &, MPI_COMM_WORLD, ierr) PRINT *, irecv =,irecv CALL MPI_FINALIZE(ierr) END Supercomputing Center MPI_SCATTER 예제 : C /*scatter*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, nprocs; int isend[], irecv; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for(i=; i<nprocs; i++) isend[i]=i+; MPI_Scatter(isend,, MPI_INT, &irecv,, MPI_INT,, MPI_COMM_WORLD); printf( %d: irecv = %d\n, myrank, irecv); MPI_Finalize(); Supercomputing Center 4 5

53 확산 : MPI_SCATTERV C Fortran int MPI_Scatterv(void *sendbuf, int sendcounts, int displs, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) MPI_SCATTERV(sendbuf, sendcounts, displs, sendtype, recvbuf, recvcount, recvtype,, root, comm, ierr) (CHOICE) sendbuf : 송신버퍼의주소 (IN) INTEGER sendcounts(*) : 정수배열, i 번째자리에프로세스 i 로전송될데이터개수저장 (IN) INTEGER displs(*) : 정수배열, i 번째자리에프로세스 i로전송될데이터의송신버퍼상의상대적위치가저장됨 (IN) q 루트프로세스는데이터를서로다른크기로나누어각프로세스에랭크순서대로하나씩전송 Supercomputing Center 5 MPI_SCATTERV 예제 COMM rank==root rank= rank= sendcount() = displs() = displs() sendcount() sendcount() = displs() 4 5 sendbuf recvcount recvbuf recvbuf recvbuf Supercomputing Center 6 5

54 장벽 : MPI_BARRIER C Fortran int MPI_Barrier(MPI_Comm comm) MPI_BARRIER(comm, ierr) q 커뮤니케이터내의모든프로세스가 MPI_BARRIER 를호출할때까지더이상의프로세스진행을막음 Supercomputing Center 7 기타 : MPI_ALLTOALL C Fortran int MPI_Alltoall(void *sendbuf, int sendcount, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcount, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLTOALL(sendbuf, sendcount, sendtype, recvbuf, recvcount, recvtype, comm, ierr) q 각프로세스로부터모든프로세스에동일한크기의개별적인메시지를전달 q 프로세스 i 로부터송신되는 j 번째데이터블록은프로세스 j 가받아수신버퍼의 i번째블록에저장 Supercomputing Center 8 54

55 MPI_ALLTOALL 예제 COMM sendcount rank= rank= rank= sendbuf sendbuf recvcount recvbuf 8 recvbuf 9 recvbuf Supercomputing Center 9 기타 : MPI_ALLTOALLV C Fortran int MPI_Alltoallv(void *sendbuf, int sendcounts, int sdispls, MPI_Datatype sendtype,, void *recvbuf* recvbuf, int recvcounts, int rdispls, MPI_Datatype recvtype, MPI_Comm comm) MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts, rdispls, recvtype, comm, ierr) q 각프로세스로부터모든프로세스에서로다른크기의개별적인메시지를전달 q 프로세스 i 로부터송신되는 sendcounts(j) 개의데이터는프로세스 j가받아수신버퍼의 rdispls(i) 번째위치부터저장 Supercomputing Center 55

56 MPI_ALLTOALLV 예제 COMM sendcounts() sendcounts() sendcounts() sendbuf rank= rank= rank= = sdispls() = sdispls() = sdispls() 4 5 sendbuf recvcounts() recvcounts() recvcounts() 4 7 recvbuf recvbuf = rdispls() = rdispls() = rdispls() 7 8 recvbuf Supercomputing Center 기타 : MPI_REDUCE_SCATTER C Fortran int MPI_Reduce_scatter(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int recvcounts, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_REDUCE_SCATTER(sendbuf, recvbuf, recvcounts, datatype,, op, comm, ierr) q 각프로세스의송신버퍼에저장된데이터를환산연산 (reduction Operation) 을수행하고그결과를차례대로 recvcounts(i) 개씩모아서프로세스 i로송신 q MPI_REDUCE + MPI_SCATTERV Supercomputing Center 56

57 MPI_REDUCE_SCATTER 예제 COMM rank= rank= rank= recvcounts() recvcounts() recvcounts() sendbuf sendbuf + op recvbuf 9 recvbuf recvbuf Supercomputing Center 기타 : MPI_SCAN C Fortran int MPI_Scan(void *sendbuf* sendbuf,, void *recvbuf* recvbuf, int count, MPI_Datatype datatype,, MPI_Op op, MPI_Comm comm) MPI_SCAN(sendbuf, recvbuf,, count, datatype,, op, comm, ierr) q 프로세스 i의수신버퍼에프로세스 에서프로세스 i까지의수신버퍼데이터들에대한환산 (reduction) 값을저장한다 Supercomputing Center 4 57

58 MPI_SCAN 예제 COMM rank= rank= rank= count sendbuf sendbuf + op + op count = + 6 = + + recvbuf recvbuf recvbuf Supercomputing Center 5 유도데이터타입 부분배열의전송 Supercomputing Center 6 58

59 유도데이터타입 (/) q 데이터타입이다르거나불연속적인데이터의전송 동일한데이터타입을가지는불연속데이터 다른데이터타입을가지는연속데이터다른데이터타입을가지는불연속데이터. 각각을따로따로전송. 새로운버퍼로묶어서전송후묶음을풀어원위치로저장 : MPI_PACK/MPI_UNPACK, MPI_PACKED( 데이터타입 ) è 느린속도, 불편, 오류의위험 Supercomputing Center 7 유도데이터타입 (/) q a(4), a(5), a(7), a(8), a(), a() 의전송 a() :REAL itype itype 유도데이터타입 itype, 한개전송 CALL MPI_SEND(a(4),, itype, idst, itag, 유도데이터타입 itype, 세개전송 MPI_COMM_WORLD, ierr) CALL MPI_SEND(a(4),, itype, idst, itag, MPI_COMM_WORLD, ierr) Supercomputing Center 8 59

60 유도데이터타입의사용 q CONSTRUCT MPI 루틴이용해새로운데이터타입작성 MPI_Type_contiguous MPI_Type_(h)vector MPI_Type_struct q COMMIT 작성된데이터타입등록 MPI_Type_Commit q USE 송신, 수신등에새로운데이터타입사용 Supercomputing Center 9 MPI_TYPE_COMMIT C int MPI_Type_commit (MPI_Datatype( *datatype) Fortran MPI_TYPE_COMMIT (datatype ( datatype, ierr) Fortran INTEGER datatype : 등록데이터타입 ( 핸들 ) (INOUT) q 새롭게정의된데이터타입을통신상에서사용가능하게함 q 등록된데이터타입은 MPI_TYPE_FREE(datatype, ierr) 을이용 해해제하기전까지계속사용가능 Supercomputing Center 6

61 MPI_TYPE_CONTIGUOUS C Fortran int MPI_Type_contiguous (int( count, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_CONTIGUOUS (count, oldtype, newtype, ierr) INTEGER count : 하나로묶을데이터개수 (IN) INTEGER oldtype : 이전데이터타입 ( 핸들 ) (IN) INTEGER newtype : 새로운데이터타입 ( 핸들 ) (OUT) q 같은데이터타입 (oldtype) 을가지는연속적인데이터를 count 개묶어새로운데이터타입 (newtype) 정의 Supercomputing Center MPI_TYPE_CONTIGUOUS 예제 = count newtype : MPI_INTEGER(oldtype) Supercomputing Center 6

62 MPI_TYPE_CONTIGUOUS 예제 : Fortran PROGRAM type_contiguous INCLUDE mpif.h INTEGER ibuf() INTEGER inewtype CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN ENDIF DO i=, ibuf(i) = i CALL MPI_TYPE_CONTIGUOUS(, MPI_INTEGER, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, ibuf =,ibuf CALL MPI_FINALIZE(ierr) END Supercomputing Center MPI_TYPE_CONTIGUOUS 예제 : C /*type_contiguous*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ibuf[]; MPI_Datatype inewtype ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) ibuf[i]=i+; else for(i=; i<; i++) ibuf[i]=; MPI_Type_contiguous(, MPI_INT, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : ibuf =, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 4 6

63 MPI_TYPE_VECTOR (/) C Fortran int MPI_Type_vector (int( count, int blocklength, int stride, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_VECTOR (count, blocklength, stride, oldtype, newtype, ierr) INTEGER count : 블록의개수 (IN) INTEGER blocklength : 각블록의 oldtype 데이터의개수 (IN) INTEGER stride : 인접한두블록의시작점사이의폭 (IN) INTEGER oldtype : 이전데이터타입 ( 핸들 ) (IN) INTEGER newtype : 새로운데이터타입 ( 핸들 ) (OUT) q 똑같은간격만큼떨어져있는 count 개블록들로구성되는새로운데이터타입정의 q 각블록에는 blocklength 개의이전타입데이터있음 Supercomputing Center 5 MPI_TYPE_VECTOR (/) count = stride = 5 blocklength = Supercomputing Center 6 6

64 MPI_TYPE_VECTOR 예제 blocklength 4=count newtype stride(# of units) :MPI_INTEGER(oldtype) Supercomputing Center 7 MPI_TYPE_VECTOR 예제 : Fortran PROGRAM type_vector INCLUDE mpif.h INTEGER ibuf(), inewtype CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=, ibuf(i) = i ENDIF CALL MPI_TYPE_VECTOR(4,,, MPI_INTEGER, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, ibuf =, ibuf CALL MPI_FINALIZE(ierr) END Supercomputing Center 8 64

65 MPI_TYPE_VECTOR 예제 : C /*type_vector*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank, ibuf[]; MPI_Datatype inewtype ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) ibuf[i]=i+; else for(i=; i<; i++) ibuf[i]=; MPI_Type_vector(4,,, MPI_INT, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : ibuf =, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 9 MPI_TYPE_HVECTOR C Fortran int MPI_Type_hvector (int count, int blocklength, int stride, MPI_Datatype oldtype, MPI_Datatype *newtype) MPI_TYPE_HVECTOR (count, blocklength, stride, oldtype, newtype, ierr) q stride 를바이트단위로표시 bytes = stride : MPI_TYPE_HVECTOR bytes = stride*extent(oldtype extent(oldtype) ) : MPI_TYPE_VECTOR blocklength newtype 4=count stride(# of bytes) :MPI_INTEGER(oldtype) Supercomputing Center 65

66 MPI_TYPE_STRUCT (/) C Fortran int MPI_Type_struct (int count, int *array_of_blocklengths, MPI_Aint *array_of_displacements, MPI_Datatype *array_of_type, MPI_Datatype *newtype) MPI_TYPE_STRUCT (count, array_of_blocklengths, array_of_displacements, array_of_types, newtype, ierr) INTEGER count : 블록의개수, 동시에배열 array_of_blocklengths, array_of_displacements, array_of_types 의원소의개수를나타냄 (IN) INTEGER array_of_blocklengths(*) : 각블록당데이터의개수, array_of_blocklengths(i) 는데이터타입이 array_of_types(i) 인 i번째블록의데이터개수 (IN) INTEGER array_of_displacements(*) : 바이트로나타낸각블록의위치 (IN) INTEGER array_of_types(*) : 각블록을구성하는데이터타입, i 번째블록은데이터타입이 array_of_types(i) 인데이터로구성 (IN) INTEGER newtype : 새로운데이터타입 (OUT) Supercomputing Center MPI_TYPE_STRUCT (/) q 가장일반적인유도데이터타입 q 서로다른데이터타입들로구성된변수정의가능 C 구조체 Fortran 커먼블록 q count 개블록으로구성된새로운데이터타입정의, i 번째블록은데이터타입이 array_of_types(i) 인 array_of_blocklengths(i) 개의데이터로구성되며그위치는 array_of_displacements(i) 가됨 Supercomputing Center 66

67 MPI_TYPE_STRUCT (/) count = array_of_blocklengths = {, array_of_types = {MPI_INT, MPI_DOUBLE array_of_displacements = {, extent(mpi_int) Supercomputing Center MPI_TYPE_STRUCT 예제 (/) = count array_of_blocklengths()= array_of_blocklengths()= array_of_types() newtype array_of_types() : MPI_INTEGER array_of_displacements()= array_of_displacements()=5*4 Supercomputing Center 4 67

68 MPI_TYPE_STRUCT 예제 (/) = count array_of_blocklengths()= array_of_blocklengths()= newtype array_of_types() L array_of_types() array_of_displacements()=*4 : MPI_INTEGER L : MPI_LB array_of_displacements()= Supercomputing Center 5 MPI_TYPE_STRUCT 예제 : Fortran (/) PROGRAM type_struct INCLUDE mpif.h INTEGER ibuf(), ibuf() INTEGER iblock(), idisp(), itype() CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN DO i=, ibuf(i) = i ibuf(i) = i ENDIF iblock() = ; iblock() = idisp() = ; idisp() = 5 * 4 itype() = MPI_INTEGER; itype() = MPI_INTEGER CALL MPI_TYPE_STRUCT(, iblock, idisp, itype, inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) Supercomputing Center 6 68

69 MPI_TYPE_STRUCT 예제 : Fortran (/) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, Ex. :,ibuf iblock() = ; iblock() = idisp() = ; idisp() = * 4 itype() = MPI_LB itype() = MPI_INTEGER CALL MPI_TYPE_STRUCT(, iblock, idisp, itype,inewtype, ierr) CALL MPI_TYPE_COMMIT(inewtype, ierr) CALL MPI_BCAST(ibuf,, inewtype,, MPI_COMM_WORLD, ierr) PRINT *, Ex. :,ibuf CALL MPI_FINALIZE(ierr) END MPI_UB, MPI_LB : MPI 유사 (pseudo) 타입 차지하는공간없이데이터타입의시작, 끝에서빈공간이나타나도록해야할때사용 Supercomputing Center 7 MPI_TYPE_STRUCT 예제 : C (/) /*type_struct*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, myrank ; int ibuf[], ibuf[], iblock[]; MPI_Datatype inewtype, inewtype; MPI_Datatype itype[]; MPI_Aint idisp[]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<; i++) { ibuf[i]=i+; ibuf[i]=i+; Supercomputing Center 8 69

70 MPI_TYPE_STRUCT 예제 : C (/) else for(i=; i<; i++){ ibuf[i]=; ibuf[i]=; iblock[] = ; iblock[] = ; idisp[] = ; idisp[] = 5*4; itype[] = MPI_INT; itype[] = MPI_INT; MPI_Type_struct(, iblock, idisp, itype, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : Ex. :, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); Supercomputing Center 9 MPI_TYPE_STRUCT 예제 : C (/) iblock[] = ; iblock[] = ; idisp[] = ; idisp[] = *4; itype[] = MPI_LB; itype[] = MPI_INT; MPI_Type_struct(, iblock, idisp, itype, &inewtype); MPI_Type_commit(&inewtype); MPI_Bcast(ibuf,, inewtype,, MPI_COMM_WORLD); printf( %d : Ex. :, myrank); for(i=; i<; i++) printf( %d, ibuf[i]); printf( \n ); MPI_Finalize(); Supercomputing Center 4 7

71 MPI_TYPE_EXTENT C int MPI_Type_extent (MPI_Datatype( *datatype, MPI_Aint *extent) Fortran MPI_TYPE_EXTENT (datatype ( datatype, extent, ierr) Fortran INTEGER datatype : 데이터타입 ( 핸들 ) (IN) INTEGER extent : 데이터타입의범위 (OUT) q 데이터타입의범위 = 메모리에서차지하는바이트수 Supercomputing Center 4 MPI_TYPE_EXTENT 예제 : Fortran (/) PROGRAM structure INCLUDE 'mpif.h' INTEGER err, rank, num INTEGER status(mpi_status_size) REAL x COMPLEX data(4) COMMON /result/num,x,data INTEGER blocklengths() DATA blocklengths/,,4/ INTEGER displacements() INTEGER types(), restype DATA types/mpi_integer,mpi_real,mpi_complex/ INTEGER intex,realex CALL MPI_INIT(err) CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,err) CALL MPI_TYPE_EXTENT(MPI_INTEGER,intex,err) CALL MPI_TYPE_EXTENT(MPI_REAL,realex,err) displacements()=; displacements()=intex Supercomputing Center 4 7

72 MPI_TYPE_EXTENT 예제 : Fortran (/) displacements()=intex+realex CALL MPI_TYPE_STRUCT(,blocklengths,displacements, & types,restype,err) CALL MPI_TYPE_COMMIT(restype,err) IF(rank.eq.) THEN num=6; x=.4 DO i=,4 data(i)=cmplx(i,i) CALL MPI_SEND(num,,restype,,,MPI_COMM_WORLD,err) ELSE IF(rank.eq.) THEN CALL MPI_RECV(num,,restype,,,MPI_COMM_WORLD,status,err) PRINT *,'P:',rank,' I got' PRINT *,num PRINT *,x PRINT *,data END IF CALL MPI_FINALIZE(err) END Supercomputing Center 4 MPI_TYPE_EXTENT 예제 : C (/) #include <stdio.h> #include<mpi.h> void main(int argc, char *argv[]) { int rank,i; MPI_Status status; struct { int num; float x; double data[4]; a; int blocklengths[]={,,4; MPI_Datatype types[]={mpi_int,mpi_float,mpi_double; MPI_Aint displacements[]; MPI_Datatype restype; MPI_Aint intex,floatex; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Type_extent(MPI_INT,&intex); MPI_Type_extent(MPI_FLOAT,&floatex); Supercomputing Center 44 7

73 MPI_TYPE_EXTENT 예제 : C (/) displacements[]=; displacements[]=intex; displacements[]=intex+floatex; MPI_Type_struct(,blocklengths,displacements,types,&restype); MPI_Type_commit(&restype); if (rank==){ a.num=6; a.x=.4; for(i=;i<4;++i) a.data[i]=(double) i; MPI_Send(&a,,restype,,5,MPI_COMM_WORLD); else if(rank==) { MPI_Recv(&a,,restype,,5,MPI_COMM_WORLD,&status); printf("p:%d my a is %d %f %lf %lf %lf %lf\n", rank,a.num,a.x,a.data[],a.data[],a.data[],a.data[]); MPI_Finalize(); Supercomputing Center 45 부분배열의전송 (/) C Fortran int MPI_Type_create_subarray (int ndims,int *array_of_sizes, int *array_of_subsizes, int *array_of_starts, int order, MPI_Datatype oldtype, MPI_Datatype *newtype); MPI_TYPE_CREATE_SUBARRAY (ndims, array_of_sizes, array_of_subsizes, array_of_starts, order, oldtype, newtype, ierr) INTEGER ndims : 배열의차원 ( 양의정수 ) (IN) INTEGER array_of_sizes(*) : 전체배열의각차원의크기, i 번째원소는 i번째차원의크기 ( 양의정수 ) (IN) INTEGER array_of_subsizes(*) : 부분배열의각차원의크기, i 번째원소는 i 번째차원의크기 ( 양의정수 ) (IN) INTEGER array_of_starts(*) : 부분배열의시작좌표, i 번째원소는 i번째차원의시작좌표 ( 부터시작 ) (IN) INTEGER order : 배열저장방식 ( 행우선또는열우선 ) 결정 (IN) INTEGER oldtype : 전체배열원소의데이터타입 (IN) INTEGER newtype : 부분배열로구성된새로운데이터타입 (OUT) Supercomputing Center 46 7

74 부분배열의전송 (/) q 부분배열로구성되는유도데이터타입생성루틴 q order : 배열을읽고저장하는방식결정 order = MPI_ORDER_FORTRAN : 열우선 order = MPI_ORDER_C : 행우선 MPI_TYPE_CREATE_SUBARRAY 는 MPI- 에서지원하는루틴으로 KISTI IBM 시스템에서컴파일하는경우 _r 을붙일것 % mpxlf9_r o % mpcc_r o Supercomputing Center 47 부분배열의전송예제 a(:7,:6) ndims = array_of_sizes() = 6; array_of_sizes() = 7 array_of_subsizes() = ; array_of_subsizes() = 5 array_of_starts() = ; array_of_starts() = order = MPI_ORDER_FORTRAN Supercomputing Center 48 74

75 부분배열의전송예제 : Fortran (/) PROGRAM sub_array INCLUDE 'mpif.h' INTEGER ndims PARAMETER(ndims=) INTEGER ibuf(:7,:6) INTEGER array_of_sizes(ndims), array_of_subsizes(ndims) INTEGER array_of_starts(ndims) CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) DO j =, 6 DO i =, 7 IF (myrank==) THEN ibuf(i,j) = i ELSE ibuf(i,j) = ENDIF Supercomputing Center 49 부분배열의전송예제 : Fortran (/) array_of_sizes()=6; array_of_sizes()=7 array_of_subsizes()=; array_of_subsizes()=5 array_of_starts()=; array_of_starts()= CALL MPI_TYPE_CREATE_SUBARRAY(ndims, array_of_sizes, & array_of_subsizes, array_of_starts, MPI_ORDER_FORTRAN,& MPI_INTEGER, newtype, ierr) CALL MPI_TYPE_COMMIT(newtype, ierr) CALL MPI_BCAST(ibuf,, newtype,, MPI_COMM_WORLD, ierr) PRINT *, I am :, myrank DO i=,7 PRINT *, (ibuf(i,j), j=,6) CALL MPI_FINALIZE(ierr) END Supercomputing Center 5 75

76 부분배열의전송예제 : C (/) #include <mpi.h> #define ndims void main(int argc, char *argv[]){ int ibuf[6][7]; int array_of_sizes[ndims], array_of_subsizes[ndims], array_of_starts[ndims]; int i, j, myrank; MPI_Datatype newtype; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank==) for(i=; i<6; i++) for(j=; j<7; j++) ibuf[i][j] = i+; else for(i=; i<6; i++) for(j=; j<7; j++) ibuf[i][j] = ; array_of_sizes[]=6; array_of_sizes[]=7; array_of_subsizes[]=; array_of_subsizes[]=5; array_of_starts[]=; array_of_startst[]=; Supercomputing Center 5 부분배열의전송예제 : C (/) MPI_Type_create_subarray(ndims, array_of_sizes, array_of_subsizes, array_of_starts, MPI_ORDER_C, MPI_INT, &newtype); MPI_Type_commit(&newtype); MPI_Bcast(ibuf,, newtype,, MPI_COMM_WORLD); if(myrank!= ) { printf(" I am : %d \n ", myrank); for(i=; i<6; i++) { for(j=; j<7; j++) printf(" %d", ibuf[i][j]); printf("\n"); MPI_Finalize(); Supercomputing Center 5 76

77 프로세스그룹생성 : MPI_COMM_SPLIT 가상토폴로지 Supercomputing Center 5 프로세스그룹생성 : MPI_COMM_SPLIT C Fortran int MPI_Comm_split(MPI_Comm comm, int color, int key, MPI_Comm *newcomm) MPI_COMM_SPLIT(comm, color, key, newcomm, ierr) INTEGER comm: 커뮤니케이터 ( 핸들 ) (IN) INTEGER color: 같은 color 을가지는프로세스들을같은그룹에포함 (IN) INTEGER key: key 순서에따라그룹내의프로세스에새로운랭크를할당 (IN) INTEGER newcomm: 새로운커뮤니케이터 ( 핸들 ) (OUT) q comm 내의프로세스들을여러그룹으로묶은새로운커뮤니케이터 newcomm 생성 q color q color = MPI_UNDEFINED è newcomm = MPI_COMM_NULL Supercomputing Center 54 77

78 MPI_COMM_SPLIT 예제 MPI_COMM_WORLD newcomm newcomm icolor= icolor= ikey= ikey= ikey= ikey= rank= rank= rank= rank= rank= rank= rank= rank= Supercomputing Center 55 MPI_COMM_SPLIT 예제 : Fortran PROGRAM comm_split INCLUDE mpif.h CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) IF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ELSEIF (myrank==) THEN icolor = ; ikey = ENDIF CALL MPI_COMM_SPLIT(MPI_COMM_WORLD, icolor, ikey, newcomm, ierr) CALL MPI_COMM_SIZE(newcomm, newprocs, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) PRINT *, newcomm=, newcomm, newprocs=,newprocs, newrank=,newrank CALL MPI_FINALIZE(ierr) END Supercomputing Center 56 78

79 MPI_COMM_SPLIT 예제 : C (/) /*comm_split*/ #include <mpi.h> #include <stdio.h> void main (int argc, char *argv[]){ int i, nprocs, myrank ; int icolor, ikey; int newprocs, newrank; MPI_Comm newcomm; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank == ){ icolor = ; ikey = ; else if (myrank == ){ icolor = ; ikey = ; Supercomputing Center 57 MPI_COMM_SPLIT 예제 : C (/) else if (myrank == ){ icolor = ; ikey = ; else if (myrank == ){ icolor = ; ikey = ; MPI_Comm_split(MPI_COMM_WORLD, icolor, ikey, &newcomm); MPI_Comm_size(newcomm, &newprocs); MPI_Comm_rank(newcomm, &newrank); printf( %d, myrank); printf( newcomm = %d, newcomm); printf( newprocs = %d, newprocs); printf( newrank = %d, newrank); printf( \n ); MPI_Finalize(); Supercomputing Center 58 79

80 가상토폴로지 (/) q 통신패턴에적합하도록프로세스에적절한이름을부여한새로운커뮤니케이터를구성하는것 q 코드작성을쉽게하고최적화된통신을가능케함 q 직교가상토폴로지 가상적인그리드상에서각프로세스가인접한이웃과연결 각프로세스는직교좌표값으로식별됨 주기적경계조건 (periodic boundary) Supercomputing Center 59 dim 가상토폴로지 (/) dim (,) (,) 6 (,) 9 (,) (,) 4 (,) 7 (,) (,) (,) 5 (,) 8 (,) (,) Supercomputing Center 6 8

81 가상토폴로지의사용 q 토폴로지를만들어새로운커뮤니케이터생성 MPI_CART_CREATE q MPI 대응함수를통해토폴로지상의명명방식에근거한프로세스랭크계산 MPI_CART_RANK MPI_CART_COORDS MPI_CART_SHIFT Supercomputing Center 6 토폴로지생성 : MPI_CART_CREATE C Fortran int MPI_Cart_create(MPI_Comm oldcomm, int ndims, int *dimsize, int *periods, int reorder, MPI_Comm *newcomm) MPI_CART_CREATE(oldcomm, ndims, dimsize,, periods, reorder, newcomm, ierr) INTEGER oldcomm: 기존커뮤니케이터 (IN) INTEGER ndims: 직교좌표의차원 (IN) INTEGER dimsize(*): 각좌표축의길이. 크기 ndims 의배열 (IN) LOGICAL periods(*): 각좌표축의주기성결정. 크기 ndims 의배열 (IN) LOGICAL reorder: MPI 가프로세스랭크를재정렬할것인가를결정 (IN) INTEGER newcomm: 새로운커뮤니케이터 (OUT) q q 가상토폴로지의구성을가지는커뮤니케이터 newcomm 리턴 인수 reorder 가거짓이면기존커뮤니케이터의랭크를그대로가지 고랭크와토폴로지그리드좌표사이의대응만설정함 Supercomputing Center 6 8

82 MPI_CART_CREATE 예제 4 5, (), (), (4), (), (), (5) Supercomputing Center 6 MPI_CART_CREATE 예제 : Fortran PROGRAM cart_create INCLUDE mpif.h INTEGER oldcomm, newcomm, ndims, ierr INTEGER dimsize(:) LOGICAL periods(:), reorder CALL MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) oldcomm = MPI_COMM_WORLD ndims = dimsize() = ; dimsize() = periods() =.TRUE.; periods() =.FALSE. reorder =.FALSE. CALL MPI_CART_CREATE(oldcomm,ndims,dimsize,periods,reorder, newcomm, ierr) CALL MPI_COMM_SIZE(newcomm, newprocs, ierr) CALL MPI_COMM_RANK(newcomm, newrank, ierr) PRINT*,myrank, :newcomm=,newcomm, newprocs=,newprocs, & newrank=,newrank CALL MPI_FINALIZE(ierr) END Supercomputing Center 64 8