主题:一个MPI并行的程序,使用某些奇数个核并行时就会出错,是什么原因?
我的电脑cpu有8个核,在wmpiexec中设置123456个核时都可以计算,设置7核时报如下错误,设置8核时可以计算,设置9、11核时报错与7类似,设置10、12核时可以计算。
这个现象是程序原因还是什么原因?
wmpiexec窗口报错信息如下:
computation start:
Fatal error in PMPI_Barrier: Message truncated, error stack:
PMPI_Barrier(425)...................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(306)..............:
MPIR_Bcast_impl(1273)...............:
MPIR_Bcast_intra(1107)..............:
MPIR_Bcast_binomial(143)............:
MPIC_Recv(101)......................:
MPIDI_CH3U_Request_unpack_uebuf(599): Message truncated; 2560 bytes received but buffer size is 1
Fatal error in PMPI_Barrier: Message truncated, error stack:
PMPI_Barrier(425)...................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(306)..............:
MPIR_Bcast_impl(1273)...............:
MPIR_Bcast_intra(1107)..............:
MPIR_Bcast_binomial(143)............:
MPIC_Recv(101)......................:
MPIDI_CH3U_Request_unpack_uebuf(599): Message truncated; 2560 bytes received but buffer size is 1
job aborted:
rank: node: exit code[: error message]
0: my: 123
1: my: 1: process 1 exited without calling finalize
2: my: 123
3: my: 123
4: my: 1: process 4 exited without calling finalize
5: my: 123
6: my: 123
7: my: 123
这个现象是程序原因还是什么原因?
wmpiexec窗口报错信息如下:
computation start:
Fatal error in PMPI_Barrier: Message truncated, error stack:
PMPI_Barrier(425)...................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(306)..............:
MPIR_Bcast_impl(1273)...............:
MPIR_Bcast_intra(1107)..............:
MPIR_Bcast_binomial(143)............:
MPIC_Recv(101)......................:
MPIDI_CH3U_Request_unpack_uebuf(599): Message truncated; 2560 bytes received but buffer size is 1
Fatal error in PMPI_Barrier: Message truncated, error stack:
PMPI_Barrier(425)...................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(306)..............:
MPIR_Bcast_impl(1273)...............:
MPIR_Bcast_intra(1107)..............:
MPIR_Bcast_binomial(143)............:
MPIC_Recv(101)......................:
MPIDI_CH3U_Request_unpack_uebuf(599): Message truncated; 2560 bytes received but buffer size is 1
job aborted:
rank: node: exit code[: error message]
0: my: 123
1: my: 1: process 1 exited without calling finalize
2: my: 123
3: my: 123
4: my: 1: process 4 exited without calling finalize
5: my: 123
6: my: 123
7: my: 123