Enhancing Parallel Computing in Python with MPI: A Deep Dive
Written on
Chapter 1: Introduction to MPI in Python
In this article, we expand on our exploration of parallel computing using Python and MPI. Previously, we laid the groundwork by establishing MPI and mpi4py and executing a basic parallel Python script. Now, we will further investigate MPI, specifically concentrating on the crucial aspect of message passing between processes. We will focus on point-to-point communication, illustrating how to send and receive both Python objects and numpy arrays using the lower and upper case send and receive functions. By the conclusion of this article, you will have a solid understanding of how to utilize MPI for point-to-point communication, including the efficient transmission of large numpy arrays between processes.
Section 1.1: Sending and Receiving Python Objects
Our objective is to run four processes, identified as ranks 0, 1, 2, and 3. As a reminder from Part 1, "process" is synonymous with "rank" in MPI terminology. Rank 1 will send a message (ping) to rank 3, which will then acknowledge receipt of the message. Subsequently, rank 3 will respond to rank 1 with a thank you message (pong). Meanwhile, ranks 0 and 2 will remain idle.
The functions necessary for this operation are MPI.COMM_WORLD.send and MPI.COMM_WORLD.recv for sending and receiving messages, respectively. There are also upper case variants, MPI.COMM_WORLD.Send and MPI.COMM_WORLD.Recv, which are particularly vital for numerical computations. The distinction between the two sets of functions lies in their handling of messages: the lower case functions serialize messages into binary format using pickle, while the upper case functions can handle any messages compliant with the Python buffer protocol, such as numpy arrays, allowing for faster transmission. If an object does not comply with the buffer protocol, the lower case versions must be used.
Let's first examine the code for this communication (file ping.py):
from mpi4py import MPI
comm_world = MPI.COMM_WORLD
rank = comm_world.Get_rank()
if (rank == 1):
message = ["apples", "oranges"]
print(f"Process {rank} is sending message: {message}")
comm_world.send(message, dest=3)
# Wait for a response from rank 3:
answer = comm_world.recv(source=3)
print(f"Process {rank} received the following answer: {answer}")
elif (rank == 3):
# Wait for a message from rank 1:
received_message = comm_world.recv(source=1)
print(f"Process {rank} received the following message: {received_message}")
# Acknowledge receipt to process 1:
comm_world.send("Thank you for your message!", dest=1)
else:
print(f"Process {rank} is doing nothing (except printing this text).")
To execute this program in parallel across four processes, you can use the command:
mpiexec -n 4 python ping.py
The output will resemble the following:
Process 2 is doing nothing (except printing this text).
Process 0 is doing nothing (except printing this text).
Process 1 is sending message: ['apples', 'oranges']
Process 3 received the following message: ['apples', 'oranges']
Process 1 received the following answer: Thank you for your message!
By invoking mpiexec, the same Python script operates on four processors (or fewer, depending on availability), with each process assigned a distinct rank. Here, rank 1 transmits a message to destination rank 3. After rank 3 receives the message, it acknowledges receipt by sending a thank you note back to rank 1, which remains in a waiting state until it receives the acknowledgment.
Section 1.2: Advanced Message Handling
Beyond basic send and receive functions, an optional tag argument allows for categorizing messages, enabling the distinction between various types of communications. For instance, you could utilize:
comm_world.send("Thank you for your message!", dest=1, tag=42)
This tagging feature can also be applied to the recv function.
When dealing with numpy arrays, while the lower case functions are user-friendly, they are not the most efficient for high-performance numerical tasks. For transmitting large arrays, it is advisable to use the faster Send and Recv functions, which are suitable for objects adhering to the Python buffer protocol, such as numpy arrays.
The usage of Recv differs slightly from recv; the upper case version does not return a value. Instead, you provide a predefined empty object of the appropriate size, which the function fills.
As a practical example, we will implement a scenario with two processes where process 0 defines a function on an array and sends it to process 1, which then computes the derivative of the received array:
import numpy as np
from findiff import FinDiff
from mpi4py import MPI
comm_world = MPI.COMM_WORLD
rank = comm_world.Get_rank()
size = comm_world.Get_size()
x = np.linspace(0, 1, 100, dtype=np.float64)
if rank == 0:
f = np.sin(x)
comm_world.Send(f, dest=1)
print(f"Rank {rank}: {f[0]}")
else:
d_dx = FinDiff(0, x[1] - x[0], 1, acc=4)
f = np.empty_like(x, dtype=np.float64)
comm_world.Recv(f, source=0)
diff_f = d_dx(f)
print(f"Rank {rank}: {diff_f[0]}")
Running this program with:
mpiexec -n 2 example.py
Will output:
Rank 0: 0.0
Rank 1: 0.9999999979182866
This illustrates the first rank calculating sin(0) and the second rank calculating the derivative at that point, yielding cos(0).
In summary, this discussion covers the essentials of point-to-point communication using MPI with mpi4py, setting the stage for our next topic: message broadcasting.
Chapter 2: Further Exploration of MPI Techniques
The first video titled "Parallel programming with MPI (Part III): mpi4py" offers an in-depth exploration of parallel programming techniques using MPI in Python. It focuses on practical implementations and advanced concepts within the mpi4py library.
The second video, "Introduction to parallel programming with MPI and Python," serves as a foundational guide for those new to parallel programming, providing essential knowledge and skills necessary for effective implementation.