Parallel DeBugger (PDB) For Debugging MPI Programs

Performance & Correctness Tools



Parallel applications involve a large number of processes taking potentially different paths through the program. Therefore, understanding the precise behaviour of parallel programs is difficult. Since MPI programs communicate and share data during their operation, a bug can start in one process and percolate to others, making it difficult to determine the bug's exact location in the source code. In traditional debuggers, all processes are handled by programmers. We present Parallel DeBugger (PDB), a debugger that automatically tracks program activities to find potential bugs while retaining all the features of a traditional debugger. PDB spawns an instance of GDB in every process, hence the debugging environment and the commands are same as GDB while adding new ones specialized for parallel execution. PDB follows client-server model to provide a single interface to all processes. The client is completely decoupled from the server, thus making it possible to define multiple server front-ends without changing the client. PDB enables debugging only a subset of processes using simple commands. Bugs in point-to-point communication can be detected by putting breakpoints only in the involved processes. Tracking of collective calls and tracking of buffers under nonblocking transfer are examples of automated tracking done by PDB. It integrates with other tools such as AddressSanitizer to identify memory-related bugs. We used four real-world MPI bugs, related to collectives, nonblocking calls and divergent behavior of different processes, to illustrate the effectiveness of PDB. Our examples show PDB's ability to debug small-to-medium scale MPI programs.