|SF project page|
|Plugins and other APIs|
DMTCP (Distributed MultiThreaded Checkpointing) transparently checkpoints a single-host or distributed computation in user-space -- with no modifications to user code or to the O/S. It works on most Linux applications, including Python, Matlab, R, GUI desktops, MPI, etc. It is robust and widely used (on Sourceforge since 2007).News | See Also | Authors | Acknowledgement
Among the applications supported by DMTCP are MPI (various implementations), OpenMP, MATLAB, Python, Perl, R, and many programming languages and shell scripting languages. With the use of TightVNC, it can also checkpoint and restart X-Window applications. The OpenGL library for 3D graphics is supported through a special plugin. It also has strong support for HPC (High Performance Computing) environments, including MPI, SLURM, InfiniBand, and other components. See QUICK-START.md for further details.
DMTCP supports the commonly used OFED API for InfiniBand, as well as its integration with various implementatoins of MPI, and resource managers (e.g., SLURM). See contrib/infiniband/README for more details.
We are currently looking for well qualified applicants who are interested in joining a Ph.D. program in order to do research on checkpointing and reversible debugging. Interested applicants should write to Gene Cooperman (firstname.lastname@example.org) at Northeastern University.
DMTCP is currently maintained by Kapil Arya, Gene Cooperman, Rohan Garg, Jiajun Cao, and Artem Polyakov. The list of active developers continues to evolve.
The DMTCP project is partially supported by grants from Intel Corporation, and from the National Science Foundation under grant ACI-1440788. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of Intel Corporation or of the National Science Foundation.
Click here for comments.