DMTCP: Distributed MultiThreaded CheckPointing

About DMTCP:

DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.

Among the applications supported by DMTCP are Open MPI, MATLAB, Python, Perl, and many programming languages and shell scripting languages. DMTCP also supports GNU screen sessions, including vim/cscope and emacs. With the use of TightVNC, it can also checkpoint and restart X Window applications, as long as they do not use extensions (e.g.: no OpenGL, no video). See the QUICK-START file for further details.

DMTCP supports the OFED API for InfiniBand on an experimental basis. See contrib/infiniband/README for more details.

News | See Also | Authors | Acknowledgement

Announcement!

We are currently looking for well qualified applicants who are interested in joining a Ph.D. program in order to do research on checkpointing and reversible debugging. Interested applicants should write to Gene Cooperman (gene@ccs.neu.edu) at Northeastern University.
[2014-07-14]: DMTCP 2.3.1 released!
This is primarily a bug fix release.
[2014-07-03]: DMTCP 2.3 released!
This is primarily a bug fix release. However, if you are using DMTCP for the ARM v7 CPU, or if you are using DMTCP either with the InfiniBand network or with the SLURM batch system, then it is strongly recommended to upgrade. Check release notes for more details.
[2014-03-20]: DMTCP 2.2.1 released!
This is a bug fix release. Users relying on --enable-unique-checkpoint-filenames configure flag are highly recommended to upgrade to this release. Check release notes for more details.
[2014-03-14]: DMTCP 2.2 released!
In this release, the lowest layers have been re-organized and partially re-written for greater clarity of code and greater maintainability. Also, users relying on the use of DMTCP with MPI, InfiniBand or the Toruqe or SLURM batch queues are strongly advised to upgrade. Check release notes for more details.
[2014-01-12]: DMTCP 2.1 released!
This release includes enhancement to the core feature set and some newly stable plugins. Check release notes for more details.
[2013-10-03]: DMTCP 2.0 released!
This version 2.0 release represents the future of DMTCP. DMTCP version 2.0 has been re-designed around the concept of plugins. The older DMTCP version 1.2.x branch will continue to be maintained for bug fixes. Check release notes for more details.
DMTCP is currently maintained by Kapil Arya, Gene Cooperman, Rohan Garg, Jianun Cao, and Artem Polyakov. The list of active developers continues to evolve.
The DMTCP project is partially supported by Intel Corporation and by the National Science Foundation under grant OCI-0960978. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of Intel Corporation or of the National Science Foundation.

Click here for comments.

SourceForge.net Logo