DMTCP: Distributed MultiThreaded CheckPointing

About DMTCP:

DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.

Among the applications supported by DMTCP are Open MPI, MATLAB, R, Python, Perl, and many programming languages and shell scripting languages. DMTCP also supports GNU screen sessions, including vim/cscope and emacs. With the use of TightVNC, it can also checkpoint and restart X Window applications. The OpenGL library for 3D graphics is supported through a special plugin. See the QUICK-START file for further details.

DMTCP supports the commonly used OFED API for InfiniBand. See contrib/infiniband/README for more details.

News | See Also | Authors | Acknowledgement

Announcement!

We are currently looking for well qualified applicants who are interested in joining a Ph.D. program in order to do research on checkpointing and reversible debugging. Interested applicants should write to Gene Cooperman (gene@ccs.neu.edu) at Northeastern University.
[2015-03-25]: DMTCP 2.4.0-rc2 released!
This is release candidate 2 for DMTCP 2.4.0. It is especially important to upgrade if you are using any of MATLAB, MPI, SLURM or Torque; or if you have a newer Linux kernel using '[vvar]' (test with: grep '\[vvar]' /proc/self/maps ); or if you use glibc-2.21 or later (test with: ls -l /lib*/libc.so.6 /lib/*/libc.so.6 ).
[2015-03-17]: DMTCP 2.4.0-rc1 released!
This is release candidate 1 for DMTCP 2.4.0. (See above for latest release candidate.)
[2014-07-14]: DMTCP 2.3.1 released!
This is primarily a bug fix release.
[2014-07-03]: DMTCP 2.3 released!
This is primarily a bug fix release. However, if you are using DMTCP for the ARM v7 CPU, or if you are using DMTCP either with the InfiniBand network or with the SLURM batch system, then it is strongly recommended to upgrade. Check release notes for more details.
[2014-03-20]: DMTCP 2.2.1 released!
This is a bug fix release. Users relying on --enable-unique-checkpoint-filenames configure flag are highly recommended to upgrade to this release. Check release notes for more details.
[2014-03-14]: DMTCP 2.2 released!
In this release, the lowest layers have been re-organized and partially re-written for greater clarity of code and greater maintainability. Also, users relying on the use of DMTCP with MPI, InfiniBand or the Toruqe or SLURM batch queues are strongly advised to upgrade. Check release notes for more details.
[2014-01-12]: DMTCP 2.1 released!
This release includes enhancement to the core feature set and some newly stable plugins. Check release notes for more details.
[2013-10-03]: DMTCP 2.0 released!
This version 2.0 release represents the future of DMTCP. DMTCP version 2.0 has been re-designed around the concept of plugins. The older DMTCP version 1.2.x branch will continue to be maintained for bug fixes. Check release notes for more details.
DMTCP is currently maintained by Kapil Arya, Gene Cooperman, Rohan Garg, Jiajun Cao, and Artem Polyakov. The list of active developers continues to evolve.
The DMTCP project is partially supported by Intel Corporation and by the National Science Foundation under grant OCI-0960978. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of Intel Corporation or of the National Science Foundation.

Click here for comments.

SourceForge.net Logo