DMTCP Publications

Citing DMTCP (please cite this publication):

  1. DMTCP Publications (enhancements to DMTCP distribution)
  2. Publications using DMTCP in their work

A. DMTCP Publications (enhancements to DMTCP distribution; reverse chronological order):

B. Publications using DMTCP in their work (not simply citing DMTCP) (in reverse chronological order):

  1. "Smart Scene Management for IoT-based Constrained Devices using Checkpointing",
    François Aïssaoui, Gene Cooperman, Thierry Monteil, and Saïd Tazi,
    15th {IEEE} Int. Symp. on Network Computing and Applications (NCA'16),
    Cambridge, Boston, MA, USA, October 31 - November 2, 2016), pp. 170--174, IEEE Press, Nov., 2016, Bibtex.

  2. "Scalable System-level Transparent Checkpointing for OpenSHMEM",
    Rohan Garg, Jérôme Vienne and Gene Cooperma
    OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environements --- Third Workshop, OpenSHMEM 2016, Baltimore, MD, USA, Aug. 2--4, 2016, Revised Selected Papers (OpenSHMEM'16),
    pp. 52--65, Lecture Notes in Computer Science, Volume 10007, Springer-Verlag, Aug., 2016 Bibtex.

  3. Extended Batch Sessions and Three-Phase Debugging: Using DMTCP to Enhance the Batch Environment,
    Rohan Garg, Jiajun Cao, Kapil Arya, Gene Cooperman and Jérôme Vienn
    Proc. of the (XSEDE16) Conference on Diversity, Big Data, and Science at Scale,
    pp. 42:1--42:8, ACM Press, July, 2016, (slides), Bibtex.

  4. Deduplication Potential of HPC Applications' Checkpoints.
    Jürgen Kaiser, Ramy Gad, Tim Süß, Federico Padua, Lars Nagel and André Brinkmann,
    Proc. of IEEE Int. Conf. on Cluster Computing (Cluster'16),
    pp. 413--422, Taipei, Taiwan, IEEE Press, Sept., 2016. Bibtex.

  5. Checkpointing with DMTCP and MVAPICH2 for Supercomputing,
    Kapil Arya,
    MVAPICH User's Group (MUG'16),
    Columbus, Ohio, Aug. 17, 2016; MUG'16 program, slides;

  6. An Affinity-structure Database of Helix-turn-helix: DNA Complexes with a Universal Coordinate System,
    Mohammed and AlQuraishi, Shengdong Tang and Xide Xia,
    BMC Bioinformatics 16:390, 19 pages, 2015, BioMed Central,

  7. HOL (y) Hammer: Online ATP Service for HOL Light,
    Cezary Kaliszyk and Josef Urban,
    Mathematics in Computer Science 9(1), pp. 5--22, 2015, Springer,
    (first published online on Jun 28, 2014)

  8. Parallel Application Signature for Performance Analysis and Prediction (or alt),
    Alvaro Wong, Dolores Rexachs, EmilioLuque,
    IEEE Trans. on Parallel and Distributed Systems 26(7), pp. 2009--2019, 2015, IEEE Press,

  9. Elastic Job Bundling: An Adaptive Resource Request Strategy for Large-scale Parallel Applications (or alt),
    Feng Liu and Jon B. Weissman,
    Proc. of the Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC'15), 12 pages, Nov., 2015, ACM,

  10. Performance Improvement in Automata Learning: Speeding up LearnLib using Parallelization and Checkpointing,
    Marco Henrix,
    M.S. thesis, Radboud University Nijmegen, Netherlands, Aug., 2015,

  11. Performance Improvement in Automata Learning,
    Marco Henrix,
    Master Thesis, Radboud University, Nijmegen, Aug., 2015

  12. An Android Cluster System Capable of Dynamic Node Reconfiguration,
    Yuki Sawada, Yusuke Arai, Kanemitsu Ootsu, Takashi Yokota and Takeshi Ohkawa,
    Proc. of 2015 Seventh Int. Conf. on Ubiquitous and Future Networks (ICUFN), pp. 689--694, IEEE Press, July, 2015,

  13. Enabling Sender-initiated Distributed Applications and Checkpointing in Content Centric Networks,
    Nitinder Mohan,
    Master of Technology Thesis, IIIT Delhi (Indraprastha Institute of Information Technology), July, 2015

  14. Optimizing Checkpoint Restart with Data Deduplication,
    Chen, Zhengyu and Sun, Jianhua and Chen, Hao
    Scientific Programming, May, 2016, Hindawi Publishing Corporation

  15. Transparent Checkpointing for Supercomputing,
    Jiajun Cao and Rohan Garg
    MVAPICH User's Group (MUG'15),
    Columbus, Ohio, Aug. 20, 2015; MUG'15 program, slides, and video;

  16. Transparent Checkpoint-Restart: Re-Thinking the HPC Environment,
    Gene Cooperman,
    MVAPICH User's Group (MUG'15),
    Columbus, Ohio, Aug. 19, 2015; MUG'15 program, slides, and video;

  17. Recent Trends towards Green Clouds by using Fuzzy based Live Migration (or alt),
    Amrinder Kaur and Anil Kumar,
    International Journal of Computer Applications 113(3) (0975--8887), pp. 17--22, Mar., 2015,

  18. Power-Check: An Energy-Efficient Checkpointing Framework for HPC Clusters,
    R.R.Chandrasekar, A. Venkatesh, K. Hamidouche and D.K. Panda,
    Proc. of 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'15),
    pp. 261--270, IEEE Press, 2015, Bibtex.

  19. Checkpointing as a Service in Heterogeneous Cloud Environments,
    Jiajun Cao, Matthieu Simonin, Gene Cooperman and Christine Morin,
    Proc. of 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'15),
    pp. 61--70, IEEE Press, 2015, Bibtex.

  20. Energy Efficient Rescheduling Algorithm for High Performance Computing,
    Manisha Chauhan, Nazia Parveen, Sumit Kumar Saurav and GL, Ganga Prasad,
    Nat. Conf. on Parallel Computing Technologies (PARCOMPTECH'15), IEEE Press, 2015,

  21. CCNCheck: Enabling Checkpointed Distributed Applications in Content Centric Networks,
    Nitinder Mohan and Pushpendra Singh,
    CCNxCon'15: Content Centric Networking (technical talk abstract), 2 pages,

  22. DMTCP: Bringing Interactive Checkpoint-Restart to Python,
    Kapil Arya and Gene Cooperman,
    Computational Science & Discovery, 16 pages, 2015, IOPScience,

  23. Using Checkpointing and Virtualization for Fault Injection,
    Cyrille Artho, Masami Hagiya, Watcharin Leungwattanakit, Eric Platon, Richard Potter, Kuniyasu Suzaki, Yoshinori Tanabe, Franz Weitl and Mitsuharu Yamamoto,
    Second Int. Symp. on Computing and Networking (CANDAR'14), pp. 144--150, 2014, IEEE Press,

  24. Be Kind, Rewind --- Checkpoint & Restore Capability for Improving Reliability of Large-scale Semiconductor Design,
    Igor Ljubuncic, Ravi Giri, Avikam Rozenfeld, and Andrew Goldis,
    2014 IEEE High Performance Extreme Computing Conference (HPEC-2014),
    6 pages, IEEE Press, Sept., 2014,

  25. Performance Evaluation of Checkpoint/Restart Techniques: For MPI Applications on Amazon Cloud,
    Basma Abdel Azeem and Manal Helal,
    Informatics and Systems, 9th Int. Conf. on (INFOS'14), pp. 49--57, Sep., 2014, IEEE Press,

  26. DMTCP: System-Level Checkpoint-Restart in User-Space,
    Kapil Arya and Gene Cooperman,
    MVAPICH User's Group (MUG'14),
    Columbus, Ohio, Aug. 26, 2014; MUG'14 program, slides, and video;

  27. Metodología para Predecir el Consumo Energético de Checkpoints en Sistemas de HPC,
    Javier Balladini, Marina Morán, Dolores Rexachs and Emilio Luque,
    XX Congreso Argentino de Ciencias de la Computación (CACCIC'14),
    10 pages, Oct., 2014, Bibtex.

  28. Using SAGA and the Open Science Grid to Search for Aptamers,
    Kevin Shieh, Pilib Ó Broin, David Rhee, Matthew Levy, and Aaron Golden,
    Proc. of 2014 Ann. Conf. on Extreme Science and Engineering Discovery Environment (XSEDE'14),  Art. No. 27, Jul., 2014

  29. Simulation Speedup of ns-3 using Checkpoint and Restore (WNS3'14),
    Kyle Harrigan and George Riley,
    Proceedings of the 2014 Workshop on ns-3 (WNS3'14),  Art. No. 7, 2014

  30. User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines,
    Kapil Arya, PhD thesis, Northeastern University, August, 2014,

  31. Use of Checkpoint-Restart for Complex HEP Software on Traditional Architectures and Intel MIC,
    Kapil Arya, Gene Cooperman, Andrea Dotti and Peter Elmer,
    J. Physics: Conference Series 523, Conference 1,
    (from Proc. of 15th Int. Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2013)),
    IOPScience, 8 pages, 2014, Bibtex.

  32. GemFI: A Fault Injection Tool for Studying the Behavior of Applications on Unreliable Substrates,
    K. Parasyris, S.Tziantzoulis ; C.D. Antonopoulos, and N. Bellas,
    44th Ann. IEEE/IFIP Int. Conf. on Dependable Systems and Networks (DSN), pp. 622--629 , IEEE Press, Jun., 2014,

  33. Алгоритмы отказоустойчивого управления ресурсами пространственно-распределённых вычислительных систем
    (Algorithms for Failover Resource Management in Distributed Computing Systems),
    А.Ю. Поляков , О.В. Молдованова, А.А. Пазников , М.Г. Курносов , С.Н. Мамойленко, А.В. Ефимов, (A. Yu. Polyakov et al.),
    Vestnik SibGUTIS 11(4), (УДК 004.382.2) pp. 11--29, 2014,

  34. Optimization Tools of Parallel Simulation of Nanostructures with Quantum Dots,
    K. V. Pavskii, M. G. Kurnosov, and A. Yu. Polyakov,
    Optoelectronics, Instrumentation and Data Processing 50(3), pp. 260--265,
    May, 2014, Springer Press,
    (Original Russian Text at: K.V. Pavskii, M.G. Kurnosov, A.Yu. Polyakov, 2014, published in Avtometriya, 2014, Vol. 50, No. 3, pp.  56--61.)

  35. Modular Software Model Checking for Distributed Systems,
    Leungwattanakit, W., Artho, C., Hagiya, M., Tanabe, Y., Yamamoto, M., and Takahashi, K.,
    IEEE Trans. on Software Engineering 40(5), pp. 483--501, May, 2014, IEEE Press,

  36. Designing Scalable and Efficient I/O Middleware for Fault-Resilient High-Performance Computing Clusters,
    Raghunath Raja Chandraseka, PhD thesis, 2014, The Ohio State University,

  37. Improving the Efficiency of Fuzz Testing Using Checkpointing,
    Erenst-Friedrich Zachow,
    Master Thesis, ETH-Zürich, April 1, 2014,

  38. Towards an Energy-Efficient Tool for Processing the Big Data,
    Eric Renault and Selma Boumerdassi,
    2nd International Conference on Future Internet of Things and Cloud (FiCloud'14), pp. 448--452, Aug., 2014, IEEE Press,

  39. Abstraction Checkpointing Levels: Problems and Solutions, Bakhta Meroufel and Ghalem Belalem,
    International Journal of Computing 13(3), pp. 158--169, 2014,

  40. Explorations of the Viability of ARM and Xeon Phi for Physics Processing,
    David Abdurachmanov, Kapil Arya, Josh Bendavid, Tommaso Boccali, Gene Cooperman, Andrea Dotti, Peter Elmer, Giulio Eulisse, Francesco Giacomini, Christopher D. Jones, Matteo Manzali and Shahzad Muzaffar,
    J. Physics: Conference Series 513, Track 5,
    (from Proc. of 20th Int. Conf. on Computing in High Energy and Nuclear Physics (CHEP13)),
    IOPScience, 7 pages, 2014, Bibtex.

  41. Selection of Nucleotide Substitution Models on the Cloud,
    Jose Manuel Santorum, Diego Darriba, Guillermo L. Taboada, and David Posada,
    Bioinformatics 30(9),
    pp. 1310-1311, Oxford Journals, Jan. 21, 2014,

  42. DMTCP: Bringing Checkpoint-Restart to Python,
    Kapil Arya and Gene Cooperman, Proc. of the 12th Python in Science Conf. (SciPy 2013),
    6 pages, 2013, Bibtex.

  43. A Framework for an In-depth Comparison of Scale-up and Scale-out,
    Michael Sevilla, Ike Nassi, Kleoni Ioannidou, Scott Brandt, and Carlos Maltzahn,
    Proc. of 2013 Int. Workshop on Data-Intensive Scalable Computing Systems (DISCS'13), pp. 13--18, 2013

  44. A Tool for Selecting the Right Target Machine for Parallel Scientific Applications,
    Javier Panadero, Alvaro Wong, Dolores Rexachs, and Emilio Luque,
    Procedia Computer Science 18, pp. 1824--1833, Elsevier, 2013,

  45. Formal Mathematics on Display: A Wiki for Flyspeck,
    Carst Tankink, Cezary Kaliszyk, Josef Urban, and Herman Geuvers,
    Intelligent Computer Mathematics,
    Lecture Notes in Computer Science Volume, vol. 7961, pp. 152--167, Springer, 2013,

  46. Towards Computing as a Utility via Adaptive Middleware: An Experiment in Cross-paradigm Execution,
    Jaroslaw Slawinski and Vaidy Sunderam,
    Parallel Processing Letters 23(2), 18 pages,
    World Scientific,  June, 2013,

  47. Calculation of the Subgroups of a Trivial-Fitting Group,
    Alexander J. Hulpke,
    Proc. of 38th International Symposium on Symbolic and Algebraic Computation, pp. 205--210, 2013, ACM Press,

  48. Semi-Automated Debugging via Binary Search through a Process Lifetime,
    Kapil Arya, Tyler Denniston, Ana-Maria Visan, and Gene Cooperman,
    Proc. of 7th Workshop on Programming Languages and Operating Systems (PLOS) (part of Proc. of 24th ACM Symp. on Operating System Principles (SOSP)), 2013,
    ACM Press, Oct., 2013, Bibtex.

  49. Shorten Device Boot Time for Automotive IVI and Navigation Systems (slides),
    Jim Huang and Shi-Wu Lo (developers, 0xlab),
    Automotive Linux Summit (ALS2013), May 28, 2013.
    (See "Part II: Userspace solution: Checkpointing"; begins at slide 66)

  50. SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes,
    P. Pavlidis, D. Živkovic, A. Stamatakis, N. Alachiotis and P. Pavlidi,
    Heidelberg Institute for Theoretical Studies, Technical report Exelixis-RRDR-2013-1, February, 2013

  51. A Survey of Fault Tolerance Mechanisms and Checkpoint/Restart Implementations for High Performance Computing Systems,
    I.P. Egwutuoha, D. Levy, B. Selic and S. Chen,
    The Journal of Supercomputing, Feb., 2013, Springer

  52. Proposal of Incremental Software Simulation for Reduction of Evaluation Time,
    Atsushi Shina, Kanemitsu Ootsu, Takeshi Ohkawa, Takashi Yokota and Takanobu Baba,
    Third Int. Conf. on Networking and Computing (ICNC), pp. 311--315, IEEE Press, Dec., 2012, Bibtex.

  53. Implement Checkpointing for Android (to speed up boot time and development process) (slides),
    Jim Huang and Kito Cheng (developers, 0xlab),
    Embedded Linux Conference Europe (ELCE2012),
    Barcelona, Spain; Nov. 5--7, 2012.

  54. Towards Fault-tolerant Energy-efficient High Performance Computing in the Cloud,
    Kurt L. Keville, Rohan Garg, David J. Yates and Kaply Arya and Gene Cooperman,
    Proc. of 2012 IEEE Computer Society International Conference on Cluster Computing. pp. 622--626, 2012, Bibtex.

  55. Adapting MPI to MapReduce PaaS Clouds: An Experiment in Cross-Paradigm Execution,
    Jaroslaw Slawinski and Vaidy Sunderam,
    Proc. of 2012 IEEE/ACM Fifth Int. Conf. on Utility and Cloud Computing (UCC '12), pp. 199--203, 2012, Bibtex.

  56. Creating and Improving Multi-Threaded Geant4.
    Xin Dong, Gene Cooperman, John Apostolakis, Sverre Jarp, Andrzej Nowak, Makoto Asai and Daniel Brandt,
    Journal of Physics: Conference Series, Volume 396, Part 5, 2012

  57. Temporal Meta-Programming: Treating Time as a Spatial Dimension,
    Ana-Maria Visan, PhD thesis, Northeastern University, April, 2012, Bibtex.

  58. Verification of Embedded Control Systems by Simulation and Program Execution Control,
    Stefan Resmerita and Wolfgang Pree,
    American Control Conference (ACC), pp. 3581--3586, June, 2012, IEEE Press, Bibtex

  59. Checkpointing in Distributed Heterogeneous Environments,
    Michael Schöttner and John Mehnert-Spahn,
    Technical Report, Heinrich Heine University, Duesseldorf, Germany, 26 pages, March, 2012,
    (from Universität Düsseldorf: Publications),

  60. Source-Level Transformation of Legacy Sequential Program into Scalable Thread-Parallel Code,
    Xin Dong, PhD thesis, Northeastern University, Dec., 2011, Bibtex.

  61. Model Checking Distributed Systems by Combining Caching and Process Checkpointing,
    Watcharin Leungwattanakit, Cyrille Artho, Masami Hagiya, Yoshinori Tanabe, and Mitsuharu Yamamoto,
    26th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 103--112,
    IEEE Press, Dec., 2011. Bibtex.

  62. Including the Workload Effect in the Parallel Program Signature,
    J.M. Canillas, A. Wong, D. Rexachs, and E. Luque,
    Proc. of 13th Int. Conf. on High Performance Computing and Communications (HPCC), pp. 304--311,
    IEEE Computer Society, Sept., 2011. Bibtex.

  63. Predicting Parallel Applications Performance Using Signatures: the Workload Effect,
    J.M. Canillas, A. Wong, D. Rexachs, and E. Luque,
    9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA), pp. 299--300,
    IEEE Computer Society, Dec., 2011. Bibtex.

  64. URDB: A Universal Reversible Debugger Based on Decomposing Debugging Histories,
    Ana-Maria Visan, Kapil Arya, Gene Cooperman, and Tyler Denniston,
    Proc. of 6th Workshop on Programming Languages and Operating Systems (PLOS) (part of Proc. of 23rd ACM Symp. on Operating System Principles (SOSP)), 2011,
    ACM Press, Oct., 2011. Bibtex.

  65. Direct Inference of Protein--DNA Interactions using Compressed Sensing Methods,
    Mohammed AlQuraishi and Harley H. McAdams,
    Proc. of National Academy of Sciences (PNAS) 108(36), pp. 14819--14824,
    Sept. 6, 2011. Full Text (html), Full Text (pdf), Bibtex.

  66. Hiroyuki Takizawa and Kentaro Koyama and Katsuto Sato and Kazuhiko Komatsu and Hiroaki Kobayashi,
    CheCL: Transparent Checkpointing and Process Migration of OpenCL Applications,
    Proc. of 2011 IEEE International Parallel and Distributed Processing Symposium, pp. 864--876
    IEEE Computer Society, May, 2011. Bibtex.

  67. Distributed Speculative Parallelization using Checkpoint Restart,
    Devarshi Ghoshal, Sreesudhan R. Ramkumar, and Arun Chauhan,
    Procedia Computer Science4, pp. 422--431,
    May, 2011, Slides, Bibtex.

  68. Unibus: Aspects of Heterogeneity and Fault Tolerance in Cloud Computing M. Slawiñska, J. Slawinski, and V. Sunderam,
    Proc. of IEEE Int. Symp. on Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), pp. 1--10,
    Apr., 2010, Bibtex.

Click here for comments. Logo