Skip to content
Christian Engelmann, Ph.D.
Senior Computer Scientist & Research Group Leader
Solutions
The INTERSECT Federated Architecture for the Laboratory of the Future
xSim: The Extreme-scale Simulator
Resilience Design Patterns
Characterization of Faults, Errors, and Failures in Extreme-Scale Systems
redMPI: A Redundant Message Passing Interface Implementation
Proactive Fault Tolerance Framework
Hybrid Full/Incremental System-level Checkpointing
Symmetric Active/Active High Availability for HPC System Services
Ongoing Projects
2024-…: A Resilient Federated Ecosystem for Self-Driving Laboratories
2024-…: Privacy-Preserving Federated Learning for Science: Building Sustainable and Trustworthy Foundation Models
Past Projects
2021-24: An Open Federated Architecture for the Laboratory of the Future
2018-19: rOpenMP: A Resilient Parallel Programming Model for Heterogeneous Systems
2015-21: Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale
2015-19: Catalog: Characterizing Faults, Errors, and Failures in Extreme-Scale Systems
2013-16: Hobbes: OS and Runtime Support for Application Composition
2013-16: MCREX – Monte Carlo Resilient Exascale Solvers
2012-14: Hardware/Software Resilience Co-Design Tools for Extreme-scale High-Performance Computing
2011-12: Extreme-scale Algorithms and Software Institute
2009-11: Soft-Error Resilience for Future-Generation High-Performance Computing Systems
2008-11: Reliability, Availability, and Serviceability (RAS) for Petascale High-End Computing and Beyond
2008-11: Scalable Algorithms for Petascale Systems with Multicore Architectures
2006-09: Harness Workbench: Unified and Adaptive Access to Diverse HPC Platforms
2006-08: Virtualized System Environments for Petascale Computing and Beyond
2004-07: MOLAR: Modular Linux and Adaptive Runtime Support for High-End Computing
2004-06: Reliability, Availability, and Serviceability (RAS) for Terascale Computing
2002-04: Super-Scalable Algorithms for Next-Generation High-Performance Cellular Architectures
2000-05: Harness: Heterogeneous Distributed Computing
Publications
Peer-Reviewed Journal Papers
Peer-Reviewed Conference Papers
Peer-Reviewed Workshop Papers
Peer-Reviewed Conference Posters
Whitepapers
Technical Reports
Datasets
Talks and Lectures
Co-Advised Theses
Theses
BibTex Citations
Other Activities
Reading List
Christian Engelmann, Ph.D.
Senior Computer Scientist & Research Group Leader
Navigation Menu
Navigation Menu
Solutions
The INTERSECT Federated Architecture for the Laboratory of the Future
xSim: The Extreme-scale Simulator
Resilience Design Patterns
Characterization of Faults, Errors, and Failures in Extreme-Scale Systems
redMPI: A Redundant Message Passing Interface Implementation
Proactive Fault Tolerance Framework
Hybrid Full/Incremental System-level Checkpointing
Symmetric Active/Active High Availability for HPC System Services
Ongoing Projects
2024-…: A Resilient Federated Ecosystem for Self-Driving Laboratories
2024-…: Privacy-Preserving Federated Learning for Science: Building Sustainable and Trustworthy Foundation Models
Past Projects
2021-24: An Open Federated Architecture for the Laboratory of the Future
2018-19: rOpenMP: A Resilient Parallel Programming Model for Heterogeneous Systems
2015-21: Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale
2015-19: Catalog: Characterizing Faults, Errors, and Failures in Extreme-Scale Systems
2013-16: Hobbes: OS and Runtime Support for Application Composition
2013-16: MCREX – Monte Carlo Resilient Exascale Solvers
2012-14: Hardware/Software Resilience Co-Design Tools for Extreme-scale High-Performance Computing
2011-12: Extreme-scale Algorithms and Software Institute
2009-11: Soft-Error Resilience for Future-Generation High-Performance Computing Systems
2008-11: Reliability, Availability, and Serviceability (RAS) for Petascale High-End Computing and Beyond
2008-11: Scalable Algorithms for Petascale Systems with Multicore Architectures
2006-09: Harness Workbench: Unified and Adaptive Access to Diverse HPC Platforms
2006-08: Virtualized System Environments for Petascale Computing and Beyond
2004-07: MOLAR: Modular Linux and Adaptive Runtime Support for High-End Computing
2004-06: Reliability, Availability, and Serviceability (RAS) for Terascale Computing
2002-04: Super-Scalable Algorithms for Next-Generation High-Performance Cellular Architectures
2000-05: Harness: Heterogeneous Distributed Computing
Publications
Peer-Reviewed Journal Papers
Peer-Reviewed Conference Papers
Peer-Reviewed Workshop Papers
Peer-Reviewed Conference Posters
Whitepapers
Technical Reports
Datasets
Talks and Lectures
Co-Advised Theses
Theses
BibTex Citations
Other Activities
Reading List
Reading List
Leadership / Management
Skills
Technology
Technology History