

HPC Documentation and Conway's Law: Everything, Everywhere All at Once
Wednesday, June 24, 2026 3:45 PM to 5:15 PM · 1 hr. 30 min. (Europe/Berlin)
Foyer D-G - 2nd Floor
Project Poster
Development of HPC SkillsEducation and TrainingResource Management and SchedulingSystem and Performance Monitoring
Information
Poster is on display.
This project poster presents early findings from an NSF funded project to study and improve user documentation processes related to the exascale machine Aurora at Argonne Leadership Computing Facility (ALCF) near Chicago, Illinois. Via ALCF staff and user interviews in 2025, we documented the communication tools and channels related to user documentation authoring and maintenance, and the organizational structures, such as committees, that support it. We adopted Conway’s Law from software engineering (Conway, 1968; Bailey, et. al., 2013) as a framework for understanding our findings as a resource for understanding documentation processes at ALCF and at other HPC facilities. Conway’s Law proposes a relationship between organizational structure and system design. Our mapping of committee and organizational structures on top of communication channels and communication tools documents how effective user documentation is a whole-facility endeavor. In addition, two relationships are essential to the authoring and maintenance of documentation for complex computing systems: vertical and horizontal. Vertical relationships that align with the organizational structure are essential for launching the documentation for new systems and for providing stability in the long-term management of documentation. For example, the Operations Lead in the Division Director’s office coordinated the pre-production initiation of Aurora’s user documentation. During the production phase, horizontal, or cross-cutting, relationships are essential for enabling subject matter experts (SMEs) from across the facility to expediently update topics based on technical system updates or user feedback. For example, the cross-cutting UX Committee, which comprises leads from across ALCF divisions (Technology, Operations, Science, Division Director) oversees the ongoing development of the documentation. In addition, different types of communication tools enable and maintain vertical and horizontal relationships: Pre-production phase documentation for new systems requires centralized coordination via project management tools, such as MS Excel and Confluence; production phase user documentation maintenance and development require open, networked tools, such as Slack and Github, so that SMEs from across the facility’s groups can quickly locate or be assigned to topics that need updating. As predicted, Conway’s Law reveals how authoring and maintaining user documentation for a new HPC machine relies on relationships formalized by the organizational structure and the high-level architecture of the documentation reflects this. Authoring and maintaining user documentation, however, also relies on networked communication tools that can expedite ad hoc relationships across the organizational structure to ensure that the documentation remains up to date. Finally, ALCF staff report that the effective networked, cross-cutting relationships are supported by the facility’s culture of trust and collaboration. Next steps in 2026 for this project include sharing the map with ALCF stakeholders and other HPC facilities and continuing to get feedback and to reflect on how the map informs existing processes and communication structures.
This project poster presents early findings from an NSF funded project to study and improve user documentation processes related to the exascale machine Aurora at Argonne Leadership Computing Facility (ALCF) near Chicago, Illinois. Via ALCF staff and user interviews in 2025, we documented the communication tools and channels related to user documentation authoring and maintenance, and the organizational structures, such as committees, that support it. We adopted Conway’s Law from software engineering (Conway, 1968; Bailey, et. al., 2013) as a framework for understanding our findings as a resource for understanding documentation processes at ALCF and at other HPC facilities. Conway’s Law proposes a relationship between organizational structure and system design. Our mapping of committee and organizational structures on top of communication channels and communication tools documents how effective user documentation is a whole-facility endeavor. In addition, two relationships are essential to the authoring and maintenance of documentation for complex computing systems: vertical and horizontal. Vertical relationships that align with the organizational structure are essential for launching the documentation for new systems and for providing stability in the long-term management of documentation. For example, the Operations Lead in the Division Director’s office coordinated the pre-production initiation of Aurora’s user documentation. During the production phase, horizontal, or cross-cutting, relationships are essential for enabling subject matter experts (SMEs) from across the facility to expediently update topics based on technical system updates or user feedback. For example, the cross-cutting UX Committee, which comprises leads from across ALCF divisions (Technology, Operations, Science, Division Director) oversees the ongoing development of the documentation. In addition, different types of communication tools enable and maintain vertical and horizontal relationships: Pre-production phase documentation for new systems requires centralized coordination via project management tools, such as MS Excel and Confluence; production phase user documentation maintenance and development require open, networked tools, such as Slack and Github, so that SMEs from across the facility’s groups can quickly locate or be assigned to topics that need updating. As predicted, Conway’s Law reveals how authoring and maintaining user documentation for a new HPC machine relies on relationships formalized by the organizational structure and the high-level architecture of the documentation reflects this. Authoring and maintaining user documentation, however, also relies on networked communication tools that can expedite ad hoc relationships across the organizational structure to ensure that the documentation remains up to date. Finally, ALCF staff report that the effective networked, cross-cutting relationships are supported by the facility’s culture of trust and collaboration. Next steps in 2026 for this project include sharing the map with ALCF stakeholders and other HPC facilities and continuing to get feedback and to reflect on how the map informs existing processes and communication structures.
Format
on-demandon-site
