ChapelCon ‘25 welcomes anyone with computing challenges that demand performance, particularly through parallelism and scalability. ChapelCon ‘25 brings together Chapel users, enthusiasts, researchers, and developers to exchange ideas, present their work, and forge new collaborations. Anyone interested in parallel programming, programming languages, or high performance computing is encouraged to attend. A wide range of sessions support all levels of experience, with Tutorials and Free Coding sessions for those looking to hone their skills, Office Hour sessions for those looking for help from Chapel developers, and Conference sessions for those looking to share and discuss their work. ChapelCon ‘25 is free to attend and will be held virtually.

Program

October 7: Tutorials, Day 1

Time (PDT)
9:00 - 9:10
Brandon Neth
Description Opening session to welcome participants and share the schedule for the day.
9:10 - 9:40
Daniel Fedorin
Description This session introduces the Chapel programming language, its key features, and its place within the HPC ecosystem. The focus is on how Chapel's design philosophy makes it unique amongst programming languages. No prior experience necessary.
9:40 - 10:20
Lydia Duncan
Description This session demos key I/O features in Chapel and offers a short coding exercise for participants to do that uses the demonstrated features.
10:20 - 11:00
Shreyas Khandekar
Description This session demos Chapel's key parallel looping constructs and offers a short coding exercise for participants to do that uses the demonstrated features.
11:00 - 11:30
11:30 - 12:10
Brandon Neth
Description This session demos distributions, a key abstraction in Chapel for reasoning about data locality and movement. It also offers a short coding exercise that uses these features.
12:10 - 12:50
Jade Abraham
Description This session demos Chapel's aggregate data structures (classes and records) and Chapel's approach to memory management. It also offers a short coding exercise that uses these features.
12:50 - 14:00
Description The free-code session is unstructured time for participants to work on their own Chapel projects in the company of other Chapel users and developers. Participants are encouraged to apply any lessons from the day's demo sessions and discuss / ask questions about their projects.

October 8: Tutorials, Day 2

Time (PDT)
9:00 - 9:10
Brandon Neth
Description Opening session to welcome participants and share the schedule for the day.
9:10 - 9:40
Daniel Fedorin
Description This session introduces Chapel language features that play a role in the rest of the day's tutorial topics. Key topics include generic types and compile- and run-time expressions.
9:40 - 10:20
Engin Kayraklıoğlu
Description This session covers common causes of performance issues in Chapel applications and how to diagnose and resolve them. It includes a short coding exercise for participants.
10:20 - 11:00
Ben Harshbarger
Description This session demos Chapel's serializer and deserializer support for reading and writing custom data structures. It also includes a short coding exercise that uses these features.
11:00 - 13:20
Description The free-code session is unstructured time for participants to work on their own Chapel projects in the company of other Chapel users and developers. Participants are encouraged to apply any lessons from the day's demo sessions and discuss / ask questions about their projects.
13:20 - 14:00
Daniel Fedorin
Description This session demos Chapel's parallel iterators, which can be used to define custom iteration schemes for user-defined data structures. It includes a short exercise for participants to practice using these features.

October 9: Conference, Day 1

Time (PDT)
9:00 - 9:05
Brandon Neth
Description Opening session to welcome participants and share the schedule for the day.
9:05 - 9:25
Iain Moncrief
Description Modern hardware is parallel in a multitude of ways, including multi-core, multi-GPU, and multi-node parallelism. The Chapel language provides a unified toolbox to make use of these varying kinds of parallelism, and thus effectively leverage computing hardware. One area that can benefit from parallelization is machine learning (ML), which has grown in popularity in recent years. To explore Chapel’s suitability for ML, over the course of a summer internship project, the Chapel team has developed the first machine learning framework in pure Chapel called ChAI. ChAI is capable of training and inference, and is able to load pretrained models from PyTorch and apply them to workloads distributed over any number of nodes, CPU cores, and GPUs. The framework has shown promising scaling results; using ChAI to load the MNIST model and classify 10,000 images, we measured a 105x speedup when scaling from a single node to 128 nodes on a Cray XC machine. We plan to integrate ChAI into the Chapel-powered Arkouda data science framework, enabling interactive, ML-enabled data science over massive datasets.
9:25 - 9:45
Thitrin Sastarasadhit and Kenjiro Taura
Description Transformer models drive the current AI but require substantial computational resources to train. Chapel, a programming language designed for high-performance computing, offers an opportunity to explore efficient implementations of such models. In this talk, I present an implementation of a transformer model from scratch in Chapel and compare its performance with an equivalent from-scratch C++ implementation. PyTorch, a widely used framework for deep learning, is included as a reference. This talk highlights the strengths and limitations of Chapel for implementing modern AI models and its potential as a programming language for high-performance research.
9:45 - 10:00
10:00 - 10:10
Mohammad Dindoost, Bartosz Bryg, Ioannis Koutis, David Bader and Oliver Alvarado Rodriguez
Description We present HiPerMotif, a hybrid parallel algorithm for subgraph isomorphism addressing scalability limitations in large-scale property graphs. Traditional vertex-by-vertex algorithms struggle with extensive early-stage exploration and limited parallelization. HiPerMotif shifts search initialization through: (1) structural reordering prioritizing high-degree vertices, (2) systematic first-edge mapping identification, (3) efficient validation, and (4) state injection at depth 2. Implemented in Chapel within the Arachne framework, HiPerMotif achieves up to 66× speedup over state-of-the-art baselines and processes massive datasets like the H01 connectome (150M edges) that existing methods cannot handle.
10:10 - 10:30
Rinor Ramli
Description This presentation begins with a brief introduction on probabilistic inference for solving hypergraph algorithms alongside its apparent strengths and challenges in implementation. PGAS as natively provided by Chapel is then presented as a way to accelerate probabilistic inference.
10:30 - 10:45
10:45 - 11:45
Chris Rackauckas
Description Scientific machine learning, denoted SciML, is the integration of machine learning into scientific computing. While it has become an academic discipline in its own right, one of the key drivers to the adoption of SciML has been the ongoing creation of readily-available software. In this talk I will introduce the key tenannts of SciML with a focus towards the implications for HPC software development. Showcases of methods such as universal differential equations and their successes for generating more accurate physical models from data will be intertwined with stories about the software architecture which has enabled the sustainable development of the Open Source SciML software ecosystem. The audience should leave with a deep understanding of how the trade-off between research software and reusable open source development can be managed in a way that is beneficial to both the research community and the broader scientific computing community.
11:45 - 12:00
12:00 - 12:30
Amanda Potts and Engin Kayraklıoğlu
Description Arkouda (https://arkouda-www.github.io/) is an open-source, NumPy-like framework for distributed exploratory data analysis (EDA), built on a Chapel backend, with growing support for pandas-like data structures and operations. Over the past year, the project has matured substantially through major architectural improvements, expanded data type support, and closer alignment with the evolving NumPy 2.0 ecosystem. These updates enhance expressiveness, performance, and reliability for large-scale, interactive data science workflows. We will begin with a general introduction to Arkouda and highlight recent use cases and success stories. Given the number of Arkouda-related talks at past ChapelCon events, the remainder of the session will focus on key improvements made since the last ChapelCon.
12:30 - 13:00
Anthony Chrun, Baptiste Arnould, Karim Zayni, Guillaume Auger, Maxime Blanchet, Eric Laurendeau and Justin Rigal
Description CHAMPS (CHapel MultiPhysics Software) is a multiphysics computational framework built around an aerodynamic flow solver based on the Euler and Reynolds-Averaged Navier–Stokes (RANS) equations, currently under development at Polytechnique Montreal. Since its early development, multiple research efforts have contributed to enhancing its capabilities by incorporating a range of turbulence and transition models. Additional physics modules include solvers for droplet trajectory prediction, ice accretion, condensation trail formation, structural deformation, and fluid–structure interaction. This paper presents some of the most recent and impactful advancements achieved within CHAMPS, in order to share them with the Chapel community.
13:00 - 13:15
13:15 - 13:35
Luca Ferranti
Description Automatic differentiation is the secret sauce allowing neural networks, and machine learning applications in general, to be more than just maths, but actually work! This talk will start with a brief general overview of automatic differentiation, then dive into two concrete workstreams. ForwardModeAD is a Chapel library for forward-mode automatic differentiation, built using operator overloading. In the first half of the talk, I’ll share the story behind its development — the design choices, the Chapel language features that made it possible, and the trade-offs along the way. I will also cover performance bottlenecks, current limitations, and where the library is headed next. Enzyme is a library for automatic differentiation at the LLVM level. It has already been integrated into languages like Julia and Rust, often achieving higher performance than nativelanguage frameworks. In the second half of the talk, I’ll present my work on integrating Enzyme with Chapel, showing the current status, limitations, challenges encountered, and next steps.
13:35 - 13:55
Ivan Tagliaferro de Oliveira Tezoto, Guillaume Helbecque, Ezhilmathi Krishnasamy, Nouredine Melab and Gregoire Danoy
Description Modern high-performance computing systems increasingly rely on heterogeneous architectures combining CPUs and GPUs from multiple vendors, such as Nvidia and AMD. Ensuring both performance and portability in this context remains a key challenge. This work investigates two distinct programming approaches for parallel tree-based exact optimization, focusing on the Branch-and-Bound algorithm. The first is a low-level, performance-oriented implementation in C, combining OpenMP with CUDA and HIP for multi-GPU acceleration within a single compute node. The second leverages the PGAS-based Chapel language, which offers a unified and portable high-level framework for threaded and GPU programming. We revisit the design of a portable multi-GPU Chapel implementation and propose an optimized low-level counterpart featuring a collegial multi-pool data structure, dynamic load balancing through Work Stealing, and GPU thread-indexing optimizations. Both implementations are evaluated on the Permutation Flowshop Scheduling Problem using up to eight GPUs on Nvidia A100 and AMD MI250x architectures. Experimental results demonstrate that while CUDA and HIP versions consistently outperform Chapel in terms of raw performance, Chapel achieves comparable or superior scalability when considering absolute speedups. These findings suggest that Chapel represents a promising option for prototyping GPU-accelerated parallel applications, allowing developers to evaluate feasibility and design choices before transitioning to performance-tuned, low-level implementations.
13:55 - 14:00
Benson Muite
Description A summary of experience distributing and testing Chapel on Fedora GNU/Linux and suggestions for integration within wider ecosystems.
14:00 - ??

October 10: Conference, Day 2

Time (PDT)
9:00 - 9:05
Brandon Neth
Description Opening session to welcome participants and share the schedule for the day.
9:05 - 9:35
Brad Chamberlain
Description This talk will give a brief summary of highlights and milestones achieved within the Chapel project since last year.
9:35 - 10:05
Emanuele Vitali, Jorik van Kemenade
Description In this talk we will introduce LUMI, starting with a view on the consortium and how you can request resources, then focusing on its hardware (in detail, the LUMI-C and LUMI-G node architectures, and the network architecture). Then we will introduce the LUMI user support team, its way of working and its duties. Finally, we will provide a short demo on how to install and run a simple Chapel program
10:05 - 10:20
10:20 - 10:30
Nelson Luís da Costa Dias
Description A simple (serial) recursive summation over a 1-dimensional array (a la quicksort) was implemented in 3 languages (Chapel, C, Fortran) and 4 compiler variants (Chapel 2.5 with LLVM, gcc 13.3.0, clang 18.1.3, gfortran 13.3.0), and compared with a standard non-recursive summation. Two alternatives for the recursion were tested: (i) by passing array indices explicitly in the recursion (possible in all three languages) and (ii) by using array slicing (only possible in Chapel and Fortran). Performance varied widely. Clang and Chapel were faster for the standard non-recursive summation; C and Fortran were faster for recursion using indices; and Fortran was much faster than Chapel for recursion using array slices. The performance of Chapel slices is a known issue (https://chapel.discourse.group/t/ new-issue-improve-the-performance-of-slices-and-rank-change-operations/30503). It appears that for the standard summation and recursive summation using indices the performance is related to the backend, i.e. GCC (gfortran and gcc) versus LLVM (Chapel and clang).
10:30 - 10:50
Jade Abraham
Description When writing thread-parallel applications, users of Chapel can use high-level productivity features like ‘forall’ and promotion to succinctly express their algorithms or use lower-level features like ‘begin‘ to more directly control task creation. Chapel’s GPU support is similar, where high-level promoted statements can become kernels, but explicit ‘foreach’ loops can be used for greater control over the generated kernel. A missing piece to this is with instruction level parallelism and vectorization. The Chapel compiler usually does a great job at automatically vectorizing code, but when it fails there is no recourse. The next best option is to interoperate with C or Fortran code to write the low-level operations. In order to solve this problem, I have created CVL: chpl Vector Library. This library exposes a vector type as a first class object which provides a unified set of operations across x86 and ARM. This provides Chapel developers direct control over the vectorization in their applications. In this demo, I will showcase the design and implementation of the library, including the tools used to maintain it. I will then demonstrate several benchmarks where usage of CVL beats the Chapel compiler’s auto-vectorization. Lastly, I will discuss potential improvements for the library going forward.
10:50 - 10:55
10:55 - 11:15
Daniel Fedorin
Description Chapel’s type system can be surprisingly powerful. In addition to “usual” features such as generics and polymorphism, Chapel provides the ability to manipulate types using functions; this involves both taking types as arguments to functions and returning them from these functions. This can enable powerful programming techniques that are typically confined to the domain of metaprogramming. For example, although Chapel’s notion of compile-time values — ‘param’s — is limited to primitive types such as integers, booleans, and strings, one can encode compile-time lists of these values as types. Such encodings can be used to create compile-time specializations of functions that would be otherwise tedious to write by hand. One use case for such specializations is the implementation of a family of functions for approximating differential equations, the Adams-Bashforth methods. Some variants of these methods can be encoded as lists of coefficients. Thus, it becomes possible to define a single function that accepts a type-level list of coefficients and produces a “stamped out” implementation of the corresponding method. This reduces the need to implement each method explicitly by hand. Another use case of function specialization is a type-safe ‘printf’ function that validates that users’ format specifiers match the type of arguments to the function. More generally, Chapel’s types can be used to encode algebraic sums (disjoint unions) and products (Cartesian) of types. This, in turn, makes it possible to build arbitrary data structures at the type level. The lists-of-values case above is an instance of this general principle. Functions can be defined on type-level data structures by relying on overloading and type arguments. Programming in this manner starts to resemble programming in a purely functional language such as Haskell. Though this style of programming has not seen much use thus far, it can be a powerful technique for controlling types of arguments or constructing highly customized functions with no runtime overhead.
11:15 - 11:35
Jade Abraham and Lydia Duncan
Description Interoperability is a key tool for new languages to drive adoption and grow. Chapel has a rich set of interoperability features that allow users to write flexible applications. For example, the ability to write C code inline with Chapel code reaches the peak of interoperability - writing code from two languages side by side in the same file. When it comes to interoperability with Python, Chapel has taken a similar approach to other languages. Python is a slow language by its very nature and to achieve good performance modules are written in another language that is then called from Python code. Chapel has been able to fill this role for some time. This takes a Python first approach, working well for those who want to primarily write Python and a little bit of Chapel. The ability to call Python code from Chapel allows a Chapel programmer to write the majority of their application in Chapel and use a little bit of Python. Recent work has resulted in a Python module for Chapel that allows developers to reach the gold standard of interoperability, Python and Chapel code side-by-side in the same file. In this demo, we will showcase how this module is put together and some of the key features that enable Python interoperability in multiple ways. We will also show some applications areas where this can be useful with libraries like numpy, scipy, and pandas.
11:35 - 11:45
Daniel Fedorin
Description Formal methods are a set of techniques that are used to validate the correctness of software. A particular category of these methods, model checking, uses the mathematical language of temporal logic to construct specifications of software’s behavior. A solver can then validate the constraints described in the formal language and ensure that undesirable states do not occur. This talk will be an experience report of using formal methods, specifically the Alloy analyzer, to detect a bug in Chapel’s ‘Dyno’ compiler front-end library. The area in which the bug was discovered is currently used in production, as well as a part of editor tools such as chplcheck and chpl-language-server. Specifically, Alloy was used to construct a formal specification of a part of Chapel’s use/import lookup algorithm. Chapel has a number of complicated scoping rules and possible edge cases in this area. By running this specification against a solver, a sequence of steps was discovered that could cause the algorithm to malfunction and produce incorrect results. A program that causes these steps to occur was constructed and served as a concrete reproducer for the bug. This reproducer was used to adjust the logic and fix the bug. This talk will cover the fundamentals of temporal logic required for formal specifications, the necessary parts of Chapel’s use/import lookup algorithm, and the steps taken to encode and validate the compiler’s behavior.
11:45 - 12:00
12:00 - 12:20
Michael Ferguson
Description Chapel 2.5 includes a new distributed-memory sort implementation. This talk will describe the interface for radix sorting, the new distributed sort algorithm, and discuss the performance of the new implementation.
12:20 - 12:40
Shreyas Khandekar and Matt Drozt
Description Distributed-memory parallel processing addresses computational problems requiring significantly more memory or computational resources than can be found on one node. Software written for distributed-memory parallel processing typically uses a distributed memory parallel programming framework to enhance productivity, scalability, and portability across supercomputers and cluster systems. These frameworks vary in their capabilities and support for managing communication and synchronization overhead to achieve scalability. We implemented a communication-intensive distributed radix sort algorithm to examine and compare the performance, scalability, usability, and productivity differences between five distributed-memory parallel programming frameworks: Chapel, MPI, OpenSHMEM, Conveyors, and Lamellar.
12:40 - 12:50
12:50 - 13:50
Todd Gamblin
Description The past year has been transformative for the twelve-year-old Spack project, starting with its inclusion in the High Performance Software Foundation (HPSF) and culminating in its 1.0 release. Spack v1.0, released in July, is the first version to offer a stable package API and to integrate true compiler dependencies into its core model—features developed over many years. This talk will cover how the Spack community evolved to this point and detail the decision-making process behind joining the HPSF and finally taking the plunge and going 1.0.
13:50 - 14:00
14:00 - 14:10
Ryan Keck
Description Many parallel algorithms depend on reshaping how data is distributed across locales to achieve efficient computation. In this talk, I’ll introduce the Repartition module, a custom module implemented in Chapel designed to simplify and generalize all-to-all communication patterns. It enables each locale to specify a destination locale for each list element, then automatically redistributes the data accordingly. I’ll show how this module enables sharding patterns useful across a range of distributed algorithms. We’ll look at how Repartition integrates with Chapel’s parallelism features, explore implementation tradeoffs, and share benchmark results across various workloads. Whether you're writing high-performance algorithms or building reusable distributed libraries, Repartition offers a flexible and powerful tool for managing data layout.
14:10 - 14:20
Sosuke Hosokawa and Kenjiro Taura
Description While Chapel’s GPU programming model simplifies multi-GPU programming, it does not fully exploit the advanced GPU-to-GPU communication capabilities provided by modern GPUs. We present an integration of NVSHMEM into Chapel to enable efficient GPUto-GPU communication from within CUDA kernels. Our implementation modifies Chapel’s GPU build pipeline and runtime system to support NVSHMEM’s symmetric memory model. Performance evaluation on the Miyabi supercomputer shows up to 100x speedup for small transfers and effective utilization of interconnect bandwidth for larger transfers compared to Chapel’s native copy operations.
14:20 - 14:40
Oliver Alvarado Rodriguez, Engin Kayraklioglu, Bartosz Bryg, Mohammad Dindoost, David Bader and Brad Chamberlain
Description Distributed applications with fine-grained communication often suffer from performance bottlenecks. Chapel's CopyAggregation module addresses this for distributed array operations but doesn't support arbitrary remote operations. Building on Arkouda's pioneering aggregation concepts, this 20-minute talk presents a prototype for generalized destination aggregation, particularly addressing sparse matrix construction challenges. The first part analyzes Chapel's CopyAggregation module's capabilities and limitations, then introduces the generalized framework prototype. Implementation examples demonstrate aggregated array assignments and sparse matrix creation using CompressedSparseRow layouts with parallel safety mechanisms. The second part presents experimental results on both random recursive matrix (RMAT) and uniformly-created sparse matrices. Comparative analysis across HPE Cray EX and Infiniband systems reveals performance impacts of fine-grained versus aggregated communication. The third part outlines a roadmap for developing a generalized aggregation framework for Chapel, discussing ecosystem integration and applications beyond sparse matrices. This presentation targets Chapel users, HPC researchers, and practitioners working with distributed sparse data structures, providing practical insights for improving performance in irregular, communication-intensive applications.
14:40 - ??

Tutorial Days

October 7th and 8th

Guided tutorials, hands on exercises, office hours, and free coding session.

More on Tutorial Days →

Conference Days

October 9th and 10th

Keynote, talks, and demos from the Chapel community.

More on Conference Days →

Timeline

Sessions

About Tutorial Days

The first two days of ChapelCon ‘25 (October 7 and October 8) will focus on action. Each day will begin with a guided tutorial, followed by hands-on exercises in the group, followed by free coding sessions, where participants can work on their own applications or on provided project prompts.

Tutorials

Tutorial days will begin with in-depth tutorials covering a range of topics: building/installing Chapel, traditional programming language features (basic usage, classes/records, IO, standard modules), and HPC-focused topics (locality, parallelism, distributed data, synchronization). No prior knowledge or preparation needed.

Free Coding Sessions

Work on projects with other Chapel enthusiasts in the Free Coding session. We’ll begin with guided exercises to warm up then shift to less structured work on personal projects or provided prompts. The Free Coding Sessions will be a relaxed working environment, with Chapel developers present to answer questions, and breakout rooms for short demo sessions focused on solving specific, common problems.

Office Hours

Book an Office Hour for an in-depth peer-programming session with a Chapel contributor The team is here to help with just about anything–understanding features, resolving bugs, or diagnosing/resolving performance issues. To sign up for a session, fill out a short survey to help us understand your problem and best match you with a Chapel developer.

About Conference Days

The two conference days will feature a mix of talks and demos from the community, a State of the Project update, a Keynote address, and Community Discussions.

Talks

If you have research or applications involving Chapel, we want to hear about it! This track is an opportunity to showcase any study ranging from preliminary to already published work and get feedback from the Chapel community. Talk slots can run from 5 to 30 minutes.

Demos

If you have code or visualization from Chapel-based work, this track is for you. You can demonstrate key parts of your implementation, show how it runs live, or advertise a new module or application you are working on. Demo slots can run from 5 to 30 minutes.

Posters and Extended Abstracts

ChapelCon ‘25 will accept submissions of posters and extended abstracts, with or without accompanying presentations. These contributions will be reviewed by the program committee and accepted work will be shared with attendees as part of the conference. These tracks are ideal for folks who are interested in sharing their work with the Chapel community but are unable to present on the day.

Community Discussions

As in previous years, conference days will include informal discussion periods to draw connections between different work presented each day.

Organization

Program Committee

Advisory Committee

Contact

If you have questions or suggestions about ChapelCon, please post to the ChapelCon ‘25 discourse thread or email us at chapel+con@discoursemail.com.