Skip to article content

Preserving Interactive Research Content

Challenges, Frameworks, and Best Practices

Abstract

Research is often more than the text content, and can include both the software and computational environments to reproduce work. These can be shared by researchers as computational notebooks, however, scholarly communications infrastructure is currently poorly positioned to both display, interact with, and archive that content. The next-generation of executable and interactive research content in scientific publishing requires comprehensive strategies, guidelines, and frameworks to ensure long-term preservation of the scientific record.

Modern research is increasingly computational and interactive. From data-heavy simulations to AI-driven analyses, scientists rely on code, data, and interactive tools to interrogate and produce results. Yet scholarly communication remains largely stuck in the print era – the static PDF research paper. Even online, most papers are essentially digital facsimiles of print, disconnected from the underlying code and data. This disconnect from their underlying code, data, and interactive executional environments has serious implications and “needlessly limit, inhibit and undermine effective knowledge transfer” Bourne et al., 2012—impeding discoverability, reuse, verification, and long-term access to the full scientific record. Results presented in static text and figures are hard to verify or reuse, and much of the richness of computational work is lost upon publication. The 2011 FORCE11 Manifesto noted, “online versions of ‘scholarly outputs’ have tended to replicate print forms, rather than exploit the additional functionalities afforded by the digital terrain” Bourne et al., 2012. FORCE11 itself has encouraged much progress in supporting areas such as data and software citations, FAIR principles, and transparent contributor roles, however the unit and form of the scholarly publication itself is largely the same facsimiles of print.

Today, we are increasingly seeing these interactive formats where reading a scientific narrative has frictionless access to data and interactive research outputs. For example, readers might zoom into a large-scale microscopy image, adjust a simulation’s parameters and rerun results within the narrative, experiment with embedded Jupyter notebooks to test or modify code, or explore time-series visualizations to see broader patterns beyond the author’s chosen times or locations.

Interactive research outputs are a demonstration of reproducibility and connection to source materials of the work. Frictionless reproducibility—the ability to access, rerun, and build upon shared code, data, and results with near-zero effort Donoho, 2024—extends this further by transforming research into a continuously verifiable process. In our discussions with stakeholders and early adopters, we typically hear about two main barriers to adoption: (1) the need for better tools to author executable and interactive content, and (2) the challenge of preserving these formats for the long term. In recent years, many platforms—such as Jupyter Notebooks, R Markdown, Quarto, Jupyter Book, MyST Markdown, and Manubot—have significantly improved the authoring experience. Some high-profile initiatives over the past five years have also attempted aspects of interactivity (eLife, AGU Notebooks Now, Microscopy Society of America, DistillPub), however, their uptake within traditional scholarly publishing remains limited.

We believe that better infrastructure for the long-term preservation of interactive research content would significantly accelerate adoption of interactive research communication tools. Today, many researchers hesitate to commit to formats such as executable notebooks or interactive visualizations because they worry that the content may not be runnable in just a few years. This uncertainty discourages experimentation, even when the tools themselves hold great promise for improving transparency and reproducibility. If publishers and tool developers had more robust preservation strategies—ensuring that interactive research objects could be reliably archived, cited, and re-executed—researchers would feel more confident that their work would endure. This reassurance would encourage researchers to learn and adopt these technologies, ultimately enriching scholarly communication with more transparent and engaging research outputs.

Fortunately, many technologies already exist to support the preservation of interactive research environments. Package managers, containers, and virtual machines provide ways to capture software dependencies and system configurations; compilation tools (e.g. GNU Make, Make for Windows) make it easy to build executable binaries for any architecture from source code files; hosting tools such as MyBinder, Google Colab, GitHub/GitHub Pages, and Globus make it easier to host or share one’s work, to say nothing of the suites of resources from major cloud providers; and workflow managers (Galaxy, NextFlow, Snakemake, Code Ocean, Whole Tale, OpenML, ReproZip, etc.) make replication easier. Open source practices, which invite community contribution, can also sustain a tool or project even if the original creator has moved on to other projects. We expect these solutions to continue advancing, and for new solutions to emerge. For example, generative AI coding assistants may make it easier to repair degraded or partially functional environments. Nevertheless, no technology is immune to obsolescence, and so we propose a tiered preservation strategy: alongside a fully interactive “ideal” version of a research object, fallback formats of decreasing interactivity should also be published. To reduce the burden on authors, modern interactive tools should seamlessly generate these fallback versions during their build, export, or publication processes. Such a strategy would provide researchers with greater confidence that their work will remain accessible even as the underlying technologies inevitably change.

Contents

↔️ Graceful Degradation

Complex systems can fail or simplify incrementally rather than catastrophically, we promote designing systems that provide the richest experience possible when everything works, while still offering meaningful, functional fallback options when conditions change.

📖 Expanding the Idea: Perspectives from the Ecosystem

As we move forward with advancing the PIRC Working Group’s framework for graceful degradation, we invite readers to consider what this approach might look like from their own positions across the scientific communication landscape.

📗 Definitions

Terms and definitions, in this context, including our definition of preservation and interactive research content.

Working Group Contribution

Preserving interactive research content—not just code or data in isolation—means capturing the integrated environments and narratives that underpin computational science. This is essential for reproducibility, transparency, and educational value. To achieve this, we must support a spectrum of preservation strategies, from fully interactive environments, to readable and static export formats, to archived artifacts with embedded provenance. PIRC takes a higher-level mindset on how to adopt better practices gradually through progressive enhancement, adding computation, data and interactivity in, and then proposing how we “fall back” to today.

References
  1. Bourne, P. E., Clark, T. W., Dale, R., de Waard, A., Herman, I., Hovy, E. H., & Shotton, D. (2012). Improving The Future of Research Communications and e-Scholarship (Dagstuhl Perspectives Workshop 11331). 10.4230/DAGMAN.1.1.41
  2. Donoho, D. (2024). Data Science at the Singularity. Harvard Data Science Review, 6(1). 10.1162/99608f92.b91339ef
Preserving Interactive Research Content
Graceful Degradation in Computational Research