Jan 28, 2025 8 min read Books

Unresolved Challenges in PRC

The following is an update to a chapter in my PRC vs The Cathedral book. I had a number of folks review the book spontaneously with very good feedback. So far I haven't absorbed all the ideas but in the process of doing so...in the mean time here is an update on the chapter "Unresolved Challenges in PRC" which has a large portion about tech, reflecting my belief that tech is important "agent for change" (to quote Elizabeth Eisenstein) in publishing, and always has been, but the importance in today's journal and scholarly communications landscape is not well understood. Comments welcome!

I'll print the book in a few weeks for a first print edition.

Forgive the endnotes...still in need of going through.

Unresolved Challenges in PRC

While the Publish-Review-Curate (PRC) model addresses some of the most pressing issues in traditional journals—most notably the time-to-market problem by prioritizing rapid dissemination—it leaves significant challenges unresolved, particularly in the areas of review and production.

Some have suggested that preprint review is quicker—or can be quicker—than traditional journal review processes [1]. However, these efficiencies may diminish if PRC maintains, as it generally does now, traditional review workflows (reviewer inviations, multiple rounds, author feedback etc), as scaling up will introduce challenges that slow the workflow as it has for journals. This underscores the need for thoughtful and innovative approaches to reimagine the review process itself.

A key question is whether every research output requires the same level of scrutiny. For some content, lightweight or selective review might suffice, reducing delays while maintaining quality. Complementary approaches, such as public annotations or open discussions, could also foster faster, iterative feedback while increasing transparency and engagement.

By exploring these innovations, PRC can ensure it not only scales effectively but also maintains the agility and accessibility that define its promise.

Production presents an even greater challenge. Integrating content into the scholarly record is a complex and technical process. Researchers often face substantial barriers, such as understanding JATS XML, navigating metadata systems like CrossRef or DataCite, or even breaking down citations into structured formats. These tasks, which may seem routine to publishing professionals, are daunting for researchers without specialized expertise.

Ironically, the reliance on PDFs exacerbates these issues. Preprint servers, while revolutionary in speeding up dissemination, have largely regressed to PDFs as the default format, setting the sector back in terms of accessibility and reusability. Converting unstructured PDFs into structured formats like HTML or XML—the so-called “hamburger-to-cow” problem [2]—is labor-intensive, costly, and inefficient. Transitioning to structured HTML earlier in the workflow would have transformative effects, enabling seamless conversions into formats like EPUB, PDF or XML while reducing typesetting costs and improving accessibility. HTML-first, single-source publishing systems present a compelling alternative to PDF and XML-driven workflows, simplifying engagement while maintaining expectations of document and data fidelity [3].

Another issue for PRC has not yet dealt with is: why are we still dealing exclusively with articles? Surely, there are other formats that could lead to more dynamic and useful ways of presenting research. What about other types of research objects such as data publications? Many experiments are already exploring this space including JOSS (mentioned earlier), and new platforms like CurveNote and Stenci.la, which, amongst other things, render data-driven dynamic charts. We need more of this, but it also comes with a plethora of problems yet to be solved. JOSS relies on GitHub for its workflows, which is not an intuitive or practical operational environment for individuals outside of development. Additionally, the challenge of ensuring portability for dynamic content in review contexts remains significant—this is especially problematic in decentralized review ecosystems where seamless workflows are critical.

Then, of course, there is the role of AI - a big topic in scholarly communications at this moment. While the Open Infrastructure movement includes AI tools, much like their proprietary counterparts, they sometimes overpromise on what AI can realistically achieve—often as a means to attract funding or attention. This issue is not unique to open-source tools; proprietary systems also suffer from this tendency to sell "magic." The general problem is that when a new technology emerges, it is, almost by definition, not well understood by the broader community. This lack of understanding creates a temptation for technologists to frame their tools as a kind of magic, even when the reality is more modest. Compounding this is the fact that people often like to buy into the promise of "magic" solutions rather than grappling with the complexities of the tools themselves.

There's nothing inherently wrong, with a code base that is 15 years old and is a monolith as opposed to a series of microservices. That's not it at all. [...] The technology we provide, the services that we provide, are ultimately for a very conservative and slow moving ecosystem.
--Will Schweitzer, CEO Silverchair [2]

On the other hand, there can very well be something wrong with tech that is 15-30 years old. The issue with statements like this from Will Schweitzer is the assumptions embedded in aging technology. Being operational is not the same as being efficient and sometimes new tech, even new ideas, come that fundamentally changes how you design systems. Platforms like ScholarOne, Editorial Manager, Literatum, EJP, Snapp, Silverchair and other systems were designed in a post-paper framework at a time when the technical foundations—hardware, software, and user expectations— and systems thinking were vastly different from where they are now. Will is betting that AI can be bolted onto these legacy systems, but the reality is that systems designed with AI as an afterthought will never fully realize its potential. Instead, new systems are emerging that treat AI as a foundational design principle, shaping workflows, infrastructure, and user experiences from the ground up.

Nearly every sector is rethinking and rebuilding its systems with an "AI-first" mentality [4]—designing with AI as a foundational element rather than an afterthought. The next phase of AI development is moving beyond simply adding a "magic button" to existing applications. Instead, tech is being designed from the ground up, with AI deeply embedded in their architecture and workflows. Scholarly publishing, however, remains an exception, often seeking to retrofit these tools onto legacy systems rather than embracing a proactive rebuild. This cautious approach aligns with the cultural norms of a skeptical and slow-moving researcher and publishing community, still grappling with the ethical and technical implications of integrating AI into established workflows.

Sooner or later, these systems are likely to be replaced. When the first brick is removed, the rest may follow in quick succession, opening the door for the sector to fully leverage AI platforms that meet the evolving demands of modern research and communication. While opinions about AI vary, and skepticism runs deep in the scholarly sector, its transformative potential is becoming increasingly apparent. The key question is: who will build these platforms—big publishers, Silicon Valley tech companies, or advocates of open infrastructure? And just as important, who should build them?

Despite this, there are excellent open tools, such as OpenAlex and Semantic Scholar, which are already making strides in integrating AI into research workflows. These tools, however, need better integration into PRC systems, and more tools of this caliber are necessary to support the ongoing evolution of PRC practices effectively.

The technical infrastructure supporting PRC workflows adds another layer of complexity. Each PRC community has its own ideas about the data it needs, where it comes from, what happens to it, how it is represented, and where it ultimately ends up. These requirements are often arbitrary in the sense that they vary significantly across communities and domains, reflecting the diverse needs and priorities of each group. Many PRC workflows also rely on novel or experimental processes that push existing systems beyond their intended capabilities. Coordinating teams across these disparate workflows and creating shareable, publishable assets—such as reviews or annotations—often involves integrating with external systems in diverse formats. This level of internal data management, differing coordination models, and system interoperability demands a technical agility that many current systems, designed with publishers rather than researchers in mind, fail to provide.

Addressing these challenges requires making strategic choices about technological development. There are three main paths: building bespoke systems tailored to specific workflows, developing abstracted platforms capable of managing diverse processes, or leaving researchers to manage workflows using general-purpose tools like spreadsheets. The latter, though ubiquitous, is inefficient and prone to errors. Spreadsheets are a frequent cause of delays in many traditional journal workflows (still!), and their over-reliance risks replicating these inefficiencies in PRC workflows.

Bespoke systems, while offering a high level of fit for specific projects, are costly to develop and often replicate up to 90% of existing functionality, leading to unnecessary duplication of effort and expense. This issue is particularly prevalent in the journal sector, where many publishers independently build tools that replicate similar features, shouldering the full burden of development themselves. It is essential to learn from these inefficiencies and avoid repeating them in PRC workflows. Flexible open infrastructure platforms, by contrast, offer greater adaptability but demand significant upfront investment and technical expertise to develop.

In my own experience working with various PRC groups, I’ve observed that while there is significant potential for shared features—such as notification systems, DOI management, invitations, versioning, tracking, and reporting—but each project also often requires systems to be configured and extended to meet its specific needs. This variability reflects the emergent nature of PRC as a field of practice.

For instance, some groups require each review to have its own DOI, with preferences split between using CrossRef or DataCite. Some favor batch-importing content from services like Semantic Scholar, while others rely exclusively on manual submission processes. Certain groups prefer inviting authors into the review process, while others insist on formal submissions. For those that do want author involvement the type of interaction varies—ranging from concurrent chat to threaded messaging, or even integrating the author directly into the review process itself. Collaborative review models appeal to some, whereas others prioritize isolated, independent reviews. Even the features for managing team dynamics vary—some support flat team structures, while others cater to highly hierarchical setups.

There is also significant divergence in how groups handle updated preprints within ongoing review cycles - an interesting challenge in a decoupled ecosystem. Moreover, every PRC project I’ve worked with has unique requirements for the data needed for evaluation, how it is presented to reviewers, and what happens to the review once completed.

Further, as the PRC field matures, there will be an increasing need for experimentation, and the supporting technology must adapt accordingly. Technology must not only iterate alongside these experiments but also anticipate potential avenues for further innovation and exploration.

These diverse requirements underscore the need for configurable and extensible technical systems. Each of these variations represents a challenge for platform design, necessitating a lot of bespoke systems, or systems that can adapt to a wide range of practices while minimizing redundancies and inefficiencies.

Endnotes

Recommendations for accelerating open preprint peer review to improve the culture of science. Avissar-Whiting M, Belliard F, Bertozzi SM, Brand A, Brown K, Clément-Stoneham G, et al. (2024). This interesting conference summary summarizes some research and experiences of PRC advocates and practioners. https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002502#sec002
“So now companies are going to basically come through, sweep through, and they’re going to do basically AI-first versions of basically everything” Marc anddressen, lex freidnman
The "Hamburger-to-Cow" Problem: Popularized by Peter Murray-Rust, this concept highlights the inefficiencies of converting unstructured PDFs back into structured data formats. The analogy emphasizes the unnecessary complexity of re-engineering content for reuse. For more, see: https://council.science/blog/implementing-fair-data-principles/
Single-Source Publishing: A streamlined approach that uses a single structured format, such as HTML, to produce multiple outputs like web content, PDFs, EPUBs, and XML. For an in-depth discussion, see my blog post (2023: https://www.robotscooking.com/single-source-publishing/

Endnotes

You might also like...

Looking for Early Feedback on a New Project

The Quiet AI Revolution Coming for Your Publishing Tools

AI Publishing System

The Emerging AI Tool Interface Standard: A New Design Paradigm

Experts as Developers in the Age of The Vibe