Where is full-article HTML and sources of JATS XML?

castedo · April 17, 2024, 4:33pm

I have scanned this forum, OJS/OPS documentation and code to try and understand what is the source of JATS XML and where full-article HTML is generated from JATS XML. By “full-article” HTML I mean all the information one would expect to see in a PDF version, notably body text and images.

My impressions are as follows. Am I mistaken on any of these points?

OPS, Open PREPRINT System, does not ingest JATS XML.
For OJS, JATS XML does not come from authors but rather is optionally imported by journal staff.
Direct generation of full-article HTML from JATS XML only happens in galley view.
Galley view is for journal staff, not authors.
Full-article HTML generation for a journal website is generated from an internal OJS specific data representation, but not from JATS XML.
The internal OJS specific data representation might have been started from an import of JATS XML, but that is optional.
There is no import of MECA packages.

Am I correct on these points?

What I am unclear on is whether body text and images inside JATS XML get transferred into an internal OJS representation. Do they?

In general where does the JATS XML that is imported into OJS typically come from? Or in other words, what corpus is representative of the “dialect” of JATS that OJS understands? Like what tools generate JATS XML that OJS is able to understand?

As a related resource to share, I have recently surveyed a number of JATS related open-source projects and have been documenting my rough summary understanding at:

https://baseprints.singlesource.pub/jats/

Best regards,
Castedo Ellerman
castedo.com/about

asmecher · April 18, 2024, 6:49pm

Hi @castedo,

OJS/OMP/OPS do not (yet) have a built-in workflow for body text. We have intentionally left that to external typesetting tools. We are beginning to explore the integration of an off-the-shelf HTML editor with mappings to and from JATS, but that work is in early stages.

So when you import JATS to e.g. OJS via some mechanism (see this thread, and I believe the PKP hosting team also maintains one), OJS ingests the metadata, but will only treat the body as a file that it does not interact with (which facilitates the current workflow, whereby users generally work with word processor documents).

Starting with 3.5.0 (when it’s released) OJS will have a “home” for JATS documents to live – see Add basic support for JATS XML files to publications · Issue #7505 · pkp/pkp-lib · GitHub for details.

Typically authors do not arrive with JATS in hand; it’s up to the journal to create it. Most journals that have this requirement will generate JATS by using 3rd-party tools or out-sourcing the creation entirely…

Articles in JATS form are presented to the reader in a variety of ways, such as the JATS Parser Plugin or Lens Galley plugin (sadly Lens Reader is no longer maintained upstream but it is used in production by journals).

A galley is the “reader-ready” form of an article, typically PDF or HTML. It is only available for journal insiders to work with during the article workflow, but when the issue is published, it becomes available to readers.

OJS does not currently support MECA. I’ve been active within the MECA standard-setting group, but that hasn’t yet led to an implementation in OJS; the main impediment is that MECA currently requires FTP accounts for transport, and that’s not going to work for the majority of the OJS user community.

As for the JATS generation side of OJS – that is currently done through the JATS Template plugin, and made available through the OAI JATS plugin.

As you can see, the current landscape is a wide variety of tools that don’t currently cohere into a single toolchain. We have been lacking a free/open source JATS-native web-based editor for the rest of the tools to adhere to, and unfortunately the ones we’ve worked with (Texture and Libero) have both died on the vine. I don’t see a good candidate showing up in the ecosystem in the near term, unfortunately. This is why we’re starting to look at a HTML editor instead, with mappings to and from JATS.

Happy to talk further!

Regards,
Alec Smecher
Public Knowledge Project Team

castedo · April 18, 2024, 9:55pm

Thanks for all this info!

I’m starting to see different tool-chains and dialects of JATS as serving the differing needs of different groups with different priorities and tool proficiencies. I’m trying to hash out yet another dialect serving (hopefully) the needs of tech-savvy STEM researchers “self-publishing” certain types of research documents. I’m using the name “Baseprint” to disambiguate this new dialect of JATS. While I’m at it, I’ll try to document some of this JATS dialect diversity.

Given this, would it be accurate to describe the “XML Data Source” of JATSParserPlugin and lensGalley as “manual imports by journal staff”?

This is in the context of filling in the table on https://baseprints.singlesource.pub/jats/.

Does https://github.com/ajnyga/embedGalley NOT read the full-article and body text of JATS XML?

How does https://github.com/withanage/lensGalleyBits relate to the above mentioned plugins?

asmecher · April 19, 2024, 12:36am

Hi @castedo,

Given this, would it be accurate to describe the “XML Data Source” of JATSParserPlugin and lensGalley as “manual imports by journal staff”?

Yes, that’s the typical pathway – until we have a fulltext editor of some kind integrated.

However, not all JATS use in production requires the fulltext. For example, Coalition Publica operates using just the metadata supported by the JATS Parser plugin.

Does GitHub - ajnyga/embedGalley: OJS3 plugin for visualizing JATS XML galleys NOT read the full-article and body text of JATS XML?

I haven’t used that plugin – it’s a 3rd-party development, though Antti-Jussi (its author) is a good friend to the project. But yes, that’s what it appears to do.

How does GitHub - withanage/lensGalleyBits: OJS3 Plugin for JATS/BITS Galleys with enhanced LENS viewer relate to the above mentioned plugins?

That’s another 3rd-party development, again by a good friend to the project. My understanding of it is that it extends the Lens Galley plugin to add support for BITS.

You’d be able to reach @ajnyga and @Dulip_Withanage here, if you have any specific questions about those!

Regards,
Alec Smecher
Public Knowledge Project Team