OJS 3.3-3.5 Crossref XML exports titles incorrectly

Problem 1

The Crossref schema says “Face markup that appears in the title, subtitle, and original_language_title elements should be retained when depositing metadata.“ Here, “face markup” means bolds, italics, superscripts, subscripts, etc. Yet the OJS crossref plugin XML-encodes the titles. The problem goes beyond simple aesthetics; it is especially annoying for chemists to turn all these H&lt;sub&gt;2&lt;/sub&gtO back into H<sub>2</sub>O before depositing. The problem persists in OJS 3.5. Is it planned to be fixed?

Problem 2

We (and the lead OJS developers - hi, @asmecher) all know that Crossref offers very limited i18n options. While two multilingual JATS-formatted abstracts can be deposited, there’s no way of sticking language attribute onto a titles or a title element. Yet multiple titles are permitted per schema (what for? who knows). For bilingual articles, Crossref plugin exports something like this:

<journal_article xmlns:jats="http://www.ncbi.nlm.nih.gov/JATS1" publication_type="full_text" language="uz">
<titles>
  <title>sarlavha</title>
</titles>
<titles>
  <title>title</title>
</titles>

It is clear that with just one per-journal_article language attribute, there is no way of telling “which title is which”, so there had better be just one title. I think.

Am I right? Can/should this be fixed as well?

The other distantly related discussions: this and this old posts, and the github issue.

@AhemNason,

Would you be able to weigh in on this?

-Roger
PKP Team

Just making sure you saw this Multilingual Metadata in Crossref · Issue #9126 · pkp/pkp-lib · GitHub
And an older related issue: Way to set language for articles in OJS Export · Issue #21 · pkp/crossref-ojs · GitHub

Thank you for the references! I’m not sure if I saw these particular ones, but this is a recurrent topic across the Crossref-related forums and github issues.

Adding a language attribute for titles is not something that we’ll be adding.

https://community.crossref.org/t/multi-language-support/3054/22 this comes from the Crossref team. Great. Why multilingual abstracts though?..

Hence, I don’t understand the logic behind the current implementation

If the article is not a translation, but you simply want the metadata expressed in multiple languages, you should not use <original_language_title> but instead add multiple <titles> sections, one for each language.

quoting from this pkp issue that you mentioned. XML is supposed to be machine-readable. Without explicit language attribute for titles, the machine sees this: “this journal_article has two titles, both in the same language defined in the attributes of journal_article”, which is not the intended meaning. Only one of the titles is in the specified language.

if a member had a work whose full text was available in Spanish, French, and English and only wanted one DOI to be cited for those works in the three languages, while we’d recommend three distinct DOIs in this case, they would be able to provide titles in those multiple languages (free of a language attribute)

from the same Crossref forum. Also great. And then we’ll let the implementation, aka parser, to guess which title is which. Is it the first? The last? The one in the middle? The hell we need XML and schemas for, let’s throw all the titles in plain text. We said that the member should register three DOIs, but they stubbornly register one. Let’s punish them for saving a few cents off the DOI registration fees :rofl:

You may not agree with me, but I’d say that in the current scheme of things keeping one title, which is in the specified language, would result in less ambiguous XML than the one with several titles. But at the end it is, of course, for the developers to decide whether this logic aligns with the OJS vision.

In any case, I’d also like to draw attention to the unescaped “face markup” in the titles (problem 1 in the original post). What do you think about that?