Workflow with galleys from JATS XML to HTML and PDF

@ajnyga Thank you very much… I downloaded and installed the version (dev-branch)… but I still do not see the images …

What could be the problem?
I’ve been trying to figure out the error for several days…
Any help would be fantastic!

I added some CSS to the “back” button. You can check here: https://ojstest.com/recaf/index.php/RCAF/article/view/42/69

Many thanks again!

The lens parser is very strict in some cases. How does the markup of your images look like in the original JATS XML?

Greetings,

I will adapt the plugin to defaultManuscript theme on this or at the end of next week.This will require modifying templates for article detail page and several others in this theme. I will let you know when it will be on github.

1 Like

Hi,

Although it’s still very basic version (in beta), you might want to try our free JATS editor at: http://jats.fontoxml.com
It’s a start. We’re in the process of further developing our JATS capabilities, guided by several experts / journals.
More to come by the end of this year.

Hi @ajnyga
How to use jats conversion https://github.com/PeerJ/jats-conversion?
I have tried converting docx to xml using Open Typesetting Stack, it seems that the bibliography cannot be converted properly.

Hi,

The library you linked can be used only for converting JATS XML to other formats like HTML. Or at least that is how I have been using it.

I used it in the embedGalley plugin that converts JATS XML galleys on the fly to HTML and shows them on the article abstract page (https://github.com/ajnyga/embedGalley). Unfortunately I have not had the time recently to take the plugin further, but I do have plans to do that in the future.

For converting articles from docx to JATS XML I have used the OTS-service. But as you noticed, there are still problems there.

I have been scetching a workflow that any journal editor even with limited technical skillset could use to create JATS XML articles and PDF’s. You can find the scetch here: https://github.com/ajnyga/OJS3XMLWorkflow. It is work in progress - at the moment I think that we are missing the needed tools, but I think that we will get there soon. I am hoping to have time for this next summer…

If you have technical skills to use for example command line tools, then I suggest that you look into @Vitaliy’s work. I think that he has a command line conversion script for docx and a more sophisticated plugin for visualizing JATS XML. I do not know the details, but maybe he can add a link here.

Thanks @ajnyga,
Hope ojsxml workflow will be available for journal editor like me with no web programming background. I have used DocxtoXML by @Vitaliy, it is a great work. However, my journal uses APA citation and I dont have php skill to chnage the citation to APA

After diving inside Java and OOXML, I have started a work on a new DOCX to JATS XML converter. I certainly will add several citation styles there. It would be written fully in Java. The most complicated thing is creating own library, because already existing libraries, including docx4j and apache POI are not designed for such complicated task (they don’t support every feature I need).
Certainly, my converter will be the best one :slight_smile: But it is a hard and long work.

2 Likes

I didn’t have it either. Just try and error method and a little bit of common sense with hard work and search for the codes will get you whatever you want. It is frankly not that difficult. I was able to make Vancouver styling already with a lot of other features as well.

There several major limitations, which, as far as I’m aware, would not be a part of OTS-service development:

  • Limited reference list parsing. Even native Microsoft Office citations are not rendered properly (and this is only a matter of extracting them from appropriate XML nodes).

  • Intext citations. Similar situation.

  • Limited tables parsing. All the data from the OOXML tables can be transferred to JATS without data lost (colspans and rowspans), but this is not happening when using this parsing service.

  • Poor meta-data extracting. We have separate XML tags in OOXML that are responsible for article’s title and author’s affiliations. This is not implemented or intended to be implemented in the future.

  • Nested elements, like lists (actually, are they supported now?).

Also, main parsing module in OTS is TEIC stylesheets, rendered by meTypeset. And this gives all of those limitations. I don’t understand why nobody had implemented more rigorous approach with one of the major programming languages.

The metadata extracting now uses the OJS metadata extensively: https://github.com/kaschioudi/ojs3-markup/pull/31 and https://github.com/pkp/ots/pull/95

But yes, there are still limitations, but if those limitations are fixed the usability of OTS together with the markup plugin and the Texture editor is excellent. You can do everything without leaving OJS.

At the moment we only have one journal doing JATS XML and they have a workflow of their own (they first convert to markdown and then use that to create XML and PDF). But they have a editorial assistant with special capabilities that usual editors do not have.

Does metadata in example come from OJS? I was more pointing on extraction DOCX data.
Also, I am hoping to integrate Java parser into OTS (already successfully deployed it on the local machine).
One more question, maybe more to @axfelix: Are you planning to keep these services (OTS and markup plugin) free to use without any costs?

Where possible, the plugin will override any of the metadata parsed from the Word or PDF document with the metadata already input into OJS; it uses the parsed metadata as a fallback.

If you have a contribution for OTS, I’d be very happy to review a pull request!

As for the cost of the service – PKP has never produced any closed-sourced code and we aren’t about to start now! The most up-to-date version of OTS and the plugin will always be available on Github. We will probably restrict bulk usage of our hosted OTS instance to PKP Publishing Services clients after we officially graduate OTS to a production service after the next release of Texture (which will fix inline images and a few other things), but you can always install it yourself, e.g. with: https://github.com/axfelix/ots-vagrant

I see that @axfelix beat me to it :smiley: But yes, the idea was to create the whole front section from the OJS metadata. Since all needed data is already there when you go to production. Also with things like ORCID’s that is the safest way of adding them to JATS XML since ORCID guidelines suggest that you should not accept hand written ORCID’s.

IMHO it would make sense that Texture editor would limit in editing the body and back sections.

@axfelix is there any docker recipe for OTS?

Just the vagrant I linked a couple posts up that I’m aware of – but if you want to dockerize it, go for it! I can do that myself at some point if there’s interest.

1 Like

Sounds like a nice summer project. :wink:

Looks outdated, but should we use this as an starting point?

BTW, during the last 2 years I have been waiting for a mature-enough project to cover the whole JATS workflow. You all made an impressive job during quite a decade and now we can see the fruits.

IMHO, the suite formed by “OTS+ojs3-markup+LensGalley” sounds like a wining horse.

@ajnyga and @Vitaliy also made great contributions that can help in specific contexts.

Thanks you all guys.

Looking for time to contribute in some way… even testing and sending you back feedback.

Cheers,
m.

Oh, I forgot all about that docker – I think I abandoned it when I decided to maintain the vagrant instead, but it may actually work fine if I up-port the xmlps.sh from the vagrant repo (it just won’t be much lighter-weight because of all the dependencies).

Yeah, this is my hope too! It’s been a long and winding road to be sure, but I think we finally have a good kit of tools together to solve the problem in a way that will fit most workflows.

1 Like