Workflow with galleys from JATS XML to HTML and PDF

David_Alarcon_davidy · May 31, 2017, 5:37am

@ajnyga Thank you very much… I downloaded and installed the version (dev-branch)… but I still do not see the images …

What could be the problem?
I’ve been trying to figure out the error for several days…
Any help would be fantastic!

I added some CSS to the “back” button. You can check here: https://ojstest.com/recaf/index.php/RCAF/article/view/42/69

Many thanks again!

ajnyga · May 31, 2017, 5:49am

The lens parser is very strict in some cases. How does the markup of your images look like in the original JATS XML?

Vitaliy · May 31, 2017, 7:37am

Greetings,

I will adapt the plugin to defaultManuscript theme on this or at the end of next week.This will require modifying templates for article detail page and several others in this theme. I will let you know when it will be on github.

Taeke_Kuyvenhoven · September 13, 2017, 8:13am

Hi,

Although it’s still very basic version (in beta), you might want to try our free JATS editor at: http://jats.fontoxml.com
It’s a start. We’re in the process of further developing our JATS capabilities, guided by several experts / journals.
More to come by the end of this year.

kawahyu · November 20, 2017, 10:07am

Hi @ajnyga
How to use jats conversion GitHub - PeerJ/jats-conversion: Conversion and validation for JATS XML?
I have tried converting docx to xml using Open Typesetting Stack, it seems that the bibliography cannot be converted properly.

ajnyga · November 20, 2017, 10:51am

Hi,

The library you linked can be used only for converting JATS XML to other formats like HTML. Or at least that is how I have been using it.

I used it in the embedGalley plugin that converts JATS XML galleys on the fly to HTML and shows them on the article abstract page (GitHub - ajnyga/embedGalley: OJS3 plugin for visualizing JATS XML galleys). Unfortunately I have not had the time recently to take the plugin further, but I do have plans to do that in the future.

For converting articles from docx to JATS XML I have used the OTS-service. But as you noticed, there are still problems there.

I have been scetching a workflow that any journal editor even with limited technical skillset could use to create JATS XML articles and PDF’s. You can find the scetch here: GitHub - ajnyga/OJS3XMLWorkflow. It is work in progress - at the moment I think that we are missing the needed tools, but I think that we will get there soon. I am hoping to have time for this next summer…

If you have technical skills to use for example command line tools, then I suggest that you look into @Vitaliy’s work. I think that he has a command line conversion script for docx and a more sophisticated plugin for visualizing JATS XML. I do not know the details, but maybe he can add a link here.

kawahyu · November 20, 2017, 11:14am

Thanks @ajnyga,
Hope ojsxml workflow will be available for journal editor like me with no web programming background. I have used DocxtoXML by @Vitaliy, it is a great work. However, my journal uses APA citation and I dont have php skill to chnage the citation to APA

Vitaliy · November 20, 2017, 12:11pm

After diving inside Java and OOXML, I have started a work on a new DOCX to JATS XML converter. I certainly will add several citation styles there. It would be written fully in Java. The most complicated thing is creating own library, because already existing libraries, including docx4j and apache POI are not designed for such complicated task (they don’t support every feature I need).
Certainly, my converter will be the best one But it is a hard and long work.

varshilmehta · November 20, 2017, 12:37pm

I didn’t have it either. Just try and error method and a little bit of common sense with hard work and search for the codes will get you whatever you want. It is frankly not that difficult. I was able to make Vancouver styling already with a lot of other features as well.

Vitaliy · November 20, 2017, 3:36pm

There several major limitations, which, as far as I’m aware, would not be a part of OTS-service development:

Limited reference list parsing. Even native Microsoft Office citations are not rendered properly (and this is only a matter of extracting them from appropriate XML nodes).
Intext citations. Similar situation.
Limited tables parsing. All the data from the OOXML tables can be transferred to JATS without data lost (colspans and rowspans), but this is not happening when using this parsing service.
Poor meta-data extracting. We have separate XML tags in OOXML that are responsible for article’s title and author’s affiliations. This is not implemented or intended to be implemented in the future.
Nested elements, like lists (actually, are they supported now?).

Also, main parsing module in OTS is TEIC stylesheets, rendered by meTypeset. And this gives all of those limitations. I don’t understand why nobody had implemented more rigorous approach with one of the major programming languages.

ajnyga · November 20, 2017, 4:04pm

The metadata extracting now uses the OJS metadata extensively: Support for additional metadata from the OJS database to be sent to OTS by ajnyga · Pull Request #31 · kaschioudi/ojs3-markup · GitHub and Support for additional metadata coming from OJS by ajnyga · Pull Request #95 · pkp/ots · GitHub

But yes, there are still limitations, but if those limitations are fixed the usability of OTS together with the markup plugin and the Texture editor is excellent. You can do everything without leaving OJS.

At the moment we only have one journal doing JATS XML and they have a workflow of their own (they first convert to markdown and then use that to create XML and PDF). But they have a editorial assistant with special capabilities that usual editors do not have.

Vitaliy · November 20, 2017, 5:26pm

Does metadata in example come from OJS? I was more pointing on extraction DOCX data.
Also, I am hoping to integrate Java parser into OTS (already successfully deployed it on the local machine).
One more question, maybe more to @axfelix: Are you planning to keep these services (OTS and markup plugin) free to use without any costs?

axfelix · November 20, 2017, 5:41pm

Where possible, the plugin will override any of the metadata parsed from the Word or PDF document with the metadata already input into OJS; it uses the parsed metadata as a fallback.

If you have a contribution for OTS, I’d be very happy to review a pull request!

As for the cost of the service – PKP has never produced any closed-sourced code and we aren’t about to start now! The most up-to-date version of OTS and the plugin will always be available on Github. We will probably restrict bulk usage of our hosted OTS instance to PKP Publishing Services clients after we officially graduate OTS to a production service after the next release of Texture (which will fix inline images and a few other things), but you can always install it yourself, e.g. with: GitHub - axfelix/ots-vagrant

ajnyga · November 20, 2017, 5:47pm

I see that @axfelix beat me to it But yes, the idea was to create the whole front section from the OJS metadata. Since all needed data is already there when you go to production. Also with things like ORCID’s that is the safest way of adding them to JATS XML since ORCID guidelines suggest that you should not accept hand written ORCID’s.

IMHO it would make sense that Texture editor would limit in editing the body and back sections.

marc · June 6, 2018, 10:28am

@axfelix is there any docker recipe for OTS?

axfelix · June 6, 2018, 2:39pm

Just the vagrant I linked a couple posts up that I’m aware of – but if you want to dockerize it, go for it! I can do that myself at some point if there’s interest.

marc · June 7, 2018, 10:54am

Sounds like a nice summer project.

Looks outdated, but should we use this as an starting point?
https://github.com/axfelix/xmlps-docker

BTW, during the last 2 years I have been waiting for a mature-enough project to cover the whole JATS workflow. You all made an impressive job during quite a decade and now we can see the fruits.

IMHO, the suite formed by “OTS+ojs3-markup+LensGalley” sounds like a wining horse.

@ajnyga and @Vitaliy also made great contributions that can help in specific contexts.

Thanks you all guys.

Looking for time to contribute in some way… even testing and sending you back feedback.

Cheers,
m.

axfelix · June 7, 2018, 3:34pm

Oh, I forgot all about that docker – I think I abandoned it when I decided to maintain the vagrant instead, but it may actually work fine if I up-port the xmlps.sh from the vagrant repo (it just won’t be much lighter-weight because of all the dependencies).

Yeah, this is my hope too! It’s been a long and winding road to be sure, but I think we finally have a good kit of tools together to solve the problem in a way that will fit most workflows.