DOCX to JATS XML converter


#92

I have made a custom Endnote styling which can take care of the references in proper style required by the plugin. Download: http://jmri.org.in/files/J%20Med%20Res%20Innov.ens


#93

Hi @Vitaliy I am working on a simple form for fill the imputs from the front of the final file.

The idea, is recive in an input the XML from your converter and fill the extra fields like the title abstract or keywords.


#94

Hi @josuevalrob

DOCX2JATS is more like an intermediate project which I managed in short-term for our goals. Its limitation includes relying on side stylesheets (TEIC). They are also used in meTypeset which in turn is included into Open Typesetting Stack. And much data are lost on XSL transformation. As I’m finishing JATS to PDF transformation for JATS Parser, I’m planning to return to writing better DOCX parser. it will be written in pure PHP so can be integrated into OJS without problems. This means that all meta-data for articles will come from OJS.


#95

Dear @Vitaliy
Thank you for you great work!
But, i don’t can comleted it… Could you tell me, how is your work progressing on this project?
I will be grateful.
Kind regards,
McKonagan


#96

I’ve just finished JATS Parser plugin for displaying JATS as HTML and PDF. I’m planning to start creating the main part of the DOCX to JATS converter in 2-3 weeks. The first version will transform all article body elements, like paragraphs, formatted text (italic, bold, underlined…), tables with rows and cells (including complex ones with merged cells), lists (including nested one), in-text references (with some limitations), figures, article’s sections with their titles. I’m expecting that all these elements would be transformed fully into JATS without the need of any manual correction. This work can take anywhere from 2 to 6 month.

If this plugin would be useful for the community, I’ll expand it for parsing reference list and different text style formatting (like text color, table cell width and height, footnotes). But it wouldn’t parse any metadata as it will write them into JATS XML from OJS.


#97

@Vitally

Thank you for your contributions to the OJS community. We are currently using your JATS Parser plugin to transform some legacy XML files within OJS to HTML.

We are very interested in your work on a DOCX to JATS converter. We’ve tested a number of workflows with the OTS stack, but we are still relying (for the most part) on manual encoding to XML for our journals publishing in full text. We are part of the community that would find the converter you are working on very useful.

Richard Higgins
IUScholarWorks | Scholarly Communication
Indiana University Libraries
https://scholarworks.iu.edu/journals/


#98

Hi @Vitaliy
I’m playing around with that parser actually. It’s a pretty helpful tool !

I have 2 questions:

Thanks
Jan


#99

The hard part here is played by TEIC XSLT stylesheets. I don’t know in which cases they miss parse embedded figures as simple paragraphs and cannot do anything here. For figures parsing is responsible this one: https://github.com/Vitaliy-1/DOCX2JATS/blob/master/stylesheets/docx/from/graphics.xsl

In second version of DOCX parser I’ll use own parsing mechanism, but I don’t know when it will be ready.

Example of conference paper reference: https://regex101.com/r/oYfrYx/1


#100

@Vitaliy
Thank you !