DOCX to JATS XML converter

#133

What operating system you are using?

#134

I use the windows 10.

#135

E.g.: https://www.howtogeek.com/235101/10-ways-to-open-the-command-prompt-in-windows-10/

#136

I’ve made the first non-production release for DOCX to JATS XML Converter Plugin for OJS 3.1+: https://github.com/Vitaliy-1/docxConverter/releases/tag/0.5.0.0

It produces the output suitable for Texture Plugin.

For those who are interested in testing, there are some instructions and examples here: https://github.com/Vitaliy-1/docxConverter

Requirements:

  • PHP 7.2+
  • php-xml (usually installed by default).
  • php-zip (usually installed by default).
2 Likes
#137

Thanks a lot @Vitaliy
We are pleased to test it

#138

Hi @Vitaliy
I have tried it with the latest oldGregg. It is great, although some tables need a bit work with Texture. Thanks so much for your work.

I have one issue;
It seems that the citation [1] is not parsed and linked to the reference. Is it not included in this alpha release? or How could the link be made in XML file?

#139

You mean at the front-end? Yeah, I need to fix it. It’s parsed but not linked.

#140

Yup… Waiting for the fix
Is for the next release?

#141

Yes, but before that I want to make a production release for DOCX Converter Plugin.

1 Like
#142

Hi @Vitaliy
Is there any manual way to link the citation and reference? while waiting for the fix in the next release. Soon, we are going to upgrade our journal to the latest Ojs and oldGregg and publish a new issue

#143

Hi @Vitaliy,
I installed the plugin without any problems but once I click on the link nothing happens, I forgot to activate something?
Thanks!

Bye
Tiziano

#144

It requires modification of the JATS Parser code related to the library. Most probably here: https://github.com/Vitaliy-1/JATSParser/blob/master/src/JATSParser/HTML/Text.php#L42
I think I missed something there.

1 Like
#145

Hi @Tiziano,

So, you are pressing the Convert to JATS XML button but nothing happens, right?
If so, there definitely should be a fatal error in php logs that indicated the reason. Let me know if you find something.

#146

Hi @Vitaliy, great work!
Okay then the reason apparently was because I tested it on an OJS version 3.1.1.4. I tried to install it on a 3.1.2 version and it works, it processes the XML file (but only if it has the extension .DOCX, instead the .DOC does not appear the button).
I’ll tell you two things, the first is that on Safari Browser the Edit Texture doesn’t work, but it works on Firefox (I didn’t check on Crome). Once I make corrections on Edit Texture, does the XML file update automatically on OJS?
The second is that with the eLens View plugin the XML file that processes the DOCX CONVERTER does not work.
I hope that these observations will help you, to have a good plugin that in our case would be great.

Bye
Tiziano

#147

DOC and DOCX are quite different formats. It’s possible to convert DOC to DOCX with PHP but it requires a heavy 3rd party library. I’m not sure if it is needed because the same can be accomplished almost with any text editor (like MS Word or LibreOffice).

Can you specify the version of Safari Browser and its version? You can open an issue on the plugin’s page with the error from the browser’s console: https://github.com/pkp/texture/issues but it may be a specific scope of the Texture itself, rather than plugin’s.

Yes. You need to press a save button (upper-right corner as I remember)

My possibilities to fix Lens Viewer plugin are quite limited. What I’m certainly planning is to make the output compatible with JATS Parser Plugin.

Sure, no doubt! My first aim now is to make sure that it is compatible with DOCX files created from different sources. It’s problematic without tests from users.

#148

Dear @Vitaliy

I found some cases:

First, it seems that the table is not displayed properly in article details. I tried your sample xml https://github.com/Vitaliy-1/docxToJats/blob/78b9c0fa46fa77de427b63cf93c8254b1acbd2c5/samples/output/test_jats.xml here https://dryam.website/index.php/jer/article/view/7. It is Ojs 3.1.2 with the latest oldGregg.

Second, is the citation e.g., [1,2] parsed in the sample? Or it is just the problem of code as you explained

Third, once I converted a Docx with some tables that contain over 100 words (It sounds strange table with a such number of words, but we have some). The result is only a half or less texts parsed then tried editing via Texture plugin (adding the missing texts), they were not saved. So, I added manually in the xml file. In this case, is there any limitation of words in a table to be parsed properly?

#149

As I’m trying to remember, I probably fixed it for the JATS Parser library but haven’t updated for Old Gregg theme. Thanks, I’ll check.

For DOCX Converter Plugin I haven’t added support for citations. I’m thinking about possible approaches. In DOCX citations and references haven’t special markup unless specific MS Word or LibreOffice citation tools are used. I suppose the best option would be to make integration for Zotero that can be used as a plugin in those text editors.
Before that they can be added manually with Texture plugin.

Hmm, tables should be converted normally. Can you send me a problematic DOCX file? But I can take a look only in 1-2 weeks.

1 Like
#150

I think it is a good idea if possible.

#151

So far I know, it works best with Chrome

#152

How is this? We plan to upgrade soon