DOCX to JATS XML converter

I’m planning to fix this as soon as return from vacation.

1 Like

Regarding the issue with tables, are you comfortable to work with Git? I’m planning to make many changes before the next release and this can take a month or two, keeping in mind the amount of needed modifications to styling and new pages. Right now master branch is free of this issue. With Git to download it with submodules use: git clone --recurse-submodules https://github.com/Vitaliy-1/oldGregg.git.

For the next upgrade, we made some modifications of OldGregg, mainly on indexJournal.tpl
What do you suggest to deal with the modification and continuous development of OldGregg?
I just read this https://help.github.com/en/articles/set-up-git, never use it before

The next version would be completely different, so it is impossible to adapt previous changes to it. After this, theme will be stable and changes could be applied through child theme or by supporting own version by Git.

1 Like

I think waiting for the upcoming version is a good option. One or two months are fine.

The second beta release for the DOCX to JATS Converter Plugin: Release docxConverter-beta2 · Vitaliy-1/docxConverter · GitHub

Thanks for all who provides testing and makes the plugin better!

1 Like

The third beta release for DOCX to JATS Converter Plugin: Release docxConverter-beta3 · Vitaliy-1/docxConverter · GitHub

Special thanks to @marc for testing and detecting bugs of the previous release :slight_smile:

The major feature of the new release is figure extraction, only JPEG and PNG formats are supported. Images are now extracted from DOCX file and attached as supplementary files to the galley. The output is compatible with Texture plugin. I’d be appreciate for testing, especially of this new feature, particularly I’d like to know if it works with different DOCX layouts.

Thanks to you @Vitaliy

Looks really promising. I will test this new release one during next week.

Take care,
m.

Dear @Vitaliy
I have tested the plugin and found the following issues:

  • Published, Accepted date and received dates are always displayed one day back of the originally entered dates. Even the XML file is showing correct dates i.e Published date: 26-Sep-2019 it will display as 25-Sept-2019
  • It does not show DOI anywhere on the front end
  • Keywords are entered and showing when I am editing with Texture but not displaying on the front end
  • I have a particular issue with my manuscript type. The first page in my journal DOCX template is a table with abstract and all the relevant details. Reference manuscipt is here (manuscript. DOCX to Jats convert the xml file yet when I try to edit with Texture plugin; it gives the following error message” ERROR: Cannot read property ‘children’ of undefined”

So, I experimented a little and when I deleted the Table on the first page of the same manuscript, Texture plugin open the file for editing.
Yet, it does not recognize the headings and subheadings from the manuscript.
May I ask for your assistance to solve the issues?
I am using the following plugins
54

Thanks

Hi @seisense,

Thanks for testing, I’ll explore those issues in near time.

1 Like

Hi @Gokmen_ARSLAN,

docxToJats library is missing from you plugin installation. Are you using git to install it?

To install with submodules from git use: git clone --recurse-submodules https://github.com/Vitaliy-1/docxConverter.git. If already installed, you can use: git submodule update --init --recursive. Or reinstall with packed release: Release docxConverter-beta3 · Vitaliy-1/docxConverter · GitHub

Thank you for your reply. I instilled the last release of the converter. Release docxConverter-beta3 · Vitaliy-1/docxConverter · GitHub

I am sorry. I don’t understand installing the submodules from git using: git clone --recurse-submodules GitHub - Vitaliy-1/docxConverter: Plugin for OJS 3 that parses DOCX and converts it to JATS XML format
Can you help me?

When you install the package from a release (docxConverter-0.5.2.tar.gz), do you still receive that error?

Yes Vitality, I received.

Something is definitely wrong with the archive. Is docxToJats folder in DOCX Converter plugin installation empty? Are you sure that you didn’t download the source code? This error can happen only if the source code was downloaded instead of a package with the release. Can you double check?

The other way to install the plugin is through git. First, you need to install it: https://git-scm.com/downloads
Once again, it’s enough to install packed release through the admin dashboard.

Hi @seisense,

regarding you comments:

DOCX file (OOXML) is more about styling than structure and this is a big headache for creating a JATS XML file from it. Currently, all metadata in the result XML comes from OJS. Unfortunately, just parsing DOCX archive isn’t enough to extract those from the article. This applies to references as well. I’ll explore the possibility to integrate the machine learning approach in the process but it’s rather a long-term perspective. For now, the best way to extract those is Grobid machine learning software, it accepts PDF files and gives XML ALTO output, which is similar to JATS XML. Thus, I recommend using DOCX Converter to extract the article’s body.

I’ll check this issue later, probably some inconsistencies between DOCX Converter output and Texture requirements. Regarding the abstract, once again, the converter relies solely on article’s metadata from OJS, I’ll recommend to strip it from the document before the conversion as, nevertheless, it will be retrieved from OJS. Apart from parsing article’s body, I’m making an investigation on how to proceed with references, I’m planning to add some sort of support for them later.

I’ve identified the problem, will make a fix soon. In general, I’d recommend to use default styles for headings, in your example, it’s the custom style based on the default. Although it’s manageable as I’ll add al check for the base style, I cannot predict all possible custom styles that document editors allow to use for headings.

1 Like

Thanks for your detailed reply.

Thanks for the nice plugin. The conversion is done in OJS 3.2.01 but “edit with texture” option is not shown.

Hi @Amin_Salehi-Abargoue,

Is Texture plugin installed?