I am trying to import back-issues along with articles into OJS 3.x.
I am using the native importer inside OJS 3, but I have problems configuring the XML for import.
So far I have tried this XML for importing the HTML to an article
But it seems the native importer requires the HTML to be in Base64-code within the XML-file. How do I import HTML-files along with issue and article? The documentation only describes version 2.x. The above XML comes from the exported XML I did, with content I manually created.
An example of a full issue import XML you can see here: https://github.com/pkp/ojs/blob/master/tests/data/60-content/issue.xml. For an article would be similar, just starting with the element “article” (and if the article should be assigned to an issue the issue_identification element should be added within the article element).
If your article file is online i.e. accessible under an URL, instead of element “embed” (in the element “revision” in the element “submissin_file”) you could use <href src="http://..." />
OJS would then (try to) import the file from that URL (defined in the attribute “src” in the element “href”).
Maybe this would be an easier solution for you?
When trying to import this, using the native XML importer in OJS 3.1.1-2 I get this message,
Validation errors:
Opening and ending tag mismatch: href line 34 and revision
Opening and ending tag mismatch: revision line 32 and submission_file
Opening and ending tag mismatch: submission_file line 31 and article
Premature end of data in tag article line 2
The document has no document element.
Files are in right location, but tag mismatch? Am I setting the tags wrong?
Haha, yes! I totally missed that one. Thank you for finding it.
However, my imported html-file does not show up as a production galley on the site. Do I need to import it separately, or is that another element?
I noticed that I can set a stage attribute to submission file element, but it still don’t show up as a published galley on the site. I am trying to generate a XML that will import a few thousand html-based papers into a journal.
All the html-files will reside on the server once import is ready to go, and I need them to be imported along with any images they link to. (I can generate a list of images into the xml, but need some info on this)
I think you should have id="13" in the submission_file_ref element of your article_galley element, i.e. this: <submission_file_ref id="13" revision="1"/> , because your submission_file element has that id. The system has to know the relationship – what file does belong to the galley – and this is done with that id. Revision number seems to be OK.
EDIT: I will have to take a look about the images embedded in the HTML…
The images inside an html file is often embedded like so, <p><img src="equatn.gif"></p>
The file names of the images are idexed and saved into a list inside my software, but I need to know how to construct the XML elements for importing these image files. All files are indexed, and the XML is programatically generated, I just need to know how to configure the actual XML-element to import an image file into OJS.
I forgot to mention that I managed to import a HTML galley into OJS, and it showed up on the website too. Now I only need the images and we’re good to go.
Ok, thank you.
Is there any other way to import images from a html galley then? Maybe manually import them over ftp to a specific folder where all images are located?
It is rather important that images follow these html galleys, as they represent research data and similar.
You could upload the images manually, but you would need to use the web UI, in the article production stage, galleys grid – in order for all information to be correctly saved in the DB and according to that in the files folder – so that the OJS knows that they belong to that HTML file…
Also, then you should not include them in the import XML file.
How many images do you have/would need to import?
Thank you for your support. We’re currently in a suspended state. I will return later this year to continue this task. Your support is much appreciated.
The amount of images is about 5000, in gif’s, png’s and jpg’s. I am currently trying to retrace my steps and remember my thought pattern on this problem.
I am manually editing the XML for now, and putting links and such to import. Then use this as a sort of template for the rest of the projects files. There are somewhere between 12000-15000 files that need to be imported.
@bozana I saw that there were some additions to the thread over at GitHub in regards to the issue of importing dependency files of the HTML. I can however not understand how to actually reference images that are in my HTML-file to be imported into OJS along with the HTML-file.
Here is my current submission file element in the XML-file,
Here you would need to adapt the id, filename, date_uploaded, date_modified, filetype, uploader as well as src attribute. Also the value for the image name element.
The ID of this artwork_file element is referenced nowhere.
The artwork_file element references your HTML submission_file with the id=17 and revision=1.
And your article_galley element also references your HTML submission_file with the id=17 and revision=1.
Thank you very much for this snippet.
One question, does it matter where in the structure of my XML i put it?
Does it belong within or , or anywhere within ?
@bozana
First off, I would like to say thank you for all your help and your enourmous patience!
My OJS version is 3.1.1.4 using Mysqli on MariaDB 10.1.26 (Debian 9).
I have been trying and tryĂng to import these html-files and their dependent images. And I have seen some inconsistency in how OJS interprets the incoming XML (Native XML plugin).
In the first attempts, I followed your example and used the XML snippets from the mentioned sample (3.1.2 native). But OJS don’t import the HTML-file at all, nor showing it.
I don’t see any other difference than the name of the article galley, is this why the html-file won’t show up on front end? I am sure it is not, I would like it to be named HTML for better usability. But why is the sample XML not working, while this “custom” snippet is?
breaks the import somehow, and HTML-files no longer show up on front end. Removing the -element makes OJS import the HTML again, but without images. Upon importing the XML files OJS gives an error message stating “The revision “1” for submission file “17” would create a duplicate record”.
Am I doing something wrong? (Obviously)
Could you correct my code, please?