Plugin when file upload to Journal galley & migration for a new field

Jose_Ares · March 20, 2019, 6:08pm

Hi all,

I’m trying to create a new plugin for OJS 3, when a XML/text file is uploaded into a new or existing galley, to copy its contents into a new field into the DB, for example, ‘submission_files’, and stumble upon several doubts:

My question is if the hook I’m using “HookRegistry::call(‘ArticleGalleyDAO::insertNewGalley’, array(&$galley, $galley->getId())” is the correct one for that task, or there’s another one that has to be used, in case it exists.

I couldn’t find any hook that could do the trick here in this hook list, so I chose the mentioned one.

Is there any nice way of debuggin a plugin creation? I mean, when I’m coding it, there will be steps where the UI just keep on a halt with a javascript error, or when I don’t know if my plugin is not enabled properly of is coded wrongly. Is there another way besides using “error_log” or file_put_contents with the variables used in the plugin?
Is there a specific way of creating a migration in OJS, like Phinx or similar? I mean, so I can somehow push to a local repo and replicate it for all the people that needs the plugin.

Thanks a lot, Jose

asmecher · March 26, 2019, 5:09pm

Hi @Jose_Ares,

We don’t currently have a well-maintained hook list – the old Technical Reference is quite out of date (although being rewritten). If you’re not sure which hooks are available, one way to identify them is to temporarily add an error_log call to the call function in lib/pkp/classes/plugins/HookRegistry.inc.php, e.g.:

error_log('Hook call: ' . $hookName);

This will cause a list of called hooks to be dumped to the PHP error log.

About the trouble you’re having with Javascript errors, is it possible that your PHP is configured to display errors via the browser rather than logging them? If so, that’ll cause AJAX requests to be unparseable.

We don’t yet support a migration toolset, though we’ll likely move to the Laravel database toolset, which we already use for its query builder. Schema management is done with ADODB, and some plugins do include schema descriptors for automatic table creation when plugins are installed (look for schema.xml in the plugins directory). We don’t recommend modifying existing OJS tables.

Regards,
Alec Smecher
Public Knowledge Project Team

Jose_Ares · March 27, 2019, 10:43am

Hi @asmecher, thanks for your response.

Regarding the hook list, maybe I’m wrong but in the end is just registering a string into the HookRegistry and then call it in the plugin, so (Except I’m adding new ones in the codebase), I can call them like I want?

Regarding the problem I’m having, I still need a way to fetch the whole content of an article, that unfortunately is in files saved in the chosen uploads directory on install. That’s the reason I added a new content column to the DB (In my local), to avoid the reading of the file/directory every time I want the content, and also because the content is spreaded in a file in the host filesystem, which I can’t access from outside (Example, via an API)

So, the question remains, is there a possible solution for fetching the content of an article in a single query?

Kind regards, Jose

Jose_Ares · March 27, 2019, 3:27pm

Hi @asmecher,

following my previous question, you named that I can add a schema.xml and an install.xml when adding a new plugin, and that those files will be called on plugin install.

However, I added those files and the DB wasn’t modified. Reading the OJS2 documentation (The only available), it says something like:

Schema management: By overriding getInstallSchemaFile generic plugins can make use of OJS schema-management features. This function is called on install or upgrade

So, thing is that it looks like I have to force an upgrade every time a plugin of this type (This is, that modified the DB by adding a new table for example) is called right? Only registering the plugin will not work?

If you can clarify this issue, it will save me lots of troubles re building the DB.

Regards, Jose

asmecher · March 27, 2019, 6:16pm

Hi @Jose_Ares,

Regarding the hook list, maybe I’m wrong but in the end is just registering a string into the HookRegistry and then call it in the plugin, so (Except I’m adding new ones in the codebase), I can call them like I want?

Are you proposing to add new hook calls to the application (e.g. HookRegistry::call, somewhere in the core code), so that you can register that hook from your plugin? Sometimes that’ll be something to consider, but generally it should be avoided, as it means the plugin will require modifications to the core code in order to work (and the point of plugins is to have them augment the system without modification). If you think there’s a general-purpose hook that should be added, feel free to propose it and we can consider adding it for future releases. If I’ve misunderstood your question, please let me know.

Regarding the problem I’m having, I still need a way to fetch the whole content of an article, that unfortunately is in files saved in the chosen uploads directory on install.

The submission files, including the full text, are all listed in the submission_files table; it shouldn’t be necessary to scan the filesystem to list them.

following my previous question, you named that I can add a schema.xml and an install.xml when adding a new plugin, and that those files will be called on plugin install.

To add database tables for your plugin, write up a schema file using ADODB’s schema format (see e.g. http://adodb.org/dokuwiki/doku.php?id=v5:axmls:axmls_index). Then tell your plugin to use it by implementing the getInstallSchemaFile to point to it, as you’ve noted above.

You can get the schema to be installed to the database in a few ways:

Install the plugin via the web interface
Install the plugin by running a system upgrade (tools/upgrade.php or the web-based interface) with the plugin in place (as always, back up your system before you do this)
Install/apply the schema using the command-line tool, e.g.: php tools/dbXMLtoSQL.php -schema execute path/to/schema.xml

Regards,
Alec Smecher
Public Knowledge Project Team

Jose_Ares · March 28, 2019, 11:48am

Hi @asmecher, thanks for your response.

What I mean about the custom hook, was that I need it to be triggered when a file is specifcally uploaded into a Galley. In the end, I need the contents of a file, not the name of the file itself (Which I realized is in the table you mentioned).

To resume, I couldn’t find the full text of the file as you mentioned, would you tell me how?

So, the only idea that came up to me is:

Add a new hook in one of the core files (Sadly, I can’t add a hook elsewhere, as I don’t know too much OJS yet, so I could track the request to PKPSubmissionFileDAO)
Create a plugin that catch that hook and (Via the schema.xml and install.xml) creates a new table with a new field (file content)
Gets the content of the file while being uploaded and save it to this new table

This way, an external API will have direct access to that file content.

Hope this was clear, and thanks again Jose

asmecher · March 28, 2019, 4:54pm

Hi @Jose_Ares,

Just to make sure I understand – you need to have the full-text of the article available in a database table? What format would this be extracted from (e.g. HTML, PDF, DOCX, etc)? What fidelity will it need to be (i.e. searchable, or something that can be used to present the content)?

Regards,
Alec Smecher
Public Knowledge Project Team

Jose_Ares · March 29, 2019, 9:39am

Hi @asmecher,

Yes, I need the full-text of either text or XML files, which is just saved as plain text anyways, not any not-plain format.

It’s not have to be searchable, just easy to fetch from a request specifically pointing that row (I have fileId and revision to be “keys” of the text)

Thanks, Jose

asmecher · April 1, 2019, 6:56pm

Hi @Jose_Ares,

I’m not sure I understand – will all of your submissions have XML files (e.g. JATS XML) uploaded to them? Generally journals use a mix of PDFs, HTML, and sometimes JATS XML, and these are only stored in the filesystem (with records kept about them in the submission_files table).

Regards,
Alec Smecher
Public Knowledge Project Team

Jose_Ares · April 2, 2019, 8:34am

Hi @asmecher,

Basically I need the content of XML files, I’m not interested in the other formats. Regarding its content, yes, I know they are uploaded to the file system, and that’s the reason of this plugin, as I won’t have acces to the FS from the outside, hence the need of this extra table (That I’m including in my plugin) for loading the text content of the file.

Regards

asmecher · April 2, 2019, 5:11pm

Hi @Jose_Ares,

As you’ve found, the file contents are not available via the database. You could have a look at GitHub - asmecher/docCentricWorkflow: Embed Hypothes.is onto Unoconv-generated PDF conversions of OJS submission files for an example of a plugin that intervenes on file upload to accomplish a task – the same point of intervention could be used to stash the content in the database.

Regards,
Alec Smecher
Public Knowledge Project Team