Setting up scheduled task with cron

  • Application Version:

OJS 3.1.2-4

  • Description of issue:

I recently took over two existing OJS installations, and for one of them I migrated from OJS 2 to OJS3. We haven’t gone live in that OJS 3, but I noticed that usage statistics were no longer being updated.

  • Steps you took leading up to the issue

I migrated from RHEL 6 with Postgres to CentOS 8 with MariaDB. I used the MySQL Workbench migration tool for the db without too many issues. Because there were no customizations, I used the download method for the upgrade rather than using git. Otherwise, I didn’t change any settings within the application. Everything appeared to be working after the upgrade.

  • What you tried to resolve the issue

I check to make sure Acron is installed and active, as are Usage Statistics and Usage Events. In files/usageStats/usageEventsStats, there are usage event logs for a few days after the upgrade. None of the dates of the logs in usageEventsStats are in usageStats/archive.

I decided to try to test the scheduled task using the cron method for OJS 3 outlined in this documentation. I disabled Acron through the admin interface and changed the setting in config.inc.php to On for scheduled tasks. When I run the command to test it I get an error message saying:
Tasks file "lib/pkp/plugins/generic/usageStats/scheduledTasksAutoStage.xml" does not exist or is not readable!
When I look in the lib/pkp/plugins, there isn’t a usageStats directory. But it looks like scheduledTasksAutoStage.xml is in the location described for the OJS 2 script in the same documentation. I checked back with the original OJS 3 I downloaded and it is the same.

So, in addition to usage event no longer being logged and the ETL script not running to get those usage events into the db, I guess there is something kind of basic that I’m misunderstanding about how this should be configured. Any help or suggestions welcome.

Hi @IOPNdev,

Congratulations on the migration between PostgreSQL and MariaDB and upgrade to 3.x – that’s quite a set of changes!

What do you see in the scheduled_tasks table?

Thanks,
Alec Smecher
Public Knowledge Project Team

1 Like

Thanks! Here is what the scheduled_tasks table has. I finished the migration and moved it to the new server around Dec 10th I think.

    +--------------------------------------------------+---------------------+
    | class_name                                       | last_run            |
    +--------------------------------------------------+---------------------+
    | classes.tasks.ReviewReminder                     | 2019-12-10 19:48:23 |
    | lib.pkp.classes.task.ReviewReminder              | 2020-05-13 13:35:41 |
    | plugins.generic.pln.classes.tasks.Depositor      | 2019-12-10 09:14:32 |
    | plugins.generic.usageStats.UsageStatsLoader      | 2020-05-13 01:32:37 |
    | plugins.importexport.crossref.CrossrefInfoSender | 2020-05-12 13:59:53 |
    | plugins.importexport.datacite.DataciteInfoSender | 2020-05-12 20:49:34 |
    | plugins.importexport.doaj.DOAJInfoSender         | 2020-05-13 12:55:12 |
    | plugins.importexport.medra.MedraInfoSender       | 2020-05-13 10:28:39 |
    +--------------------------------------------------+---------------------+
  • Update:

It looks like one of the issues is that the usage events plugin was disabled but the UI always shows that plugin as being enabled. I filed an issue against the pkp-lib repository. Now, log files are populating in the usageEventLogs directory, but they still aren’t being processed by running
php tools/runScheduledTasks.php plugins/generic/usageStats/scheduledTasksAutoStage.xml.

Hi @IOPNdev,

That sounds like progress. Do you have anything helpful in either your PHP error log or in the scheduledTaskLogs subdirectory of the directory configured in the files_dir setting in config.inc.php?

Regards,
Alec Smecher
Public Knowledge Project Team

Thanks for your help @asmecher. A lot closer to being fixed.

When I ran the scheduled task with auto staging command on the command line there are no errors and no side effects that I can see. When I looked at the full php logs and not just the errors, here are the relevant entries:

[15-May-2020 01:36:22 America/Chicago] PHP Warning:  fopen(/var/www/uploads/scheduledTaskLogs/Usagestatisticsfileloadertask-5ebe386638cfe-20200515.log): failed to open stream: Permission denied in /var/www/ojs/lib/pkp/classes/scheduledTask/ScheduledTask.inc.php on line 111
[15-May-2020 01:36:22 America/Chicago] PHP Warning:  flock() expects parameter 1 to be resource, boolean given in /var/www/ojs/lib/pkp/classes/scheduledTask/ScheduledTask.inc.php on line 112
[15-May-2020 01:36:22 America/Chicago] ojs2: Couldn't lock the file.

The scheduledTaskLog directory had entries up to 2019/12/13, and then suddenly stopped. All of those subdirectories have httpd_sys_rw_content_t for their SELinux context type, so I’m not 100% sure what is causing the problem, but I’m pretty sure looking closer at SE policy will have the answer. When I moved the usageEventLogs files manually as the apache user to the stage directory, and then run the command
php tools/runScheduledTasks.php plugins/generic/usageStats/scheduledTasks.xml
as the apache user then they are processed and the usage statistics show up in the Articles graph. So, if nothing else I can just use that workflow for now.

The one last part that doesn’t seem to be working is the Counter plugin. It still isn’t showing a link for 2020 on the Counter Reports page. Is it pulling from the same data? Thanks again for your help.

Hi @IOPNdev,

I think you’re on the right track with SELinux; unfortunately I haven’t used that toolset myself but if you want a quick way to test you can always write a quick .php script to check writability:

<?php
echo is_writable('/var/www/uploads/scheduledTaskLogs/Usagestatisticsfileloadertask-5ebe386638cfe-20200515.log')?"Writable\n":"Not writable\n";

Don’t forget that this will need to run in the same context as OJS, i.e. via the user account with the cron job if using cron, or with the same credentials as the web server if using the Acron plugin.

I believe Counter stats depend on the same toolset, so fixing this may resolve the Counter issue.

Regards,
Alec Smecher
Public Knowledge Project Team

Just wanted to follow up for anyone else finding this. The above steps were all I needed to do. I was just misunderstanding what data COUNTER Reports compile. The 2020 link was never appearing because I hadn’t opened any of the full-text pdf links from an article and the site wasn’t live so no one else had either. I checked out the counter standard on Journal Report 1 and Article Report 1 for more details, and realized those reports are only concerned with successful full-text requests.

1 Like