UsageStats - Strange i18n behaviour and missing data

Description of issue or problem I’m having:
UsageStats stopped working as of Dec 31 2021, last data collection for 2021 was done on Dec 30 2021 (archive/ stops there). For Jan 4 2022 only partial data was loaded, nothing else (not in archive/).

Steps I took leading up to the issue:
We have read several forum threads.
We have looked at the metrics and scheduled_tasks tables in the database. We have done a thorough analysis of the data in the filesystem and found things like this:

$ ls -1 scheduledTaskLogs/*20211231.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-61ce3a0214593-20211231.log
scheduledTaskLogs/Crossrefautomaticregistrationtask-61cf727d7c2f1-20211231.log
scheduledTaskLogs/DataCiteautomatischeRegistrierung-61cf726bbde9d-20211231.log
scheduledTaskLogs/DOAJautomatischeRegistrierung-61cf71dcdba5c-20211231.log
scheduledTaskLogs/ErinnerunganGutachten-61cf72f3b7ae2-20211231.log
scheduledTaskLogs/mEDRAautomatischeRegistrierung-61cf722d1faec-20211231.log
scheduledTaskLogs/PKPPNAblieferungsaufgabe-61cf1f31aa5ac-20211231.log
scheduledTaskLogs/SubscriptionExpiryReminder-61ce3a06074aa-20211231.log
scheduledTaskLogs/TaskzumLadenderNutzungsstatistik-Datei-61cf713f98a92-20211231.log

The “TaskzumLadenderNutzungsstatistik” sometimes is called “Usagestatisticsfileloadertask” instead.

We have several “##admin[…]” entries, 17 to be exact. Looks a lot like beginning and end of the month are special?

ls -1 scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-*
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-5fbe8245dc2d8-20201125.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-603e66923eff7-20210302.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-6063428ab1005-20210330.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-6064f0f2be63a-20210401.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-608c7de10b0f2-20210501.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-60b40ae31d2a7-20210531.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-60b55c63b561a-20210601.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-60dce961a629e-20210701.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-61047662e2d77-20210731.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-6105c7f4654d1-20210801.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-612ea66ce0d51-20210901.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-615633618786d-20211001.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-617dc066094a7-20211031.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-617f20008e131-20211101.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-61a6acf9dd13c-20211201.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-61ce3a0214593-20211231.log
scheduledTaskLogs/##admin.scheduledTask.statisticsReport##-61cf8b8293248-20220101.log

We also see a lot of data missing in archive/ which seems to be stuck in stage/.

What I tried to resolve the issue:

We moved one file for Jan 04 2022 from processing/ to stage/. We then ran runScheduledTasks manually on the commandline, to no avail.

Application Version - e.g., OJS 3.1.2:

OJS 3.2.1-3
Usage Stats 1.0

Additional information, such as screenshots and error log messages if applicable:

We have only seen the error message on data being in processing/, which is why we moved one file to stage/.

Additional remarks:

  • Why are the file names translated? Obviously this is very error-prone and makes it harder to debug things. Searching for translated error messages means I will have to translate them back to English, because searching in random language (in our case German) won’t give me any results.
  • Missing documention of this plugin is also a major concern for us. I have not found any information on where the data comes from, where it first starts, what all the directories are meant for and how to handle data that seems to be in the wrong place (except for the error message we saw in the log file). Describing the process in more detail anywhere would be much appreciated!

We have now moved all files from both stage/ and processing/ to a different location. We still have no messages in either scheduledTaskLogs nor the webserver logs that would indicate what is wrong or what we can do to amend this situation.

We are hoping that moving the old files away from stage will show us whether or not the currently produced new files are passing the process. If they do, we might add the old files little by little. If not, we do not have the slightest idea where to start debugging this mess.

(Gabriele)

Quick update: Yesterday’s stats were successfully processed and archived. We will now leave things as they are for the weekend to see if future days are also successfully processed. After that we might start including the old data from stage file by file and see if maybe the sheer amount of files in stage was the issue or if one specific file lead to the issues.

I am still underwhelmed by the logging behaviour: We see neither any info on what went wrong in scheduledTaskLogs nor in the PHP logs (at least not when grepping for the libs-pkp-path, as we have limited possibilities here).
(Gabriele)

Process has been running without issues for a few days (Jan 12-15 2022). We will now add the old files we moved away from stage one by one. Prior to that we will remove all lines - if any - that do not match the regular expression as per the plugin’s default settings (looks like a low amount of lines compared to the general file size).
(Gabriele)

We have seen another file fail last week and are now considering to reprocess the failing files after the next OJS upgrade. This is really not at all satisfactory, but the way the plugin behaves for us we don’t see another option currently.

Our last theory was that the regex is failing on some files and that reverse-grepping might help. But some files fail even though all lines match the regex.

@asmecher do you have any idea what can be done in this situation or who to ask?

(Gabriele)

Why are the file names translated?

You probably have Acron plugin doing the scheduled tasks. This means that a visit to any of the journals in your installation can trigger the scheduled task for processing the usageStats file and the language of that journal will determine the language of the log file.

We see the same thing in Journal.fi

It seems you have an installation with more than one journal? What kind of setup do you have? Are your journals under the same domain or have some custom setup?