A few months ago we upgraded from OJS 2 to OJS 3.3.0.17, we had complications but we managed to solve them with the support of the Forum.
A few days ago we noticed that the statistics are not being processed correctly as we only get the file views and downloads but it is not showing the abstract views.
I have checked and reprocessed the statistics files but there are no results, I only see the downloads. The base_url is the same as the logs and also there are records of type article/view/245.
In the database there are values of assoc_type 515,256,259 but the value 1048585 is not there, this value represents a visit to the abstract? how can I diagnose why the visits to the abstract are not being registered?
When reprocessing the logs some of them show this error but it does not stop the task and sends it to reject.
[2024-09-12 19:23:34] [Error] Cannot load record: submission file is not associated with a representation object.
[2024-09-12 19:28:07] [Error] Cannot load record: submission file is not associated with a representation object.
[2024-09-12 19:32:11] [Error] Cannot load record: submission file is not associated with a representation object.
[2024-09-12 19:35:29] [Error] Cannot load record: submission file is not associated with a representation object.
[2024-09-12 19:41:18] [Error] Cannot load record: submission file is not associated with a representation object.
I have been reviewing and apparently it is a problem in the database because some files that are galleys do not have an associated record in the table galleys however in my case I have files as sumplementary and correspond to another assoc_type, we must wait to give us more clarity on these values because I have not found documentation on this section.
After I migrated from 3.2.1-5 to 3.3.0-18, I was impacted by low MySQL performance and decided to stop the scheduled tasks. However, after the confirmation from PKP support, I disabled the plugin “Recommend Similar Articles”, and the performance improved. So, I turned the scheduled tasks back on.
After enabling the scheduled_tasks and Acron plugin, I saw some background tasks, but the metrics are not being updated(abstracts).
I saw some scheduled tasks were in the Rejected folder, and I moved them back to Scheduled, but it hasn’t worked yet.
I did not change the Config file, so the URLs internally are the same (this may be a hypothesis that would affect the statistics).
Hi @Gustavo_Leon, did the statistics work after you applied the updates to the database? Please describe the SQL script you used to validate and fix it. Thanks
Hi @Gustavo_Leon,
Yes, the assoc_type = 1048585 mean the abstract (article landing page) view.
So it means you do not have any entry in your old DB table metrics (in 2.4 installation) containing abstract views?
Could you please describe how do you reprocess a file? Maybe you can try to reprocess a file that contains an abstract URL again, and let me know each step you make and what you then see for that article and that date in the DB table metrics?
It seems your log files stay in the usageEventLogs folder.
And it seems the scheduled task that processes the log files is running: plugins.generic.usageStats.UsageStatsLoader last run 2024-09-16.
What do you see in the your folde scheduledTaskLogs (in files folder) - what is the date of the last UsageStatsLoader-…-date.log file, and what do you see in that file?
Could you also take a look in your PHP error log, if you see any errors regarding usage stats there?
Maybe you could also double check the access permissions for all folders and files in the usageStats folder – if you are using the Acron plugin the web user needs read and write access to them.
You only use Acron plugin and you do not have any cron job to run the scheduled tasks, correct?
And it seems the scheduled task that processes the log files is running: plugins.generic.usageStats.UsageStatsLoader last run 2024-09-16. Now is today at 09:51 AM
What do you see in the your folder scheduledTaskLogs (in files folder) - what is the date of the last UsageStatsLoader-…-date.log file, and what do you see in that file?
Could you also take a look in your PHP error log, if you see any errors regarding usage stats there?I searched for “usage” or “stats” and I found nothing. I can send you as a DM the error log if you want.
Maybe you could also double check the access permissions for all folders and files in the usageStats folder – if you are using the Acron plugin the web user needs read and write access to them.I did, and it looks correct. I granted 775 to all files/folders under ‘usageStats’. I did this: find . -type d -print0 | xargs -0 chmod 0775 find . -type f -print0 | xargs -0 chmod 0775
You only use Acron plugin and you do not have any cron job to run the scheduled tasks, correct? That’s correct. See that I tested the migration first in the Sandbox (from 2.4.8-2 until 3.3.0-17) and let the scheduled tasks run, and no issues in the Sandbox with the stats (same DB, code, etc). But, in the Live environment, I faced this issue Slow queries running in background after upgrade from OJS 3.2.1-5 to OJS 3.3.0-18 - #8. I decided to stop running background tasks and Acron, and after the feedback from PKP support, I turned them back (just stopped the plugin ‘Recommend Similar Articles’). I remembered that the stats for 09/07 worked well, but they are still missing the updates from 09/04 until 09/06 and everything after 09/08 (so the visits to the abstracts are not being recorded). I saw some errors like this, as an example the log from UsageStatsLoader-66eb566a38de9-20240918:
[2024-09-18 19:38:34] [Erro] O diretório /…/files/usageStats/processing não está vazio. Isto poderia indicar que um processo falhou anteriormente, ou um processo está executando atualmente. Este arquivo será automaticamente reprocessado se você estiver usando oscheduledTasksAutoStage.xml, caso contrário, você precisará mover manualmente os arquivos órfãos no diretório de processamento de volta para o diretório principal.
[2024-09-18 19:40:41] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:42:45] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:46:23] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:49:24] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:52:25] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:55:30] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 19:58:30] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:01:08] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:04:06] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:07:42] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:11:00] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:14:10] [Erro] Cannot load record: submission file is not associated with a representation object.
[2024-09-18 20:14:10] [Aviso] Tarefa agendada parou.
I want to clarify that the base url is exactly the same in the logs and in config.inc.php I have generated a development environment to debug the error, for now I’m only processing a specific file. example: usage_events_20240820.log this file has records such as
(Does OJS take into account the logs identified as bots?)
I take the file from files/usageStats/reject and move it to the folder files/usage/stage then I go to my OJS 3.3 directory and execute the following command.
This generates a log in files/scheduledTaskLogs indicating the following:
[
[2024-09-23 23:06:33] [Notification] The task has been started.
[2024-09-23 23:11:24] [Error] Cannot load record: submission file is not associated with a representation object.
[2024-09-23 23:11:24] [Notification] The task has been stopped.
Others have mentioned that you can modify lib/pkp/classes/statistics/PKPMetricsDAO.inc.php to debug what is causing the error so I get the submission_id and the assoc_id.
[2024-09-23 23:11:24] [Error] Cannot load record: submission file is not associated with a representation object.5537 1798
That record in the submission_files table has null fields, it seems to me that there was some conflict in the update process although I don’t understand how the downloads show statistics.
Please help, I don’t know if it is necessary to correct the values in the db if so I need to know what the values of assoc_type, assoc_id refer to and in case this is not generating the problem what could I do to solve it.
Hmmm… Yes, it seems something is wrong within your DB
What kind of files are those problematic ones? Can you access them publicly and as editor in the backend?
The error that you see in the scheduled task log file (Cannot load record: submission file is not associated with a representation object) leads to an exception and the log file is rejected.
But that would actually mean that all stats numbers are not correct (and not only abstract views) – at least for those rejected log files.
Do you maybe have any new (after the upgrade) archived usage stats log file (i.e. successfully processed) that contains article landing page URL and that you can check the stats for abstract views and eventually try to re-process?
You can also create a few log file entries in the current log file, by accessing some article landing pages, and see what happens when that file is processed.
And to be sure: Are your stats for the time before the upgrade correct (so the problem is only with the new stats after the upgrade)?
(OJS does not count the bot access, bot entries in the log file).
Hmm… That same error (Cannot load record: submission file is not associated with a representation object) rejects the log file processing.
So is your example log file usage_events_20240918.log in your reject folder, together with the other usage stats log files?
It seems like you would have the same problem, that something went wrong with the upgrade so that the submission files entries in the DB are not correct any more
@Gustavo_Leon, how did you upgrade from OJS 2 to 3.3.0.17? Did you first upgrade to the latest 2.4.8 and then to the latest 3.2.1 version? Have you tested if everything OK after each upgrade?
hmmmm… but it is strange that @sergiobm has apparently the sameproblem although he upgraded from 3.2.1-5 to 3.3.0-18…
I will take a look if there are any known upgrade issues from 3.2.1-5 to 3.3.0-18…
It would be great if you both could investigate what kind of files that are – actually they should be galley files, because else they would not appear in the usage stats log files – and if you can access them fully regularly as reader and editor…
EDIT:
So you should first make sure that those files are galley files and not some other files (that the file ID in the URL in the log file is not from a dependent file like CSS or similar):
I found a similar/same issue here: Article statistics aren't being generated? · Issue #5832 · pkp/pkp-lib · GitHub.
So you could eventually proceed like that user.
Or, even better, if you could somehow track down, where in upgrade that wrong submission_files DB table happens, especially if coming from OJS2 (which requires first the upgrade to LATEST 2.4.8. and then to LATEST 3.2.1). @sergiobm, have you maybe also upgraded from OJS2?
If you have assoc_id for all those wrong entries (where assoc_type = NULL): SELECT * FROM submission_files WHERE file_stage=10 AND assoc_id IS NULL AND assoc_type IS NULL
(You should not get any result here)
If entries in publication_galleys table according to assoc_id in submission_files table exist, for all those wrong entries (where assoc_type = NULL): select sf.submission_file_id, sf.file_stage, sf.assoc_type, sf.assoc_id from submission_files sf left join publication_galleys pg on (pg.galley_id = sf.assoc_id) where sf.file_stage = 10 AND assoc_type IS NULL AND pg.galley_id IS NULL
(You should not get any result here)
If those two things are correct in your DB, then I can give you the SQL to set the assoc_type column correctly…
I also have another journal that I recently migrated from 2.2.4-8 to 3.2.1-5 and then to 3.3.0.19. I saw that both queries are returning 12 rows. But the difference is that the statistics of visits to the abstracts are recorded without issues.