You do not need to roll back the code.
Let me think about the error… At the moment I do not understand why is it happening…
You do not need to roll back the code.
Let me think about the error… At the moment I do not understand why is it happening…
Could you double check if you have foreign keys in your DB table usage_stats_total_temporary_records
? – e.g. in PHPMyAdmin, if you go to the table, then Structure, then Relation view. There should be 6 foreign keys, on the following columns: context_id, issue_galley_id, issue_id, representation_id, submission_file_id, submission_id.
So the foreign key constrain should already fail on that table… and not later when inserting into metrics_submission
.
EDIT: also other temporary tables should have foreign keys…
Hi @bozana I have checked the journal statistics and found that use_events_20240131.log
has produced statistics that were processed on a previous date.
use_events_20240201.log
is processed and Archived
but the statistics are not generated, maybe it will be generated the next day. I will take a look and update you.
I have checked the temporary
DB and found all foreign keys in usage_stats_total_temporary_records
.
2 foreign keys (issue_galley_id
and issue_id
) not found in usage_stats_unique_item_requests_temporary_records
and usage_stats_unique_item_investigations_temporary_records
.
I don’t know whether it is needed or not.
What do you mean?
use_events_20240201.log
is processed andArchived
but the statistics are not generated
It is so that first the file is parsed, the entries saved into the temporary tables, (currently) 2 jobs dispatched, and the file archived. Then on the next request (or two) the jobs will be run and statistics calculated and saved into the metrics tables.
Your foreign keys seem to be right. Hmmmm… I do not understand how comes that the error occurs first when saving into the metrics tables… – it should occur when inserting into the temporary tables and it would be ignored i.e. the row would not be inserted into the temporary tables and the processing would continue. Because that row would not be in the temporary tables it cannot cause any problems when saving the metrics from temporary tables into the metrics tables. But in your case it seems the row (that has a not existing galley/representation ID) is inserted into the temporary table – like the foreign key constraint is ignored there and when it is then saved into the metrics tables the error occurs.
EDIT: Have you double checked that temporary tables are empty before re-processing the log file?
I checked in the morning and the statistics (usage_event_20240131.log
) are as shown. I don’t know how this happened.
I have checked and re-processed use_events_20240201.log
but it is processed and stored
but statistics are not generated. It may be generated on the next processing day.
Yes, I have run the delete command for those tables.
DELETE FROM
usage_stats_institution_temporary_records
DELETE FROMusage_stats_total_temporary_records
DELETE FROMusage_stats_unique_item_investigations_temporary_records
DELETE FROMusage_stats_unique_item_requests_temporary_records
Strange, strange…
I do not know what is happening or what happened with 20240131 log file…
It seems you can see entries for usage_events_20240201.log in the metrics tables, i.e. in metrics_submission? If so, than it is OK. I can double check how the graph is working – it could be that it needs some time i.e. that yesterday is the last possible date…
Yes,
I don’t know what is happening, the statistics are for 20240131 or 20240201.
Maybe one more thing to double check: what is in the column date
in the metrics_submission
table where load_id = usage_events_20240201.log ? Is it 2024-02-01?
SELECT * FROM
metrics_submission WHERE load_id = "usage_events_20240201.log";
Below is the 2024-02-01
log.
Below is the 2024-01-31
log.
Yes, it seems to be OK – in the DB table screen shot for 20240201 I only see the assoc_type for files (515, 531), and the graph shows it too, when you switch to “files”. It is a little bit strange – it seems no abstract page was accessed on that day
You could eventually double check in the log file for 20240201 itself, e.g. if there is any entry with "assocType":1048585
.
And strangely for 20240131 there are only abstract pages (1048585) and no files But maybe the saving failed and was not complete because of the foreign key error that you get.
Hi @bozana
I have checked and found "assocType":1048585
in log 20240201
.
Are those entries with “assocType”:1048585 in usage_events_20240201.log for that journal (that you sent the graph screenshot of)?
If so, something seems to be wrong with stats processing from that file as well
I believe you would then need to somehow debug the processing:
You could also do the same for 20240131 log file, to see why there is no insertion error for that galley ID that does not exist… But one after another…
Best,
Bozana
I have checked, the 20240204 log file is still in the process folder. It should have been completed in the morning but log file is still in the process folder. There are no failed jobs available.
Hi @bozana
[2024-02-05 06:10:09] https://epubs.icar.org.in/
[2024-02-05 06:10:09] [Notice] Task process started.
[2024-02-06 05:10:05] https://epubs.icar.org.in/
[2024-02-06 05:10:05] [Notice] Task process started.
As I see, statistical compilation has not been completed for the last two days. The statistics for 20240204
have not been prepared and the statistics for 20240205
have also not been generated.
Yesterday I saw that log file 20240204
was in the processing folder and today it is in the stage folder and log file 20240205
is in the processing folder.
As mentioned, I also did the manual process but the compilation is not completed, the log file is stuck in the processing folder.
No failed jobs in the fail queue.
I don’t know what’s happening.
Do you see any errors in the PHP error log file? – It seems like the scheduled task cannot complete for a reason…
I have checked no error log reported.
Hmmm… strange…
Do you see any entries in the temporary stats tables?
Are you sure the last patch was applied successfully?
Do you see these two new lines in your PKPUsageStatsLoader: pkp/pkp-lib#9679 allow processing of the log files from the last month · pkp/pkp-lib@d0862af · GitHub ?
But somehow I think you would get/see an error if this would be wrong… Hmmm…
Can you try following:
APP\tasks\UsageStatsLoader
from the DB table scheduled_tasks
, e.g. by running:DELETE FROM scheduled_tasks where class_name = ‘APP\tasks\UsageStatsLoader’;
If that does not tell us anything:
Can you revert the last patch, those two changes from this link above, and then try the steps here again?
Hi @bozana
I have rolled back the last patch (pkp/pkp-lib#9679 allow processing of the log files from the last month · pkp/pkp-lib@d0862af · GitHub) and after reprocessing the log file is successfully executed and archived. Statistics have been generated for both the dates 20240204
and 20240205
. I will look at the log file processing over the next 1-2 days and update you.
I think, if we remove the last patch we will again face problems in processing files on the last date of the month like last time.