Maybe @bozana could check this.
I noticed that after upgrading to OJS 18.104.22.168. I get an error in the middle of running the usage stats task. I added some debugging here ojs/UsageStatsLoader.inc.php at ojs-stable-3_1_0 · pkp/ojs · GitHub to print out the
$articleFile object before the error:
[Tue Jan 30 13:59:34 2018] [error] [client 22.214.171.124] SubmissionFile Object\n(\n [_data] => Array\n (\n [fileId] => 50054\n [submissionLocale] => fi_FI\n [revision] => 1\n [assocType] => 521\n [assocId] => 16578\n [submissionId] => 53302\n [fileStage] => 10\n [originalFileName] => 53302-50053-1-CE.pdf\n [filetype] => application/pdf\n [genreId] => 457\n [fileSize] => 3963948\n [uploaderUserId] => 1\n [userGroupId] => 750\n [viewable] => \n [dateUploaded] => 2015-11-17 20:38:50\n [dateModified] => 2015-11-17 20:38:50\n [name] => Array\n (\n [fi_FI] => 53302-50054-PB\n )\n\n )\n\n [_hasLoadableAdapters] => \n [_metadataExtractionAdapters] => Array\n (\n )\n\n [_extractionAdaptersLoaded] => \n [_metadataInjectionAdapters] => Array\n (\n )\n\n [_injectionAdaptersLoaded] => \n)\n
[Tue Jan 30 13:59:34 2018] [error] [client 126.96.36.199] SubmissionFile Object\n(\n [_data] => Array\n (\n [fileId] => 84426\n [submissionLocale] => fi_FI\n [revision] => 1\n [assocType] => 521\n [assocId] => 29873\n [submissionId] => 68578\n [fileStage] => 10\n [originalFileName] => filename.pdf.htm\n [filetype] => text/html\n [fileSize] => 51997\n [uploaderUserId] => 1\n [userGroupId] => 1409\n [viewable] => \n [dateUploaded] => 2017-12-15 20:40:47\n [dateModified] => 2017-12-15 20:40:47\n [name] => Array\n (\n [fi_FI] => filename.pdf.htm\n )\n\n )\n\n [_hasLoadableAdapters] => \n [_metadataExtractionAdapters] => Array\n (\n )\n\n [_extractionAdaptersLoaded] => \n [_metadataInjectionAdapters] => Array\n (\n )\n\n [_injectionAdaptersLoaded] => \n)\n
[Tue Jan 30 13:59:34 2018] [error] [client 188.8.131.52] PHP Fatal error: Call to a member function getCategory() on null in /plugins/generic/usageStats/UsageStatsLoader.inc.php on line 79
Not that the first SubmissionFile object does not cause any problems, but the second creates the error. Is the reason the missing genre_id or what? Because I have 2000 rows of submission_files with genre_id set to null.
@bozana I think this has to do with the bug where genre_id was set to NULL if the object was edited in 3.0.2? right? Fix genre assignment for upgrades · Issue #2506 · pkp/pkp-lib · GitHub
Do have a cool piece of sql to fix those 2000 rows of files?
Hi @ajnyga, so the missing genre_id for submission_files is the problem. As far as I know, all files should have genre_id (i.e. I don’t know how come your files have genre_id = NULL), but it is actually not the requirement in the DB table schema, so maybe @asmecher knows the cases where the files could have genre_id = NULL? In that case I would have to consider that in the script.
What kind of files are those that miss the genre_id in your DB table? Are they all article full texts?
Maybe you can first only consider/investigate the galleys, because they are publicly available and making the problem here. Maybe first see if some of them is a supp or artwork file, i.e. something like:
SELECT file_id FROM submission_files
WHERE genre_id = NULL AND assoc_type = 521 AND (file_id IN (SELECT file_id FROMsubmission_supplementary_files
) OR file_id IN (SELECT file_id FROM submission_artwork_files
Those that are not supp or artwork files are probably full texts then, right? Then you could use this SQL i.e. something like:
UPDATE submission_files sf, genres g, submissions s SET sf.genre_id = g.genre_id WHERE sf.genre_id = NULL AND g.entry_key = 'SUBMISSION' AND g.context_id = s.context_id AND s.submission_id = sf.submission_id AND sf.assoc_type = 521 AND sf.file_id NOT IN (SELECT file_id FROM submission_supplementary_files) AND sf.file_id NOT IN (SELECT file_id FROM submission_artwork_files)
For those that are supp or artwork files maybe to choose an appropriate genre key instead of
I am not sure if this is related to the issue above, because in that issue the files should have the genre_id = 1 and not NULL.
hm, seems that most of them are reviewer reports. I will check what the one that was causing the error was, and whether it is actually a single case of galley file with a section_id null.
Is there a specific file_stage I should be looking at with these cases?
@bozana, I think this is due to a bad xml import. The import was maybe missing a value for the genre attribute in revision-tag.
If you have
<revision genre=""> in the import xml, apparently you end up with NULL values in the database…
But this is solved. I just checked the SUBMISSION genre_id for the journal and added those values to cases where genre_id was null and stage was 10. They all came from a single journal. We have only imported full texts so it was a safe bet. Thanks once again @bozana!
Ah, yes, that could be. Now, I think, such an import would not be possible – there would be the appropriate import result message (that the genre could not be found)…
Thanks a lot @ajnyga!
another error, this time on row 77 with the $articleFile->getGenreId(). I am debugging this and will let you know what it is. Two log files were already processed succesfully.
This is something different. @bozana, I would bet some money on the fact that you will get a lot of similar questions in the forum during the next few months
The submission_file in this case is missing altogether.
What happens with the current code in a scenario where:
- A journal adds galley file A and publishes article. The file_id is saved in the usage stats log.
- Before the logs are processed the journal removes the galley file A and adds galley file B
- Scheduled tasks are run, the submission_file with the id mentioned in the logs can not be found => error on line 77, right?
edit: this scenario definitely needs fixing. I removed the references to the removed file and the log file got processed. But basically anyone could run into a similar problem.
Hmmm… I will then take a look how to just display the warning in the scheduled task log file but continue with the processing in such a case…
Thanks a lot!
Thanks, all the log files were processed now succesfully. There was only one such case.
I created a GitHub Issue and provided a patch there: consider missing submission file in usage stats loader · Issue #3332 · pkp/pkp-lib · GitHub – As in other similar cases, I just break the switch statement i.e. continue with the next line in the log file…
I think reviewer reports legitimately don’t have a genre ID.
Public Knowledge Project Team
yes, those did not turn out be the issue here. There were actually two separate cases. The first one was due to a bad xml import and the second one is the scenario Bozana has in the github issue above.
Great, then the requirement for the published files to have a genre_id is correct…
Thanks a lot to both of you, @ajnyga and @asmecher!
So, if I have the similar problem, what should I do? This will be fixed in the next OJS release or there is a need for editing the database?
If the problem is that the file (that was logged once) does not exist any more, then there is a fix in this GitHub Issue: consider missing submission file in usage stats loader · Issue #3332 · pkp/pkp-lib · GitHub.
The second problem here was that some public galley/supp files did not have genre_id in the DB, due to the wrong import – this should be fixed manually in the DB.
Hmm, I have checked the database, seems the error that we get after upgrade has different origin. I will open another topic for it.