Tool for cleaning up database?

When I run ‘php rebuildSearchIndex.php’, I get several messages like:

I/O Error: Couldn’t open file ‘/var/www/files/journals/9/articles/1193/6359660725f5d.pdf’: No such file or directory.

I don’t know why the tools is looking for those files, I assume it’s entries in the database. Is there a safe way to clean up the DB?

I’m using OJS 3.3.0.13

My Example error: (If you see errors in your queries while following instructions below, try to correct it accordingly)

php tools/rebuildSearchIndex.php Clearing index … done Indexing “Alberta Journal of Educational Research” … 3315 articles indexed Indexing “ARIEL: A Review of International English Literature” … I/O Error: Couldn’t open file ‘/data/production/journals/86/articles/35700/submission/proof/35700-49-90662-1-10-20170221.pdf’: No such file or directory.

Run the SQL to locate that record and any siblings:

– Find this specific file record (this is inspection query)
SELECT sf.submission_id, sf.file_id, sf.file_stage, sf.assoc_type, sf.created_at
FROM submission_files sf
WHERE sf.submission_id = 35700
AND (sf.file_id = 90662 OR sf.created_at LIKE ‘%90662%’);

– Find the galley pointing to it (if it’s a galley rec) (this is inspection query)
SELECT pg.galley_id, pg.label, pg.locale, pg.submission_file_id
FROM publication_galleys pg
WHERE pg.submission_file_id = 90662;

(this is inspection query)

SELECT * FROM publication_galleys WHERE submission_file_id = 90662;

Update/Change Resolution:

If You cannot restore the file (most common in older migrations)

using bash linux command to check, if you can find the file to put back that is great if not move to Clean up DB so the indexer won’t look for it:

FILES_DIR=“/data/production/journals”
test -f “$FILES_DIR/journals/86/articles/35700/submission/proof/35700-49-90662-1-10-20170221.pdf” || echo “MISSING”

Clean up DB so the indexer won’t look for it: (I practice first in Dev env)

  1. If it’s a galley: either upload a replacement through the UI (preferred) or temporarily remove the galley link:
-- Detach the galley (non-destructive; removes the broken pointer)
UPDATE publication_galleys SET submission_file_id = NULL WHERE submission_file_id = 90662;
-- (Optionally delete the galley if it’s truly orphaned)
DELETE FROM publication_galleys WHERE galley_id = <id>;

  1. If it’s a non-galley submission file and truly orphaned, you can soft-remove it by moving it to the “deleted” stage or fully delete it (take a backup first):
-- Mark as deleted (safer approach)
UPDATE submission_files
SET file_stage = 99  -- SUBMISSION_FILE_DEPENDENT or a non-indexed stage; varies by OJS version
WHERE file_id = 90662;

-- Or, if you are certain it's orphaned, delete:
DELETE FROM submission_files WHERE file_id = 90662;

  1. Rebuild the index:
php tools/rebuildSearchIndex.php

Done.