Hi,
We are running OJS 3.2.1.1 on RHEL 7.9 with mysql 5.6
We recently upgraded from OJS 2.8 and noticed that full text indexing of PDFs is no longer working. For our migrated journals, older articles continue to be full text searchable, but articles added since migration to 3.2.1.1 are not full text searchable.
This line is uncommented in our config.inc.php
index[application/pdf] = “/usr/bin/pdftotext -enc UTF-8 -nopgbrk %s - | /usr/bin/tr ‘[:cntrl:]’ ’ '”
When I run the above command manually on the command line against a newly added article, the expected output is send to stdout.
apache is owner of all files and directories in the OJS file directory and has rwx permissions.
I ran the rebuildSearchIndex.php tool on the command line on a test instance and it reported that it successfully indexed 47 articles. However, when I checked the database I saw that the number of table entries dropped as follows
submission_search_keyword_list dropped from 18532 to 1029 records
submission_search_object_keywords dropped from 74491 to 2144 records
submission_search_objects dropped from 416 to 370 records
It seems that only the article metadata is now being indexed.
The only thing I see in the logs is that PHP records a Division by zero Warning whenever an editorial decision is recorded. But this does not stop us from being able to push an article through the editorial workflow and publish it, as expected. I saw no errors in the log when I ran rebuildSearchIndex… A couple of PHP Notices were output regarding Array to string conversion, and two PDFs were not found (path was given as journals/1//articles/… instead of journals/1/articles/… but this seems to be an unrelated issue).
Can anyone suggest where the problem might lie?
many thanks.