we have the newest version of OJS installed and are trying to get the full-text search running. We already took care of the following steps:
enable section for pdftotext in config.inc.php
test pdftotext with galley file
run php tools/rebuildSearchIndex.php
clear cache with rm -rf cache/_db/*
also searched database for specific words that only appear in PDFs (no result)
Running rebuildSearchIndex.php gives us the following errors, but also results in a message that all articles were indexed:
PHP Warning: Declaration of PKPUsageEventPlugin::getEnabled() should be compatible with LazyLoadPlugin::getEnabled($contextId = NULL) in /srv/www/.../html/lib/pkp/plugins/generic/usageEvent/PKPUsageEventPlugin.inc.php on line 386
PHP Warning: Declaration of SubmissionFileDAO::fromRow($row) should be compatible with PKPSubmissionFileDAO::fromRow($row, $fileImplementation) in /srv/www/.../html/classes/article/SubmissionFileDAO.inc.php on line 23
Do you have any advice on what we could try next to get the search running?
Check that, so I was initially using index[application/pdf] = “/usr/bin/pstotext -enc UTF-8 -nopgbrk %s - | /usr/bin/tr ‘[:cntrl:]’ ’ '”
to index the pdfs and was not getting any results. Once I switched to: index[application/pdf] = “/usr/bin/pdftotext -enc UTF-8 -nopgbrk %s - | /usr/bin/tr ‘[:cntrl:]’ ’ '”
things looked much better.
Thanks,
Jim
Thanks for your reply, @jbutler! We were using pdftotext from the beginning, but I would agree it is confusing why pstotext is mentioned in config.inc.php as first example for pdf indexing tools, since there is a PostScript section below that.
So your indexing now works completely? On re-visiting our issue, it seems like
metadata is indexed from all articles
full-text seems only to be indexed from the most recent issue
May I ask what version of OJS you are using, and what version of database and poppler-tools (the package pdftotext usually is installed with)?
Hi, yes indexing looks good now. That is interesting it’s grabbing the full text from the most recent issue. Maybe a permissions issue…
Debian, currently running ojs 3.1.1.4 | poppler-utils 0.26.5-2 | poppler-data 0.4.7-1
Thanks