Fatal error Invalid datetime format when rebuilding search index

Hi, Merry Christmas

I have update a system to OJS 3.3.0-13 and execute the tool for reindexing all the system.

php tools/rebuildSearchIndex.php

The tool indexed correctly the first journal but in the second one it throws an error

Indexando “Colombia forestal” … PHP Fatal error: Uncaught PDOException: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: ‘\xF0\x9D\x91\x92\xF0\x9D…’ for column revistas.submission_search_keyword_list.keyword_text at row 1 in /var/www/html/lib/pkp/lib/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:115
Stack trace:
#0 /var/www/html/lib/pkp/lib/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php(115): PDOStatement->execute(NULL)
#1 /var/www/html/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(489): Doctrine\DBAL\Driver\PDOStatement->execute()
#2 /var/www/html/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(664): Illuminate\Database\Connection->Illuminate\Database{closure}(‘INSERT INTO sub…’, Array)
#3 /var/www/html/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(631): Illuminate\Database\Connection->runQueryCallback(‘INSERT INTO sub…’, Array, Object(Closure))
#4 /var/www/html/lib/pkp/lib/vendor/laravel/framework/src/ in /var/www/html/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php on line 671

Database and tables are Collation is utf8_general_ci and character_set_database is utf8

What could it be? Regards

Hello all,

I ran into the same error after upgraded from OJS 3.3.0-8 to OJS 3.3.0-14. Php version 7.3.33 Linux server. Running same script php tools/rebuildSearchIndex.php

The script successfully indexed the first 2 journals and ran into error on the 3rd one and stopped therefore the next 34 journals did not get a chance of indexing.

PHP Fatal error: Uncaught PDOException: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: ‘\xF3\xBA\xAE\xA2-’ for column ojs.submission_search_keyword_list.keyword_text at row 1 in /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:119
Stack trace:
#0 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php(119): PDOStatement->execute(NULL)
#1 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(489): Doctrine\DBAL\Driver\PDOStatement->execute()
#2 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(664): Illuminate\Database\Connection->Illuminate\Database{closure}(‘INSERT INTO sub…’, Array)
#3 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(631): Illuminate\Database\Connection->runQueryCallback(‘INSERT INTO sub…’, Array, Object(Closure))
#4 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(496): Illuminate\Database\Connection->run(‘INSERT INTO sub…’, Array, Object(Closure))
#5 /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Capsule/Manager.php(200): Illuminate\Database\Connection->affectingStatement(‘INSERT INTO sub…’, Array)
#6 /var/www/html/ojs-3.3.0-14/lib/pkp/classes/db/DAO.inc.php(140): Illuminate\Database\Capsule\Manager::__callStatic(‘affectingStatem…’, Array)
#7 /var/www/html/ojs-3.3.0-14/lib/pkp/classes/search/SubmissionSearchDAO.inc.php(38): DAO->update(‘INSERT INTO sub…’, Array, true, false)
#8 /var/www/html/ojs-3.3.0-14/lib/pkp/classes/search/SubmissionSearchDAO.inc.php(113): SubmissionSearchDAO->insertKeyword(‘???\xF3\xBA\xAE\xA2-’)
#9 /var/www/html/ojs-3.3.0-14/classes/search/ArticleSearchIndex.inc.php(306): SubmissionSearchDAO->insertObjectKeyword(‘2709459’, ‘???\xF3\xBA\xAE\xA2-’, 6370)
#10 /var/www/html/ojs-3.3.0-14/classes/se in /var/www/html/ojs-3.3.0-14/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php on line 671

I appreciate any help or pointing.

Best,

Dung.

1 Like

Hi again,

I am also wondering if there is a way we can index each journal separately to avoid the problematic journal?

for example:

php tools/rebuildSearchIndex.php (can we specify what journals to index here)?

Thanks!

Dung.

A post was split to a new topic: Rebuilding search index - PDF characters

Hi @dung,

With apologies for the late response, the built-in (MySQL/PostgreSQL) search engine implementation doesn’t currently support re-indexing only certain journals. The Lucene search engine does, but I doubt you’re using that.

I’d suggest tweaking your text extraction tools in config.inc.php so that invalid characters are filtered.

Regards,
Alec Smecher
Public Knowledge Project Team

1 Like

Hello,

Thank you for the pointer, I will look into adjusting the extraction configuration.

Best,

Dung.

Hi Alec,

I made a modification in order to find the error when indexing “Colombia forestal”

Everything points to the indexing of a word from an xml file of an article, this is the extract of the XML file:

<fig id="f6">
						<label>Figura 6</label>
						<caption>
							<title>Comparación de los estimados de los modelos de mejor ajuste para la estimación de la biomasa aérea y subterránea de árboles individuales de <italic>Hevea brasiliensis</italic> en el departamento del Caquetá, Colombia, y los recomendados por <xref ref-type="bibr" rid="B5">Brahma <italic>et al.</italic> (2017)</xref> (Ec. 1: B a = exp (−3.31+0.95∗Ln dap 2 ∗at ) ∗1.02 y Ec. 5: 𝐵 𝑏 = 𝑒𝑥𝑝 (−2.64+0.60∗𝐿𝑛 𝑑𝑎𝑝 2 ∗𝑎𝑡 ) ∗1.04), de <xref ref-type="bibr" rid="B27">Monroy-Rivera y Návar-Cháidez (2004)</xref> (Ec. 2: 𝐵 𝑎 =−548.448+0.008684∗ 𝑑𝑎𝑝 2 ∗𝑎𝑡+27.48∗𝑎𝑡+6.949∗𝑑𝑎𝑝), de <xref ref-type="bibr" rid="B28">Moreno et al. (2005)</xref> (Ec. 3: 𝐵 𝑎 =0.0041∗ 𝑐𝑎𝑝 2.596 y Ec. 6: 𝐵 𝑏 =0.0022∗ 𝑐𝑎𝑝 2.357 ) y de <xref ref-type="bibr" rid="B37">Sone et al. (2014)</xref> (Ec. 4: 𝐵 𝑎 =0.144∗ 𝑑𝑎𝑝 2.40 y Ec. 7: 𝐵 𝑏 =0.0661∗ 𝑑𝑎𝑝 2.02 ). B<sub>
 <italic>a</italic>
</sub> : biomasa aérea (kg.árbol<sup>-1</sup>); B<sub>
 <italic>b</italic>
</sub> : biomasa subterránea (kg.árbol<sup>-1</sup>); DAP: diámetro a la altura del pecho (cm); AT: altura total (m); CAP: circunferencia del tronco a la altura del pecho (cm). Las líneas continúan representan el rango de biomasa donde se recomienda la aplicación de los modelos desarrollados.</title>
						</caption>
						<graphic xlink:href="0120-0739-cofo-25-02-57-gf6.png"/>
					</fig>

So when trying to add the word 𝑒𝑥𝑝 into the table submission_search_keyword_list the rebuilding index process stops with the error I posted initially. My database character set is utf8 and database and table collation is utf8_general_ci

Could you guide to a solution, please,

Thank you in advance

Hi @lcmartinezru,

I suspect you’ve run into an UTF8MB4 character, which cannot be represented in the utf8_general_ci that you’re using. See this StackOverflow discussion for a discussion of the issues involved.

To adapt your database to UTF8MB4 there are two steps:

  1. Convert the database contents. You can do this from PHPMyAdmin, apparently, or use mysqldump to generate a database dump and change the encodings there before reloading. (There will probably be lots of information on StackOverflow about this.)
  2. Configure OJS to use the new encoding; see this thread for details.

Regards,
Alec Smecher
Public Knowledge Project Team

Thank you Alec,

First I am trying to reindex adding this code to the file lib/pkp/classes/search/SubmissionSearchDAO.inc.php in order to skip words with those problematic symbols

EDIT: It worked, the re-indexing was completed.

Regards,

                if (PKPString::strlen($keyword) > self::MAX_KEYWORD_LENGTH) return null;
                if (preg_match("/[𝒂𝐵𝑏𝑐𝑑𝒆𝑬𝒇𝐠𝒊𝒍𝐿𝒎𝒏𝐧𝒐𝑝𝒑𝒓𝒔𝑡𝒖𝐯𝒙𝒛􀀀]/", $keyword)) {
                        return null;
                }```

This topic was automatically closed after 12 days. New replies are no longer allowed.