[MySQL 8.0.21] Upgrading OJS 2.4.8.1 to OJS 3.1.1.4 ERROR: Upgrade failed: DB: Cannot convert string X from utf8mb3 to latin1

Hi everyone I was trying to upgrade an OJS 2.4.8.1 database to OJS 3.1.1.4, the original database is encoded as latin1 and latin_swedish_ci. I was doing the upgrading process in my laptop which has Ubuntu 20.04 and MySQL 8.0.21 and I have noticed when a restore the dump file in MySQL some characters are misencoded as ó, á, ñ, Ã, é.

I have noticed that MySQL 8 use for default utf8mb4 as character set, but the database used was encoded as utf8. I have executed the upgrading process as 5 times and always finished with the following error:

[data: dbscripts/xml/upgrade/3.0.0_update.xml]
ERROR: Upgrade failed: DB: Cannot convert string '\xCE\x95\xCF\x80\xCE\xB9...' from utf8mb3 to latin1

Someone knows how to correct that error? I have tried to convert latin1 data to utf8 and the upgrading process finished successfully, but I have noticed when made some search, this never returns something.

Thanks for your attention.

Hi @juancure,

See e.g. this thread: Strange Characters in abstracts since upgrading to OJS 3.2.1.1 (Bootstrap)

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher, in the original config.inc.php the client_charset is utf-8 and the connection_charset and database_charset are Off both. I have left those values same in the config.inc.php of OJS 3.1.1.4.

Hi @juancure,

In that case, the content in your old installation is double-encoded in MySQL, but because the character set configuration in OJS is not correct, the two problems cancel out and are not apparent when using OJS. However, when moving the data to your machine, which has UTF8MB4, the incorrectly-encoded data no longer comes through OK.

I would recommend transcoding your database contents to have it correctly encoded as UTF8 before running the upgrade.

Regards,
Alec Smecher
Public Knowledge Project Team