Okay to Change Database Encoding (utf8 -> utf8mb4)?

After upgrading to the latest OJS (3.1.1.2), I also changed the database collation from utf8_unicode_ci to utf8mb4_unicode_ci using phpMyAdmin.

I did so using PMA under (OJS) Database → Operations → Collation. I also selected the two options that appear, “Change all tables collations” and “Change all tables columns collations”.

Are there any potential issues with having done so?

Also, I now have the following in config.inc.php:

client_charset = utf-8
connection_charset = utf8mb4
database_charset = utf8mb4

Is that correct?

Hi @wptx,

I haven’t used those settings myself – but generally speaking, you can check by making sure that multibyte content presents properly on the website, and by also checking using a tool like phpMyAdmin (or the command-line mysql client) that content looks fine there. Generally I check to make sure that MySQL has the right understanding of content lengths e.g. by running a LENGTH query on some rich content (e.g. containing accented and unaccented characters). If your configuration is incorrect or inconsistent a LENGTH query will report the wrong number.

(This seems like a convoluted way to test but I’ve found that when you’re secure-shelled into a server, your console or SSH client may sometimes correct encodings for you rather than reporting that they aren’t as expected, so they can’t always be trusted to report what you’re getting verbatim.)

Regards,
Alec Smecher
Public Knowledge Project Team

1 Like

It appears to return LENGTH correctly, at least for new/modified mixed content. I’ve not seen any issues with the display of existing content, so it seems to be a correct method to modify the collation and specify the character set.

I was concerned when I encountered another error, but that appears to be unrelated to database changes.
Thanks for the tips.