Can someone explain why this is happening and how does one fix these. Currently I have to copy from the old and paste into the new. Extremely time consuming. I have to do this for every journal that has french.
I suspect you inadvertently changed your UTF8 configuration when you migrated – either the database settings in your config.inc.php configuration file, or perhaps when you created the new database and moved the old content over using probably the mysql and mysqldump tools. I wouldn’t suggest correcting this manually. I’d suggest checking your settings and finding out where the configuration mismatch occurred, then fixing it there. If you go ahead with manual corrections, you’ll need to do this manually in a lot of places.
Regards,
Alec Smecher
Public Knowledge Project Team
Hi @asmecher,
The one journal in question was using OJS-2.3.7 Here is the part in config.inc.php.
; Default locale
locale = en_US
; Client output/input character set
client_charset = utf-8
; Database connection character set
; Must be set to “Off” if not supported by the database server
; If enabled, must be the same character set as “client_charset”
; (although the actual name may differ slightly depending on the server)
connection_charset = Off
; Database storage character set
; Must be set to “Off” if not supported by the database server
database_charset = Off
; Enable character normalization to utf-8 (recommended)
; If disabled, strings will be passed through in their native encoding
; Note that client_charset and database collation must be set
; to “utf-8” for this to work, as characters are stored in utf-8
charset_normalization = Off
The same journal migrated to OJS-2.4.7.1
Here is the same part in config.inc.php.
; Default locale
locale = en_US
; Client output/input character set
client_charset = utf-8
; Database connection character set
; Must be set to “Off” if not supported by the database server
; If enabled, must be the same character set as “client_charset”
; (although the actual name may differ slightly depending on the server)
connection_charset = utf8
; Database storage character set
; Must be set to “Off” if not supported by the database server
database_charset = utf8
; Enable character normalization to utf-8 (recommended)
; If disabled, strings will be passed through in their native encoding
; Note that client_charset and database collation must be set
; to “utf-8” for this to work, as characters are stored in utf-8
charset_normalization = utf-8
As you can see in the old version client_charset = utf-8 was the only one defined. I went in and defined all the others.
Should I have left the other off?
I would suggest using the same settings as your old installation to make sure your transfer went smoothly and no new problems were introduced. Your old installation was not properly configured for UTF-8 in the database, and in order to fix that you’ll likely have to generate a database dump, transcode it with something like iconv, and load it – but it’s best not to solve this at the same time as changing servers, because you’ll compound any problems that either process might introduce.
Regards,
Alec Smecher
Public Knowledge Project Team
charset_normalization should generally be Off. It used to be useful when some browsers and servers weren’t reliably using UTF-8, but not anymore. You can change this setting whenever you want without garbling your existing database. Only new content will be affected by the change.
Regards,
Alec Smecher
Public Knowledge Project Team
I had the same issue when I migrated to a newer version, however, changing the collation of the new database never resolved the problem. In addition, I found that UTF8 was not working for me in many cases, for instance this name “Šimončičová” wasn’t as expected and I changed to “utf8mb4_unicode_ci” and I manually edited the texts. Hope you permanently resolved the issue!
If you already have content in your database that’s not properly UTF-8 encoded, you may need to transcode your database. You can do this by running a mysqldump through a transcoder like iconv and then loading the result back in to MySQL.
Regards,
Alec Smecher
Public Knowledge Project Team
Thanks – and I think I’ll also remove it from the master branch entirely. It’s not harmful, but it’s no longer useful; a few years ago when some platforms didn’t properly support UTF8, it helped.
Regards,
Alec Smecher
Public Knowledge Project Team