OJS dont't display character set in UTF-8

Hi all
Im tired
I move ojs to differend server:
centos 6.6
apache 2.2.15
mySql 5.173
phpMYadmin 4.0.10.10

I change everywhere encoding but still nothing, in article not display special characters for pl_PL. Where OJS retains the text of articles?
Why in cache ojs text have different encoding?
In phpmyadmin comments its fine, in console mysql fine, in cache is broken.
ojs.uek.krakow.pl

Did you change the encoding in the move?

Did you clear the Data and Template Cache via the Site Admin menu? When you clear the cache, do the cache files actually get erased and re-written?

yes i clear cache, change encoding conf apache, php, dumpfile, ojs conf itp. i read forum ojs and nothing. Select in console mysql is good, display character in php file good, but if ojs read mysql in cache chages encoding and wrong display in apache.

If you’re actually changing the character encoding from your original server to your new one (for example latin1_swedish_ci to utf8_general_ci), this will be a fairly complex operation. Another user tried a similar migration in this thread (see Database error after upgrade to Mariadb v 10.2), but didn’t report back on the success (or not).

SELECT table_name, CCSA.character_set_name FROM information_schema.TABLES T, information_schema.COLLATION_CHARACTER_SET_APPLICABILITY CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "ojs_db";

everything utf8 in character_set_name

clear cache and not different :frowning:

in config.inc.php

connection_charset = utf-8
database_charset = utf-8
charset_normalization = On

I’m not sure if I’m reading you correctly. Did you change the character encoding in the dumpfile from something else to utf8? Or was the character encoding on the previous server utf8 already?

in pervios server 10.0.20-MariaDB encodig is utf8mb4, dumpfile is in utf8, i set in my mssql 5.1.73 utf8 for all 100 db, only in ojs is this problem

Hi @lucien,

The connection_charset and database_charset settings should be utf8, not utf-8. See if that helps.

Regards,
Alec Smecher
Public Knowledge Project Team

i try utf8 utf-8 on off, i try 8 hours this. btw where is table with article content?

Hi @lucien,

If you’re looking for article metadata (titles, abstracts), see the article_settings table.

If you’re checking things with the MySQL client or phpMyAdmin, you might want to double-check that your terminal/browser isn’t “helpfully” interpreting things that are actually stored incorrectly. If you can find an entry in a table that shows content with accents, check its storage e.g. by comparing its STRLEN against the old copy of the database. If the lengths don’t match, even if the contents look good, then something is wrong with your database character set configuration. This can happen during the dump/reload process – I’ve sometimes had to explicitly specify the dump character set when running mysqldump.

Regards,
Alec Smecher
Public Knowledge Project Team

metadata is correct, in phpmyadmin textis correct in OJS is NOT correct

Hi @lucien,

If your settings are (exactly!)…

[i18n]
client_charset = utf-8
connection_charset = utf8
database_charset = utf8

…then your OJS character set configuration is correct. Have you turned on persistent connections? If so, turn that off – your old configuration might still affect your connection pool.

These settings should correspond to the same ones you used on your old OJS installation. If not, you may have to convert your database somehow.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @lucien,

Were you able to resolve this issue? If so, please let us know what solved it – it may be useful to future users.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher
Thanks for help, i resolve my problem.
This is correct configuration:
[i18n]
client_charset = utf-8
connection_charset = utf8
database_charset = utf8

but with one else parameters:
; Default locale
locale = en_US
;locale = pl_PL

i comment locale = pl_PL and uncomment en_US and clear everythig in cache and restart apache.

Thanks for help.
Redgards
Lucien

1 Like

I vote for setting:

client_charset = utf-8
connection_charset = utf8
database_charset = utf8

as default for all OJS installations.

I’ve stumbled upon the same problem - strings are saved with jibberish in the DB but original installation treats them OK (as if they were UTF8, they are interpreted correctly despite being saved as junk). However, after migrating to another host, OJS displays this jibberish literally. All config parameters regarding i18n were set to Off originally. DB exported as UTF8 and imported as UTF8. Very frustrating stuff.

Hi @szmigieldesign,

I’ve set the defaults to UTF8 for future releases:

The previous defaults were written, believe it or not, before UTF-8 adoption was well established.

Regards,
Alec Smecher
Public Knowledge Project Team

Glad to see UTF-8 being used by default as it’s basically a standard now. I understand that OJS originates from times when it wasn’t - hence the blank default.

I still can’t figure out why on my original install OJS saves junk in db but interprets it as polish characters but upon migration it displays characters literally - the junk that was saved in db. Fortunately there weren’t many information saved so it won’t be painful and time consuming to repopulate the data.

Hi @szmigieldesign,

A few things to check:

  • Try dumping the database to a .sql file and editing it with a decent programmer’s editor. (I use vim.) This might help determine whether the encoding in the database is correct, or whether your terminal emulator or text editor is tidying it up for you before presenting but it’s actually incorrect in the database. (Sometimes I’ve run database dumps through iconv before reloading to correct encodings.)
  • Check to make sure that your “View Page Info” (available in most browsers) correctly reports UTF-8. Things like inadvertent template modifications that insert blank spaces at the head of the generated markup may cause the headers declaring UTF-8 content to be ignored.
  • Try using SQL LENGTH statements to test whether accented characters are encoded properly in the database. For example, LENGTH('louĂ©e') should be 5 but if it’s stored incorrectly it may report as 6.
  • Check the database collation/character set in MySQL to make sure it’s all set to UTF-8. (StackOverflow should have some good resources on this.)

Regards,
Alec Smecher
Public Knowledge Project Team

Thank you. I should keep that in mind if I encounter such problems in future. This time however, fixing junk characters will be faster than in-depth investigation and conversion as it’s just general journal descriptions that were entered.

What matters is that I’ve entered utf8 in for client, connection and database charsets and verified that all characters are saved ok now.

Hi, I believe I have a similar issue. But can’t seem to find the issue. I have several journals on this site: https://revistacientifica.uep.edu.py/. In this one: Espacio Teologico I can see the articles fine like here: La alegría según Qohélet: Un estudio bíblico-teológico | Espacio Teologico but in this journal: Anuario Académico I can’t see a specific article, it just shows blank: Apologética bíblica como absoluta necesidad | Anuario Académico

Any ideas of why that is? and how I can fix that?