OJS dont't display character set in UTF-8

SELECT table_name, CCSA.character_set_name FROM information_schema.TABLES T, information_schema.COLLATION_CHARACTER_SET_APPLICABILITY CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "ojs_db";

everything utf8 in character_set_name

clear cache and not different :frowning:

in config.inc.php

connection_charset = utf-8
database_charset = utf-8
charset_normalization = On

I’m not sure if I’m reading you correctly. Did you change the character encoding in the dumpfile from something else to utf8? Or was the character encoding on the previous server utf8 already?

in pervios server 10.0.20-MariaDB encodig is utf8mb4, dumpfile is in utf8, i set in my mssql 5.1.73 utf8 for all 100 db, only in ojs is this problem

Hi @lucien,

The connection_charset and database_charset settings should be utf8, not utf-8. See if that helps.

Regards,
Alec Smecher
Public Knowledge Project Team

i try utf8 utf-8 on off, i try 8 hours this. btw where is table with article content?

Hi @lucien,

If you’re looking for article metadata (titles, abstracts), see the article_settings table.

If you’re checking things with the MySQL client or phpMyAdmin, you might want to double-check that your terminal/browser isn’t “helpfully” interpreting things that are actually stored incorrectly. If you can find an entry in a table that shows content with accents, check its storage e.g. by comparing its STRLEN against the old copy of the database. If the lengths don’t match, even if the contents look good, then something is wrong with your database character set configuration. This can happen during the dump/reload process – I’ve sometimes had to explicitly specify the dump character set when running mysqldump.

Regards,
Alec Smecher
Public Knowledge Project Team

metadata is correct, in phpmyadmin textis correct in OJS is NOT correct

Hi @lucien,

If your settings are (exactly!)…

[i18n]
client_charset = utf-8
connection_charset = utf8
database_charset = utf8

…then your OJS character set configuration is correct. Have you turned on persistent connections? If so, turn that off – your old configuration might still affect your connection pool.

These settings should correspond to the same ones you used on your old OJS installation. If not, you may have to convert your database somehow.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @lucien,

Were you able to resolve this issue? If so, please let us know what solved it – it may be useful to future users.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher
Thanks for help, i resolve my problem.
This is correct configuration:
[i18n]
client_charset = utf-8
connection_charset = utf8
database_charset = utf8

but with one else parameters:
; Default locale
locale = en_US
;locale = pl_PL

i comment locale = pl_PL and uncomment en_US and clear everythig in cache and restart apache.

Thanks for help.
Redgards
Lucien

1 Like

I vote for setting:

client_charset = utf-8
connection_charset = utf8
database_charset = utf8

as default for all OJS installations.

I’ve stumbled upon the same problem - strings are saved with jibberish in the DB but original installation treats them OK (as if they were UTF8, they are interpreted correctly despite being saved as junk). However, after migrating to another host, OJS displays this jibberish literally. All config parameters regarding i18n were set to Off originally. DB exported as UTF8 and imported as UTF8. Very frustrating stuff.

Hi @szmigieldesign,

I’ve set the defaults to UTF8 for future releases:

The previous defaults were written, believe it or not, before UTF-8 adoption was well established.

Regards,
Alec Smecher
Public Knowledge Project Team

Glad to see UTF-8 being used by default as it’s basically a standard now. I understand that OJS originates from times when it wasn’t - hence the blank default.

I still can’t figure out why on my original install OJS saves junk in db but interprets it as polish characters but upon migration it displays characters literally - the junk that was saved in db. Fortunately there weren’t many information saved so it won’t be painful and time consuming to repopulate the data.

Hi @szmigieldesign,

A few things to check:

  • Try dumping the database to a .sql file and editing it with a decent programmer’s editor. (I use vim.) This might help determine whether the encoding in the database is correct, or whether your terminal emulator or text editor is tidying it up for you before presenting but it’s actually incorrect in the database. (Sometimes I’ve run database dumps through iconv before reloading to correct encodings.)
  • Check to make sure that your “View Page Info” (available in most browsers) correctly reports UTF-8. Things like inadvertent template modifications that insert blank spaces at the head of the generated markup may cause the headers declaring UTF-8 content to be ignored.
  • Try using SQL LENGTH statements to test whether accented characters are encoded properly in the database. For example, LENGTH('louée') should be 5 but if it’s stored incorrectly it may report as 6.
  • Check the database collation/character set in MySQL to make sure it’s all set to UTF-8. (StackOverflow should have some good resources on this.)

Regards,
Alec Smecher
Public Knowledge Project Team

Thank you. I should keep that in mind if I encounter such problems in future. This time however, fixing junk characters will be faster than in-depth investigation and conversion as it’s just general journal descriptions that were entered.

What matters is that I’ve entered utf8 in for client, connection and database charsets and verified that all characters are saved ok now.

Hi, I believe I have a similar issue. But can’t seem to find the issue. I have several journals on this site: https://revistacientifica.uep.edu.py/. In this one: Espacio Teologico I can see the articles fine like here: La alegría según Qohélet: Un estudio bíblico-teológico | Espacio Teologico but in this journal: Anuario Académico I can’t see a specific article, it just shows blank: Apologética bíblica como absoluta necesidad | Anuario Académico

Any ideas of why that is? and how I can fix that?

The blank page almost certainly indicates a PHP error. Check your server’s error log for details. If the error message describes a UTF-8 encoding problem, this may be the right place to discuss it. If the error message describes something else, try searching for another thread in the forum or creating a new topic specific to your error message.

changed collation to latin1_swedish_ci
and did this to my latin1 database, now it works

;;;;;;;;;;;;;;;;;;;;;;;;;
; Localization Settings ;
;;;;;;;;;;;;;;;;;;;;;;;;;

[i18n]

; Default locale
locale = en_US

; Client output/input character set
client_charset = utf-8
connection_charset = latin1
database_charset = latin1
charset_normalization = Off

Hi @nelsonf17,

It would be better at some point to transcode everything into UTF-8 – but as long as your settings are consistent for now, OJS will continue to work.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi,

I have the same problem An unexpected error has occurred. Please reload the page and try again. when I inputted the special character like this .

My OJS is 3.3.0-10 LTS.

I tried this to insert Pre-fill install form defaults with UTF8 · pkp/pkp-lib@547c627 · GitHub

classes/install/form/InstallForm.inc.php

no result at all.

Best Regards,
Darryl