I’m not sure if I’m reading you correctly. Did you change the character encoding in the dumpfile from something else to utf8? Or was the character encoding on the previous server utf8 already?
If you’re looking for article metadata (titles, abstracts), see the article_settings table.
If you’re checking things with the MySQL client or phpMyAdmin, you might want to double-check that your terminal/browser isn’t “helpfully” interpreting things that are actually stored incorrectly. If you can find an entry in a table that shows content with accents, check its storage e.g. by comparing its STRLEN against the old copy of the database. If the lengths don’t match, even if the contents look good, then something is wrong with your database character set configuration. This can happen during the dump/reload process – I’ve sometimes had to explicitly specify the dump character set when running mysqldump.
Regards,
Alec Smecher
Public Knowledge Project Team
…then your OJS character set configuration is correct. Have you turned on persistent connections? If so, turn that off – your old configuration might still affect your connection pool.
These settings should correspond to the same ones you used on your old OJS installation. If not, you may have to convert your database somehow.
Regards,
Alec Smecher
Public Knowledge Project Team
Hi @asmecher
Thanks for help, i resolve my problem.
This is correct configuration:
[i18n]
client_charset = utf-8
connection_charset = utf8
database_charset = utf8
but with one else parameters:
; Default locale
locale = en_US
;locale = pl_PL
i comment locale = pl_PL and uncomment en_US and clear everythig in cache and restart apache.
I’ve stumbled upon the same problem - strings are saved with jibberish in the DB but original installation treats them OK (as if they were UTF8, they are interpreted correctly despite being saved as junk). However, after migrating to another host, OJS displays this jibberish literally. All config parameters regarding i18n were set to Off originally. DB exported as UTF8 and imported as UTF8. Very frustrating stuff.
Glad to see UTF-8 being used by default as it’s basically a standard now. I understand that OJS originates from times when it wasn’t - hence the blank default.
I still can’t figure out why on my original install OJS saves junk in db but interprets it as polish characters but upon migration it displays characters literally - the junk that was saved in db. Fortunately there weren’t many information saved so it won’t be painful and time consuming to repopulate the data.
Try dumping the database to a .sql file and editing it with a decent programmer’s editor. (I use vim.) This might help determine whether the encoding in the database is correct, or whether your terminal emulator or text editor is tidying it up for you before presenting but it’s actually incorrect in the database. (Sometimes I’ve run database dumps through iconv before reloading to correct encodings.)
Check to make sure that your “View Page Info” (available in most browsers) correctly reports UTF-8. Things like inadvertent template modifications that insert blank spaces at the head of the generated markup may cause the headers declaring UTF-8 content to be ignored.
Try using SQL LENGTH statements to test whether accented characters are encoded properly in the database. For example, LENGTH('louée') should be 5 but if it’s stored incorrectly it may report as 6.
Check the database collation/character set in MySQL to make sure it’s all set to UTF-8. (StackOverflow should have some good resources on this.)
Regards,
Alec Smecher
Public Knowledge Project Team
Thank you. I should keep that in mind if I encounter such problems in future. This time however, fixing junk characters will be faster than in-depth investigation and conversion as it’s just general journal descriptions that were entered.
What matters is that I’ve entered utf8 in for client, connection and database charsets and verified that all characters are saved ok now.
The blank page almost certainly indicates a PHP error. Check your server’s error log for details. If the error message describes a UTF-8 encoding problem, this may be the right place to discuss it. If the error message describes something else, try searching for another thread in the forum or creating a new topic specific to your error message.