Describe the issue or problem
After the upgrade of OJS-3.4.0 to OJS-3.5.0, we noticed how some form fields with user input contained nothing but illegible characters. This seemed limited to the review_form_responses table. In the OJS website, this produced following output when viewing such a review response:
In the SQL dump, this appeared to be a series of null characters:
(260,15,46,'string','\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\.....')
When I first noticed this, I’d been able to restore the correct values from a pre-migration dump, after which the display in the OJS website was completely fixed.
After that, for a subsequent upgrade to OJS-3.5.0-3, I have dumped again, and restored that dump.
Yesterday I bumped on a fresh case, containing two reviews submitted after the dump containing my earlier fix. Fortunately, I’ve been able to retrieve the correct values from an intermediate SQL dump, and a snapshot of the db files, so the damage has been restored.
Yet, I want to find the cause and avoid this at all cost in the future. I think I it has to do with dumping / restoring dumps, given following observations:
- the intermediary dump (after the first fix but before the next stated corruption) didn’t contain any
\0characters at all, including one new review that got corrupted in the latest dump - the latest dump (after the latest stated corruption) did contain exactly the same fields consisting entirely of
\0characters, in exactly the same table as in the first corrupt dump, as well as 2 new corrupted review responses
Since the intermediary SQL dump did contain correct values for a review that got mangled in the latest dump, I’m suspecting that the problem doesn’t lie with character encoding when OJS is storing the review forms in the review_form_responses table, but rather with the process of creating and/or restoring SQL dumps.
I realize this is probably a broader MySQL character encoding issue, but since the db is administered within an OJS context, I hope to get some remediation here. I’ll provide details of the character encoding settings under “Additional information” below.
Steps I took leading up to the issue
- create a SQL dump from the db
- import that SQL dump into the db
I haven’t checked where exactly the corruption occurs, unfortunately, but will keep a close look in the future. After my latest fix, the dump remains ok, so perhaps it’s step 2 that’s causing the issue.
I’m creating the dump with following command:
mysqldump --all-databases -u $OJS_DB_USER -p$OJS_DB_PASSWORD $OJS_DB_NAME > ojs_dump.sql
I’m importing the dump with following command:
mysql -u $OJS_DB_USER -p$OJS_DB_PASSWORD $OJS_DB_NAME < ojs-dump.sql
Could it be the case that these commands need an explicit encoding parameter, such as `--default-character-set=utf8mb4`?
What application are you using?
OJS 3.5.0-3
Additional information
Character encoding settings in the docker-compose.yml file:
ojsdb:
image: mariadb:10
container_name: ojsdb
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
Character encoding settings in the config/config.inc.php file:
[i18n]
client_charset=utf-8
database_charset=Off
connection_charset=utf8mb4
locale=en
[database]
driver=mysqli
connection_charset=utf8mb4
collation=utf8mb4_unicode_ci
Character encoding settings in the SQL dump file:
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `ojs` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci */;
--
-- Table structure for table `review_form_responses`
--
DROP TABLE IF EXISTS `review_form_responses`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `review_form_responses` (
`review_form_response_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`review_form_element_id` bigint(20) NOT NULL,
`review_id` bigint(20) NOT NULL,
`response_type` varchar(6) DEFAULT NULL,
`response_value` text DEFAULT NULL,
PRIMARY KEY (`review_form_response_id`),
KEY `review_form_responses_unique` (`review_form_element_id`,`review_id`),
KEY `review_form_responses_review_id` (`review_id`),
CONSTRAINT `review_form_responses_review_form_element_id_foreign` FOREIGN KEY (`review_form_element_id`) REFERENCES `review_form_elements` (`review_form_element_id`) ON DELETE CASCADE,
CONSTRAINT `review_form_responses_review_id_foreign` FOREIGN KEY (`review_id`) REFERENCES `review_assignments` (`review_id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=3778 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
/*!40101 SET character_set_client = @saved_cs_client */;
