STX - Start of Text symbols on database cause errors in XML OAI-PMH

I’m using OJS 3.3.0-15

Some indexing platforms have contacted us because there are problems when trying to harvest our content via OAI-PMH

We have seen when copying the metadata to OJS, sometimes it keeps some not easy to identify characters

They seem to be Start of Text characters, but only we could see it in forms, not in the database directly where doesn’t appear when making a search.

How could we identify these characters on the database in order to delete them in batch?

This is the result when making a consult to the data above. none symbol is shown in the word narrativas
| CONVERT(CAST(setting_value as BINARY) USING latin1) |
±------------------------------------------------------------------------------------------------------+
| The art of giving birth: audiovisual practices and collective narratives of pregnancy and childbirth |
| El arte de parir: prácticas audiovisuales y narrativas colectivas de la gestación y el parto |
| A arte de parir: práticas audiovisuais e narrativas coletivas da gravidez e do parto |

| @@character_set_database | @@collation_database |
±-------------------------±---------------------+
| utf8 | utf8_general_ci |

Collation of tables is utf8_general_ci

Thanks in advance, your help will be apreciated

Hi @lcmartinezru,

It’s likely that your SQL client is hiding the details of these characters. I’d suggest using mysqldump to generate a dump of your database, then looking through it with a good text editor. I’m pretty sure you’ll find that the characters are present in the database, and are artifacts of whatever process you used to get content into your OJS in the first place (import/export, copy/paste, etc).

Regards,
Alec Smecher
Public Knowledge Project Team

This topic was automatically closed after 15 days. New replies are no longer allowed.