Native XML Plugin UTF-8 cyryllic characters problem ojs3.1.1.2


#1

when i try to export by native xml plugin issue or articles, i have got unrecognized characters…

how can i fix this?

  <label>&#x41F;&#x435;&#x440;&#x435;&#x433;&#x43B;&#x44F;&#x43D;&#x443;&#x442;&#x438; &#x416;&#x443;&#x440;&#x43D;&#x430;&#x43B; &#x443; PDF</label>

#2

Hi @redukr,

Are you sure something is wrong? Why do you think so? Do you have problems importing it?
The characters above looks OK as unicode characters, you can verify them here: http://www.codetable.net/unicodecharacters

Regards, Primož


#3

Dear @primozs , today is not 1-st april, to make such Joke. Are you really think, that its normal export?
Latinic alphabet is not exported like cyryllyc. and if you try to find some of this characters by link you provide, - you will receive " The character you requested was not found…"
&#x41F not #x41F not x41F are found at your site.

and if i wrote: абсдефг you will not see code. you will see letters.


#4

Dear @redukr,

Sorry for trying you help. Note that thsi is free forum, I am doing this in my free time. But not the way I got from you, so I will not answer to you anymore.
BTW, on the page http://www.codetable.net/unicodecharacters?page=5
you can see: П Cyrillic Capital Letter Pe &#1055; &#x41f; UppercaseLetter

Regards, Primož


#5

@primozs I’M sorry if i dissapoint you. Yes, i realize that this is free forum. but i think if i have problem, some one will hwlp me to find solution. If you think that i’m so stupid, that i can’t use google to find what mean’s “&#x435” (for example), - that’s bad.

But if question is in the next: that export ability of plugin doesnt work correct. and question is how to fix this.

You demonstrate, discrimination based on “використанні кирилиці, а не латинки”

З повагою!


#6

Hi @redukr,

There are all kinds of people here with different skill levels, including many who have probably never seen XML entity encoding. Let’s keep to the task at hand and keep it respectful, please.

Regards,
Alec Smecher
Public Knowledge Project Team


#7

@asmecher that’s i am talking about. i have no skills in php\mysql programming. but i know what is google and how to search. i have searched what mean’s that characters. so i create topic to get answers. @primozs tell me what i have learned in google. so i was confused that he wrote message about that what can be googled. but haven’t propose some kind of solution


#8

Hi @redukr,

Essentially this is a valid encoding of the Cyrillic characters, even though it’s hard to read. Any toolset that supports XML will be able to parse and work with it correctly. To change this over to using the actual UTF-8 characters, a change like the one described here would need to be applied.

I’ve filed this as a feature request at https://github.com/pkp/pkp-lib/issues/4259, but as the current export is technically valid, we may not get to it soon.

In the meantime, you can use GNU recode to replace the escaped entities with their UTF8 equivalents:

recode html < path/to/export.xml

Regards,
Alec Smecher
Public Knowledge Project Team


#9

@asmecher thanks a lot!