Compare and auto translate keys from OJS to OCS

Hey there translators!
I’m trying to translate OCS to Croatia language. On the wiki page I found a translation for OJS 2.2.1 (OJS2 na hrvatskom - Trac) which I would like to use as a starting point for OCS 2.3.6 translation.

Namely, I know that the basic structure and a lot of the phrases are the same so I would like to automate the process of translating the basics and then proceed with manual translation of the rest of the phrases.
The idea is to create a script that would look through the OJS lang file that is translated (locale.xml) and then find strings in OCS lang file (locale.xml) and substitute the ones that it finds. I.E.:

EN locale.xml (OCS) looks something like this:

<message key="common.copy">Copy</message>
<message key="common.preview">Preview</message>
<message key="common.activate">Activate</message>
<message key="common.deactivate">Deactivate</message>

The file locale.xml (Croatian, OJS) I would like to use the translations from looks something like this:

<message key="common.copy">Kopiraj</message>
<message key="common.preview">Prethodni pregled</message>
<message key="common.activate">Aktiviraj</message>

I would like to create a bash script that would search through both files, find the same keys i.e. key=“common.copy” and replace the content of the first file (locale.xml OCS) so it would look like this:

<message key="common.copy">Kopiraj</message>
<message key="common.preview">Prethodni pregled</message>
<message key="common.activate">Aktiviraj</message>
<message key="common.deactivate">Deactivate</message>

In this case key=“common.deactivate” is missing from the translated file locale.xml (OJS) so it would be left unchanged after the script went through the files.
In this way I would only be left with keys that are unique to OCS system to translate by hand which would save me a lot of time.

Has anyone got any experience with this kind of editing or an idea how to do this?

I’m new to translating and I have only basic knowledge on XML, so I don’t know if this is something commonly used and trivial or not. :smile:
One suggestion that I got is to use XSLT, but that would require looking through a lot of documentation that I’m not familiar with at all, so some starting suggestions would be of great help.

Thanks!

Ivo

Hi @ihaladin,

Tagging @mtub in case he’s got some suggestions.

Have you looked at the Translator plugin? It’s included in OCS in the “Generic Plugins” category. It’ll help to identify and fix mismatches between what OCS expects to find in a locale file and what it actually finds. It might help to clean out unwanted content from OJS locale files, though I haven’t tried it in that capacity before.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @ihaladin,

good thought. But after what you’ve described I am not so sure that you will be faster this way than by just comparing the files manually and update them at the same time. I’d suggest doing this first:

  • create missing files, can be done by using the translator plugin if you have a running installation of OCS
  • in case you are building upon existing OCS locale files: remove unused keys (by using the translator plugin or by comparing to the en_US locale of the ocs-dev-2_3 branch.

I don’t think there are many keys that are the same in OJS and OCS (which you would need to profit from building upon the OJS locale files). E.g., in admin.xml there are 7 (of 77) keys identical between OJS and OCS, in locale.xml it’s ~50 (of 706) keys. But there are a lot of keys that are similar in meaning but named slightly different (e.g. referring to a journal in OJS and to a conference in OCS). You’d miss all these keys when you just look for identical key names.

Keys that are meant to be identical could (should) also be in the shared library (pkp-lib, GitHub - pkp/pkp-lib: PKP Web Application Library), see branches ojs-dev-2_4 and ocs-dev-2_3.

Looking at the English OCS file, the Croatian OJS file and your new Croation OCS file at the same time might be a bit confusing, but it’s probably the fastest way to a Croation translation of OCS.

I am not saying it can’t be done; in fact, I am working on something similar for updating translations. But if you want to get to a complete OCS translation soon and without too many iterations, comparing the files directly might be the better option. I fear that you would probably end up putting some time into achieving not too much, since you maybe overestimate how many keys are really identical between OJS and OCS.

If you have any questions or need some help with creating the initial files, please let me know and I’ll gladly help.

HI @mtub,

Thanks for the suggestions. Since I had no better suggestions at that time, I did it the following way.
I haven’t posted here jet because of a busy schedule with my OCS site, but I managed to translate most of the OCS to Croatian language.
You can see the working site here: http://sabor.hsgi.org

A friend of mine has written a small php script doing just what I asked (to auto translate keys with the same name). Then I had to manually go through them to see if there were some “journal” “OJS” or some other phrases that needed manual translating.
For the keys that weren’t identical in name, or that were missing, the script left the EN keys, so that I would have a working site (without ##keyname## popping up) and now I’m translating the rest of the files as I stumble upon english titles in croatian version of the site, while using OCS.

I managed to translate:

  • most of the locale/hr_HR files
  • all the e-mail templates
  • all plugins that are bundeled to OCS installation
  • lib/pkp/locale was already translated to some extent

Speeking of translations, I couldn’t get custom translations related to theme in /plugins/themes/mytheme/locale folder to work properly… I have some content related to the theme itself (some sidebar stuff) that I would like to have translated as well and I don’t know where to put those translations.
Is it a good practice to create a locale folder inside theme (as it is the practice in other plugins) or should I put those translations in locale/hr_HR/locale.xml file or somewhere else?

I’m also still struggling how to translate TinyMCE, since it shows up in English, but in Croatian version I don’t have any translations even if I put the lang files in corresponding folders (/var/www/sabor/ocs/lib/pkp/lib/tinymce/jscripts/tiny_mce/langs).

In case anyone would like to do what I did with the OJS to OCS translations (not saying it’s the wisest or the quickest way to translate :slight_smile: ) here’s the php script I used:

    <?php
    // Script compares two XML files. Finds all 'message' elements in $inHR and 
    // looks for elements with corresponding 'key' attribute in $inEN 
    // and changes the value from $inEN to the value found in $inHR
    // create new DOM document and load the data
    $inEN = new DOMDocument;
    $inEN->load('OCS-EN/locale/en_US/locale.xml');
    // create new DOM document and load the data
    $inHR = new DOMDocument;
    $inHR->load('OJS-HR/locale/hr_HR/locale.xml');
    // Create new xpath
    $xpath = new DOMXPath($inHR);
    // query the result amd change the value to the new date
    $results = $xpath->query("//message");
    // Create new xpath for comparison search
    $ensearch = new DOMXpath($inEN);
    // loop all found items
    foreach ($results as $item) {
        // build comparison query
        $query = "message[@key='".$item->attributes->getNamedItem('key')->nodeValue."']";
        // do the comparison search
        $en = $ensearch->query($query);
        // if found, length will be > 0
        if ($en->length) {
            // print_r($en->item(0)->nodeValue);
            $en->item(0)->nodeValue = $item->nodeValue;
        }
    }
    // save changes to new file
    file_put_contents('OCS-HR/locale/hr_HR/locale.xml',$inEN->saveXML());
    // print_r($inEN->saveXML());

Best regards,

Ivo