Merge Fr_CA and FR_Fr

Hi,

The FR_Fr translation seems to be lacking community support and is always incomplete. The FR_CA translation, on the other hand, is well supported in OJS. Is there any specific reason why we should keep the FR_Fr translation as a separate locale? I believe that it would be more efficient to create a single French locale based on FR_CA. Linguistic differences at an academic level are marginal and I don’t believe that it would make much difference for the UI anyway. An alternative could be for FR_Fr to mirror FR_CA by default so that translators can focus on regionalisms.

Hi @pheckler,

There are differences between Canadian and France French (speaking of language, rather than just OJS translations), and while I’m not especially fluent in either variant, I think the issue is probably that choosing a single variant will lead the translation towards whichever variant receives the most volunteer translation contribution. This might cause confusion among users and conflict between translators. And as PKP is based primarily in Canada, it’s important to us to support Canadian French.

Fortunately we’ve recently started using Weblate for translation management (see Introducing Weblate: A new path for OJS & OMP translations) and this has made it much easier for groups to collaborate on translations. There will be some initial work required in refreshing a translation that hasn’t received as much maintenance, but after that it should be quite easy to prepare each new release with the bits and pieces that have changed or been added.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

Thank you for the response. I will try to contribute through Weblate when I have the time to get the France locale up to date (I made a pull request directly through github earlier, which I should probably cancel).

Hi @pheckler and @asmecher,

If I remember correctly, I had suggested that kind of merging earlier (maybe in the Slack translation chanel but since I don’t have a paid access it seems I can’t access messages prior to a certain date…).

From what I could see when I began to be involved with OJS in ~2018, the fr_FR translation was mostly based on the fr_CA translation done a few years ago (version OJS 2) by my colleagues at UQAM (Canada). At that time I did no see much differences between the two.

I, for my part, have been contributing to the fr_CA translation OJS 3.x since then. I have done frequent updates and I am still actively involved. I must say that, when doing the fr_CA translation, I really try to stick to the original en_US source, choose more broadly accepted terms when possible/relevant, and rely on the localisation plugin for more in-depth localisation/regionalism needs (really not much, if any). I try (and I hope I have succeeded?.. :blush:) to make the updates with rigor and method, reusing the same exact vocabularies between the different OJS/PKP components.

Hence I totally agree when @pheckler says “I believe that it would be more efficient to create a single French locale based on FR_CA. Linguistic differences at an academic level are marginal and I don’t believe that it would make much difference for the UI anyway. An alternative could be for FR_Fr to mirror FR_CA by default so that translators can focus on regionalisms”. Maybe this is a situation similar to en_US, en_CA, en_UK, en_AU where only one English version seems to be needed…

Marie-Hélène V.
Université de Montréal

1 Like

Hi all,

The fr_FR translation was forked from the fr_CA translation in July 2016 after a conversation much like this one – I’ll try to track it down for context. (@Marie-Helene, I hear you about getting locked out of our own conversation history! We’re planning to move to Mattermost to resolve this.)

Regards,
Alec Smecher
Public Knowledge Project Team

1 Like

Hi,

This is exactly what I had in mind! That being said, I spent a few hours yesterday updating the FR_Fr locale from Weblate and there is quite a lot of work to be done. I have been focusing on missing/invalid strings and failed checks because I believe this to be the most urgent, but from what I have seen so far, an overall review of the translations will be needed to catch up on FR_CA. I have isolated three weaknesses in the current FR_Fr file:
1- Typographical rules, especially regarding spacing and French quotation marks (« »), are not systematically respected. Weblate can catch spacing errors, but I do not know if it can catch quotation marks.
2- Some translations have a loose relationship with the source. Reading @Marie-Helene’s post, I understand that they are most likely an heritage from OJS2.
3- Terms are not consistent across the plateform.

From what I have seen, FR_CA performs systematically better on all three aspects. Typography is impeccable, and translations are accurate (in relation to the source) and consistent. I have been through roughly a hundred strings so far and I would say that in at least 90% of the case, FR_Fr and FR_CA are (or should be) identical. The remaining portion accounts for a few terms where usage might differ, although I have not yet come across any proper regionalism (e.g. FR_CA translates upload to téléverser, the term is accepted in France and is part of most dictionaries, but the more generic télécharger - which covers both download and upload - is more frequently used), and inclusive writing.

I am hoping to be able to bridge the gap in missing strings from FR_Fr soon (at least in OJS and the Library). Once this is done, it might be interesting to compare the two and assess whether maintaining both is really justified. In the meantime, @Marie-Helene, it seems that the Canadian translation team is quite organised and I would be interested in your input on some translation aspects (e.g. inclusive writing, for which support in FR_Fr is subpar to say the least).

Paul
École de droit de la Sorbonne

Bonjour @pheckler ,

Thank you for this detailed report. I’m glad you liked my translation!
Regarding “téléverser/télécharger”, yes, indeed, we usually use the more generic term “télécharger” for both upload and download but since in OJS those 2 actions (upload a file; download a file) are sometimes available to the user (reviewer, copyeditor) within the same screen but serve different purposes, I needed to make sure that the distinction between the 2 options was clearly made. As of inclusive writing (which sometimes gives me headaches…), I will be pleased to discuss this further with you.

If we were to choose one and only one French translation, I could also consider making some modifications to my present translation to make it more “universal” say. It remains to be seen where these changes would be needed though.

Marie-Hélène Vézina
Université de Montréal

Bonjour @Marie-Helene,

I had not thought about that at all! The distinction makes a lot of sense in that context and I think FR_Fr should strick to it as well.

I might open a dedicated post in the translation section for this since it is a separate topic.
Edit: the post is here :slightly_smiling_face:

Let’s see where we stand when FR_Fr is completed, but your translation as it stands looks fairly standard to me. I will make notes if I see any regionalism in the canadian strings as I go on :smile:

Paul

Hi all,

I am done with the rough translation of OJS components to FR_Fr (Reviewing the failing checks for the email component was not necessarily fun :sweat_smile:). Once the translation has been pushed, I will run gettext concatenate on a few files and draw some stats from it to have a better picture just how similar the two locale are.
In the meantime, I’ll get to work on the web library!

Hi @pheckler,

Huge thanks for all this work!

I’m not especially familiar with the gettext tools, so if you find them useful, please write the details up briefly here!

Thanks,
Alec Smecher
Public Knowledge Project Team

Hi,

I did some preliminary work with the gettext msgcat command, which basically merges two .po files together into a third one. Identical strings are automatically merged in the new file, while strings that differ are both added with a #, fuzzy tag to indicate an error. I initially ran the tool on 3 files: admin.po, submission.po, and default.po., which together comprise 166 strings. I then went through every fuzzy string manually to check the actual differences. I identified 3 reasons:

  • inclusive writing,
  • incorrect translation (ranging anywhere from a spacing difference to a flat-out invalid translation), and
  • diverging translations (i.e. both strings are correct, but dissimilar).

Here are the numbers for each of those three files:

admin.po (55 strings):

  • Identical : 25 strings (45%)
  • Fuzzy : 30 strings (55%), distributed as follows:
    • inclusive writing: 14 strings (47%)
    • incorrect translation: 7 strings (23%)
    • diverging translation: 9 strings (30%)

submission.po (69 strings)

  • Identical: 35 strings (51%)
  • Fuzzy: 34 strings (49%)
    • Inclusive writing: 8 strings (24%)
    • incorrect translation: 19 strings (56%)
    • diverging translation: 7 strings (20%)

default.po (42 strings)

  • Identical: 20 strings (48%)
  • Fuzzy: 22 strings (52%)
    • inclusive: 11 strings (50%)
    • incorrect: 9 strings (40%)
    • diverging: 2 strings (10%)

Overall, half of all strings in these files are identical and merge automatically. I ran the command on a bigger file to confirm this (author.po, 103 strings) and got similar results: 54 identical strings (52%), 49 fuzzy (48%). Incorrect strings account for an average of 40% of alI fuzzy strings. Many of them include trivial spacing issues, a few typos, or erroneous translations (a wrong translation of the word “galley” in FR_Fr accounts for almost half of all incorrect strings in submission.po). In all cases but one, the error is limited to one string and the other one is correct, meaning those should be identical upon correction. The only exception is admin.languages.supportedLocalesInstruction in admin.po, (FR_Fr does not translate correctly and FR_CA contains a typo). Inclusive writing is another major source of divergence (37% of all fuzzy strings on average), but the main take from the post dedicated to this issue in the translation section is that both locale can share a common ground on this. Actual diverging translations are quite limited both in number (23% of all fuzzy) and in substance. Most are very trivial and limited to a word (e.g. all diverging strings in admin.po boil down to these: ceci/cela, chemin/chemin d’accès, OJS/de OJS, PKP/de PKP, lire/voir, informations/renseignements). I identified what I believe to be one regionalism: interessé à soumettre (CA)/interessé pour soumettre(Fr) and I spotted one or two more during translation of the email component, which is not included here. The few remaining strings are ones that, in my opinion, are better translated in one or the other, but are both correct and do not contain any Canadian regionalism (I don’t think they contain any French regionalism either, but a canadian colleague would be better placed to confirm this).

All of this seems to indicate that FR_CA and FR_Fr are substantially “mergeable” since virtually all differences have little to do with linguistics. However, the workload would be substantial. As things stand (and assuming those stats accurately represent the whole picture), merging would require a review of half of all strings. Even after eliminating all errors and inclusive writing issues, we would still need to go through 20% of all strings.

I have uploaded the raw concatenated files to Github if anyone wants to check it out.

Paul