Images lost from XML Galleys after upgrade to OJS 3.3.0.5 from 3.2.1.1

I have received reports from a journal manager about images missing from XML galleys.

Checking the forums I’ve found this post:

It didn’t provide a solution but I could use some pointers from it to check the database contents.

Below I’ll share a snippet from the xml and the contents of the database of the 3.2.1.1 version and the new 3.0.0.5 version

[XML]

<p>

<fig id="f1">
<label>Figura 1</label>
<caption>
<title>Una posta del Estado en siglo XIX en Sudamérica</title>
</caption>
<graphic xlink:href="2314-1549 -rhaa-55-02-35-gf1.jpg"/>
<attrib>Fuente: <xref ref-type="bibr" rid="B11">Page, 2007</xref>, p. 214</attrib>
</fig>
</p>

[OJS 3.2.1.1]

mysql> SELECT * FROM submission_files WHERE > original_file_name = ‘2314-1549 -rhaa-55-02-35-gf1.jpg’;

| file_id | revision | source_file_id | source_revision | submission_id | file_type | file_size | original_file_name | file_stage | viewable | date_uploaded | date_modified | assoc_id | genre_id | direct_sales_price | sales_type | uploader_user_id | assoc_type |
±--------±---------±---------------±----------------±--------------±-----------±----------±---------------------------------±-----------±---------±--------------------±--------------------±---------±---------±-------------------±-----------±-----------------±-----------+
| 12883 | 1 | NULL | NULL | 3988 | image/jpeg | 12312 | 2314-1549 -rhaa-55-02-35-gf1.jpg | 17 | 0 | 2020-10-25 21:34:42 | 2020-10-25 21:34:42 | 12882 | 202 | NULL | NULL | 1555 | 515 |
1 row in set (0.00 sec)

mysql> select * from genre_settings where genre_id =202;

±---------±-------±-------------±-------------------------±-------------+
| genre_id | locale | setting_name | setting_value | setting_type |
±---------±-------±-------------±-------------------------±-------------+
| 202 | es_ES | name | Imagen | string |
±---------±-------±-------------±-------------------------±-------------+

[OJS 3.0.0.5]

MariaDB [revistasuncu]> select * from submission_files where file_id = 12883;

| submission_file_id | source_submission_file_id | submission_id | file_stage | viewable | created_at | updated_at | assoc_id | genre_id | direct_sales_price | sales_type | uploader_user_id | assoc_type | file_id |

±-------------------±--------------------------±--------------±-----------±---------±--------------------±--------------------±---------±---------±-------------------±-----------±-----------------±-----------±--------+

| 14894 | 14893 | 3438 | 11 | 0 | 2021-02-25 13:37:47 | 2021-02-25 13:40:25 | NULL | 133 | NULL | NULL | 1430 | NULL | 12883 |

1 row in set (0.001 sec)

MariaDB [revistasuncu]> select * from submission_files where file_id = 12882;

| submission_file_id | source_submission_file_id | submission_id | file_stage | viewable | created_at | updated_at | assoc_id | genre_id | direct_sales_price | sales_type | uploader_user_id | assoc_type | file_id |
±-------------------±--------------------------±--------------±-----------±---------±--------------------±--------------------±---------±---------±-------------------±-----------±-----------------±-----------±--------+
| 14893 | NULL | 3438 | 6 | 1 | 2021-02-25 13:37:47 | 2021-02-25 13:37:47 | NULL | 133 | NULL | NULL | 1430 | NULL | 12882 |

1 row in set (0.000 sec)

MariaDB [revistasuncu]> select * from genre_settings where genre_id =133;

| genre_id | locale | setting_name | setting_value | setting_type |
±---------±-------±-------------±---------------------------±-------------+
| 133 | es_ES | name | Texto del Artículo | string |

It seems that, somehow during the upgrade the image file lost his link to the xml file and also it’s file type among other things …

I haven’t tried to “fill the blanks” directly in the database in the hope to find a better, more general, solution.

Thanks in advance!

Hi @hilongo,

I suspect it’s related to this issue: Links to dependent files referenced in HTML/JATS XML galleys are broken · Issue #6801 · pkp/pkp-lib · GitHub

originalFileName was removed in 3.3 and name property is used instead. Can you compare the values for the same images in the original_file_name column before the upgrade with those after the upgrade in the submission_file_settings table, column setting_value where setting_name = name?

1 Like

Hi @Vitaliy … Thaks for the answer!

Sure:

SELECT * FROM submission_files WHERE original_file_name = '2314-1549 -rhaa-55-02-35-gf1.jpg';

throws:

| file_id | revision | source_file_id | source_revision | submission_id | file_type  | file_size | original_file_name               | file_stage | viewable | date_uploaded       | date_modified       | assoc_id | genre_id | direct_sales_price | sales_type | uploader_user_id | assoc_type |
|   12883 |        1 |           NULL |            NULL |          3988 | image/jpeg |     12312 | 2314-1549 -rhaa-55-02-35-gf1.jpg |         17 |        0 | 2020-10-25 21:34:42 | 2020-10-25 21:34:42 |    12882 |      202 | NULL               | NULL       |             1555 |        515 |

and

select * from submission_file_settings where setting_value = '2314-1549 -rhaa-55-02-35-gf1.jpg' ;`

Throws

| submission_file_id | locale | setting_name | setting_value                    | setting_type |

|              12883 | en_US  | name         | 2314-1549 -rhaa-55-02-35-gf1.jpg | string       |
|              12883 | es_ES  | name         | 2314-1549 -rhaa-55-02-35-gf1.jpg | string       |

The upgrade script manages file names correctly.

Just to clarify the problem: the images are not displayed with Lens Viewer correctly but they appear attached to the galley as they should, right?

Exactly …

I have created a new version of one of the published articles and edited the XML Galley to check its attachements. The images are all there and I can click them and they display ok (screenshot below)

Captura de pantalla_2021-05-21_11-26-20

Does the issue appear if the image name doesn’t contain spaces?

hi everyone,

This is most likely a “spaces in the file name” issue. The lens galley plugin will urlencode the file name before it searches for it in the XML, and if there spaces in the file name the pattern will no longer match:

I can go either way on this, but I believe the rationale here is that URLs (and web accessible resources like filenames) shouldn’t have spaces in them.

Cheers,
Jason

1 Like

Agreed 100% @jnugent !

Now… I can advice the journal managers not to use spaces in web accesible resources from now on, but how could I fix the already published content?

Hi @hilongo

Well, that’s a bit tricky. The easiest thing to do is remove the use of rawurlencode the line I mentioned, above. However, this will mean that the pattern will not match any URLs that might exist in your XML files that are URL encoded. You could also search and replace in your XML files for image links that contain spaces and replace those space characters with %20, which will then match the pattern again.

Finally, and this might be the most work, you could search and replace in the XML files and remove the spaces (so, convert a filename like “2420 -something” to “2420-something”) and then also remove the spaces in the original file name field in the database using mysql_replace() but that’s potentially error prone (and back up your database if you attempt it).

Cheers,
Jason

1 Like

Thanks to all that contributed. I finally fixed the problem following this procedure:

2 Likes

Hi @hilongo

Glad to hear it worked out!

Jason

2 Likes