Duplicate Galley View Counts

Dear PKP staffs and Forum members,

I am currently operating OPS3.3.0.8 (PHP8.1).
I have encountered an issue in the ”preprint_details.tpl” of our OPS system where the getTotalGalleyViews function results in double counting the galley view counts. This problem occurs when multiple galley_id s are associated with the same submission_file_id.

The getTotalGalleyViews function currently calculates galley views as follows:

function getTotalGalleyViews() {
    $application = Application::get();
    $publications = $this->getPublishedPublications();
    $views = 0;

    foreach ($publications as $publication) {
        foreach ((array) $publication->getData('galleys') as $galley) {
            $file = $galley->getFile();
            if (!$galley->getRemoteUrl() && $file) {
                $views = $views + $application->getPrimaryMetricByAssoc(ASSOC_TYPE_SUBMISSION_FILE, $file->getId());
            }
        }
    }
    return $views;
}

In this function, if the same submission_file_id is associated with multiple galley_id s, the views are counted for each galley_id , resulting in double counting.

To prevent duplicate counting, I propose modifying the function to count views only once per submission_file_id . The following code includes a check for duplicate file IDs and counts each file’s views only once:

function getTotalGalleyViews() {
    $application = Application::get();
    $publications = $this->getPublishedPublications();
    $views = 0;
    $processedFiles = [];

    foreach ($publications as $publication) {
        foreach ((array) $publication->getData('galleys') as $galley) {
            $file = $galley->getFile();
            if (!$galley->getRemoteUrl() && $file) {
                $fileId = $file->getId();
                if (!in_array($fileId, $processedFiles)) {
                    $views += $application->getPrimaryMetricByAssoc(ASSOC_TYPE_SUBMISSION_FILE, $fileId);
                    $processedFiles[] = $fileId;
                } else {
                    // Debug log for duplicate file IDs
                    error_log("Duplicate file ID: " . $fileId);
                }
            }
        }
    }
    return $views;
}

Additionally, to identify all submission_file_id s with multiple galley_id s, the following query can be used:

SELECT submission_file_id, COUNT(galley_id) AS galley_count
FROM publication_galleys
GROUP BY submission_file_id
HAVING COUNT(galley_id) > 1;

I would appreciate any feedback or suggestions for improvements on this fix. It would be great if this bug fix could be incorporated into the community version.

Thank you for your assistance.

Minoru Tanabe.

Additional question:
I would like to know if, fundamentally, when a preprint author creates a new version (pushes the button of “Create New Version”), there are two possible scenarios: one where “one submission_file_id is associated with one galley_id ,” and another where “one submission_file_id is associated with multiple galley_id s.” Is this behavior a specification of OPS, or is it a bug?

Thanks.

Dear PKP staffs,

My previous question might have been complex and hard to understand, so this time I will ask a straightforward question.
When an author, moderator, or administrator revises published post data (by pressing the Create New Version button), there seem to be cases where a record is added to the “submission_files” table for the relevant version and cases where it is not.
Could you please explain under what operations such differences occur? Naturally, records for the revised version are created in the “publications” table and the “publication_galleys” table.

Best.

I have recently discovered that “new records are not always created in the submission_files table during a revision.” For example, in cases where the author only changes the bibliographic information (metadata) and does not modify any preprint files (i.e., does not replace the uploaded PDF file) during a revision, no new records are created in the submission_files table, and the old version’s records are reused.

Although this revision pattern is not mainstream, I have observed several instances. In such cases, I have confirmed that the issue of double-counting download numbers on the preprint list screen occurs.

While the correction code I provided earlier resolves the double-counting issue, I request that this bug be officially addressed in the OPS 3.3 series versions and applied to versions 3.4 and later.

Hi @Minoru_Tanabe

Let me investigate… and I will then come back to you…

Thanks!
Bozana

Hi @Minoru_Tanabe,

I have taken a look, and yes, you are right.
I opened a new GitHub Issue, s. OPS: Duplicate Galley Views Countes · Issue #10082 · pkp/pkp-lib · GitHub. I will provide a fix there and it should be available in the next 3.3.0.18 release.
The function is implemented differently in 3.4.x – because usate statistics background is fully rewritten – so that the problem should not appear there.

Thanks,
Bozana

1 Like

Hi @bozana ,

Sorry for my late reply. Thank you for your response.
I checked the changes. I confirmed that the code you modified produces the same results as the code I provided (though some variable names are different).
I also understand that the changes will be included in version 3.3.0.18, and that this issue won’t occur in version 3.4.x due to a different implementation.

Thank you for your assistance.
Minoru Tanabe