COUNTER stats after OJS 2.4.3 seems to be much bigger than it should

I was taking some reports today, and I noticed that the stats returned from COUNTER after I upgraded to 2.4.3 are 3 to 5 times bigger than it should.

For example, the total download of all journals from 2013 was 152974. In 2014 after april (when I updated) was 646938.

It is impossible that I have all that flow, I checked google analytics and it seems to be even bigger than the page views of the site.

What is wrong in this case?

Hi @luizborges,

The numbers are related to the same time period?

Thanks,
Bruno

No, I have old metrics up to Abril 2014, after that I only have those new metrics that seems inflated.
In the example that I mentioned, the old metrics are from 2013 january to december. The new metrics are from 2014 april to december.

@luizborges,

Are you using MYSQL or POSTGRESQL?

Regards,
Bruno

@beghelli, MySQL, but I don’t see how that could influence anything.

@luizborges,

We have a bug already in stats processing using the acron plugin, where the acron plugin was calling more than one time the stat processing because POSTGRESQL didn’t support the function we are using to avoid duplicating the calls.

But since you’re using MySQL, I am not sure what is happening. Let’s make sure that the stats processing is ok. Do the following, please:

1 - go at your ojs files_dir folder, under usageStats/archive and pick any filename there (including the extension).

2 - run this query:

SELECT count(metric) FROM metrics WHERE load_id = "YOUR FILENAME HERE"

3 - make sure you save the result;

4 - turn off acron plugin, under generic plugins page administration;

5 - move the file you picked from usageStats/archive to usageStats/stage folder;

6 - run this php script:

php tools/runScheduledTasks.php plugins/generic/usageStats/scheduledTasksAutoStage.xml

7 - run the same sql query and compare the results.

Let us know the result and we will continue to debug this.

Thanks,
Bruno

My acron plugin is disabled, I’m using CRON for running the same string on stage 6.

SQL result before the test with “usage_events_20150531.log”: 3076.
SQL result after the test: 3076.

The file “usage_events_20150531.log” is still on the stage folder, is that normal? Should it have been moved to the archive?

I do remember that just after the migration it didn’t quite work. So I followed the instructions on http://pkp.sfu.ca/wiki/index.php?title=PKP_Statistics_Framework#Processing_Log_Files to use cron and it seemed to work (now I think it might be doing something wrong).

@luizborges,

After processing the file should be in archive folder only. Can you check the permissions for the stats logs files and folders? The php process you’re running as client should have write access to them.

Cheers,
Bruno

Isn’t there a safe measure that prevents it from running multiple times? I just checked and I have full permissions to run it. And the script doesn’t spit any errors.

@luizborges,

Yes, there is. It is just not working ok with PostgreSQL. But MySQL is working fine.

Can you make sure that you have only one usage_events_20150531.log inside the usageStats folder? If you have two, can you delete one, move the other one to stage and run the task again?

I moved the file between tries, so there is only one on the a folder at time.
I tried to run it with the file on the stage, processing, and usageEventLogs folders without any changes (nor errors). Is there a way to run it verbose? Ou maybe I could run it in a different mode than scheduledTasksAutoStage.xml ?

@beghelli, today I just checked and the file was processed without error at night (the files were moved from stage to archive) and the count remained the same in the database.
Any thoughts?

EDIT2: Some data from an estabilished journal from January 2015 that I got using the custom reporting:
Article file downloads: 12923
Article abstract page views: 4919
Issue file downloads: 423
Issue table contents page views: 1562
Journal main page views: 1591
Google Analytics total page views for the journal (01/2015): 8965

Previous year PDF download (via COUNTER stats): 5484
Google Analytics total page views for the journal (01/2014): 12260

There is something wrong either on the previous COUNTER stats or on the new reporting system. In 2014 year I had 2.23:1 ratio on PV:Downloads, and in 2015 it is 1:1.44. I can’t imagine how to account for this change…

Could it be that the previous COUNTER system didn’t account for some download? Maybe direct linking from Google, or something like that?