OJS3: Usage events logs are not generated properly

Greetings @bozana and @ctgraham,

I have updated my issue on github about statistics plugin in OJS3.0.1, but I leave some more info here.

First of all, usage_events logs are not generated properly. For now it can be seen there (in my case):

  1. My own clicks from server.

  2. My own PC in same providers subnet.

  3. googlebot.

Example of a click from my Phone:
myIpInSubnet - - "2017-01-27 21:51:06" http://e-medjournal.com/index.php/psp/article/view/15 200 "Mozilla/5.0 (Linux; Android 5.0; MyDeviceModel) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.91 Mobile Safari/537.36"
another example:
mySubnetIp administrative 1 "2017-01-27 16:38:40" http://e-medjournal.com/index.php/psp/article/download/4/20/62 200 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"
What I can not see, is clicks right on galley file like http://e-medjournal.com/index.php/psp/article/view/15/25
but maybe this is normal, do not know exactly.

In compare to apache access.log definitely there is no 99% clicks on article html galley file. In apache log typical click on this article is recorded like:
IP - - [27/Jan/2017 Time +0200] "GET /index.php/psp/article/view/15/25 HTTP/1.1" 200 26774 "https://www.facebook.com/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"

Also I have processed apache log for this day with regex and transformed it to usage_events log file. Typical log line example:
IP - - "2017-01-27 Time" http://e-medjournal.com/index.php/psp/article/view/15/25 200 "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"
I have deleted all clicks from bots, my own PC and clicks, that was not related to articles views.
After successful processing I had near 200 new entries in metrics table for that day.

Another problem was that all those entries were recorded with assoc_type 1048585 (what exactly this means?). And with empty representation_id column. After changing to assoc_type 515 (galley file?) the clicks appeared in the front-end graph. In the metrics table I have near 200 rows for that day, but on front-end there is 1800 clicks, according to number of processed lines in log file for this article (Is 1 row in metrics table not equals 1 click?). From usage_events log for that day I had got only 5-10 entries, associated for that article, and 10 clicks according to statistic graph on front-end.

To summarize:

  1. Think the problem is in recording clicks from html galleys (maybe xml too) into logs.
  2. There is a need to review if assoc_type is recorded in metrics table right.
  3. And maybe someone could show me in a post how click to html galley file should be recorded in events_log file (without sensitive info like ip)?

Also inspecting metrics table, above pointed earlier, I see absence of representation_id in columns for several articles where it should be. After correcting it manually this views are displayed at front-end. It seems these was not an abstract views, but views of galleys.

Hi @Vitaliy

Thanks a lot fore reporting!
I already answered in the GitHub Issue, but maybe also here: The URLs like this http://e-medjournal.com/index.php/psp/article/view/15/25 (i.e. without file ID at the end) should never be logged/considered – this page calls the file URL with the file ID that should be logged. I will have to check if the statistics calculation considers these kind of URL when reading from apache logs – this would be wrong.
Do you use OJS htmlArticleGalley plugin for HTML articles – it seems that the problem is with your HTML articles…
Also that the galley access is logged as article view access points to the same problem, that was fixes with these PRs:
pkp-lib ojs-stable-3_0_1: #2170
ojs ojs-stable-3_0_1: pkp/ojs#1181

Could you maybe send your HTML file and if necessary explain how is this file displayed, if I would need something else…

Thanks!
Bozana