Error in processing log files

Hi @asmecher, this is the stack trace

ojs@ojs:/production/var/ojs01$ php tools/runScheduledTasks.php plugins/generic/usageStats/scheduledTasks.xml
<h1>DB Error: ERROR:  invalid input syntax for integer: "Special issue: IX ATIt Congress"</h1><h4>Stack Trace:</h4>
<strong>File:</strong> /production/var/ojs01/classes/issue/IssueDAO.inc.php line 64<br />
<strong>Function:</strong> DAO->retrieve("SELECT i.* FROM issues i WHERE issue_id = ?", "Special issue: IX ATIt Congress")<br />
<br/>
<strong>File:</strong> /production/var/ojs01/classes/issue/IssueDAO.inc.php line 26<br />
<strong>Function:</strong> IssueDAO->getIssueById("Special issue: IX ATIt Congress", Null, False)<br />
<br/>
<strong>File:</strong> (unknown) line (unknown)<br />
<strong>Function:</strong> IssueDAO->_cacheMiss(Object(GenericCache), "Special issue: IX ATIt Congress")<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/cache/GenericCache.inc.php line 63<br />
<strong>Function:</strong> call_user_func_array(Array(2), Array(2))<br />
<br/>
<strong>File:</strong> /production/var/ojs01/classes/issue/IssueDAO.inc.php line 91<br />
<strong>Function:</strong> GenericCache->get("Special issue: IX ATIt Congress")<br />
<br/>
<strong>File:</strong> /production/var/ojs01/classes/issue/IssueDAO.inc.php line 182<br />
<strong>Function:</strong> IssueDAO->getIssueByPubId("publisher-id", "Special issue: IX ATIt Congress", "3", True)<br />
<br/>
<strong>File:</strong> /production/var/ojs01/plugins/generic/usageStats/UsageStatsLoader.inc.php line 576<br />
<strong>Function:</strong> IssueDAO->getIssueByBestIssueId("Special issue: IX ATIt Congress", "3", True)<br />
<br/>
<strong>File:</strong> /production/var/ojs01/plugins/generic/usageStats/UsageStatsLoader.inc.php line 519<br />
<strong>Function:</strong> UsageStatsLoader->_getInternalIssueId("Special issue: IX ATIt Congress", Object(Journal))<br />
<br/>
<strong>File:</strong> /production/var/ojs01/plugins/generic/usageStats/UsageStatsLoader.inc.php line 181<br />
<strong>Function:</strong> UsageStatsLoader->_getAssocFromUrl("http://www.italian-journal-of-mammalogy.it/issue/view/Special is...", "/production/var/ojs-uploads/ojs01/usageStats/processing/usage_even...", 434)<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/task/FileLoader.inc.php line 131<br />
<strong>Function:</strong> UsageStatsLoader->processFile("/production/var/ojs-uploads/ojs01/usageStats/processing/usage_even...", Null)<br />
<br/>
<strong>File:</strong> /production/var/ojs01/plugins/generic/usageStats/UsageStatsLoader.inc.php line 124<br />
<strong>Function:</strong> FileLoader->executeActions()<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/scheduledTask/ScheduledTask.inc.php line 149<br />
<strong>Function:</strong> UsageStatsLoader->executeActions()<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/cliTool/ScheduledTaskTool.inc.php line 112<br />
<strong>Function:</strong> ScheduledTask->execute()<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/cliTool/ScheduledTaskTool.inc.php line 95<br />
<strong>Function:</strong> ScheduledTaskTool->executeTask("plugins.generic.usageStats.UsageStatsLoader", Array(0))<br />
<br/>
<strong>File:</strong> /production/var/ojs01/lib/pkp/classes/cliTool/ScheduledTaskTool.inc.php line 66<br />
<strong>Function:</strong> ScheduledTaskTool->parseTasks("plugins/generic/usageStats/scheduledTasks.xml")<br />
<br/>
<strong>File:</strong> /production/var/ojs01/tools/runScheduledTasks.php line 34<br />
<strong>Function:</strong> ScheduledTaskTool->execute()<br />
<br/>
ojs2: DB Error: ERROR:  invalid input syntax for integer: "Special issue: IX ATIt Congress"

Andrea

Hi @marchitelli,

Thanks, that helps. Could you try the patch at Add missing int casts to IssueDAO · Issue #578 · pkp/pkp-lib · GitHub to see if it corrects the issue? If you can confirm, I’ll commit it for release in OJS 2.4.6-1.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,
the patch causes a blank pages in every pages beyond login. This is the error log

[Mon Jul 06 21:50:23 2015] [error] [client 130.186.18.97] ojs2 has produced an error\n  Message: WARNING: Cannot modify header information - headers already sent by (output started at /var/ojs/classes/issue/IssueDAO.inc.php:858)\n  In file: /var/ojs/lib/pkp/classes/core/PKPRequest.inc.php\n  At line: 87\n  Stacktrace: \n  Server info:\n   OS: Linux\n   PHP Version: 5.4.39-0+deb7u2\n   Apache Version: Apache/2.2.22 (Debian)\n   DB Driver: postgres\n   DB server version: PostgreSQL 9.1.12 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit

I tried to bypass the error (run the process with the patch and see the results without the patch
I saw that all files were processed (now they are /ojs-uploads/leo/usageStats/archive directory) but the report (e.g. OJS usage statistics report at http://ojstest.test.it/index.php/jjournal/manager/statistics?statisticsYear=2015 ) are without any data.

andrea

Hi @marchitelli,

I don’t think the patch applied successfully. Did you apply it using the patch tool, or manually? If with the patch tool, what was its output?

Thanks,
Alec Smecher
Public Knowledge Project Team

You are right, @asmecher: I’ve reversed and re-applied the patch and now the import process works :blush:

Otherwise, the report downloaded from User > Journal Management > Stats & Reports > OJS usage statistics report is blank and contains no data.

Andrea

Hi @marchitelli,

I suspect you also have to patch this and then re-run your logs: Stats can't process Issues with unique identifiers · Issue #460 · pkp/pkp-lib · GitHub

Regards,
Alec Smecher
Public Knowledge Project Team

HI @asmecher,
I have applied Stats can't process Issues with unique identifiers · Issue #460 · pkp/pkp-lib · GitHub and re-run logs processing.

My usage stat reposrt is still blank.

Moreover, I fond that in the new COUNTER report (the one generated with new usagestats plugin) I have data only for some journals. Other journals are not listed at all.

I don’t know if this can help
Andrea

For the missing data, have you confirmed that your base_url(s) from config.inc.php match what is being captured in your old Apache log files? See: My published file views or statistics reports shows no data. What do I do?

Hi @ctgraham,
I find the problem (thank you for pointing out the base_url setting)

In my installation I have several journals: some with custom urls and some with the url of the site index

In my config.inc.php I added baseurl overrides olny for journals with custom urls and I didn’t find data for a journal in the site index path (leo.cineca.it/index.php/jlis, below you can see the whole configuration)

Now I’ve added an override for jlis, too and re-processed the logs, so I now can see data for this journal, too.

It seems a strange behavior, for me, because I thought that overrides was only for custom urls.

Is my reasongin correct?

Thanks,
Andrea


; Base URL override settings: Entries like the following examples can
; be used to override the base URLs used by OJS. If you want to use a
; proxy to rewrite URLs to OJS, configure your proxy’s URL here.
; Syntax: base_url[journal_path] = http://www.myUrl.com
; To override URLs that aren’t part of a particular journal, use a
; journal_path of “index”.
; Examples:
; base_url[index] = http://www.myUrl.com
; base_url[myJournal] = http://www.myUrl.com/myJournal
; base_url[myOtherJournal] = http://myOtherJournal.myUrl.com
base_url[index] = http://leo.cineca.it
base_url[agcm] = http://iar.agcm.it
base_url[cilea] = http://bollettino.cilea.it
base_url[caspur] = http://www.annualreport.caspur.it
base_url[ccp] = http://caspur-ciberpublishing.it
base_url[ebph] = http://ebph.it
base_url[formez] = http://www.amministrativamente.com
base_url[hystrix] = http://www.italian-journal-of-mammalogy.it
base_url[researchhpc2006] = http://researchhpc2006.cilea.it
base_url[ijph] = http://ijphjournal.it
base_url[jlis] = LEO
base_url[symphonya] = http://symphonya.unimib.it
base_url[scires-it] = http://caspur-ciberpublishing.it/index.php/scires-it

What is your unqualified base_url from config.inc.php (without the [])?

This one, at the top of the file?

; The canonical URL to the OJS installation (excluding the trailing slash)
base_url = "http://leo.cineca.it"

Yes, with that configuration of base_url = http://leo.cineca.it and overrides for anything not hanging off of http://leo.cineca.it, I wouldn’t have expected a problem.

In adding the base_url[index] line, I personally would have entered:
base_url[index] = http://http://leo.cineca.it/index.php/index
but I don’t have an exactly parallel installation to use as a test for comparison, and your config might be correct.

But I had no data only for

base_url[jlis] = http://leo.cineca.it/index.php/jlis 

that is the same path of the index.

Now I am trying with another installation, where I have
base_url = “http://riviste.unimi.it
and
base_url[index] = http://riviste.unimi.it/index.php/index
base_url[interfaces] = http://riviste.unimi.it/interfaces
but a lot of other journal with standard urls

In this case, too, I find the data only for interfaces journal and no data for all other journals.
It seems to me a bit complicated to explicitly add in overrides all the journal if only one needs to have a custom url.

Can you semplify this in some way?
Thanks
Andrea

What do you mean by “a lot of other journals with standard URLs”? Are they of the form: http://reviste.unimi.it/index.php/JOURNALNAME ?

Hi @asmecher,
the patch Add missing int casts to IssueDAO · Issue #578 · pkp/pkp-lib · GitHub is ok, you can commit it, thanks!

I catched the problem of my reports thanks to @ctgraham, it seems related to base_url overrides (see below)

Yes, e.g. http://riviste.unimi.it/index.php/DoctorVirtualis
I can’t get statistics for this journal but I can get data for http://riviste.unimi.it/interfaces (that is set up as an override):

base_url[index] = http://riviste.unimi.it/index.php/index 
base_url[interfaces] = http://riviste.unimi.it/interfaces

I can try to add an override for DoctorVirtualis and reload logs, if you think that can help to confirm the issue.

What is your restful_urls value? You seem to be mixing restful urls (http://riviste.unimi.it/interfaces) and urls with the index.php component (http://riviste.unimi.it/index.php/DoctorVirtualis). I bet that is confusing the Usage Stats process, but you are able to override the confusion by manually entering base_urls with the index.php component.

restful_urls = Off

It seems that the plugin is confused when at least one base_url override is present.
That one is the only one with stats running.

If I put only one override, I have to manage override for all journals

I wonder how your journals with restful URLs are working with restful_urls = Off.

You can test your theory by turning off scheduled_tasks_report_error_only and examining the mail from (re)procesing a logfile. I suspect the email will report the warning plugins.generic.usageStats.removeUrlError (“Could not remove base URL from…”) for each line relating to access within one of these journals.

This would come from:

A quick reading of the logic in the Core::_getBaseUrlAndPath() might point to the situation you are describing:

So, the log of processing log is full of rows similar to this one

[2015-07-07 17:39:39] [Warning] The line number 2 from the file /ojs-uploads/riviste.unimi/usageStats/processing/usage_events_20150629.log contains an url that the system can't remove the base url from.

the line 2 of the log file is

5.9.112.6 - - "2015-06-29 00:00:23" http://riviste.unimi.it/index.php/promoitals/article/download/433/621 200 "Mozilla/5.0 (Macintosh; Intel Mac OS X 1084) AppleWebKit/536.29.13 (KHTML like Gecko) Version/6.0.4 Safari/536.29.13"

this is an url without override

If I add in config.inc.php an override for each base_url[journalname] (ever if it is the same of the root base_url) and then re-run the log processing the sceduled task log has no errors and the data are shown.