This post is a summary of an issue related to PDF.js and the EZproxy software by OCLC used to by libraries to proxy resources. It’s somewhat more verbose than it might be to include all likely search terms for this issue.
In this example I’ll use the OJS server http://mapress.com/ but the issue affects many for-pay installs of OJS when used through EZproxy. I’ll be using my institutional EZproxy install. For the example resource:
Clearly the second instance of host name in the URL isn’t being rewritten. When the javascript tries to access that non-rewritten URL it’s flagged as a cross-site security issue and fails hard with the message "PDF.js v1.0.907 (build: e9072ac) Message: Unexpected server response (0) while retrieving PDF "http://mapress.com/j/zt/article/viewFile/zootaxa.4150.4.1/7282"."
The short-term fix for EZproxy admins is for the EZproxy stanza for this resource to include the two lines to rewrite the javascript:
The DJ (Domain Javascript) and HJ (Host Javascript) implement the basic logic of “rewrite the host names in bitstreams identified as javascript as well as bitstreams identified as HTML.” They are very useful when navigation / redirects are done using javascript. Neither rewrites things that look like host names in the path or query parts of the URL.
So they cannot be used to fix this issue, alas.
I agree that this is not an OJS issue, but it is an issue for a number of OJS users and/or their clients.
We are also having EZProxy setting wrong URL for the iframe; and instead of fixing the absolute URL, we decided to use relative URI instead: $relativeUrl = '/index.php/' . $journal->getPath() . '/article/download/' . $article->getBestArticleId() . '/' . $galley->getBestGalleyId()
This variable is calculated within the plugin class, and replaces the $pdfUrl in the iframe src attribute.
Would this cause any issues? @ctgraham or someone else who knows.
You may be able to do this for your specific install, but it won’t be able to be generalized in other cases where resful urls remove the index.php or journal path, etc.
In a general case, you would probably want to remove the base url from the fully formed URL. There is a core function to assist with this:
That said, if your install will always use the /index.php/ prefix to a journal path, your code should be fine.
@ctgraham I just tried using Core::removeBaseUrl on the URL generated by the PKPPageRouter within the PDFJsViewer plugin.
I’ve noticed that the core function did not generated the “/index.php/” part of URL, even though our installation uses it.
I’ll investigate further.
Hmmm… it looks like this is an intentional feature of this function:
This might be more complex that I would have hoped. If not Core::removeBaseUrl, I’m not familiar with a function in the code which will construct a relative URL.
I’m thinking there should be a way for PHP to detect whether “/index.php/” has been configured (or simply whether it is part of the URL), and maybe the core function can be modified to conditionally remove the URI instead of always removing.
Whether or not OJS’s “index.php” is part of the URL is configured with the restful_urls option in config.inc.php.
You can use Config::getVar('general', 'restful_urls') to check this setting. Because various other functions depend on Core::removeBaseUrl() as-is, we won’t be able to modify this function without either adding a new parameter or auditing all current usages.
We were able to fix our issue by carefully inspecting our EZProxy configuration and fixing bad configs.
There was a typo in a stanza that affected OJS proxying. After fixing that typo, the base URL was correctly generated by this plugin. No modification for the plugin was necessary.
Hi @radjr, replying on Sean’s behalf. Our fix involves instructing the subscriber institution to add stanzas to their EZproxy config that look something like this (based on the journal’s domain):
A temporary fix could include a find-and-replace of the encoded part in your stanza. If you do, you can provide the functioning stanza to OCLC for inclusion in their database. Here is an example of our makeshift fix - Iter: Gateway to the Middle Ages and Renaissance - OCLC Support
I know this thread is older but I’d like to add to this conversation in 2021 because a journal subscriber last week opened a ticket about urls breaking again due to EZProxy. We found that the merged solution in this thread was not enough to solve it. I would like to add our additional solution included modifications to the javascript in the/pdfJsViewer/templates/display.tpl template file.
In order for this change to be enabled across the board, the display.tpl file had to be replaced in all our themes that used pdfJsViewer/templates/display.tpl. I’ve also commented on the closed issue, hoping to see if it can be reopened and this solution can be contributed. Thank you, take care and stay safe.
Best Regards,
Rachel
<script type="text/javascript">
// Creating iframe's src in JS instead of Smarty so that EZProxy-using sites can find our domain in $pdfUrl and do their rewrites on it.
$(document).ready(function() {ldelim}
var urlBase = "{$pluginUrl}/pdf.js/web/viewer.html?file=";
var pdfUrl = "{$pdfUrl}";
var encodedPdfUrl = encodeURIComponent(pdfUrl);
encodedPdfUrl = encodedPdfUrl.replace("https%3A%2F%2F", "https://").replace("http%3A%2F%2F", "http://");
$("#pdfCanvasContainer > iframe").attr("src", urlBase + encodedPdfUrl);
{rdelim});
</script>
@wangra , in cases where a prior issue has already been closed/resolved, but where additional edge-case or use-cases exist which need to be addressed, please feel free to open a new GitHub issue and submit a PR against that issue.