Plugin for reporting pdf downloads

Somewhat similar to my previous question, I’m making a plugin that will report views of published articles to an external API.

The plugin needs to react the same way in the following situations:

  • The user downloads the pdf to their machine.
  • The user views the pdf in their browser, either via the pdf.js plugin or the builtin pdf viewer in the browser.
  • In the future, when we have HTML versions of the our articles, we would like the plugin to also report to the API when a visitor views an HTML version.

What are the appropriate hooks to use here? I see the following hooks in pages/article/ArticleHandler.php but I’m not sure which are the correct ones to respond to:

  • ArticleHandler::download
  • ArticleHandler::view
  • ArticleHandler::view::galley
  • FileManager::downloadFinished

Any guidance would be much appreciated, thank you!

[Edit] Separately from reporting pdf downloads, we also need to report when visitors view the article abstract/landing page. I guess that is the ArticleHandler::view event?

After looking at the code a little bit more, my guess is that I need to respond to the following hooks:

  • ArticleHandler::view for reporting landing page views.
  • ArticleHandler::download for any kind of full-article access, whether that’s a pdf download or viewing an HTML galley.

But I’d appreciate confirmation!

Hi @mbtuckersimm,

At a glance, that looks correct to me!

Thanks,
Alec Smecher
Public Knowledge Project Team

Thanks @asmecher! I have a related question as well. In addition to recording actual downloads, we also need to record attempted downloads where the user clicks on the pdf link but is not authorized. It seems that (understandably) the ArticleHandler::download hook doesn’t fire in that situation. Do you have any suggestion for where to intervene in that case?

Hi @mbtuckersimm,

If you’re using the subscription toolset, have a look at the hooks in classes/issue/IssueAction.php – particularly IssueAction::subscribedUser. This is called after OJS’s internal subscription checks have been run, in order to allow a plugin to either inspect the result (your situation), or override it (e.g. in case it wants to check an external subscriber database).

Regards,
Alec Smecher
Public Knowledge Project Team

@asmecher Thank you.

The problem with the approach you recommend, as I understand it, is that the IssueAction::subscribedUser hook (and similarly the IssueAction::subscribedDomain hook) isn’t only run when the user clicks the download link. These hooks fire when somebody views an issue table of contents page or an article landing page in order to decide how to style the links to the pdf.

I do already have a plugin that attaches a callback to IssueAction::subscribedDomain to check an external API to see whether the IP address corresponds to a subscriber. But this is about recording actual pdf downloads and download attempts, which I only want to happen when the user has actually clicked the pdf link. I can’t record a download attempt when the user just visits the article landing page.

I guess the question is, if I want to intervene after the access decision is made, I need a way to check whether the user was actually attempting a download, or whether they were just viewing an article landing page or issue TOC page.

Does this make sense, or am I misunderstanding something?

Hi @mbtuckersimm,

At a glance – and my advice on this is not as well informed as if I were trawling through the code – you should be able to use those hooks, but also check within them to see what page/operation are being requested. You can do that from within the hook callback by calling something like (untested):

$request = Application::get()->getRequest();
$page = $request->getRequestedPage();
$op = $request->getRequestedOp();
if ("$page/$op" == 'article/view') {
    // This is an article view request
}

Regards,
Alec Smecher
Public Knowledge Project Team

@asmecher I’ve looked into this, and it seems that $page === 'article' and $op === 'view' for download requests as well as page views. That is, it seems that these values are not enough to distinguish page views from download attempts.

I’ve spent a while with xdebug stepping through ArticleHandler->view() and it seems that the main difference between the two cases is whether $this->galley is null (landing page view) or not (download request). But this value doesn’t get passed through to IssueAction->subscribedDomain(), and hence it’s not available in the callback I attached to the corresponding hook.

The only difference I can see between the two cases (in terms of information that’s available in the callback) is what’s returned by $request->getRequestedArgs(). For the page view case, this is an array containing just the submission id. For the download case, the array contains a submission id and a galley id, and ArticleHandler::initialize() populates $this->galley based on the presence or absence of the galley id.

So, it seems that it is possible to use this to distinguish page views from download requests. However, it feels pretty brittle since the array arguments aren’t even documented with keys, it’s just by index. It’s better than nothing, though, so I’ll probably need to go with this because I don’t see any other hooks I can use to distinguish between refused and allowed downloads.

Ultimately I think the issue here is that the redirect to the login page is buried inside the call to ArticleHandler->userCanViewGalley(). In an ideal world, that would not do a redirect, it would just return a boolean and the redirect would happen at the top level inside ArticleHandler->view(). Or, even better, download requests would go through a completely different method.

Well, anyway, this is getting into the weeds. I think I have a solution, but I’m just worried because it uses undocumented properties of the request.

This topic was automatically closed after 10 days. New replies are no longer allowed.