OJS 3 behind reverse proxy - how to achieve?

Hi everyone,

I am trying to run OJS 3 behind a reverse proxy. I terminate SSL at the reverse proxy and do the communication to the backend in http (quite a standard setup I think).
As I saw that there is the possibility to configure a base_url in config.inc.php I thought this should be easy.

But it’s not. I see that the value for base_url is unfortunately almost never used to “calculate” URLs: When I look at the source code of the start-page I see none of the URLs for links, css or scripts uses https which renders the page useless as browsers do not load script- and css files over http into a page loaded with https. Even setting base_url[index] does not help here. Links get https as scheme this way but not the scripts and stylesheet (these are all the links in the startpage):

<link rel="stylesheet" href="https://MYSITE/ojs3/$$$call$$$/page/page/css?name=stylesheet" type="text/css" />
<a href="https://MYSITE/ojs3/index" class="is_img">
<img src="http://MYSITE/ojs3/templates/images/structure/logo.png" alt="Open Journal Systems" title="Open Journal Systems" width="180" height="90" />
<a href="https://MYSITE/ojs3/user/register">
<a href="https://MYSITE/ojs3/login">
<a href="https://MYSITE/ojs3/user/setLocale/de_DE?source=%2Fojs3%2F">
<a href="https://MYSITE/ojs3/user/setLocale/en_US?source=%2Fojs3%2F">
<a href="https://MYSITE/ojs3/about/aboutThisPublishingSystem">
<img alt="Open Journal Systems" src="http://MYSITE/ojs3/templates/images/ojs_brand.png">
<img alt="Public Knowledge Project" src="http://MYSITE/ojs3/lib/pkp/templates/images/pkp_brand.png">
<script src="http://MYSITE/ojs3/lib/pkp/lib/components/jquery/jquery.min.js" type="text/javascript">
<script src="http://MYSITE/ojs3/lib/pkp/lib/components/jquery-ui/jquery-ui.min.js" type="text/javascript">
<script src="http://MYSITE/ojs3/lib/pkp/js/lib/jquery/plugins/jquery.tag-it.js" type="text/javascript">
<script src="http://MYSITE/ojs3/plugins/themes/default/js/main.js" type="text/javascript">

Am I missing something?
What can I do?

Greetings
Hermann

Hi @hermann,

There are different kinds of base_url settings in config.inc.php – the main base_url setting, and then an optional set of base_url[...] settings specifying a “context” (the ... part). The context will be index for the site-wide areas of the site (outside any particular journal), and you’ll also need to specify an additional entry for each journal by path (see the administrator’s “Edit” form for editing a journal for where the path is specified).

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher

I did specify
base_url[index] = "https://MYSITE/ojs3/index.php/index"
This helps to get the correct links to “Login”, “Register” asf. And it helps to make the browser load the images (although with a warning about loading insecure content into a secure page).
But it does not help to get all those css- and javascript-files loaded as these would have to be loaded over insecure http which browser only do with images but not with “active” content.

It looks like for “calculating” the URLs of those css- and javascript resources base_url and base_url[index] are not used!

Greetings
Hermann

Hi @hermann,

Did you specify an additional base_url[...] statement for your journal path?

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

no I have not specified a base_url[...] for a journal as I have not yet configured any journal. And I can’t add one as the whole Site-Admin interface is not working (due to not loaded javascript-files).

I think the problem is this: PKPRequest::getBaseUrl() tries first to auto-detect the base URL.
That means in all cases where this function is used (e.g. in the TemplateManager to add javascript- and css-resources) the setting in config.inc.php gets ignored!

Why are you doing it this way. Wouldn’t it be easier/better to use the config setting?

Greetings
Hermann

Hi @hermann,

OJS’s primary means of operation is to try auto-detecting its environment and responding accordingly. That works best for most users, most of the time. The various base_url configuration directives were added over time to better support URL rewriting, command-line tools that need to know how to generate URLs, etc. As a result it might not be what we’d design from the ground up, but has worked well overall.

We host in a reverse-proxy environment too, and our proxy (which as far as I’m aware hasn’t required any nonstandard configuration) passes along certain headers (e.g. SCRIPT_NAME) according to the proxy envioronment, not the proxied environment. As a result, the script auto-detection should work OK even behind a reverse proxy.

If the hostname is the problem, it might be that your proxy isn’t providing the HTTP_X_FORWARDED_HOST header to OJS.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

thanks for the information. It’s good to hear others are using reverse-proxies with OJS as well. :slight_smile:

In my case the problem is not the servername (HTTP_X_FORWARDED_HOST is provided correctly).
The problem is the protocol.
I am enforcing HTTPS in my proxy web-server.
But the proxy talks HTTP to the backend (OJS) server. So OJS sees HTTP as protocol in the requests and uses it to construct the value of baseUrl.
Are you using the same protocol on both ends?
If you use different protocols like I do, how are you solving this problem?

I see two possibilities:

  • change PKPRequest::getBaseUrl() and make $allowProtocolRelative = true the default.
  • register a callback function with the hook Request::getProtocol in order to force the protocol to https (but how and where?).

Greetings
Hermann

Hi @hermann,

I’m not sure I’ve hit that permutation before – but off the top of my head, I’d suggest trying to turn on the force_ssl setting in config.inc.php.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

turning on force_ssl does not do any good because then you get a redirect loop:
OJS sees an HTTP request (not knowing it’s coming from the reverse-proxy) and redirects the browser to the HTTPS location. So the browser makes a new request (using HTTPS as before) to the reverse-proxy resulting in a new HTTP request to the OJS backend which redirects to HTTPS again. And so on and so forth.

Greetings
Hermann

PS: For the time being and as a workaround I changed PKPRequest.inc.php:

@@ -154,7 +154,7 @@
         * @param $allowProtocolRelative boolean True iff protocol-relative URLs are allowed
         * @return string
         */
-       function getBaseUrl($allowProtocolRelative = false) {
+       function getBaseUrl($allowProtocolRelative = true) {
                $_this =& PKPRequest::_checkThis();
 
                $serverHost = $_this->getServerHost(false);
1 Like

Have you considered adding self-signed SSL certificates to your OJS servers? Without SSL for the last mile, your network traffic is vulnerable between the proxy and your OJS server. Self-signed certificates would resolve that vulnerability, and resolve this issue as well.

Alternately, I’m curious of any debugging you discover on this, as I have very mixed feelings about the base_urls and auto detection of such.

Hi @ctgraham,

no I haven’t considered using SSL from the reverse-proxy to the backend. It may solve this problem but it adds more and unnecessary complexity. Unnecessary because the “last mile” is on a private network using addresses that are unrouteable in the Internet and that goes through network switches that heavily use VLANs to keep things separate. So only servers from Admins I trust in are in the “vicinity” of my servers. Apart from that: those switches will raise a great hue and cry if someone tries to do something nasty (e.g. ARP spoofing) inside that private network.

I am sorry but I do not understand what you mean with

What can I do to help?

Greetings
Hermann

I would encourage you to think not just in terms of trusting your admins, but also in terms of ensuring that even if an admin or the last mile network were compromised, it would be technologically impractical to sniff out a password on the open wire. Trust in your people and your facility is only half of the equation.

In terms of debugging the issue, it would be interesting to know the full breadth of information your server knows about the request when it comes it. What all HTTP_* environment variables are present in the request? Anything that might be used to key in on this situation and to handle it better?

Thanks @ctgraham,

you are right. I will think about it and will discuss it with my fellow admins.

The backend is seeing these “HTTP_” environment variables:

HTTP_HOST
HTTP_USER_AGENT
HTTP_ACCEPT
HTTP_ACCEPT_LANGUAGE
HTTP_ACCEPT_ENCODING
HTTP_REFERER
HTTP_COOKIE
HTTP_DNT
HTTP_UPGRADE_INSECURE_REQUESTS
HTTP_CACHE_CONTROL
HTTP_X_FORWARDED_FOR
HTTP_X_FORWARDED_HOST
HTTP_X_FORWARDED_SERVER
HTTP_CONNECTION

And of course I can “send” (almost) any variable to the backend by setting a request header on the reverse-proxy using
RequestHeader set ... .
E.g.
RequestHeader set HTTPS %{HTTPS}e
or
RequestHeader set X-Forwarded-Protocol "HTTPS"

Greetings
Hermann

Firstly try to setup HTTPS on your server ( SSL end to end ).
If this is not an option you can use the following workaround which work across multiple PHP apps.

  1. Configure SSL terminator to insert additional header when terminating SSL.
    We have F5 and we use the following iRules to inject additional header : :

    when HTTP_REQUEST priority 10 { HTTP::header insert FRONT-END-HTTPS "on" HTTP::header insert X-Forwarded-Proto "https" }

  2. Configure web server / php-fpm to prepend PHP files ( ie within VirtualHost directive or php-fpm pool ) :
    php_admin_value auto_prepend_file /data/web/ssl-workaround/ssl.php

  3. ssl.php - Adjust php_sapi_name accordingly

      <?php
    
         # (rwahyudi):
         # Workaround for application that don't factor in SSL might be terminated somewhere else. This code is prepended via php config
    
         # error_log ( php_sapi_name() );
         if ( php_sapi_name() === 'apache2handler' )
         {
                 if (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https')
                 {
                         $_SERVER['HTTPS'] = 'on';
                 }
         }
    

    ?>

1 Like

This helped me out a lot. Have you ever found a solution besides this workaround?

This workaround seems to lead to relative links for OAI aswell.

https://www.zchinr.org/index.php/zchinr/oai?verb=ListRecords&metadataPrefix=oai_dc

Not to sure if this works well with indexes like the Bielefeld Academic Search Engine.

Hi @Rianto_Wahyud.

I’m working to setup a reverse proxy (traefik) and I fall in the same issue.

Although some workarounds work I like to understand why it happens and your comment here denotes you think is an issue in the code:

I know it’s been a long time since you wrote it, but if you got time, I love to listen why it fails and (if is a OJS issue) fix the code.

Potential solution is evaluated here:

1 Like

Thanks, @marc!
This is a much better solution as it doesn’t change the code but only the web-server configuration.

We just tested this and I can confirm that it really works.

Just to make it clear (and easily findable): what we did is adding this line to our Apache-configuration:

SetEnvIf X-Forwarded-Proto “https” HTTPS=on

Which essentially means that Apache “lies” to OJS about the protocol if the proxy sets the X-Forwarded-Proto header to https for requests to the backend.

2 Likes

Thanks @hermann for your feedback.

I have been digging into this recently trying to document how to build this depending on your url (domain, subdomain or folder), your protocol (https only, mixed protocols) or your infrastructure (direct, reverse-proxy or/and containers) and after one week I didn’t solid rock examples for all the cases.

Issue documented here:

I will keep digging…