OAI Timeout in OJS

OJS 3.1.2.4. We use OJS’s OAI capabilities to get our content indexed in databases like PlumX, WorldCat, etc
We have received notifications from them in the past week or so that our OAI fetching is timing out. Is there a server timeout limit that can be increased within OJS or is there any setting which we are not aware of?

https://journals.ansfoundation.org/index.php/jans/oai?verb=ListRecords&metadataPrefix=oai_dc

Hi @pcansf,

I suspect you’re hitting a slow query in your MySQL database (or PostgreSQL, if you’re using that). This could be e.g. due to a missing index. If you’re able to modify your DBMS’s configuration, I’d suggest turning on the MySQL slow query log (or equivalent, if you’re using PostgreSQL). Then check the log to see what query it has recorded; I suspect you’ll see something related to the OAI page.

Regards,
Alec Smecher
Public Knowledge Project Team

Thank you for the input. The following error shows up in the slow query log:
Duplicate entry ‘submissionKeyword-1048585-88’ for key ‘controlled_vocab_symbolic’

Hi @pcansf,

Hmm, I wonder if you’re missing an index on your controlled_vocabs table; that should not be a slow query. What do you get for…

SHOW INDEXES FROM controlled_vocabs;

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @pcansf,

Looking at that error a little closer, it’s fine – the controlled vocabulary code uses index collisions to detect when an entry already exists and can be used. Do you see any slow queries showing up?

What have you set your oai_max_records setting to in config.inc.php?

Regards,
Alec Smecher
Public Knowledge Project Team

slow query error log was: OAI Timeout in OJS

oai_max_records = 10000 in config.in.php

Hi @pcansf,

That’s waaaaaaay too big a value for oai_max_records. I’d recommend keeping it to around 200. Too large a value will have exactly the effect you’re observing.

Regards,
Alec Smecher
Public Knowledge Project Team

We never changed it. I guess its the original value in the config.php. If articles are more than 200, would we have issues?

Hi @pcansf,

The default value is 100, so at some point it must have been changed. It doesn’t matter if you have more than that many articles, it just controls how many are delivered via OAI-PMH in a single batch. OAI-PMH supports delivery of large sets of content in multiple batches.

Regards,
Alec Smecher
Public Knowledge Project Team

We just pushed the change but it didn’t help :frowning:
https://journals.ansfoundation.org/index.php/jans/oai?verb=ListRecords&metadataPrefix=oai_dc&set=jans

Hi @pcansf,

Your server may have bogged down due to existing requests with the high limit, or possibly your database tables may be locked. If you’re able to restart your web server (Apache) and DBMS (MySQL) processes, that may clear up the problem. Otherwise, if you’re able to check your server’s load by process, it’ll determine whether the process is database-bound, PHP-bound, or something else.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher. I took some help from the OCLC (WorldCat) support and they suggested a handy tool: http://validator.oaipmh.com

If we put in our base-url and then hit check-now. Then select ListRecords OAI_DC from the left side, it comes up with an “Invalid”.

Here is our base URL: https://journals.ansfoundation.org/index.php/jans/oai

Could it be an issue with the Dublin Core format?

Hi @pcansf,

I don’t see anything wrong with your DC ListRecords request, though it is quite slow to respond. It’s possible that they’re encountering a timeout depending on your server’s conditions. If that’s the case, further reducing the oai_max_records may help.

Regards,
Alec Smecher
Public Knowledge Project Team

Reducing the oai_max_records to just 15 worked but with a lot of delays .


This does not look like a server issue since others have experienced the same:

Our servers are not lacking any resources.

Hi @pcansf,

That seems like unusually low performance to me. I would suggest investigating what in particular is limiting your server’s performance. This is more of a server management question than an OJS question, i.e. establishing whether it’s PHP-limited or database-limited; if it turns out to be database-limited, then determining next what the problem queries are (using e.g. MySQL’s slow query log). It may be that your system is missing e.g. an index or something, but you’ll need to work systematically to determine whether that is the case and what it would be.

Regards,
Alec Smecher
Public Knowledge Project Team

Thank you for your inputs but we have explored all of the server management aspects including slow queries, missing index, etc but found nothing. All the results were posted earlier in the thread.
If there is anything else we need to check, please let me know. I will also post if our engineering team finds something but for now, their evaluation has cleared any server related issues.

Thank you.

Hi @pcansf,

I’m not suggesting that there’s something wrong with the server, just that investigation there needs to determine what is blocking the requests (probably either MySQL or PHP), and so on. That way we can track down the factor that’s causing it to respond slowly.

Regards,
Alec Smecher
Public Knowledge Project Team

Sure. Thank you for your time.