There have been several instances on my server when the load goes over >100. After researching into this and carefully observing why this is happening from MySQL Processes, this is what I’ve discovered;
Anytime the Data Cache & Template Caches are cleared, upon the first visit to the journal page - it queries a bunch of MySQL commands that usually takes up a decent amount of time to complete (considering there are over >6m rows in my OJS instance).
Here, the problem isn’t that these caching queries take place - but when multiple users are trying to visit this website during the same time or when this process is taking place, it creates multiple instances of this cache refreshing queries which causes a spam on the database. This also becomes a loop that keeps going on for hours, from what I’ve experience on my journals and ultimately disables all my websites.
Can someone please share a permanent fix/patch on this, how do I disable this loop of caching or if there is any other solution to this?
In any system using caching, when you clear the caches, it’ll need to re-generate them. During that time period, the system will necessarily perform worse than if it had the caches already present.
A couple of thoughts…
Are you regularly clearing the data caches? This shouldn’t be necessary.
Have you investigated the database queries that are causing the system to experience performance problems? It’s possible that you have an underlying database performance issue, such as tables using mixed storage engines (InnoDB vs. MyISAM).
If performance problems persist, and you do need to clear the data cache for some reason, you might want to re-generate the caches while the system is temporarily offline to outside users. To do this, for example, you could temporarily introduce a .htaccess rule that blocks access to anybody but your IP address; then click around the site (or use a series of wget commands) to cause the most important cache files to be re-generated. Then open up access to all other users. This will give OJS a chance to generate the cache files before a lot of users rush in simultaneously.
Regards,
Alec Smecher
Public Knowledge Project Team
I have no idea at all why this is happening, we do not clear data caches at all from our end (only I have admin access).
Never
All the 12 journals that I manage only use InnoDB engine (all tables), I had manually run a command to convert them before upgrading to 3.3.0-13.
I have been having to do this for a long time now, and I’ve had sleepless nights because of the downtimes that are occurring after the upgrade.
Any idea why this did not happen in OJS 2.4 to us and on 3.3, any particular plugins that force cache clearing?
This cache clearing takes place in every journal at least once every 24 hours or so, is it some plugin/pages in the installation that runs this?
I’m sharing some of the queries that are run while this happens;
135510
Prepare
0
starting
SELECT cve.* FROM controlled_vocab_entries cve WHERE cve.controlled_vocab_id = ? ORDER BY seq
135632
Execute
0
Sending data
SELECT * FROM author_settings WHERE author_id = ?
135923
Execute
1
closing tables
select locale from submissions as s where s.submission_id = ? limit 1
136844
Execute
0
Sending to client
SELECT * FROM author_settings WHERE author_id = ?
137087
Execute
0
Sending to client
SELECT * FROM author_settings WHERE author_id = ?
137565
Execute
12
Sending data
SELECT COUNT() AS row_count FROM (SELECT a., ug.show_title, s.locale, COALESCE(agl.setting_val
137889
Execute
0
closing tables
SELECT cve.* FROM controlled_vocab_entries cve WHERE cve.controlled_vocab_id = ? ORDER BY seq
138125
Execute
0
statistics
SELECT * FROM author_settings WHERE author_id = ?
138129
Execute
0
statistics
SELECT * FROM controlled_vocabs WHERE symbolic = ? AND assoc_type = ? AND assoc_id = ?
138139
Execute
147
Creating sort index
SELECT a.*, ug.show_title, s.locale, COALESCE(agl.setting_value, agpl.setting_value) AS author_g
138312
Execute
9
Sending data
select s.* from submissions as s left join publications as issue_p on issue_p.`submissio
138418
Execute
9
Sending data
select *, s.locale as submission_locale from authors as a left join publications as p
Can you share some fix that I can implement to resolve this issue? Would be appreciated.
Please let me know if you need any more info from my end.
Could you share the portion of the code/file that I can comment out to disable this caching process temporarily (perhaps, to test if the problem really lies in this process).
138139 Execute 147 Creating sort index SELECT a.*, ug.show_title, s.locale, COALESCE(agl.setting_value, agpl.setting_value) AS author_g
This is related to the author index, which is legacy code and is known to perform poorly. (It will be removed entirely starting with 3.4.0.) See this related issue:
If you see a lot of requests for /search/authors (e.g. from search engines) in your access log around the time that performance degrades, it’s likely related to this. There is a suggestion here on how to quickly disable it.
Regards,
Alec Smecher
Public Knowledge Project Team
Appreciate your prompt replies, I tried using those direct author search links and it does appear to eat up memory a lot and sends those queries that I could see above. I’ve implemented the fix you’ve shared and will wait and observe if that resolves the issue completely.
Additionally, I’d like to bring to your notice that sitemap is also something that cooks up the memory as much and I had disabled it on my instance (created my own code to produce sitemap). Something that could be worked on for OJS 3.4 - if not already.
Will keep you updated on if this was a permanent solution for my instances.
The sitemap is currently coded to generate a complete list of all submissions in a journal, and involves fetching a complete representation of each of the journal’s issues and submissions from the database. This will result in a lot of database queries, but each should be well indexed and thus quick, and as the results are iterated one at a time, it shouldn’t consume a lot of memory (beyond the DOM objects that represent the sitemap as it’s built).
The sitemap is intended for occasional fetching by legitimate search engines. While we could optimize it to reduce the number of queries involved (by coding a workaround to the object model), or introduce caching, both approaches have downsides that are keeping me from being too enthusiastic about them.
I’d suggest checking your access log to see whether it is being fetched enough to warrant changes, and if so, by whom. I suspect the author index is a much larger factor in the issues you’re encountering than the sitemap.
Regards,
Alec Smecher
Public Knowledge Project Team
I want to add that your previous solution worked and our journals are working amazing now. The load is back to how it used to be with 2.4.8. Thanks a lot for your help!
I understand, it was however just a suggestion.
I’m no professional programmer, but this is a sitemap turn around that I made to build a manual sitemap (only takes a second to load) - specific to my journals.
// All Articles
$article_query = mysqli_query($sitemapdb, "SELECT * FROM publications WHERE status='3'");
if(mysqli_num_rows($article_query) > 0)
{
while($row = mysqli_fetch_assoc($article_query))
{
$articleid=$row['submission_id'];
?><url>
<loc><?php echo $baseurl;?>/index.php/<?php echo $tag;?>/article/view/<?php echo $articleid; ?></loc>
</url>
<?php
$double_check_query = mysqli_query($sitemapdb, "SELECT assoc_id as checklink FROM submission_files WHERE submission_id = '".$articleid."' AND uploader_user_id!=1 LIMIT 1");
if(mysqli_num_rows($double_check_query) > 0)
{
$get_dcheck = mysqli_fetch_assoc($double_check_query)
?>
<url>
<loc><?php echo $baseurl;?>/index.php/<?php echo $tag;?>/article/view/<?php echo $articleid;?>/<?php echo($get_dcheck['checklink']); ?></loc>
</url>
<?php
}
else
{
$pdf_query = mysqli_query($sitemapdb, "SELECT MIN(assoc_id) AS thepdflink FROM submission_files WHERE submission_id = '".$articleid."' AND assoc_id!='' AND genre_id=1");
if(mysqli_num_rows($pdf_query) > 0)
{
$get_article = mysqli_fetch_assoc($pdf_query)
?>
<url>
<loc><?php echo $baseurl;?>/index.php/<?php echo $tag;?>/article/view/<?php echo $articleid;?>/<?php echo($get_article['thepdflink']); ?></loc>
</url>
<?php
}
}
?>
<?php
}
}
I was trying to build this section to get the latest PDF ID, but does not seem to work with the highest accuracy, any suggestion on what I could change here?
This is the downside to moving away from the standard object fetching tools that you’re finding to perform slowly – you’ll need to replicate in pure SQL some of the features and functions that are currently written in PHP. In the process you’ll lose important things like custom PDF and article IDs, etc. I’d recommend against this approach.
Regards,
Alec Smecher
Public Knowledge Project Team
Just an update so far, we’ve been able to run for days without an issue at all.
But, there are times where the same queries start happening out of no where - this time they do not lock out the websites (like what happens when data caches are cleared) but they create a spam of the scripts as seen before in the “/search/authors” instance.
Also, I checked out Google Search Console Server Error (5xx) Logs and found most of them to come out of the “/search/authors/” instances - but could not find any other thing that was causing a slow down.
Any idea what could be causing this? I’m not sure if there is any additional info that I could find to share with you on what it could possibly be linked to.
Sorry for multiple replies, also; one more thing to note is that this does not happen because of all journals in my server (but just 2 or 3 out of 12).