[3.4.0-3]Large SQL queries slowing down site

After Updating to OJS 3.4 I’ve found some queries running when the site is rebuilding it’s cache that puts the DB under heavy load they’re all very similar to one I’ve pasted below:

I found it after our client reported the site timing out, saw the DB at near 100% utilisation with multiple of the pasted queries below. It seems to be partially called by bots scraping the site.

Now that the cache is rebuilt the queries are slowing down and the site is accessible so it’s not ongoing but I would appreciate some advice as to how I can narrow down the cause of this issue.

I’ve enabled all debug options and check the error.log and access.log but can’t see much that indicates what causes it.

For reference It’s a multi-journal site with lots and lots of pages so that may well be a factor in the cause of this issue.

Thanks in advance for any help you can provide!

select s.* from submissions as s left join publications as po on s.current_publication_id = po.publication_id where s.context_id in (47) and s.status in (3) and (exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘language’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘language’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘language’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘dominance’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘dominance’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘dominance’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘acculturation’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘acculturation’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘acculturation’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘shift’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘shift’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘shift’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘development’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘development’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘development’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘lifespan’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘lifespan’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘lifespan’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘complementarity’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘complementarity’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘complementarity’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘principle’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘principle’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘principle’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘polish-english’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘polish-english’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘polish-english’), ‘%’)) or exists (select * from submission_search_objects as sso inner join submission_search_object_keywords as ssok on sso.object_id = ssok.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where sskl.keyword_text = LOWER(‘bilinguals’) and s.submission_id = sso.submission_id) or s.submission_id in (select p.submission_id from publications as p inner join publication_settings as ps on p.publication_id = ps.publication_id where ps.setting_name = ‘title’ and LOWER(ps.setting_value) LIKE CONCAT(‘%’, LOWER(‘bilinguals’), ‘%’)) or s.submission_id in (select p.submission_id from publications as p inner join authors as au on au.publication_id = p.publication_id inner join author_settings as aus on aus.author_id = au.author_id where aus.setting_name in (‘givenName’, ‘familyName’, ‘orcid’) and LOWER(aus.setting_value) LIKE CONCAT(‘%’, LOWER(‘bilinguals’), ‘%’))) and s.submission_id not in (19200) order by (select COUNT(DISTINCT sskl.keyword_id) from submission_search_objects as sso inner join submission_search_object_keywords as ssok on ssok.object_id = sso.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where (sskl.keyword_text = LOWER(‘language’) or sskl.keyword_text = LOWER(‘dominance’) or sskl.keyword_text = LOWER(‘acculturation’) or sskl.keyword_text = LOWER(‘shift’) or sskl.keyword_text = LOWER(‘development’) or sskl.keyword_text = LOWER(‘lifespan’) or sskl.keyword_text = LOWER(‘complementarity’) or sskl.keyword_text = LOWER(‘principle’) or sskl.keyword_text = LOWER(‘polish-english’) or sskl.keyword_text = LOWER(‘bilinguals’)) and s.submission_id = sso.submission_id) desc, (select COUNT(0) from submission_search_objects as sso inner join submission_search_object_keywords as ssok on ssok.object_id = sso.object_id inner join submission_search_keyword_list as sskl on sskl.keyword_id = ssok.keyword_id where (sskl.keyword_text = LOWER(‘language’) or sskl.keyword_text = LOWER(‘dominance’) or sskl.keyword_text = LOWER(‘acculturation’) or sskl.keyword_text = LOWER(‘shift’) or sskl.keyword_text = LOWER(‘development’) or sskl.keyword_text = LOWER(‘lifespan’) or sskl.keyword_text = LOWER(‘complementarity’) or sskl.keyword_text = LOWER(‘principle’) or sskl.keyword_text = LOWER(‘polish-english’) or sskl.keyword_text = LOWER(‘bilinguals’)) and s.submission_id = sso.submission_id) desc limit 10 offset 0

1 Like

Hi @ian-believe.digital,

Those look like searches performed (possibly by a 3rd-party search engine) against the article full-text index. This is potentially a large index and will test your database server.

You can probably confirm by looking through your logs for URLs containing some of the rarer keywords there – e.g. acculturation – if these are specified in the URL (as opposed to a POST form). This might tell you what bot is responsible, and what its crawling behaviour looks like.

Most (well-behaved) bots can be rate-limited using a robots.txt file – see e.g. Crawl-delay. Here is a bit of a reference.

With a large-enough set of articles, the built in search engine simply won’t perform very well; for large collections, I’d recommend looking into Lucene or similar.

Regards,
Alec Smecher
Public Knowledge Project Team

Thanks @asmecher!

I’ve done what I can in regards to bot rate-limiting etc.

I’ll look into incorporating Lucene as well…

1 Like

This topic was automatically closed after 11 days. New replies are no longer allowed.