Authors page doesn’t work any more due to sql query conditions

Hello everybody,

We use OJS 3.2.1.3 since 2 years with mysql 5.5 and PHP 7.3.
We host 10 journals.

Recently, we encountered a display problem with the authors’ page (/search/authors → searchAuthorIndex.tpl) for several journals which have lot of submissions and authors.
The page can’t stop running and never displayed.

I tested the SQL query that is generated and I found that the problem comes from the query conditions.

It’s a similar problem that I had already reported in october 2020 :
https://forum.pkp.sfu.ca/t/search-is-slow-and-not-always-working-after-upgrade-to-ojs-3-2-1-1/

The basic query is as follows (with the famous CAST condition in bold) and for some journals query runs for ever and never ends:

Query 1: SELECT a.*, ug.show_title, s.locale, COALESCE(agl.setting_value, agpl.setting_value) AS author_given, CASE WHEN agl.setting_value <> ‘’ THEN afl.setting_value ELSE afpl.setting_value END AS author_family FROM authors a JOIN user_groups ug ON (a.user_group_id = ug.user_group_id) JOIN publications p ON (p.publication_id = a.publication_id) JOIN submissions s ON (s.current_publication_id = p.publication_id) LEFT JOIN author_settings agl ON (a.author_id = agl.author_id AND agl.setting_name = ‘givenName’ AND agl.locale = ‘en_US’) LEFT JOIN author_settings agpl ON (a.author_id = agpl.author_id AND agpl.setting_name = ‘givenName’ AND agpl.locale = s.locale) LEFT JOIN author_settings afl ON (a.author_id = afl.author_id AND afl.setting_name = ‘familyName’ AND afl.locale = ‘en_US’) LEFT JOIN author_settings afpl ON (a.author_id = afpl.author_id AND afpl.setting_name = ‘familyName’ AND afpl.locale = s.locale) JOIN ( SELECT MIN(aa.author_id) as author_id, CONCAT( ac.setting_value, ’ ’ , COALESCE(asg0.setting_value, ‘’), ’ ', COALESCE(asf0.setting_value, ‘’), ’ ', COALESCE(SUBSTRING(asa0.setting_value FROM 1 FOR 255), ‘’), ’ ’ ) as names FROM authors aa JOIN publications pp ON (pp.publication_id = aa.publication_id) LEFT JOIN publication_settings ppss ON (ppss.publication_id = pp.publication_id) JOIN submissions ss ON (ss.submission_id = pp.submission_id AND ss.current_publication_id = pp.publication_id AND ss.status = 3) JOIN journals j ON (ss.context_id = j.journal_id) JOIN issues i ON (ppss.setting_name = ‘issueId’ AND ppss.setting_value = CAST(i.issue_id AS CHAR(20)) AND i.published = 1) LEFT JOIN author_settings ac ON (ac.author_id = aa.author_id AND ac.setting_name = ‘country’) LEFT JOIN author_settings asg0 ON (asg0.author_id = aa.author_id AND asg0.setting_name = ‘givenName’ AND asg0.locale = ‘en_US’) LEFT JOIN author_settings asf0 ON (asf0.author_id = aa.author_id AND asf0.setting_name = ‘familyName’ AND asf0.locale = ‘en_US’) LEFT JOIN author_settings asa0 ON (asa0.author_id = aa.author_id AND asa0.setting_name = ‘affiliation’ AND asa0.locale = ‘en_US’) WHERE j.enabled = 1 AND j.journal_id = 1 GROUP BY names ) as t1 ON (t1.author_id = a.author_id) ORDER BY author_family, author_given;

I added 3 conditions (in bold) to optimize the query to make it working again and running in an acceptable time:
Query 2: SELECT a.*, ug.show_title, s.locale, COALESCE(agl.setting_value, agpl.setting_value) AS author_given, CASE WHEN agl.setting_value <> ‘’ THEN afl.setting_value ELSE afpl.setting_value END AS author_family FROM authors a JOIN user_groups ug ON (a.user_group_id = ug.user_group_id) JOIN publications p ON (p.publication_id = a.publication_id) JOIN submissions s ON (s.current_publication_id = p.publication_id) LEFT JOIN author_settings agl ON (a.author_id = agl.author_id AND agl.setting_name = ‘givenName’ AND agl.locale = ‘en_US’) LEFT JOIN author_settings agpl ON (a.author_id = agpl.author_id AND agpl.setting_name = ‘givenName’ AND agpl.locale = s.locale) LEFT JOIN author_settings afl ON (a.author_id = afl.author_id AND afl.setting_name = ‘familyName’ AND afl.locale = ‘en_US’) LEFT JOIN author_settings afpl ON (a.author_id = afpl.author_id AND afpl.setting_name = ‘familyName’ AND afpl.locale = s.locale) JOIN ( SELECT MIN(aa.author_id) as author_id, CONCAT( ac.setting_value, ’ ’ , COALESCE(asg0.setting_value, ‘’), ’ ', COALESCE(asf0.setting_value, ‘’), ’ ', COALESCE(SUBSTRING(asa0.setting_value FROM 1 FOR 255), ‘’), ’ ’ ) as names FROM authors aa JOIN publications pp ON (pp.publication_id = aa.publication_id) LEFT JOIN publication_settings ppss ON (ppss.publication_id = pp.publication_id ) JOIN submissions ss ON (ss.submission_id = pp.submission_id AND ss.current_publication_id = pp.publication_id AND ss.status = 3) JOIN journals j ON (ss.context_id = j.journal_id) JOIN issues i ON (ppss.setting_name = ‘issueId’ AND ppss.setting_value = CAST(i.issue_id AS CHAR(20)) AND i.published = 1 AND ppss.locale =‘’ AND ss.context_id = i.journal_id) LEFT JOIN author_settings ac ON (ac.author_id = aa.author_id AND ac.setting_name = ‘country’ AND ac.locale = ‘’) LEFT JOIN author_settings asg0 ON (asg0.author_id = aa.author_id AND asg0.setting_name = ‘givenName’ AND asg0.locale = ‘en_US’) LEFT JOIN author_settings asf0 ON (asf0.author_id = aa.author_id AND asf0.setting_name = ‘familyName’ AND asf0.locale = ‘en_US’) LEFT JOIN author_settings asa0 ON (asa0.author_id = aa.author_id AND asa0.setting_name = ‘affiliation’ AND asa0.locale = ‘en_US’) WHERE j.enabled = 1 AND j.journal_id = 1 GROUP BY names ) as t1 ON (t1.author_id = a.author_id) ORDER BY author_family, author_given;

But where the query is the fastest is when I only replace in original query:
Query 3:
AND ppss.setting_value = CAST(i.issue_id AS CHAR(20))
with
AND CAST(ppss.setting_value AS UNSIGNED) = i.issue_id

I know you have to create a query which works with both DB Mysql and PostgreSQL and that PostgreSQL doesn’t like to implicitly convert between strings and numbers.

Could you plan to change the query to make it working again : either query 2 or either (idealy for us) query 3 ?

Thanks in advance.
Best regards.
Helene

Hello again,

As we are also testing OJS 3.3.0.14, I realized I put query of 3.3.0.14. The only difference with both queries in file classes/article/AuthorDAO.inc.php is the call of locale attribute in the select clause:
in OJS 3.2: p.locale
in OJS 3.3: s.locale

But problems of conditions remain the same.
With the original query in file classes/article/AuthorDAO.inc.php, query runs for ever and never ends for some of our journals.

  1. With adding those 3 conditions:
    AND ppss.locale =‘’
    AND ss.context_id = i.journal_id
    AND ac.locale = ‘’

  2. Or with replacing
    AND ppss.setting_value = CAST(i.issue_id AS CHAR(20))
    by
    AND CAST(ppss.setting_value AS UNSIGNED) = i.issue_id

query is working again

Could you consider to modify the query to add or modify conditions to optimize the query in OJS 3.2 and/or OJS 3.3 as @asmecher did before for a similar problem query join issue:

Thanks again.
Kind regards.
Helene

Hi @hcl,

The queries you’re struggling with are part of the author index, which is a long-standing part of OJS that has been slated for removal. The removal happened in OJS 3.4.0, and has been recently back-ported to 3.3.0 for release in 3.3.0-15. (See googleSearch report a lot of slowpages linking to author's search · Issue #7183 · pkp/pkp-lib · GitHub for details.)

If you have a particular need for this interface, please describe it a little. The issue linked above reviews some feedback and options, and in the end concludes that the list is not serving needs well, and is affecting users negatively due to poor performance. In my opinion, third-party services like ORCiD will meet the need for author listings more completely.

Thanks,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

Thanks for your answer and your information on the future of the authors’ page.

We will update OJS 3.3 soon and we will have time to think about the new authors’ page until the release of OJS 3.4.

“Back-ported” in 3.3.0-15, does it mean insert again ? If so, is it possible to modify in the same time the query in file classes/article/AuthorDAO.inc.php with adding the 3 conditions to the query to optimze it and make it work again ? It would be better if the improve is in the original code than we have to patch it ourselves.

I can pull a request for this issue if you want ?

We are going to discuss with our team if we have particular need for the new interface of authors’ page in OJS 3.4.

Thanks in advance for your answer.
Kinds regards.
Helene

Hi @hcl,

What I mean by back-porting is that the removal of the author index in 3.4 has also been applied to 3.3, starting with 3.3.0-15. Do you need this interface for anything in particular? I suspect it’s just being indexed by search engines for legacy reasons, which causes the slow-down. That’s all we’ve heard reported so far.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher,

Thanks for your answer and sorry for the misunderstanding.

In our journals custom themes, there is a menu that points to the authors interface. Our journals liked to have a page that lists authors.

We take note of the removal of the authors interface from OJS 3.3.0-15. We will inform our journals.

Where could we deposit journals proposals and some of collection of journals needs for this new authors page?

Thanks again.
Helene

Hi @hcl,

Rather than replacing the authors index within OJS, we’re prioritizing better integration of ORCiD. If authors adopt ORCiD, it’ll mean listings of author work not just in the current journal or OJS installation but more broadly; that’s much more useful for researchers. (Google Scholar would be another alternative.)

If you’d like to write a little more about the journal use cases, I’d welcome it.

Thanks,
Alec Smecher
Public Knowledge Project Team

This topic was automatically closed after 10 days. New replies are no longer allowed.