Pre-conferred papers leak to Google Scholar before publication

Hi there,

I was wondering is this problem will be associated with this harvester. I am a new manager of the OCS website and last year, we found some abstracts/papers were accidently popped up in Google Scholar before they were conferred. We did not know it happened till now. Could you please give as any advices for this? Or in which part should I enable/disable the tools? Thanks in advance!

Hi @dianvincensia,

I’m not sure what you mean by “pre-conferred” papers? Can you elaborate please? Are these papers published on your OCS site? Google Scholar crawls publicly available content, so if you are using PKP software and have made content publicly available and Google crawls your site, there’s not much we can do about that I’m afraid. Google Scholar does offer some guidelines for excluding content - via robots.txt files: Google Scholar Help

-Roger
PKP Team

Hi @rcgillis

Thank you for replying me.

Pardon me that it was a bit unclear about the term… So, we set the submission page for authors which authors of our conference should submit their abstract/full paper before the conference is commenced. And we did not know why this abstracts/full papers accidently come up in Google Scholar though they had not been presented in our conference. Some of authors noticed it and they were afraid that it might ended as plagiarsm act or double submission since the papers were popped up in Google Scholar before the presentation.

I have asked the previous manager of this OCS but he said he did not change anything except the timeline of the submission.

Could you please give me any advices for this?

Thanks anyway!

Hi @dianvincensia,

Thank you for clarifying. I’m not sure what exactly would lead to Google Scholar indexing your material, but my guess is that it was publicly exposed in some fashion (i.e. available on the web), and that the Google Scholar indexing crawl picked it up as a part of its indexing efforts, since you are using a recognized platform. That said, getting one’s material removed from Google Scholar’s index has proven to be difficult, and is something, unfortunately, that we have little control of it. This post has some good advice (although it’s for OJS and not OCS): How to remove a journal from Google (Scholar) - #12 by bernieh - and other community members may wish to weigh in with advice as well.

-Roger
PKP Team

Hi @dianvincensia,

Is it possible that your OCS installation’s files directory (the files_dir setting in config.inc.php) is exposed to the web directly? If so, that could explain Google indexing it – but more than that, it’s a potentially dangerous configuration because malicious users can upload PHP scripts and guess the URL to them in order to execute them on the server. Make sure that your files_dir is either outside the web root, or if you can’t do that, make sure it’s protected from direct access using e.g. a .htaccess file.

Regards,
Alec Smecher
Public Knowledge Project Team

1 Like

This topic was automatically closed after 7 days. New replies are no longer allowed.