How to remove a journal from Google (Scholar)

We had a test journal that was accidently indexed by Google Scholar. Now I’d like to remove the nonsense articles from
Google’s index. I was adviced to use this removal tool:

https://www.google.com/webmasters/tools/removals?pli=1

When I try to use it, I get the message, that the content is still available. I should contact the owner. The old articles give a status code 302 and users are redirected to the OJS start page.

What can I do about this?

I’m aware of Google indexing non-public journals. But now the content is in the index, I’d like to get rid of it.

Hi @CH_,

Only Google knows how to get content indexed or removed – I’m afraid we have no control over that. To prevent getting content indexed in the first place, ensure that you don’t check the “Enabled” box in the Site Administrator’s “Hosted Journals” interface for that journal. However, once the content is indexed, it’s up to Google to remove it.

Regards,
Alec Smecher
Public Knowledge Project Team

Thanks for your answer! You are right, Google is responsible for its index. But OJS as a CMS is responsible for delivering the right status code for a removed journal. As it is, it’s a 302, which is a signal for every crawler that the site exists. Which is - in this case - no longer true, as the journal is deleted.

And since Google only accepts removal suggestions if the site doesn’t exist anymore, I have a problem resulting from OJS. At least I do think that it’s resulting from an OJS configuration somwhere.

Sorry, if my first posting wasn’t clear enough about that.

PS: One of the URLs as an example:
https://journals.ub.uni-heidelberg.de/index.php/testinfo/article/view/16177/12537

PPS: You can see the server response here:
http://tools.seobook.com/server-header-checker/?page=single&url=https%3A%2F%2Fjournals.ub.uni-heidelberg.de%2Findex.php%2Ftestinfo%2Farticle%2Fview%2F16177%2F12537&useragent=3&typeProtocol=11

I do think a 403, 404, or 410 response would be more “honest” here, but that’s a pretty radical departure from how we’re handling nonexistent links so far.

On the other hand, Google Scholar is all about evaluating links for research association, and if they followed the 302 and read the metadata, they would see that there is no “content available”, so in part I think this is an issue on their end.

Edit: It looks like @asmecher has his eye on a 404: Respond to nonexistent monographs with a 404 · Issue #772 · pkp/pkp-lib · GitHub

Thanks, @ctgraham & @asmecher, I’ll follow this issue on Github!

The problem with GS is, that they don’t follow the 302. I will inform GS about this.

Hi @CH_,

Agreed, I think Google Scholar should respond to 302s by removing the content, but we could definitely do better by returning 404s. As @ctgraham points out, I’ve got a modification nearly completed to do that – see Respond to nonexistent monographs with a 404 · Issue #772 · pkp/pkp-lib · GitHub for details. Watch for a patch there.

Regards,
Alec Smecher
Public Knowledge Project Team

Short note for people with the same problem: It took almost exactly a year, but now Google Scholar removed the test journal from its index.

Came across this query by serendipity, so posting this note just in case others encounter the same problem.

I’ve been monitoring our works in Google Scholar for the past year or so, and one of the issues which has arisen is that once Google Scholar has indexed something, it doesn’t revisit/reindex that item for another 6-9 months. No matter what you do, this rule remains cardinal. This is documented at: Google Scholar Help. And my observations over the past year confirm that this is indeed how Scholar behaves in practice.