How do I combat SPAM?

jmacgreg · April 18, 2018, 3:26pm

Hi @ambs, and apologies for not responding sooner! Here’s how we @ PKP go about getting lists. It requires access to MySQL though.

First, we look at the user accounts and determine if there’s some common identifying characteristic for the spam accounts. Quite often this is easy: there’s a bot out there that as part of the registration process puts “123456” in the phone and sometimes fax fields. So we get all usernames that match this:

SELECT * FROM users WHERE phone = 123456;

This will return all users that have “123456” as the phone number, in a list. Review the list for possible falst positives, and copy the problem usernames to names.txt.

You can do the same for other suspicious-looking data, for example, if you see multiple users registering with emails that contain the same weird suffix, such as “eamale.com” or “yandex.com”, you can do the following:

SELECT * FROM users WHERE email LIKE "%eamale%"

We’ve also seen a situation with another spambot where they fill in the phone field with an 11-digit string, like “84286848777”. You can select for this as well:

SELECT * FROM users WHERE phone REGEXP “^[0-9]{11}$”

You can also select all users registered in the system who don’t have roles assigned to them, which is close to what you are asking:

SELECT * FROM users WHERE user_id NOT IN (SELECT user_id FROM roles)

This will return all user information for users that don’t have any role, including author, reader, reviewer, etc.

One caution about this process: you will want to look through the results carefully to make sure that you aren’t flagging any sort of false positives; otherwise, you could be deleting a user that is actually not a spammer.

Cheers,
James