Lessons Learned from Manually Assessing 8000 Blog Comments & Deleting Half of Them

You may have noticed that the comments had been removed from WPLift for the last week, the reason is I have been doing a clean-up of the site because of Panda 4.0. WPLift has been hit a little by it so our traffic has dipped from the search engines so I have been doing an audit of on-site factors to try and repair this for when the next refresh comes around. Panda places more emphasis on on-site factors and authority so if you have been hit by it, taking a look at your site and cleaning it up should help. I plan to expand on this when the refresh happens and I can see the results of my work so I will share what worked ( or what didn’t! ).

One thing I thought could be a factor is the amount of spammy comments we had on WPLift, foolishly I had just left our comments open and was letting Akismet handle – it which while it’s very effective, isn’t one-hundred percent perfect.

The result ?

Over 8000 comments, many of which were very spammy in nature – links to all sorts of dodgy websites, keyword-stuffed comments and many other sorts of comments which should never be on your blog. I didn’t waste too much time checking comments as I reasoned that the links were no-followed and they didn’t seem to be harming my site. Looking back now, that’s stupid of me – when trying to rank on Google, linking out to dodgy sites and having a load of irrelevant content in the form of comments will wreck your keyword density and I think maybe it’s one of the reasons we got hit. Even if it’s not, It doesn’t look good to actual visitors – it makes the site look uncared for and neglected.

I did think about removing comments altogether like CopyBlogger has recently done, but decided against it – a lot of our comments are very useful and add to the blog post, whether that’s with corrections about the content of my posts ( which Im always happy to receive ), additional links and information and just the general feeling of a community around a blog.

So this week, I set about cleaning up the comments on WPLift, here is what I did.

How I removed Thousands of Spam Comments

When you have over 8000 comments to go through and pick which ones are spam, it’s quite a daunting thought. To help ease the burden I decided first of all to remove all pings and track backs as I no longer wanted to accept them. I don’t think they are really useful anymore, a lot of them are from spam sites, scrapers, aggregators etc. You can filter comments by pings so I did that and then bulk deleted them all.

Remove Pings & Trackbacks WordPress

Removing pings brought the total comments down to around 6000 which was still quite daunting but cheered me up a little.

To start removing the spam comments I tried setting the screen options for the comments page to show more comments, the max is 999 so I tried that which gave me 6 pages of comments to go through.

Amount of comments screen options

I then went down ticking all the spam comments and tried to bulk delete them, that didnt work – I received a WordPress error stating that the request was too large. I tried smaller amounts on screen at once but kept getting the error so in the end I showed 100 comments at once and then went down the list clicking “Spam” next to any comments I wanted to remove. This was an arduous process so I spread it out over 4 days – whenever I needed a break from working I cleaned out a few pages and kept a record of where I was up to.

Delete a spam comment WordPress

Unfortunately I decided this was the best way – I could have tried some more automated process but I wanted full control over what comments stayed as I had a few criteria for comments that needed removing. After my manual removal I ended up with 4800 comments so I had to remove around 1200 by hand.

Let’s move on to what I removed.

Types of Comments I Removed

Here is a run-down of the types of comments that I removed and what you should be on the look-out for on your site.

Keyword Usernames

I deleted the comment if a user posted a comment with a name like “Affordable WordPress Development” or even more spammy things like “Auto Car Loan”, you know the sort – they are only made to try and rank for that search term, even though the Penguin update slapped this sort of link-building. Ironically I probably helped some of these sites by removing the links :)

Fake Praise

There were a lot of comments with fake praise about the site which you could tell were automated, things like : “Excellent site you have here.. It’s hard to find good quality writing like yours these days. I really appreciate individuals like you! Take care!!” These were easy to spot and the username linked to some irrelevant website, all deleted.

Reporting Non-Existent Errors

Similar to the fake praise comments, these ones reported issues with the site such as RSS feed not working, browser errors etc but I could tell these were also auto-posted.

Comments with Links

There were quite a few comments with links to various sites which had got through Akismet, links to payday loans, pills, porn were all present – the worst kind of neighborhoods to link to. There were also a lot with signature links like you would see on a forum, I also removed these.

Add-In Links

A lot of people with their own products, plugins etc had dropped their link into the comments of reviews or roundups of other products. I assessed these on a case by case basis, If I thought they were a good resource which added to the post I left them. Other people were more sneaky, by adding comments to reviews saying the product mentioned was rubbish, a scam or support was bad etc but looking at the users email or website I could see these were from competitors – deleted!

Short Comments & Bad Grammar

This was another one I had to think about, short comments like “Nice Post” “Great Roundup” etc while from genuine readers I thought they didn’t add anything to the site and could have been used to get the first comment approved so they could then spam in further comments. I decided to go ahead and remove these as well as ones with terrible grammar. The bad grammar comments were not malicious but again, didn’t add anything to the site as a lot of time they were gibberish.


One thing I noticed which was interesting was that when I used Disqus on the site for a period on our old design, the number of spam comments dropped by a huge amount. I enjoyed that period, breezing through pages of comments with only a few spam ones was a nice break!

As I mentioned earlier, pings have been disabled now. They used to be used for blogs to post follow up articles on their own site and let the original site and commenters know of your content. The blogging world has changed a lot since then and I think nowadays is pretty much irrelevant for most sites.

So now I have the comments cleaned up, my thoughts are going to turn to how to prevent this from happening in future. I don’t want to implement a Captcha as I personally hate them and know that it will deter people from using the comments. I did see an interesting piece of code posted the other day “How I Stopped WordPress Comment Spam” which I’m going to look into, that looks like it will stop automated spam but won’t deter manual spammers. I could use the Jetpack plugin to handle them but Im not keen on how it looks, and there is also the option to move back to Disqus or Livefyre but again, I’m not a huge fan of those either.

Requiring people to login with Facebook or Twitter is another option, but not everyone uses those either so could deter comments.

Whether or not this will have any effect on Google’s view of the site remains to be seen but I can only see it being a positive in their eyes.

What do you think ? What are the steps you use to prevent spam in your comments ?



Oliver Dale is the founder of Kooc Media, An Internet Company based in Manchester, UK. I founded WPLift and ThemeFurnace, find out more on my Personal Blog. Thanks!

Related Articles


14 thoughts on “Lessons Learned from Manually Assessing 8000 Blog Comments & Deleting Half of Them”

  1. Oli, as someone without a blog, (but who intends to create one in the near future,) I had no idea there were the number and variety of comment/ping fraud techniques in use out there. Very enlightening. Oh, and if this comment is too shallow and vapid, feel free to delete it. I have to go back in your archives to see why you believe Disqus and Livefyre suck, since I use those channels on other sites.

    In any case, terrific post.

    ps: In “Add-in Links” paragraph, the “there” in first line should be “their.” Typo.

    • Thanks Max – corrected :)

      I don’t necessarily think Disqus and Livefyre suck, I just dont like the look of them – they seem a bit bloated, I prefer the simplicity of built-in WordPress HTML comments.
      As I mentioned Disqus did a very good job on this blog of preventing spam so if they work for you go for it!

  2. Sad to hear you got a minor slap, and glad you didn’t disable comments.
    I have had the same thoughts, about removing comments but I never thought about just disabeling ping and trackbacks, and I’m not shure about your statement about they are hardly used anymore.
    I have stopped using Aksimet to handle spam, after reading a post on trafficgenerationcafe.com about how you, without knowing can be marked as a spammer.
    I’m in doubt if I should put the link to the post, but it’s worth a read, so here is the link, you are free to remove it.
    She recommends G.A.S.P. anti spam plugin from the repo. I use Spam Free WordPress also from the repo, they sort all robot comments very well.
    Thanks for sharing!

    • Interesting read, ironically your comment was caught and flagged by Akismet – I had to manually approve it.

      I am definitely going to be looking into this issue more as Akismet seems less than ideal at the moment.

  3. Very interesting read. How do you know the decrease in traffic you saw was a result of the Google update (and not some other reason say)? I.e. How did you come to the conclusion that you’d been hit by Panda 4.0?

  4. I must admit to being quite happy with Akismet, although I hadn’t realized it could result in me being labeled a spammer, nor do I fully understand how that can work.

    I don’t get too many comments on my blog, I think the total is about 3000 in all for four years. I decided right at the start, though, to manually approve all comments. This means a little more work and, of course, there is a delay in the comments appearing, but it works far better than any other method. Discus and so on? Not only bulky but off-site, and I’d rather have my comments here, on my site, where I can see and control them properly.

  5. One way to avoid all those spammy comments is to use another plugin like Captcha that would add a basic maths calculation (like 2 + 3 = ? ) field to your comments form.

    This way spammy robots won’t be able to post comments anymore. The only spam comment that will pass are real humans that have successfuly filled the comment form but has been detected by Akismet as a spammer by mistake, like Govertz comment above. In this cas, you’ll have to manually approve it. But that’s something that happens very rarely.

  6. It’s pretty strange how it’s possble that a new e-mail address is getting caught by Akismet… anyway Akismet is still better than getting a lot of spam :)

    I think you did something wrong: If your comment list becomes very long because Akismet wasn’t used, you can enable Akismet just for the spam check. In the comment list there is an option to re-check for spam. I use that one also for false positives, days after these fase spam message arrive.

    Anyway it’s not good for hosting performance to process too much spam comments at the same time. If you don’t like to use Akismet, you can filter some bots using these filters:
    The Bad Behavior plugin is also some great help. The best way to fight comment spam to keep this bots outside!

  7. Luckily the first Panda updates had been and gone by the time I started my website, but I would guess that 80% of comments I receive are spam.

    I have never authorized a comment with a keyword as the name & often (probably) even mark legitimate comments as spam if they don’t offer anything to the conversation.

    Its a shame you have had a Panda knock as your website is one of the best in the WordPress niche.

  8. I run a site with a reasonable amount of traffic (about 8-10k views per day) but the comment are the lifeblood of the site. It is not uncommon for single posts to get over 500 comments and managing them could be a real problem.

    Have been through most of the solutions out there including dabbling with Disqus (big mistake)

    Absolutely agree that catchpas, social logins and other similar solutions are not the way to go either and I even get lots of useful comments on posts that are quite old.

    One solutions I really liked was Anti Spam Bee, very good. Askimet was absolutely rubbish, far too many false positives

    But in addition to the usual hacks like removing website from the comment field the one I am currently happy with is WP-Spam Shield. Also have the whitelist and auto blacklist plugins although not sure if the whitelist is 100% effective because I still get a handful of false positives.

    I also use a comment editing plugin that can sometimes fall foul of spam filters because if the commenter submits the edit very soon after the original it is classed as spam

    I think an effective whitelist plugin would be excellent, not found one that is 100% yet though

Comments are closed.


Create Your Own

Building beautiful WordPress websites has never been easier. Explore the visual drag & drop Theme Builder that does it all, and works on any theme. Coding skills needed: none.