Content banning is one of our AntiSpamRecommendations.
Use a blacklist to ban spammy content from your wiki. Good wiki software includes a blacklist file, for matching edit content against regular expressions. If after a user edits a page, the page contents get a match on any of the regular expressions, then the edit is blocked with a message. This method has proven to be more effective than IP banning at reducing spam.
Normally the same feature can be used to ban rude words, although in some software, the filter is only applied to external link URLs.
The best thing is to automate this process so that your wiki is always immune to the latest spammers. Such automation may be available as a built in feature of your wiki software, otherwise you could try using a cron job.
However we do recommend that administrators use the content banning features of their wiki software, and set up automatic updates. We also want to recommend to all wiki developers that they provide this as a built in feature of the software.
False positives could occurr if a user wanted to talk about spammers, and in doing so they innocently try to link to a spammer's website. In this case, there is no harm in preventing the edit, since we don't want to link to these sites, even within such discussions. It would be a mistake to ban them from the wiki in these circumstances, and so we don't recommend using automatic IP banning with the blacklist.
Other false positives could occurr if a mistake is made in defining a regular expression. Take care not to create an expression which matches on legitimate edits.
All of this would be useful. But I guess it cannot be done with our current approach that simply let's you grab a file filled with regular expressions. The file would become much larger and much more difficult to parse. Of course, another approach is something like DNS based blacklist. Although it would be ideal, this is way beyond my technical abilities and the limits of my hosting account. – Manni
This french chap seems to have a similar idea. He's has laid out a proposed format (described in french) for storing tracking information about a blacklist entry. Ultimately I suppose this information would allow people to build a kind of trust network, where changes are propagated a lot like DNS as you say. – Halz - 2005-04-25 13:21 UTC
That would increase the size of the blacklist a lot. Our list is already large. The simple format we use now is very easy to use and already implemented in several different wikis. More information would be good, but who is really going to use it? Some people may want to expire URLs that the spammers no longer spam for, but that requires us to keep updating records for URLs that are already in the database. We only see a small fraction of all spam at any one time so doing that would be very hard. If we integrated WikiMinion's data in some way to keep the last seen info fresh it might be better. But eventually spammers will learn to hide from WikiMinion protected sites so it wouldn't be effective for all spammers for long. – Joe - 2005-04-25 19:26 UTC
Yeah the mechanism is almost exactly the same, so I dont think we need a separate page detailing link banning, but then again… calling it 'content banning' in this case is a little misleading. – Halz - 2005-04-25 13:21 UTC