MediaWiki is the engine behind Wikipedia, which is the largest of all wikis, so you would have to assume that it is extremely powerful and scaleable software. It uses PHP with MySQL database for storage.
Homepage: http://www.mediawiki.org/
If you are only interested in having a small number of editable pages, you need to remember that your wiki can have other pages, and these pages can be home to spam links. Its not enough to just clean spam from your main pages. Keep an eye on your 'Recent Changes'.
Mediawiki comes with a lot of links to help you build a skeleton wiki community across many pages. But many people don't really make use of these pages so spammers are targeting these default pages. We also recomend you protect the pages that have links from the sidebar and footer such as 'About' and 'Community Portal' especially if you are not going to use them regularly. See MediaWikiDefaultPagesSpam.
Media wiki allows some HTML tags, and the 'style' attribute. Some sneaky spammers are using this to make their links invisible, using CSS tricks. The links are not invisible in the edit box and in your diff displays, so keeping an eye on 'Recent Changes' allows you to spot this. If you want prevent the trick, you use the wgSpamRegex or wgSpamBlacklist variable to block the 'style' attribute. See CSSHiddenSpam for how.
See MediaWiki Anti-spam_Features (That page has now been populated with some useful information)
ContentBanning in mediawiki is done using a single regular expression in the variable $wgSpamBlacklist, but we recommend you also install the Spam Blacklist extension, which allows you to maintain the list of regular expressions one domain name per line (a more sensible readable format). You can also keep it as an actual page of your wiki. If you do this, you should set this as a 'protected' page. You can allow other users to edit it by promoting those trusted users to sysop status.
We also recommend getting blacklist updates (preferably automatic) from shared blacklists. The chongqed.org blacklist is available in a specially modified format for mediawiki here: http://blacklist.chongqed.org/mediawiki/ By default the extension gets the metawiki blacklist maintained at the MediaWiki site. You may find that our list is just too big for your site (possibly due to memory limitations), if so the extension fails and you will have no protection from this extension even if you load two blacklists. If it appears not to be working, check you errors_log file in the MediaWiki root folder.
As of version 1.4.1, Mediawiki has ProxyBlocking?. The idea is to prevent the use of open proxies. Many spammers use open proxies to obcure their identity, and to avoid IP address bans. It enables them to access a wiki, and make edits, from many different IP addresses.
Mediawiki uses of the rel=nofollow link attributes. By default, it is put on all external links, plus log and history pages. See NoIndexHistory. Note that putting it on all external links is a rather heavy handed anti-spam tactic, which you may decide not to use (switch off the rel=nofollow option). It's good to see this as the installation default though. It means lazy administrators who are not thinking about spam problems, will tend to have this option enabled.
There are also a few CAPTCHA extensions available for MediaWiki though CAPTCHAs are not a very good solution. They can be beat and are annoying to regular users, though you can configure the captchas to trigger only for unregistered or new users so regular users are not affected.
See also the general AntiSpamRecommendations.
Ah good news. Someone who knows what they are talking about has just made some updates to that page – Halz - 2005-05-25 07:50 UTC
Some more serious anti-spam discussion there now, which is good to see – Halz - 2005-06-09 10:02 UTC
Any clue if the chongqed blacklist format works with MediaWiki? The format is very similar, but we also check for https?:\/\/([^\/]*\.)?, which makes our matches more specific. Using ours a spammer can post his links as long as the http:// is left off (which of course does them no good). I just wonder if somehow that could cause our list not to work for them. – Joe - 2005-04-30 09:40 UTC
Joe, according to the readme for the MediaWiki spam blacklist extension, internally a single giant regular expression is formed using the lines from the blacklist file as follows:
!http://[a-z0-9\-.]*(line 1|line 2|line 3|....)!Si
This makes it incompatible with the chongqed blacklist. – RichardP - 2005-04-30 17:18 UTC
It should be easy to create a MediaWiki compatible version then, all the URLs are stored in the DB without the http:// anyway. But the question is will anyone use it? MediaWiki is popular, but this is an extension so fewer people will use it and they already have blacklists. Obviously no one has tried it so far. – Joe - 2005-04-30 17:38 UTC
… And it doesn't sound like many folks would want to use this extension. One giant regular expression doesn't sound like a very good idea to me. – Of course, it would be trivial to have a script that returns the known spamvertized domains as one big beast that will stress test your perl interpreter. – Manni - 2005-05-03 20:31
Good point, only they would be stressing their PHP interpreter. I guess they aren't thinking of extrememly large blacklists. – Joe - 2005-05-03 18:57 UTC
New information. The $wgSpamBlacklist variable is in the default install, and follows this approach of sticking everything in one expression. Seems like a bad approach to me. I think the extension makes the list more useable, easier to maintain, but ends up joining everything into that variable again anyway. I wonder if there's an upper limit on the size of that expression. The metawiki blacklist has 1000+ entries. Pretty big, so maybe it works fine. – Halz - 2005-05-25 07:50 UTC
The SpamBlacklist? extension readme says:
This extension uses a small "loader" file, to avoid loading all the code on every page view. This means that page view performance will not be affected even if you are not running a PHP bytecode cache such as Turck MMCache. Note that a bytecode cache is strongly recommended for any MediaWiki installation. The regex match itself generally adds an insignificant overhead to page saves, on the order of 100ms in our experience. However loading the spam file from disk or the database, and constructing the regex, may take a significant amount of time depending on your hardware. If you find that enabling this extension slows down saves excessively, try installing MemCached or another supported data caching solution. The SpamBlacklist extension will cache the constructed regex if such a system is present.
A MediaWiki version of our blacklist is again online and can be found here: http://blacklist.chongqed.org/mediawiki/ – Manni - 2006-01-13 10:09