SpamCatching Module
This is the OddMuse module that protects the chongqed wiki from spammers. Manni has made it available to all OddMuse users. More information can be found on its OddMuse's Module page.
Its main feature is to use our chongqed blacklist to prevent known spam from being put on your wiki. It captures all attempts and puts them on a CaughtSpam page. It also includes some other spam prevention features.
Discussion
You should suggest the admin create an AntiSpamDan (or whatever user they choose) page to describe that it is an automated spam protection user.
The cookie should be able to be disabled. Some people are still scared of cookies and some contries require a notice before a site uses a cookie.
- m: OK. That should be quite easy to do.
The warning you give to spammers says: "The spam will end up in the chongqed.org database." That would only happen if they reported it since we can't go out to every wiki that installs your module.
- m: Since we caught the spam, the spam already is in our db. Scare the spammers a little was all I intended with that sentence.
Also, you should add some text to mention that CaughtSpam will be reviewed and what to do if you are not a spammer. Not sure what to say though. The admin can review the CaughtSpam, but it would be a pain to make those edits if someone did get caught accidently. Since we also handle blocking on BannedContent false positives may go up, we are careful with the BlackList? but have no control over that obviously.
- m: Right. I could include another variable that can contain some extra text for this and other stuff.
Does this do anything with the cookie? It doesn't look like it, but I just wanted to be sure using the cookie redirect is up to the wiki admin. You could look up the cookie on each new post (or even on return visit as we do) and automatically CaughtSpam it even if the URLs are not in the blacklist yet.
- m: No. The cookie is only set. That's it.
Also, if you do that or use the cookie somehow you need to explain how to get things back to normal in case someone accidently gets the cookie as I have done a few times.
Also, you aren't including the crash IE extra save button right? On other wikis its more likely someone will be caught accidently.
- m: Correct. That's why I removed it. It also isn't used here anymore. The cookie redirect should be enough I guess.
For sites that see a ton of spam being posted the CaughtSpam page may get really long and the data for each spammer could be wasting a lot of space. You should give info on how to clean it up.
- m: Correct. I'll add a note to the description and maybe I can make that more comfortable in a future version.
The CaughtSpam detail pages should be noindexed. For us, indexed spam is fine because we want to attract spammer idiots, but for other wikis indexing the spam would attract unwanted spammers.
- m: Good point. But I don't want to have two different versions (think maintenance). So I will either have to disable indexing here or enabled it elsewhere. What do you suggest?
- j: What about a preference variable and default to noindex. Having noindex on ours woudn't be a big deal though. We attract a good number of spammers anyway.
- m: Turns out that this isn't really a problem. The index page is set to INDEX, but the single spam pages get a NOINDEX anyway.
What happened to editing the CaughtSpam page automatically? That would be nicer, but not totally necessary. At most each diff would be one line so its not a big improvement. People will just have to get used to it as we have.
- m: I will look into this, but it's going to be a little trickier than what I currently do.
- j: Since only one line is ever going to change its less important than I originally thought, but its annoying to hit diff on the page that just changed and get a revision that was edited months ago that has nothing to do with the spam. Probably more trouble than its worth unless lots of users complain. Diffs really won't do much good on a wiki that gets a lot of spam attacks. Even here you often get two or three hits a day and would miss them if you only looked at the latest diff once a day. Maybe it makes sense to leave it as it is.
You say to include the ?action=viewcaughtspam;file=index, but does the admin have to turn on an option or something for the include first? I would think that is a security hole if not.
- m: I looked that up and the transclusion docs don't mention anything to that effect.
- j: That seems like a security problem to me. Could the include be used to insert the wiki password file or worse like something outside the Oddmuse directory.
- m: No. Nothing to worry about. Transclusion only does a http request. If you can transclude stuff, you can also get it directly into your browser. That's why the webserver shows up on the list of recent visitors.
It should be pretty easy to implement a shotgun filter if there aren't any existing modules for OddMuse that do that.
- m: A shotgun filter?
- j: Have you not read WikiSpam recently? See ShotGunSpam and AutoBan on that page. AutoBan is a good bit more complicated but we already have the cookie so something similar could be done using the cookie. You would have to differentiate blacklist caught spam and shotgun caught spam so we would know which need to be chongqed. The shotgun filter should catch much of what the blacklist doesn't.
- m: Yes, I read that. But I didn't connect 'shotgun filter' with 'shotgun spam'. Wouldn't Surge Protection help?
- j: I have always hated that name as a spam filter, it sounds like a type of spam instead of a way of preventing it. Spammers that Surge Protection would catch don't seem to be as much of a problem anymore. Anyway, wouldn't Surge Protection be more complex to implement, you would have to track IPs for a few minutes to identify spammers and a few hours once matched to prevent return spams. Though the cookie could be used to handle return visits. And I just realized, according to our OddMuse page Surge Protection built in.
- j: Apparently the built in Surge Protection is antileach not antispam. It should be based on editing rather than page access. Its time interval is far too short and number of accesses is to high to be useful against most modern spammers.
– Joe - 2005-04-20 17:31 UTC
I didn't realize OddMuse had discussion pages (we don't need them, all our pages are dicussion pages), someone found a possible bug. Its in the BannedContent stuff which we don't use except for the BackToTheFutureII guy so you may have missed it. – Joe - 2005-04-20 20:32 UTC
Pheew. Not really a bug. – There's an extension for the comments. I gave this one some thought in the past, but we seem to be doing fine without it; just like you said. – Manni - 2005-04-21 01:43