WikiHome RecentChanges WikiNode Preferences chongqed.org

BugsAndSuggestions

This is the place to point out any little bugs or typos you find on the main chongqed.org website, or any little design tweaks you want to suggest. See TechnicalImprovements for more in depth techy suggestions

On all the individual spammer pages: It says "These are then responsible to keep all the information up to date". I think this is wrong, but I'm not sure. I think it should say "These are then responsible for keeping all the information up to date". – Halz - 3rd Nov 04

On all the wiki pages the chongqed.org link back to the main website is not very prominent (just on the right of the menu there), but then having the link to HomePage suggests that this is the main chongqed.org homepage (rather than just the wiki HomePage). I think the chongqed.org link should be a big fat one above. – Halz - 3rd Nov 04

Good start, Halz. I've changed the wiki config a bit to have a bold link to the homepage. And I went over that spammer page template and reworked the grammar as to your suggestions. Sounds better now. – Manni - 3rd Nov 04

That whole section of text needs help:

"A regular webpage can be edited by only a few people. These are then responsible for keeping all the information up to date. And of course they are responsible to bring all the information online in the first place."

Would be better as:

"A regular webpage can be edited by only a few people. They are responsible for keeping all the information up to date. And of course they are responsible for putting all the information online in the first place."

The HomePage link has always confused me. I sometimes click it when trying to go to the homepage and I have been here a few times before. How about calling it WikiHome instead?

Joe - 3rd Nov 04

Better now? – Manni

Yep. One more suggestion is make the chongqed.org link red like it is on the chongqed.org main site. You don't have to move it to the left, that would be odd since thats not the WikiHome, but red makes the other one really noticable. – Joe

OK. Done that too. The link really stands out now. While I was at it, I also installed a new extension that makes it very easy to leave a signature on discussion pages like this one. Details on the WikiNews page. – Manni - 2004-11-04 11:37


The text appearing after you submit a spammer, is confusing. It says "You should now start to plant links to chongqed.org using the spammer's keywords. Here's an other example" …then the example is a link without keywords.

It should say something like "Please wait for your spam submission to be processed. We have to check (manually) that this is really a spammer. Assuming your submission is accepted, there will be a new page added to chongqed.org, describing your spammer. Then we are ready for you to create chonqing links, using the spammer's favourite keywords. If you can't wait to get involved, you could help to chonq existing spammers in the database! All incoming links will help, in the fight against wiki spam" ….or something like that. – Halz - 2005-03-03 13:26 UTC

Oh, that old text was pretty … err … old. And outdated. I changed this according to your suggestion. – Manni - 2005-03-03 15:23


Minor bug with the new textbox size Preferences option: If you go to there and save changes, without bothering to fill in these values (like the lazy bastard that I am), it makes really diddy edit textboxes. You need to make it default to some normal values, if the submitted values were blank (Or disallow submitting blanks). – Halz - 2005-04-04 10:48 UTC

Thanks, Halz. Should be fixed now. – Manni - 2005-04-04 12:57


Apparently you upgraded the wiki engine recently. Since then links automatically get a space appended after. So [[GhostTown]]s looks like GhostTown s instead of GhostTowns. And links at the end of sentances now have a space before the period. I assume they did this on purpose to make the nonexistent page question marks more visible as not punctuation: GhostTown? . instead of [[GhostTown]]?. But I don't like it. Anything you can do about that? – Joe - 2005-04-05 20:03 UTC

Yuck. I hadn't noticed that. Let me try it: GhostTowns.
OK. That looks evil. I'll see what I can do. – Manni - 2005-04-05 22:36

OK. So that was easy. I had changed oddmuse to use CGI::Pretty which produces a line-break after the </a> tag. I reverted to normal oddmuse and things seem to be ok again. – Manni - 2005-04-05 22:41


Some problems:

404:

http://spammers.chongqed.org/fioricet%20freeservers

http://spammers.chongqed.org/fioricet%206x

http://spammers.chongqed.org/cipro%20and%20bayer

Where did you find those? They aren't in the db and of course they should return a 404.

Google of course. Don't know why they would have dissapeared, but they must have existed at some point. Just seems odd.

We should have a custom 404. Put this line in your .htaccess:

This won't do any good because those 404s are triggered when a keyword or spammer isn't found in the db. .htaccess files don't know that much about the db.

That does present a problem. Could still be useful for the main site, but so little content is there its not major that it doesn't have a custom 404. Hardly anyone would ever see it.

To reduce the number of pages and maybe help our PR, we should noindex the text copy link pages: http://spammers.chongqed.org/wikispammer/autoinsurancemass.uni.cc&showlink=1

Strangely, those already had noindex,nofollow. Don't know why Google lists them anyway.

Probably got them indexed long ago and hasn't revisited since. Which is a good data point to add to our experiment, you can't rely on how often Google will reindex a large site totally.

Even if we don't encourage use of those pages, we should link to them from somewhere so search engines can find them. They are important. And to reduce the insane page size, you could create seperate pages for each letter rather than everything on one page.

That's an idea, but it requires some programming. Don't know when I will have time for that.

No hurry. Our PageRank sucks so much right now it won't make much difference.

What happened to the old cut and paste chongqing pages? I don't find them linked from anywhere anymore.

I got rid of the links because nobody was using them and I wanted to reduce the internal links on chongqed.org. Not sure whether I should put the back.

Good point, but as long as they are noindex,nofollow it shouldn't hurt as as far as Google is concerened.

Joe - 2005-04-16 00:25 UTC

I checked on how the CaughtSpam spam views were working now with this Google search. There is something weird going on. Do they expire eventually or did you change the numbering system at some point? That is an amazing number of missing caught spams. Result 2 and 3 are still cached and show what used to be there. They don't exist anymore on chongqed.org.

You said you couldn't put out a 404 on missing ones, but could you do a domain filter to limit access only to the webserver so it can serve the pages to the wiki. No need for anything else to access these directly. I can't figure out where Google is getting these URLs from.

Joe - 2005-04-22 08:29 UTC

At least one of the links that don't work as listed on Google does work:

Google has: http://wiki.chongqed.org/?action=viewcaughtspam;file=1103680691

That spam does exist because it is used on the CaughtSpam page as: http://chongqed.org/cgi-bin/view-spam.pl?mode=wiki&single=1103680691

Looks like some of the stuff you have done has helped so far. We have dropped from about 32,000 pages indexed by Google down to 31,900. Its a good start. We just have to wait till they notice all the 404s and noindexes.

Joe - 2005-04-22 08:52 UTC

Duh! The caught spams you are talking about are the ones that were served up (and caught) by the old script. Since I changed to the spam catching module, the caught spam is saved in a different directory and I purged the old directory. That's why they are missing. The module will still find them though. I now have deleted the old script and Google (and everyone else) will see a nice 404. – Manni - 2005-04-26 11:03

The wiki search results pages could use noindex. No reason they should be and they look weird in search engines. Yahoo (I think) and Gigablast are listing them somehow. I don't know where they are getting the search from though. – Joe - 2005-04-27 22:57 UTC

Yuck. Must have been a referrer link that ended up being crawled or something like that. Fixed now. Thanks. – Manni - 2005-04-28 09:30


That seems like something normal people would expect, maybe you should post it as a patch for inclusion into OddMuse. – Joe - 2005-04-28 07:56 UTC

I'm not using the OddMuse defaults for the robots meta tag. But you're right, I should check what the default behavior is. – Manni - 2005-04-28 11:03


I would like to see a nofollow wiki syntax. It wouldn't be too useful for most people so its probably not worth making it into actual OddMuse, but it would allow us to discuss and link to a page that we don't want to give PageRank help to. That way the spammers we are discussing will see our link in their referrers (reverse Referrer spam). How about [http://spammyurl.com ~ spammyurl] for nofollow links? – Joe - 2005-04-16 23:33 UTC


What this page all about? http://spammers.chongqed.org/info

Guess something went wrong there. It's not very human readable. I found it with a google search actually – Halz - 2005-06-01 10:24 UTC

That is very odd and not at all readable. It seems to be because there are so many entries for the same term. But why they are under the term info alone I don't know, that could be a mistake. – Joe - 2005-06-01 10:54 UTC

Ah yeah, it's when the same keyword (in this case 'info') is used by many spammers. These drug names like cialis are also popular keywords. So we just need some special logic to display the page differently in the case where there's more than say five different spammers using the keyword. – Halz - 2005-06-02 23:17 UTC


Andi emailed me about a DokuWiki user who had a problem linking to Geocities. Currently Geocities is in the DB for the words below. I removed it since there are still many legitimate users.

adipex alprazolam ambien carisoprodol casino cialis fioricet hydrocodone levitra lortab phentermine poker prozac texas holdem tramadol ultram valium vicodin xanax

Joe - 2005-06-02 19:06 UTC

Maybe we need a refresh the blacklist button. That would have solved a lot of the >30 edits we got from one idiot spammer today. – Joe - 2005-06-09 06:46 UTC


Interesting little bug, which is maybe one to report to oddmuse developers. I Recently I logged in as 'halz', and it recognised me and gave me access to to the PrivateForum etc… but I just noticed I should have logged in as 'Halz' (upper case H) because the links to my user page on RecentChanges were wrong for my previous edits (There is no 'halz' page). Not sure what the correct behaviour should be though, because with usemod you dont log-in as such, you just fill in the preferences form. Maybe when I fill in 'halz' it should search the users and then say "welcome back Halz". – Halz - 2005-09-23 12:09 UTC

Oddmuse doesn't really have a login concept. It has user names so you can see who edited what. And it has passwords for editors and admins. But the user names and passwords aren't in any way connected. The right password will give you the correct credentials, no matter what user name you are using.

Guess this is a case of 'behaviour by design'. Sorry. – Manni - 2005-09-23 15:15 UTC

Usernames are not connected to passwords? Weird. So I can login with any username, but if my password is one of the editor passwords, then I can see the PrivateForum. Well you learn something new every day. I thought the whole point of the password was to stop someone making edits with your username. – Halz - 2005-09-23 15:37 UTC

Normally that is what you expect passwords to do, but OddMuse and UseMod don't really have actual user accounts. I really have no idea what the user passwords on them do. I have not seen anything. The editor and admin passwords work for any username you use. MediaWiki and DocuWiki are two that I know can have actual logins, several others seem to also. – Joe - 2005-09-23 19:09 UTC


In the blacklist.chongqed.org, the regex permit writing, the web page without the http:// in front Ex: http://www.some_spam_page.org is block but not www.some_spam_page.org, and some wiki actually allow that to be a link, could you modify your blacklist (regex) to prevent that? Thanks in Advance – patgadget 22 nov 2005

That could be a problem. Manni will have to look into the concequences of doing that or possibly provide an alternate version of the blacklist. I think the reason we did it that way was to allow users to discuss spammer sites as long as they didn't include the http so they would not be a link. – Joe - 2005-11-22 21:41 UTC

I guess we could simply provide an alternate version of the blacklist that will not include the "http://". I never thought about something like that because Oddmuse doesn't autolink to wwww.something.com. But you are right, other wiki engines do that. Take a look at this list and let us know whether it works for you: http://test.chongqed.org/Manni - 2005-11-23 09:52

What i did is modify the list on my own, and change all the https?:\/\/ to (https?:\/\/)? so the file can include or not the http in front. i will test your list test.chongqed.org, thanks Patgadget 23 nov 2005


Manni, I think we have a chongqing form problem. I entered two spammers with a bunch of keywords. The first one had only the last one of those keywords shows up in the DB. The second I watched more carefully, I tried entering:

dorank.com:pagerank main
dorank.com:improve pagerank default
dorank.com:PageRank
dorank.com:google rank
dorank.com:online pr
dorank.com:google pr
No paris were entered into the DB, but the spammer domain seems to be in there as do the spammer's keywords. No pair comes up in a search, but the ususal no match message does not come up either. Dorank was given the number 6802 on the entry confirmation page, its keywords were given 20423 through 20428. Resubmitting gives the same numbers but does not treat them as already being in the DB. Maybe the DB is having trouble.

The other spam I entered was for adultpersonalz.net, its keywords were:

personals webcam
adult personal ads
free personal ads
personal ads with photos
adult personals

Joe - 2005-11-24 05:31 UTC

I got that one fixed. Don't know how that piece of code got in their, but I found three lines that prevented adding spam to the database. I must have put it there a long time ago for some kind of debugging. Don't ask. – Manni - 2005-11-25 10:23


Are diffs on older pages broken with the new OddMuse? I have noticed it a few times recently on pages ther clearly have a working revision history. The latest is PrivateGhostTownList, Halz latest addition has no diff. A couple days ago I also saw a diff for WikiForum that had mixed Manni's addition with Richard's minor revision. That is new too I think. I guess that is good since spam can't be hidden as easily. Just seems odd. – Joe - 2005-12-05 20:21 UTC


Can't add subdirectories to DB anymore.

u-blog.net/kuku: rolex replica
toolia2.de/user/wreath7529: wreath
toolia2.de/user/replic8487: replica tiffany jewelry
toolia2.de/user/replic4005: replica coach bag
toolia2.de/user/replic4878: replica coach handbag
toolia2.de/user/replic2010: replica watch
toolia2.de/user/coachr5035: coach replica
toolia2.de/user/rolexr2738: rolex replica
toolia2.de/user/rolexr5296: rolex replica
toolia2.de/user/swissr4690: swiss rolex replica
toolia2.de/user/rolexd1586: rolex daytona replica
toolia2.de/user/replic8697: replica bag
toolia2.de/user/rolexd3582: rolex daytona replica
toolia2.de/user/replic1768: replica coach bag
toolia2.de/user/wreathaquo: wreath
20six.nl/bultr: buy ultram	 
20six.nl/bvali: buy valium	 
20six.nl/bviag: buy viagra
20six.nl/bxana: buy xanax 
20six.nl/bxeni: buy xenical

Joe - 2006-03-20 01:12 UTC

Manni, there is something strange going on with caching of the pages in FF 1.5. I edited the forum, and then did not see my change until several refreshes and a forced refresh later. What is much worse though, is that if you go to edit the page again, the old version of the edit page is still there. Meaning you would be overwriting your own latest changes.

I also noticed that the spammers domain has a heading problem: "Could not open the menu file" on the main spammer index page.

And I am guessing the maintanance URL has moved again. Let me know where. And presuming I got the pw right the stats page isn't letting me in.

Joe - 2006-05-19 22:01 UTC

Thanks, Joe. These should all be fixed now. File access rights and a strange .htaccess directive that should have never worked, but did on the old server. As to your pw on the stats page: should be the same as your wiki password. – Manni - 2006-05-20 14:05

I had a case problems with both the login and password. I am not used to mixed case passwords and I use a variation of the scheme all in lowercase on other antispam sites so I often forget. I notice the stats from the old server are gone. Not a big deal, just wondering if you knew that. – Joe - 2006-05-20 13:19 UTC

Yes, I knew. Don't know how to merge the old and the new stats and I, too, figured it wasn't a big deal. – Manni - 2006-05-20 16:39


Suggestions:

You list contains only spamlinks, but also a list of keywords can by very useful. Also because the list cold by reduced. More words on the list require more checking time. A list with typical and well none spam and xx words could increase the safety and reduce the list. And a lot of new links witch containing such words would by blocked too. Or is existing such a badword list and i down't saw it?

(i send you once a list with shocking sites, this was used from spamers witch are pisst to by blocked, or from trolls and vandals. This also should by inside of a wiki protection.)

+

blogspot.com
is a example, i think is a never-ending story, you can block out 1000 of this links. But if you block only the keyword, you have them all, and the list becomes match shorter. 2007.05.10

There are others that run keyword blocking lists. It may be more effective against future spam and result in a much smaller list, but it is far more likely to catch things that are not spam. See this site for one such list. That list is pretty good, but you must be careful that it does not contain words may be used on your wiki. For example, if you ran a medical site, you couldn't block Tramdol, Ambian, Viagra, etc.

There is also Bad Behavior which detects spam by detecting characteristics of the spamming software. It is very effective and unlikely to have false positives, but of course not perfect and some spam makes it through. On a small WordPress? blog I run it has blocked 650 spams in the last 7 days. Only about 10 made it through.

There need to be a lot of different approaches or spammers would be able to adapt too easily.

--Joe - 2007-05-13 21:43 UTC

This is true, i sort my list by subiect, for example medicine, se* or words witch make no sense in the context of my wiki. The word se* is too a good example for possible Errors because it is a part of many longer words. I check all words manually before i but them in my blocklist. But it is a very good and strong tool to block spamers, so i use it, and i log all blocked actions too check if maybe someone is blocked without reason. And i use "Bad Behavior", i am surprised how many bad accesses visit my wiki. Tanks for responding.

2007-05-14