Opened 8 years ago

Closed 5 years ago

#12947 closed bug (fixed)

Trac won't accept new registrations (classifies as spam)

Reported by: humdinger Owned by: haiku-web
Priority: high Milestone:
Component: Website/Trac Version:
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

More and more people on IRC mention that they are unable to register at Trac, because they are rejected as possible spam. This is very bad, as it deterres bug reports and patches and ultimately possibly new developers.

I have no idea what's the problem, let alone the solution. I'll post a short discussion by PulkoMandy and Barrett, with Adrien's take on it:

[18:23] <Barrett> <gbl08ma> Barrett: I tried registering with two email addresses, first a Gmail one and after that didn't work, a @tny.im one (personal domain)
[18:23] <Barrett> <gbl08ma> both times I got an error message "Submission rejected as potential spam" because of failed captcha attempts and high spambayes probability
[18:34] <PulkoMandy> Barrett: not much can be done, users can be added manually but that won't fix things long-term
[18:35] <PulkoMandy> basically the "register" page submit feeds an empty string to the spam filter logic no matter what
[18:35] <PulkoMandy> and people keep marking things as "spam" for our bayesian filter in trac administration, so now everything empty is "spam"
[18:35] <PulkoMandy> and it's not possible to register anymore
[18:36] <PulkoMandy> the captcha would save it, but I think it's broken
[18:37] <Barrett> in past I suggested to use reCaptcha, but forgot how it ended BTW
[18:39] <PulkoMandy> that's what we use
[18:40] <PulkoMandy> possibly our API key is broken however, no idea how to test that

Change History (6)

comment:1 by gbl08ma, 8 years ago

I should mention I ended up being able to register by using a different email alias (basically, I used the same Gmail address I was planning to use but added a +haiku suffix, so the email became gbl08ma+haiku at the gmail domain dot com).

The only reason this worked was because by appending that suffix, spambayes no longer considered my "submission" to have a high spam probability, and because of that it went straight to the login page, bypassing the captcha. From there, I could login and confirm my email - by the way, the confirmation email went to the spam folder. You may want to confirm SPF and DKIM are correctly set up, and use something like https://www.mail-tester.com/ to test.

In short, I think there are two main issues to solve here:

  1. The reCaptcha must be fixed, possibly upgraded to the new "checkbox" version which is friendlier;
  2. SpamBayes should be configured to be a bit easier on the registration submissions, or disabled for the registration page entirely.

As far as I can understand, SpamBayes was made to analyse longer messages (emails) and thus is not well fit for analyzing a couple of fields which contain nothing but user credentials. The situation becomes worse if it really has been fed bogus spam reports. If disabling SpamBayes results in too many bot submissions, then I believe enabling the captcha for everyone on the registration page is still better than having a broken captcha that only appears sometimes.

I agree this situation deters bug reports and new developers, personally I almost gave up on contributing after waiting many hours for someone to add me manually (which ended up not happening, as I managed to register as I explained). I can fully see less motivated people not even complaining on IRC or the mailing lists and just go away.

comment:2 by pulkomandy, 8 years ago

SpamBayes will analyze the *content* of the submission, not the HTTP headers. In the case of the registration page, the content is empty.

SpamBayes rely on careful training and a reasonable ration of "spam" vs "ham" submissions. Our instance was trained with way too much spam and not enough ham, so now it rejects a lot more things than it should, including most empty messages. I have no idea why it sometimes works.

We can reset the spambayes filter and re-train it, which would solve at least part of the problem. Or, we could disable it since we have several other filters running anyway (using shared services such as Akismet), and various blacklists.

But, the most important part is, we should fix the captcha. However, this is not something I can do from the Trac admin page, so we need someone from haiku-web to have a look at it. Maybe the trac spam filter plugin could be updated.

comment:3 by nielx, 8 years ago

I have changed the Captcha to the I am human captcha (the plugin had to be activated). I don't know how to verify this though. If somebody could do that, it would be great

in reply to:  3 comment:4 by pulkomandy, 8 years ago

Replying to nielx:

I have changed the Captcha to the I am human captcha (the plugin had to be activated). I don't know how to verify this though. If somebody could do that, it would be great

You changed it to http://www.areyouahuman.com/ which is not a reCaptcha service. And, you did not set up API keys for that, so I think it will not work. API keys can be entered in the trac admin page just below the "captcha type" field.

The "are you human" checkbox is still from reCaptcha (http://www.google.com/recaptcha/intro/index.html) and we could use it with Trac reCaptcha plugin, if that worked (probably it became broken by changes on reCaptcha side over the years). So, either we fix/update the reCaptcha plugin, switch to a different service *with* new API keys, or switch to Trac internal captcha generator.

To test the changes, you can logout and then try to create an account with an address @yahoo.com. These are blacklisted and will always go to the captcha page.

I have switched to the internally generated captcha, which seems to work correctly. We'll have to stick with that until some admin updates the spam filter plugin to a version that supports reCaptcha v2 (https://trac.edgewall.org/browser/plugins/1.0/spam-filter/tracspamfilter/captcha/recaptcha2.py), this was added 16 months ago.

Last edited 8 years ago by pulkomandy (previous) (diff)

comment:5 by lezsakdomi, 6 years ago

Maybe I were lucky, but I should mention, that I could register (almost) successfully. But:

  • The captha system (...) was buggy, so it pleased me to verify my human being after I (I promise) entered the correct characters. This looped indefinitely. Altough it verified me succesfully, but didn't redirect me to the next page. ("Human verification failed" or similar error message appeared on the top of page, after a few iterations "Unable to verify catcha" (because I were verified already))
  • The email arrived in my spam folder (using gmail)

comment:6 by waddlesplash, 5 years ago

Resolution: fixed
Status: newclosed

Emails appearing in Gmail spam folder should have been fixed by DKIM, and the indefinite loops should now be fixed also.

Note: See TracTickets for help on using tickets.