SpamAssassin Bayes training via Cpanel
Offer easy ways to train the bayes filter of SpamAssasin per User and global.
http://wiki.apache.org/spamassassin/ReportingSpam
Report is not the same as Learn (http://spamassassin.apache.org/full/3.4.x/doc/sa-learn.html).
But both methods could be offered.
HAM and SPAM training could be offered via cpanel gui. Upload/Paste emails in a form to train the filter.
Or via IMAP folders (e.g. SA_HAM and SA_SPAM) , which could be defined via cPanel GUI. A cron job or a button to read in these folders.
Third way could be two emailadresses / email aliases which allows forwarding of HAM and SPAM to an emailaddress to train filter.
NonSSL http://forums.cpanel.net/f43/exim-spamassassin-enough-426381.html
SSL https://forums.cpanel.net/f43/exim-spamassassin-enough-426381.html
Scripts like these could be implemented to submit hashed and train the local bayes filter.
http://www.ruwenzori.net/code/teach-sa/
Scripts like these could be implemented to submit hashed and train the local bayes filter.
http://www.ruwenzori.net/code/teach-sa/
I definitely agree that training the bayesian filters with SpamAssassin could stand to improve users' experiences. The toughest point I think is to find a way that is easy enough for even the most casual of users to understand and utilize. Remember, many email users may not even have the knowledge that cPanel "exists". They're simply utilizing an email account provided to them.
As a result, I'd love to see some use cases and thoughts discussed here along those lines.
On a personal level, I had at one point on my cPanel & WHM server deployed a set of cron jobs that ran sa-learn --ham and sa-learn --spam accordingly on designated "ham" and "spam" folders within each email account. Therefore management became as simple as moving mail within folders (which virtually every mail client, smartphone, etc) supports. It worked well, and only required that users understand the folders exist and that it will "train" the spam system in cases where it was wrong.
Ideally, each client itself would support it (plugins for Thunderbird, integration into iPhone/Android mail clients, integration to webmail clients, etc). However, that's not something we can directly control nor necessarily have any influence on.
Forcing a user to login and use a cPanel interface seems like a high cost of entry (especially when many users who would want/use this may have no concept of what cPanel is and simply utilizing an email account that the cPanel owner setup for them).
Setting up email addresses to forward and process spam/ham is definitely another alternative, although that too requires all users be cognizant of the feature.
Basically, I'd love to hear more from others on if any of the proposed ideas seem to have merit for their use cases. Additionally, if they wouldn't help, whether or not others have any alternative recommendations for implementation.
I definitely agree that training the bayesian filters with SpamAssassin could stand to improve users' experiences. The toughest point I think is to find a way that is easy enough for even the most casual of users to understand and utilize. Remember, many email users may not even have the knowledge that cPanel "exists". They're simply utilizing an email account provided to them.
As a result, I'd love to see some use cases and thoughts discussed here along those lines.
On a personal level, I had at one point on my cPanel & WHM server deployed a set of cron jobs that ran sa-learn --ham and sa-learn --spam accordingly on designated "ham" and "spam" folders within each email account. Therefore management became as simple as moving mail within folders (which virtually every mail client, smartphone, etc) supports. It worked well, and only required that users understand the folders exist and that it will "train" the spam system in cases where it was wrong.
Ideally, each client itself would support it (plugins for Thunderbird, integration into iPhone/Android mail clients, integration to webmail clients, etc). However, that's not something we can directly control nor necessarily have any influence on.
Forcing a user to login and use a cPanel interface seems like a high cost of entry (especially when many users who would want/use this may have no concept of what cPanel is and simply utilizing an email account that the cPanel owner setup for them).
Setting up email addresses to forward and process spam/ham is definitely another alternative, although that too requires all users be cognizant of the feature.
Basically, I'd love to hear more from others on if any of the proposed ideas seem to have merit for their use cases. Additionally, if they wouldn't help, whether or not others have any alternative recommendations for implementation.
I don't know if it's actually feasible (like I'm not sure if it would be possible to pass the emails file name), but I recall paper lantern keeping a config bar on the top with logout and a few other options. Could ham/spam buttons go up there, so they worked with all 3 webmail clients?
I don't know if it's actually feasible (like I'm not sure if it would be possible to pass the emails file name), but I recall paper lantern keeping a config bar on the top with logout and a few other options. Could ham/spam buttons go up there, so they worked with all 3 webmail clients?
The limitation there is those buttons would likely be resigned to simple links to a form to fill out. It may not be feasible to have it read cross-frame and have access to the proper mail headers/particular mail of whatever mail is open at the time in whatever mail client was open. It certainly could be used to increase visibility of it if a "form submit" way to handle spam/ham was approached. Further investigation would need to be made to see if something beyond basically a link to a form is logistically possible.
This also excludes users who do not actually utilize the webmail client (smartphones, users with Thunderbird, Mac Mail, etc).
The limitation there is those buttons would likely be resigned to simple links to a form to fill out. It may not be feasible to have it read cross-frame and have access to the proper mail headers/particular mail of whatever mail is open at the time in whatever mail client was open. It certainly could be used to increase visibility of it if a "form submit" way to handle spam/ham was approached. Further investigation would need to be made to see if something beyond basically a link to a form is logistically possible.
This also excludes users who do not actually utilize the webmail client (smartphones, users with Thunderbird, Mac Mail, etc).
Thanks for the good comments.
The usercase "not aware of cpanel" is valid. The usercase "is only using POP3" too.
I would go for the folder pair (sa_spam / sa_ham) which is precreated via
IMAP SPECIAL-USE attributes as specified by
# RFC 6154 and a cronjob or any other trigger to learn the bayes filter. The cpanel should then offer at least some info / reset functionality to unlearn.
Is it possible to create certain emailforwarders to certain imap folders ? Would be a nice feature too. That way emailaddresses for spam / ham learning could be using the folders.
Thanks for the good comments.
The usercase "not aware of cpanel" is valid. The usercase "is only using POP3" too.
I would go for the folder pair (sa_spam / sa_ham) which is precreated via
IMAP SPECIAL-USE attributes as specified by
# RFC 6154 and a cronjob or any other trigger to learn the bayes filter. The cpanel should then offer at least some info / reset functionality to unlearn.
Is it possible to create certain emailforwarders to certain imap folders ? Would be a nice feature too. That way emailaddresses for spam / ham learning could be using the folders.
Most systems manage this with two methods, a ham and spam imap folder maybe called something like report_as_ham and report_as_spam so users understand them better. And something like an email address for reporting spam to.
I don't see the point of a specific web interface for reporting as if the user has to log into the webmail, he may as well use the imap folders.
A plugin for email clients would be a good solution too.
The database must not be a global one but for each email address. Some users will report ham/spam that's not ham/spam for others
Most systems manage this with two methods, a ham and spam imap folder maybe called something like report_as_ham and report_as_spam so users understand them better. And something like an email address for reporting spam to.
I don't see the point of a specific web interface for reporting as if the user has to log into the webmail, he may as well use the imap folders.
A plugin for email clients would be a good solution too.
The database must not be a global one but for each email address. Some users will report ham/spam that's not ham/spam for others
Agree with easy to understand, folder names.
learn-spam and learn-ham are other nice ways of naming them to make sense for the end-user.
Agree with easy to understand, folder names.
learn-spam and learn-ham are other nice ways of naming them to make sense for the end-user.
I find customers that manually move email to the SPAM folder, thinking that it will help with filtering. Specific training folders make a lot of sense.
I find customers that manually move email to the SPAM folder, thinking that it will help with filtering. Specific training folders make a lot of sense.
We don't ask for a magic button "This is Spam" and "This is not Spam", for a start a filtering of inbox (for ham) and Junk/Spam folder for spam should be fine. If someone moves something from spam to inbox means it's ham. If someone moves something from Inbox to Spam folder means it's spam. It can learn that way for a start.
We don't ask for a magic button "This is Spam" and "This is not Spam", for a start a filtering of inbox (for ham) and Junk/Spam folder for spam should be fine. If someone moves something from spam to inbox means it's ham. If someone moves something from Inbox to Spam folder means it's spam. It can learn that way for a start.
Most free email accounts out there, Gmail, Outlook, Zoho; all have a spam folder when emails older than 30 days are removed, and moving emails in and out of that folder would train the spam filter.
Not having this simple and basic feature for email accounts, technology that's been around for more than 10 years already, makes all cPanel clients provide a lower quality service.
Competition has already fixed this, hopefully cPanel implement this soon.
Most free email accounts out there, Gmail, Outlook, Zoho; all have a spam folder when emails older than 30 days are removed, and moving emails in and out of that folder would train the spam filter.
Not having this simple and basic feature for email accounts, technology that's been around for more than 10 years already, makes all cPanel clients provide a lower quality service.
Competition has already fixed this, hopefully cPanel implement this soon.
There is work taking place in this area currently. I hope to have some new features that help with spam/ham management in v104. I'll make updates as this comes along.
Koree A. Smith
Product Owner
cPanel, LLC
There is work taking place in this area currently. I hope to have some new features that help with spam/ham management in v104. I'll make updates as this comes along.
Koree A. Smith
Product Owner
cPanel, LLC
Replies have been locked on this page!