RE: Anyone want to write an "intelligent" mail filter?

new topic     » goto parent     » topic index » view thread      » older message » newer message

Rob:
Is there some SPAM filter that not only identifies SPAM, but also avoids
downloading them from the server and deletes them in the server?
I'm asking this because I have a 56K connection, and SPAM consumes a big
part of connection time (I have to pay for it).
A few days ago, I had to resort to change my e-mail address from rforno to
rmforno, in view of increasing SPAM.
Regards.

----- Original Message -----
From: Robert Craig <rds at RapidEuphoria.com>
To: <EUforum at topica.com>
Sent: Wednesday, November 05, 2003 12:11 AM
Subject: Re: Anyone want to write an "intelligent" mail filter?


>
>
> Irv Mullins wrote:
> > Every day I get more annoying SPAM e-mails. Currently it's running about
10
> > spams to every valid e-mail.
> >
> > I'm tired of wading thru them, and I'd rather not download them at all.
> > My e-mail client can filter the messages by sender or subject, but most
> > spams now are written to get around those filters.
> >
> > One thing I notice is that nearly 100% of the spams either contain the
> > word "lagos" or long strings of "dictionary" words to confuse the
filters:
> >
> > "indecisive constitute dakar summitry ajax beaver descendent withal
> > circumlocution asocial voluble inquire convolution replete hitler
> > commendation segregate cognition abstract eject disgustful"
> >
> > But very few or none of the more common shorter words that would likely
> > appear in a valid e-mail: "a, and, or, if, you, we, I, to, for, the,
this,
> > that....."
> >
> > We should be able to come up with a routine which would analyze a given
> > text string and rank it according to its likelyhood of being a
'meaningful'
> > message. Then use that routine in an e-mail client to rank messages and
> > only download from the server those which appear to be 'real'.
> >
> > Ideas?
>
> For the past few months I've been using the e-mail
> client in Netscape 7.1. It has a "Bayesian" spam filter
> that adapts to the streams of spam and normal mail
> that you receive. It works pretty well.
>
> It keeps track of all the words in your incoming e-mail,
> and notes how often each word appears
> in spam vs normal mail. For example, the word "Euphoria"
> might have appeared in 1 of my spam messages and
> 99 of my normal messages, so if it sees "Euphoria" in a
> message, that would indicate a 99% probability that this
> is a normal message. But it doesn't just look at one word.
> I believe it looks at the 20 or so words in each message
> with the most extreme probabilities. It uses a formula from
> Bayesian statistics to combine the probability indicated
> by each word into a single overall probability. e.g.
> if you had a word that indicated "90% likely to be spam"
> and another that said "95% likely to be spam", the result of
> combining those two words might be 97% (or something).
> It will move a message out of your inbox into a spam folder
> if the probability of it being spam is quite high,
> something like 99%. Obviously you want to keep false
> positives (real mail tagged as spam) to an absolute minimum.
>
> In practice, over a long period of time,
> suppose I get 1000 messages of which 900 are spam. It will
> probably move about 800 of the spams and 1 or 2 of the
> non-spams into my spam folder.
>
> With each batch of incoming mail, I check the spam folder
> for non-spams, but usually I can quickly see from the
> subjects and senders that there aren't any non-spams, so I
> click a button to quickly delete all the spams in one
> operation.
>
> Whenever it tags a message incorrectly (usually spam
> that it missed), you can click a button to tell it so.
> This way it gradually learns and gets smarter.
>
> Also, e-mail from anyone in my address book is automatically
> considered non-spam, so the false positives are quite low.
>
> Being able to delete a whole bunch of spams in one
> operation saves time. It's also nice that it keeps
> my inbox largely clear of distracting spam clutter.
>
> Regards,
>     Rob Craig
>     Rapid Deployment Software
>     http://www.RapidEuphoria.com
>
>
>
> TOPICA - Start your own email discussion group. FREE!
>
>

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu