jonsjava Posted January 7, 2009 Share Posted January 7, 2009 What could I add to this to make it more complete? <?php /** * Anti-Spam function. It takes the 344 most common spam words/phrases, * and compares input to the list. Returns array containing * number found, the actual text found, and a score for the input * * @param unknown_type $input : the data you want to check * @return array */ function stopSpam($input){ $count = 0; $data = "'hidden' assets -online 3.28 4u accept credit cards act now! don't hesitate! additional income addresses on cd adipex advicer all natural amazing stuff apply online as seen on auto email removal avoid bankruptcy baccarrat be amazed be your own boss being a member big bucks bill 1618 billing address blackjack bllogspot booker brand new pager bulk email buy direct buying judgments byob cable converter call free call now calling creditors can't live without cancel at any time cannot be combined with any other offer car-rental-e-site car-rentals-e-site carisoprodol cash bonus cashcashcash casino casinos cell phone cancer scam cents on the dollar chatroom check or money order cialis click below click here link click to remove click to remove mailto collect child support compare rates compete for your business confidentially on all orders congratulations consolidate debt and credit coolcoolhu coolhu copy accurately copy dvds credit bureaus credit card offers credit-card-debt credit-report-4u cures baldness cwas cyclen cyclobenzaprine dating-e-site day-trading dear email dear friend dear somebody debt-consolidation debt-consolidation-consultant dig up dirt on friends direct email direct marketing discreetordering discusses search engine listings do it today don't delete drastically reduced duty-free dutyfree earn per week easy terms eliminate bad credit email harvest email marketing equityloans expect to earn fantastic deal fast viagra delivery financial freedom find out anything fioricet flowers-leading-site for free for instant access for just $ free access free cell phone free consultation free dvd free grant money free hosting free installation free investment free leads free membership free money free offer free preview free priority mail free quote free sample free trial free website freenet freenet-shopping full refund gambling- get it now get paid get started now gift certificate great offer guarantee hair-loss have you been turned down? health-insurancedeals-4u hidden assets holdem holdempoker holdemsoftware holdemtexasturbowilson home employment homeequityloans homefinance hotel-dealse-site hotele-site hotelse-site human growth hormone if only it were that easy in accordance with laws incest increase sales increase traffic insurance insurance-quotesdeals-4u insurancedeals-4u investment decision it's effective join millions of americans jrcreations laser printer levitra limited time only long distance phone offer lose weight spam lower interest rates lower monthly payment lowest price luxury car macinstruct mail in order form marketing solutions mass email meet singles member stuff message contains disclaimer mlm money back money making month trial offer more internet traffic mortgage rates mortgage-4-u mortgagequotes multi level marketing name brand new customers only new domain extensions nigerian no age restrictions no catch no claim forms no cost no credit check no disappointment no experience no fees no gimmick no inventory no investment no medical exams no middleman no obligation no purchase necessary no questions asked no selling no strings attached not intended off shore offer expires offers coupon offers extra cash offers free (often stolen) passwords once in lifetime one hundred percent free one hundred percent guaranteed one time mailing online biz opportunity online biz opportunity online pharmacy online-gambling onlinegambling-4u only $ opportunity opt in order now order status orders shipped by priority mail ottawavalleyag outstanding values ownsthis palm-texas-holdem-game paxil penis pennies a day people just leave money laying around pharmacy phentermine please read poker-chip potential earnings poze print form signature print out and fax produced and sent out profits promise you ...! pure profit pussy real thing refinance home removal instructions remove in quotes remove subject removes wrinkles rental-car-e-site reply remove subject requires initial investment reserves the right reverses aging ringtones risk free roulette round the world s 1618 safeguard notice satisfaction guaranteed save $ save big money save up to score with babes search engine listings section 301 see for yourself sent in compliance serious cash serious only shemale shoes shopping spree sign up free today slot-machine social security number special promotion stainless steel stock alert stock pick stop snoring strong buy stuff on sale subject to credit supplies are limited take action now terms and conditions texas-holdem the best rates the following form they keep your money -- no refund! they're just giving it away this isn't junk this isn't spam thorcarlson top-e-site top-site tramadol trim-spa ultram university diplomas unlimited unsecured credit/debt urgent us dollars vacation offers valeofglamorganconservatives viagra viagra and other drugs vioxx wants credit card we hate spam we honor all weekend getaway what are you waiting for? while supplies last while you sleep who really wins? why pay more? will not believe your eyes winner winning work at home xanax you are a winner you have been selected your income zolus"; $data = strtolower($data); $data_array = explode("\n", $data); foreach ($data_array as $value){ if (stristr($input, $value)){ $array['item'][] = $value; $count++; } } $total_spam_vars = count($data_array); $score = ($count / $total_spam_vars) * 1000; $array['total_found'] = $count; $array['score'] = $score; return $array; } $spam_score = stopSpam($input); print_r($spam_score); Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/ Share on other sites More sharing options...
JonnoTheDev Posted January 7, 2009 Share Posted January 7, 2009 I would use a text file for bad words and then read within the function. Makes it easier to add to rather than adding to the function code. Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731751 Share on other sites More sharing options...
jonsjava Posted January 7, 2009 Author Share Posted January 7, 2009 I would use a text file for bad words and then read within the function. Makes it easier to add to rather than adding to the function code. good idea. I think I'll do that. Also, i was wondering if anybody has any other tricks they use to catch spammers. Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731752 Share on other sites More sharing options...
GingerRobot Posted January 7, 2009 Share Posted January 7, 2009 Wouldn't it make more sense to count the number of occurrences of each of the phrases? For example, data containing 10 occurrences of one of the phrases must surely be more likely to be spam than data containing just 1 occurrence? Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731764 Share on other sites More sharing options...
jonsjava Posted January 7, 2009 Author Share Posted January 7, 2009 Wouldn't it make more sense to count the number of occurrences of each of the phrases? For example, data containing 10 occurrences of one of the phrases must surely be more likely to be spam than data containing just 1 occurrence? so, what you are saying is if I had this input spam spam spam spam spam spam spam spam I should give it more weight than this is not spam assuming that the measured word is "spam". Well, that would work, but the phrases I have outlined are pretty much guaranteed to come from spammers, so it showing up once is a good indicator. The weight system is to see if the input has more than one "obvious spam" phrase in it. If it does, it weighs it as more "loaded with spam" than just one phrase. This is to keep people from being blocked form saying stuff like "I recieved an e-mail from your site that said 'act now! don't hesitate!', and I just want you to know that I don't appreciate receiving spam". I want people to fill out the contact form, and even quote possible spam, with being weighed so heavily that my spam filter blocks them from contacting me. Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731767 Share on other sites More sharing options...
Mark Baker Posted January 7, 2009 Share Posted January 7, 2009 but the phrases I have outlined are pretty much guaranteed to come from spammersWell I've certainly both received and sent emails in the last week that contain more than one of these phrases, and that aren't spam. "terms and conditions", negotiating a software license for use in a product "billing address", e-mailed receipt for a purchase by credit card "insurance", e-mailed confirmation that a renewal for my car insurance online had been received Your weighting system would flag them as probable, but not definite spam, while the dozen or so e-mails offering me cheap cia1is would probably be accepted purely because of a single character change. Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731781 Share on other sites More sharing options...
jonsjava Posted January 7, 2009 Author Share Posted January 7, 2009 but the phrases I have outlined are pretty much guaranteed to come from spammersWell I've certainly both received and sent emails in the last week that contain more than one of these phrases, and that aren't spam. "terms and conditions", negotiating a software license for use in a product "billing address", e-mailed receipt for a purchase by credit card "insurance", e-mailed confirmation that a renewal for my car insurance online had been received Your weighting system would flag them as probable, but not definite spam, while the dozen or so e-mails offering me cheap cia1is would probably be accepted purely because of a single character change. ...and that's a question I should have posed before. Any ideas to convert the numbers between letters to associated letters that.....well, you get the big idea. change cia1is to cialis? Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731793 Share on other sites More sharing options...
GingerRobot Posted January 7, 2009 Share Posted January 7, 2009 Well, you could decide what numbers are likely to represent which characters and perform a conversion. But there's so many different ways in which someone could modify the text slightly: miss-spellings, underscores, dashes etc. A potential solution would be to use the levenshtein distance - though it might be too slow. I'd probably just use someone else's spam filter and let them do the hard work Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731820 Share on other sites More sharing options...
jonsjava Posted January 7, 2009 Author Share Posted January 7, 2009 I'd probably just use someone else's spam filter and let them do the hard work but....where's the fun in that? Quote Link to comment https://forums.phpfreaks.com/topic/139858-feedback-on-a-function/#findComment-731821 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.