mark107 Posted February 17, 2023 Share Posted February 17, 2023 Hi all, I wonder if any of you could help me with this. I'm storing the source emails in a database after I have fetching the emails from imap. I'm running the cron job in every 24 hours to monitor for spam emails in the spam folder so the Bayes database can get update. Here is what I use to run on cron job: /usr/local/cpanel/3rdparty/bin/sa-learn -p ~/.spamassassin/user_prefs --spam ~/mail/mydomain.com/myaccount/.spam/{cur,new As I'm using to store the emails in a database, can I run the PHP on cron job to get the Bayes database update? Example: /usr/local/cpanel/3rdparty/bin/sa-learn -p ~/.spamassassin/user_prefs --spam ~/public_html/username/myfolder/update_bayes.php Would it ever work to get the Bayes database update so the spamassassin could learn from it?? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 17, 2023 Share Posted February 17, 2023 Yes. Quote Link to comment Share on other sites More sharing options...
mark107 Posted February 17, 2023 Author Share Posted February 17, 2023 3 hours ago, gw1500se said: Yes. Great. I have been told that I need to create the file to output the email header included the body for sa-learn to scan the email and then update the bayes database?? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 17, 2023 Share Posted February 17, 2023 Whatever you can do from the command line you can do with a cron job either directly or as a script. Quote Link to comment Share on other sites More sharing options...
mark107 Posted February 17, 2023 Author Share Posted February 17, 2023 Oh right, if I use this: Update_bayes.php <?php ini_set('display_errors', '1'); ini_set('display_startup_errors', '1'); error_reporting(E_ALL); require_once('../Spamassassin/Client.php'); require_once('../Spamassassin/Client/Exception.php'); require_once('../Spamassassin/Client/Result.php'); $spam_mailbox = $link->prepare("SELECT count(*) FROM `Spam` WHERE readtype = ?"); $spam_mailbox->execute([$unread]); $row = spam_mailbox->fetch(PDO::FETCH_ASSOC); $header = $row['header']; $fp = fopen('/home/user/myfolder/spam.txt', 'w'); fwrite($fp, $header); fclose($fp); $message = @file_get_contents('spam.txt'); $params = array( "hostname" => "localhost", "port" => "783", "user" => "root"); $sa = new Spamassassin\Client($params); if ($isSpam == 'Spam') { $sa->report($email_content); } else if ($isSpam == 'Not Spam') { $sa->revoke($email_content); } ?> To run on cron job, example: /usr/local/cpanel/3rdparty/bin/sa-learn -p ~/.spamassassin/user_prefs --spam ~/public_html/username/myfolder/update_bayes.php Would it work to get the Bayes database update if I run PHP script on cron job so the sa-learn could scan my emails and learn from it?? But what if the server goes offline would the sa-learn lose the information before the server get back online?? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 18, 2023 Share Posted February 18, 2023 Not sure what server you are talking about but if you add some error checking to your script you can determine if the update was successful or not. If unsuccessful you can schedule a one time rerun after some specified amount of time or just wait for the next regularly scheduled cron to run again. Quote Link to comment Share on other sites More sharing options...
kicken Posted February 18, 2023 Share Posted February 18, 2023 I think maybe you're doing something that is unnecessary. Assuming you're using this class, then it seems like using the report function will automatically train the message so there's no reason to run sa-learn on it again. If you did want to use sa-learn, then you cannot just pass the path to your PHP script to it, sa-learn does not understand PHP. You would need to run your script first to generate a file sa-learn does understand, then run sa-learn with the path to that file. Quote Link to comment Share on other sites More sharing options...
mark107 Posted February 18, 2023 Author Share Posted February 18, 2023 (edited) 3 hours ago, kicken said: I think maybe you're doing something that is unnecessary. Assuming you're using this class, then it seems like using the report function will automatically train the message so there's no reason to run sa-learn on it again. If you did want to use sa-learn, then you cannot just pass the path to your PHP script to it, sa-learn does not understand PHP. You would need to run your script first to generate a file sa-learn does understand, then run sa-learn with the path to that file. What do you mean what I'm doing that is unnecessary? Yes, I am using the class, but I can also run shell_exec which it works the same way as the class I use which it can be done by using this: $output = shell_exec('/usr/local/cpanel/3rdparty/bin/sa-learn -p /home/username/.spamassassin/user_prefs --spam /home/username/public_html/test/{new}'); echo "<pre>$output</pre>"; I have been using it as it works great, but there is a problem. I need to create a new file in order to run the command. If I report 2 emails as spam and then report one of these emails as not spam, how would sa-learn suppose to know which emails I report is not spam if I use the same filename that I created in the /home/username/public_html/test/new folder?? Edited February 18, 2023 by mark107 Quote Link to comment Share on other sites More sharing options...
kicken Posted February 18, 2023 Share Posted February 18, 2023 1 hour ago, mark107 said: What do you mean what I'm doing that is unnecessary? Because if I am understanding that class correctly, it already runs the message through the appropriate training when you call either the report, revoke, or learn methods. Since the message gets run through the training at that point, there's no need to do it again using sa-learn later. sa-learn is for a more passive setup where you just put messages into folders and have sa-learn periodically train on those folders. It sounds like you're doing active training on individual messages instead when means there is no need for the passive training. Of course, we don't know all the details of what you're trying to do/build but based on what's been provided so far it sounds like you're trying to make it way more complicated than it needs to be. For example, it kind of sounds like you're: Scanning a mailbox for messages Storing those messages in a DB and deleting them from the mailbox Pull those message back out of the DB Use sa-learn on them to train the filter. If you're PHP code that scans the mailbox is calling that report method, then you don't need steps 3 or 4 at all. If you're not using PHP to report the message and want to use sa-learn then it'd be simpler to just do that before you delete the messages from the mailbox, ie: Use sa-learn to train the filter Scan the mailbox for messages Store messages in DB and delete them from the mailbox. Quote Link to comment Share on other sites More sharing options...
mark107 Posted February 18, 2023 Author Share Posted February 18, 2023 (edited) Yes this is what I am doing right now to move the emails to spam folder when I mark them as spam. I know that it can be easily be done by insert the data into Spam DB and delete the emails from any DB table. I need to inform sa-learn to let them know that I have mark the emails as spam and I know that I don't need to use sa-learn later when I report the emails as spam or revoke as ham. When I report the emails as spam and not spam, I have got two choice to do this. 1. Create a file to output the header in the file and run the sa-learn to scan these spam messages. 2. Call imap to move the emails to junk folder and run the sa-learn to scan these spam messages. As I can see the files have already been created in "/home/username/mail/mydomain.com/myaccount/.spam/" so there is no need to create the same file with the same output. It will save my disk space from being waste. It sounds like to me I would need to call imap using with uid to move the emails to junk folder and run the sa-learn later on to scan the emails automatically to train the filter, is that correct? Edited February 18, 2023 by mark107 Quote Link to comment Share on other sites More sharing options...
kicken Posted February 18, 2023 Share Posted February 18, 2023 If you want to use sa-learn to scan your spam / junk folders in your mail directory, then your PHP code should not be doing anything related to spam control outside of possibly moving the message to the spam folder. There's no need to deal with that spam assassin library and report/revoke the messages, sa-learn will process them whenever it is next run. Otherwise, continue to use your spam assassin library's report/revoke functions as you process messages and forget about sa-learn. Quote Link to comment Share on other sites More sharing options...
mark107 Posted February 19, 2023 Author Share Posted February 19, 2023 (edited) Yes I do but I will use sa-learn to scan my spam / junk folders in my mail directory later on. I will let the cron job to do the work automatically. It sounds like to me I would need to call imap to move the emails to junk folder and let the cron job to run sa-learn later to scan these spam messages in the mail directory, is that correct?? Edited February 19, 2023 by mark107 Quote Link to comment Share on other sites More sharing options...
kicken Posted February 19, 2023 Share Posted February 19, 2023 3 hours ago, mark107 said: is that correct Yes. When you find spam just move the message to the junk folder and let the cron job handle it from there. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.