I recently noticed that my spamassassin bayes filter doesn’t seem to work. And this is even though I run a script, that uses sa-learn to learn ham and spam tokens daily. Looking at the header of SPAM mail, I often saw something like:
X-Spam-Status: No, score=1.411 tagged_above=1 required=4.5 tests=[BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_YAHOO_RCVD=1.63, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_NEUTRAL=0.779] autolearn=no
As you can see my BAYES score looked like the bayes filter was completely untrained in my case. How can this happen although I have thousands of learned hams and spams in my database?
Well, for me there was an easy explanation…
My sa-learn-script runs via cron job in root context and learns into the database located at /root/.spamassassin/
. But spamassassin is called in the context of amavis which uses /var/lib/amavis/.spamassassin/
.
You can check this easily:
sudo sa-learn -D --dump magic
[...] Okt 2 13:10:32.216 [17743] dbg: config: using "/root/.spamassassin" for user state dir Okt 2 13:10:32.216 [17743] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_toks Okt 2 13:10:32.216 [17743] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_seen Okt 2 13:10:32.217 [17743] dbg: bayes: found bayes db version 3 [...]
vs.
su amavis -c "sa-learn -D --dump magic"
[...] Okt 2 14:26:22.931 [20051] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x3870488) implements 'learner_is_scan_available', priority 0 Okt 2 14:26:22.931 [20051] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks Okt 2 14:26:22.932 [20051] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen Okt 2 14:26:22.932 [20051] dbg: bayes: found bayes db version 3 [...]
Okay, know that we know what is wrong, how do we fix it?
1. Export the learned data from the root context bayes database
sudo sa-learn --backup > /tmp/bayes-backup.txt sudo chown amavis.amavis /tmp/bayes-backup.txt
2. Backup your amavis related bayes database (just in case something wents wrong and you have to roll back)
su amavis -c "cp -R /var/lib/amavis/.spamassassin /var/lib/amavis/.spamassassin_bkp152001"
3. Import the learned data to the amavis context bayes database
su amavis -c "sa-learn --restore /tmp/bayes-backup.txt"
4. Make sure sa-learn uses amavis context from now on
For this, edit your /etc/spamassassin/local.cf
and add the following line:
[...] bayes_path /var/lib/amavis/.spamassassin/bayes [...]
Note that, although it says “path” the parameter contains a filename prefix (bayes
).
5. Check if sa-learn REALLY uses the amavis context database location
sudo sa-learn -D --dump magic
[...] Okt 2 15:24:16.614 [21749] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x387c208) implements 'learner_is_scan_available', priority 0 Okt 2 15:24:16.615 [21749] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks Okt 2 15:24:16.615 [21749] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen Okt 2 15:24:16.616 [21749] dbg: bayes: found bayes db version 3 [...]
If you see this, you were successful. All spamassassin bayes related action from now on takes place in the amavis context.
Thanks!