WARNNING: This is not a HOW-TO, this is a HOW-I-DID, so you should do in your own way, using or not the information in this page.


1) Spamd configuration (3.0.2)

Spamd is started with the RedHat style script that comes with the SpamAssassin distribution with the following options:

   OPTIONS="-d -x -m 5 --max-conn-per-child=5 --socketpath=/var/spool/spamd/spamd -u spamd"

Qmail max smtp incoming connections is set to 20, so I configure the spamd children to 25.

I run spamd in unix-socket mode since from my test over ten thousand mails, spamd is 7,8% faster running with unix-socket.

The user spamd was created in this way:

   mkdir /var/spool/spamd
   groupadd spamd
   useradd -g spamd -d /var/spool/spamd spamd
   chown spamd:spamd /var/spool/spamd

/etc/mail/spamassassin/local.cf

   # This is the right place to customize your installation of SpamAssassin.
   # See 'perldoc Mail::SpamAssassin::Conf' for details of what can be
   # tweaked.
   #
   ###########################################################################
   #
   # Some settings are same as default, but I like to see them...

   required_hits 6.5
   dns_available yes

   # Site-wide files
   use_bayes 1
   bayes_path /var/spool/spamd/bayes
   bayes_file_mode 0666
   bayes_min_ham_num 150
   bayes_min_spam_num 150

   bayes_auto_learn 1
   bayes_auto_learn_threshold_nonspam -0.5
   bayes_auto_learn_threshold_spam 11.2

   auto_whitelist_path /var/spool/spamd/whitelist
   auto_whitelist_file_mode 0666

   # DCC
   use_dcc 1
   dcc_path /usr/bin/dccproc

   # My score
   score NO_DNS_FOR_FROM 2.550
   
   # I trust my bayes database so I modified the score
   # Don't do this until you have a good database

   score BAYES_00 0 0 -1.901 -2.1
   score BAYES_05 0 0 -0.8 -0.37
   score BAYES_20 0 0 -0.6 -0.3
   score BAYES_40 0 0 -0.4 -0.2
   score BAYES_50 0 0 0.8 0.8
   score BAYES_60 0 0 3.1 3.1
   score BAYES_80 0 0 3.9 3.9
   score BAYES_95 0 0 5.4 5.4
   score BAYES_99 0 0 6.2 6.2

   score DCC_CHECK 0 1.806 0 3.5


    

2) Sample of qmail-queue.log

Fri, 04 Feb 2005 12:55:32 CET:19447: +++ starting debugging for process 19447 (ppid=19445) by uid=81
Fri, 04 Feb 2005 12:55:32 CET:19447: w_c: message size 1256 bytes
Fri, 04 Feb 2005 12:55:32 CET:19447: w_c: elapsed time from start 0.514413 secs
Fri, 04 Feb 2005 12:55:32 CET:19447: return-path='marty_horn@wetworks.com.sg',
                                     recips='user-1@domain.com,user-2@domain.com'
Fri, 04 Feb 2005 12:55:32 CET:19447: from='"Marty Horn" ',
                                     subj='New product! Cialis soft tabs.',
                                     via SMTP from 219.241.47.201
Fri, 04 Feb 2005 12:55:32 CET:19447: s_p_d: we have multiple recipient, checking each of them
Fri, 04 Feb 2005 12:55:32 CET:19447: s_p_d: recipient 'user-1@domain.com',
                                     scanners 'sophie_scanner,spamassassin,perlscan_scanner'
Fri, 04 Feb 2005 12:55:32 CET:19447: sophie: finished scan in 0.023313 secs
Fri, 04 Feb 2005 12:55:34 CET:19447: SA: REPORT hits = 23.7/6.5
  1.3 SUBJECT_DRUG_GAP_C     Subject contains a gappy version of 'cialis'
  6.2 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                             [score: 1.0000]
  3.5 DCC_CHECK              Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
  0.1 RCVD_IN_NJABL_DUL      RBL: NJABL: dialup sender did non-local SMTP
                             [219.241.47.201 listed in combined.njabl.org]
  2.0 RCVD_IN_SORBS_DUL      RBL: SORBS: sent directly from dynamic IP address
                             [219.241.47.201 listed in dnsbl.sorbs.net]
  1.0 URIBL_SBL              Contains an URL listed in the SBL blocklist
                             [URIs: gfdgfppp.com]
  3.2 URIBL_OB_SURBL         Contains an URL listed in the OB SURBL blocklist
                             [URIs: gfdgfppp.com]
  4.3 URIBL_SC_SURBL         Contains an URL listed in the SC SURBL blocklist
                             [URIs: gfdgfppp.com]
  0.4 URIBL_AB_SURBL         Contains an URL listed in the AB SURBL blocklist
                             [URIs: gfdgfppp.com]
  1.5 URIBL_WS_SURBL         Contains an URL listed in the WS SURBL blocklist
                             [URIs: gfdgfppp.com]
  0.2 DRUGS_ERECTILE         Refers to an erectile drug
Fri, 04 Feb 2005 12:55:34 CET:19447: SA: yup, this smells like SPAM - hits=23.7 - rejecting message...
Fri, 04 Feb 2005 12:55:34 CET:19447: SA: finished scan in 1.623569 secs - hits=23.7
Fri, 04 Feb 2005 12:55:34 CET:19447: r_e: QS-1.25st: We have reasons to believe this mail is SPAM
Fri, 04 Feb 2005 12:55:34 CET:19447: ------ Process 19447/19445 finished. Total of 2.179352 secs
    

3) Spamassassin: tcp-server vs. unix-socket

Test done in a dedicated mailhub:
HW: Pentium IV 2,4 Ghz, ram 1 Gb, HardDisk SCSI (Controller Adaptec 29160).
SW: RedHat 7.3, kernel 2.4.26, perl 5.6.1, spamassassin 2.63.

   Spamassassin TCP-SERVER mode

   Average: 2.0614
   Median:  1.1995
   Std_dev: 3.2359

   Spamassassin UNIX-SOCKET mode (faster 7,8%)

   Average: 1.9124
   Median:  1.0033
   Std_dev: 2.7514
    


Back
Salvatore Toribio

20050213