Maximizing PHP’s Mail Throughput: To Fork Or Not To Fork?

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Profile image of Mattias Geniar

Mattias Geniar, September 30, 2010

Follow me on Twitter as @mattiasgeniar

PHP forking

Since I’m experimenting with a high-performance SMTP cluster in my lab, I needed to find out which was the most effective method of sending mails via PHP, to get the most out of my setup. This allowes me to pinpoint bottlenecks, and find the best way to deliver e-mails.

First, let me clarify the set-up I’m now using, so you can better understand the following benchmarks.

HAproxy + SMTP hosts

I have a HAProxy server, which balances the load between 3 SMTP servers that reside behind it. All SMTP connections are therefore handled by the HAProxy, which finds the best available host, and forwards the SMTP request for further processing. For all the tests I’ve done, all traffic was being passed through this load balancer. The only difference was in getting the mails to that balancer: directly, or via the local server? By forking each mail action or by sending them sequentially? By directly accessing the load balancer, or by routing them via localhost first?

This is the small PHP script I used to send the mail. I’ve used the same script for all my tests, to keep it objective.

send($recipients, $headers, $mailmsg);

/* Mail was sent, if this was a forked child process, end it here */
if (!defined('DONT_EXIT'))
    exit();
?>

The tests were all done for sending 1.000 mails, via each method.

First test: sending the mails sequentially (one at a time).

I’ve split this up in 2 ways:

  • Sending the mails directly to the Load Balancer
  • Sending them to the localhost relay server (Postfix)

I modified the script above (send-mail.php), to change the SMTP-host to either “localhost” (for local delivery), or the load balancer’s hostname. This is how I looped them.

send-mails-not-forked.php

define(‘MAILS_TO_SEND’, 1000); // How many mails to send, in total?

define(‘DONT_EXIT’, true); // Needed to keep the include from exit()-ing

for ($i = 0; $i < MAILS_TO_SEND; $i++) {

include(“send-mail.php”);

echo “."; // Some output, to show something via terminal

}

echo “\n”;

?>

OK, enough talk. Let’s get benchmarking.

Via localhost relay server

This local server was set up as a relay SMTP host, which forwarded all received requests to the load balancer.

PHP > Local Relay > Load Balancer

This turned out to be very effective. The local server accepted the mails very fast, and placed them into the local queue to process them one by one. By placing them in that local queue first, PHP was able to “send” those messages quickly – since the receiving mailserver (localhost) accepted the message, PHP considered it “delivered” and proceeded to the next one.

When PHP’s job was done sending all mails to the local queue, the postfix queue manager stepped in to process all those mails, and deliver them to the load balancer. From PHP script to Load balancer, this took 56 seconds to send 1.000 mails.

Directly talking to the load balancer

Using PEAR::Mail functions, the PHP script was communicating directly to the Load Balancer to deliver the mails. This ment the mail was never passing the local relay server.

PHP > Load Balancer

By avoiding the local mail queue, I had assumed it would cause a noticeable difference, but as it turns out, it doesn’t. By sequentially sending all mails directly to this load balancer, it took a total of 58 seconds for 1.000 mails. That’s in fact slightly slower than sending them via a localhost relay.

Here are the results plotted out.

Sequential delivery: via localhost vs. direct communication

The graph above shows the result of sending 1.000 e-mails this way. The first bar consists of 2 items:

  • Green: timing to deliver all mails via the PHP script to the localhost relay: 6 seconds
  • Red: timing to deliver all mails from the local queue, to the load balancer: 50 seconds

The second bar shows how long it took if PHP sent the requests directly to the load balancer. By routing the mails via localhost first, it took 56 seconds, as opposed to directly talking to the load balancer, which took 58 seconds. The difference is hardly noticeable.

If you were limited by these choices, I would opt for the second one: having PHP talk directly to the load balancer. It avoids placing a (small) load on the local mailservice, and “offloads” that task directly to the other systems.

Second test: forking the actual sending of the mail

The problem with sending mails sequentially, is that PHP is forced to wait until the send() function is completed, before it can continue. This means you’re effectively only sending mails one by one, instead of sending more in parallel. This is where forking comes in.

By forking the PHP process – actually copying the process into a new, unique and independant process – we can avoid having to wait for each send()-call to complete, and continue with spawning more child processes. This causes multiple SMTP connections to be made, which should be faster than sending them one by one.

This is the script I used to fork the processes. I built in a “limiter”, to only allow 128 simultaneous child processes, as it could otherwise crash your server by spawning too many (uncontrolled) processes. This would, of course, need further finetuning if you were to use this in real-life.

 $value) {
                       pcntl_waitpid($arrChildPids[$key], $status, WUNTRACED);
                }

                return true;
        }

        /* Loop our mailscript several times */
        echo "Start spawning all child processes.\n";
        for ($i = 0; $i < MAILS_TO_SEND; $i++) {
                /* Fork the process, so it runs independenly
                   The parent process has the Process ID in this variable
                   whereas the child process does not.

                   This allows us to know if this process is the parent, or the child
                */
                $arrChildPids[$i] = pcntl_fork();


                if (!$arrChildPids[$i]) {
                        // This gets executed as the "child" process
                        include("send-mail.php");
                }

                /* This keeps output flowing
                   Simple way of seeing it's still running, 
                   and prevents SSH timeouts */
                echo ".";

                /* Simple "garbage" collection. Don't have more than 1k simaltaneous children */
                if ($i != 0 && $i % MAX_CHILD_PROCESSES == 0) {
                        echo "\nRecycling children after ". MAX_CHILD_PROCESSES ." spawned children.\n";
                        waitForForkedChildren($arrChildPids);
                        echo "Done. Resuming sending now.\n";

                        /* Clean up our array of Child PIDs */
                        $arrChildPids = array();
                }
        }
        echo "\nAll children spawned.\n\n";

        /* Check to see if all our child processes ended already */
        echo "Waiting for all spawned child processses to end.\n";
        waitForForkedChildren($arrChildPids);
        echo "Done.\n";
?>

Via localhost relay server

PHP > Local Relay > Load Balancer (forked)

Sending them in parallel (forking processes) caused a considerably larger load on the sending webserver. As it turns out, it’s slightly slower to send the mails to the localhost relay for queue-ing, but Postfix manages to send them more efficiently as it receives multiple mails at the same time.

In total, sending these mails by forking processes, and using the localhost relay, it took 52 seconds to send 1.000 mails.

Directly talking to the load balancer

PHP > Load Balancer (forked)

If we let PHP talk directly to the load balancer, without passing via localhost, it took a total of 43 seconds to send these 1.000 mails. In a graph, it would look like this.

Forked delivery: via localhost vs. direct communication

Conclusion: Fork or use local relays

Forked delivery: via localhost vs. direct communication

Conclusion: Fork or use local relays

PHP mail benchmark: conclusion

In conclusion there are a few things still left to mention.

  • Forking causes a serious increase in server-load for PHP
  • Using local relay adds another possibly failing service

If you have CPU power left to spare, I would seriously consider forking your actual sending of the mail, to have it processed in a child process. Let that process handle the mailing, so your script can continue doing something else. If you don’t have the extra CPU power, or your host doesn’t allow forking, consider setting up a local relay server that simply accepts all mails, and sends them out to your load balancer as fast as resources allow it. It will speed up your PHP script, but will cause additional load on your local SMTP service.



Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.