PHP forking
Since I’m experimenting with a high-performance SMTP cluster in my lab, I needed to find out which was the most effective method of sending mails via PHP, to get the most out of my setup. This allowes me to pinpoint bottlenecks, and find the best way to deliver e-mails.
First, let me clarify the set-up I’m now using, so you can better understand the following benchmarks.
I have a HAProxy server, which balances the load between 3 SMTP servers that reside behind it. All SMTP connections are therefore handled by the HAProxy, which finds the best available host, and forwards the SMTP request for further processing. For all the tests I’ve done, all traffic was being passed through this load balancer. The only difference was in getting the mails to that balancer: directly, or via the local server? By forking each mail action or by sending them sequentially? By directly accessing the load balancer, or by routing them via localhost first?
This is the small PHP script I used to send the mail. I’ve used the same script for all my tests, to keep it objective.
send($recipients, $headers, $mailmsg);
/* Mail was sent, if this was a forked child process, end it here */
if (!defined('DONT_EXIT'))
exit();
?>
The tests were all done for sending 1.000 mails, via each method.
First test: sending the mails sequentially (one at a time).
I’ve split this up in 2 ways:
- Sending the mails directly to the Load Balancer
- Sending them to the localhost relay server (Postfix)
I modified the script above (send-mail.php), to change the SMTP-host to either “localhost” (for local delivery), or the load balancer’s hostname. This is how I looped them.
send-mails-not-forked.php
define(‘MAILS_TO_SEND’, 1000); // How many mails to send, in total?
define(‘DONT_EXIT’, true); // Needed to keep the include from exit()-ing
for ($i = 0; $i < MAILS_TO_SEND; $i++) {
include(“send-mail.php”);
echo “."; // Some output, to show something via terminal
}
echo “\n”;
?>
OK, enough talk. Let’s get benchmarking.
Via localhost relay server
This local server was set up as a relay SMTP host, which forwarded all received requests to the load balancer.
This turned out to be very effective. The local server accepted the mails very fast, and placed them into the local queue to process them one by one. By placing them in that local queue first, PHP was able to “send” those messages quickly – since the receiving mailserver (localhost) accepted the message, PHP considered it “delivered” and proceeded to the next one.
When PHP’s job was done sending all mails to the local queue, the postfix queue manager stepped in to process all those mails, and deliver them to the load balancer. From PHP script to Load balancer, this took 56 seconds to send 1.000 mails.
Directly talking to the load balancer
Using PEAR::Mail functions, the PHP script was communicating directly to the Load Balancer to deliver the mails. This ment the mail was never passing the local relay server.
By avoiding the local mail queue, I had assumed it would cause a noticeable difference, but as it turns out, it doesn’t. By sequentially sending all mails directly to this load balancer, it took a total of 58 seconds for 1.000 mails. That’s in fact slightly slower than sending them via a localhost relay.
Here are the results plotted out.
The graph above shows the result of sending 1.000 e-mails this way. The first bar consists of 2 items:
- Green: timing to deliver all mails via the PHP script to the localhost relay: 6 seconds
- Red: timing to deliver all mails from the local queue, to the load balancer: 50 seconds
The second bar shows how long it took if PHP sent the requests directly to the load balancer. By routing the mails via localhost first, it took 56 seconds, as opposed to directly talking to the load balancer, which took 58 seconds. The difference is hardly noticeable.
If you were limited by these choices, I would opt for the second one: having PHP talk directly to the load balancer. It avoids placing a (small) load on the local mailservice, and “offloads” that task directly to the other systems.
Second test: forking the actual sending of the mail
The problem with sending mails sequentially, is that PHP is forced to wait until the send() function is completed, before it can continue. This means you’re effectively only sending mails one by one, instead of sending more in parallel. This is where forking comes in.
By forking the PHP process – actually copying the process into a new, unique and independant process – we can avoid having to wait for each send()-call to complete, and continue with spawning more child processes. This causes multiple SMTP connections to be made, which should be faster than sending them one by one.
This is the script I used to fork the processes. I built in a “limiter”, to only allow 128 simultaneous child processes, as it could otherwise crash your server by spawning too many (uncontrolled) processes. This would, of course, need further finetuning if you were to use this in real-life.
$value) { pcntl_waitpid($arrChildPids[$key], $status, WUNTRACED); } return true; } /* Loop our mailscript several times */ echo "Start spawning all child processes.\n"; for ($i = 0; $i < MAILS_TO_SEND; $i++) { /* Fork the process, so it runs independenly The parent process has the Process ID in this variable whereas the child process does not. This allows us to know if this process is the parent, or the child */ $arrChildPids[$i] = pcntl_fork(); if (!$arrChildPids[$i]) { // This gets executed as the "child" process include("send-mail.php"); } /* This keeps output flowing Simple way of seeing it's still running, and prevents SSH timeouts */ echo "."; /* Simple "garbage" collection. Don't have more than 1k simaltaneous children */ if ($i != 0 && $i % MAX_CHILD_PROCESSES == 0) { echo "\nRecycling children after ". MAX_CHILD_PROCESSES ." spawned children.\n"; waitForForkedChildren($arrChildPids); echo "Done. Resuming sending now.\n"; /* Clean up our array of Child PIDs */ $arrChildPids = array(); } } echo "\nAll children spawned.\n\n"; /* Check to see if all our child processes ended already */ echo "Waiting for all spawned child processses to end.\n"; waitForForkedChildren($arrChildPids); echo "Done.\n"; ?>Via localhost relay server
Sending them in parallel (forking processes) caused a considerably larger load on the sending webserver. As it turns out, it’s slightly slower to send the mails to the localhost relay for queue-ing, but Postfix manages to send them more efficiently as it receives multiple mails at the same time.
In total, sending these mails by forking processes, and using the localhost relay, it took 52 seconds to send 1.000 mails.
Directly talking to the load balancer
If we let PHP talk directly to the load balancer, without passing via localhost, it took a total of 43 seconds to send these 1.000 mails. In a graph, it would look like this.
Conclusion: Fork or use local relays
Conclusion: Fork or use local relays
In conclusion there are a few things still left to mention.
- Forking causes a serious increase in server-load for PHP
- Using local relay adds another possibly failing service
If you have CPU power left to spare, I would seriously consider forking your actual sending of the mail, to have it processed in a child process. Let that process handle the mailing, so your script can continue doing something else. If you don’t have the extra CPU power, or your host doesn’t allow forking, consider setting up a local relay server that simply accepts all mails, and sends them out to your load balancer as fast as resources allow it. It will speed up your PHP script, but will cause additional load on your local SMTP service.