The (unexpected?) workload associated with migrating to HHVM

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Profile image of Mattias Geniar

Mattias Geniar, December 30, 2014

Follow me on Twitter as @mattiasgeniar

Wikimedia, the company behind top #10 website Wikipedia, published an interesting article on how they moved from vanilla PHP to HHVM, Facebook’s implementation.

HipHop Virtual Machine, or HHVM, reduces the median page-saving time for editors from about 7.5 seconds to 2.5 seconds, and the mean page-saving time from about 6 to 3 seconds.

Those are impressive numbers. And they match the performance benchmarks of HHVM I ran a few months earlier. HHVM is faster than PHP, because it uses an entirely different engine to compile PHP code.

But besides the obvious 2x performance improvement, they also saw an impressive 40% CPU usage drop on all servers.

The CPU load on our app servers has dropped drastically, from about 50% to 10%. Our TechOps team member Giuseppe Lavagetto reports that we have already been able to slash our planned purchases for new MediaWiki application servers substantially, compared to what would have been necessary without HHVM.

Too bad for the hardware vendors. Very impressive for the Mediawiki foundation. But what surprised me most, was the workload associated with moving from PHP to HHVM.

Overall, HHVM’s compatibility with PHP is quite good, but with a code base as large as ours, there were of course edge cases. Fixing these small issues required a substantial amount of work.

HHVM does its best to act as a drop-in replacement for PHP, but some features had to be dropped (due to performance reasons), which may require code rewrites. The “substantial amount of work” to move from PHP to HHVM comes down to 6 months of work, which is a lot.

It isn’t just code, but deployment tools, Puppet modules, monitoring, … Changing the engine of a car requires more than just removing the old one and installing a new one. There’s knowledge of the internals (in terms of HHVM code), internal (to the organisation) knowledge sharing, …

I must admit I was surprised to see the transition took 6 months. Granted, this included rewriting custom PHP extensions (in C) that were created to speed up PHP processing of Wikipedia, which had to be converted to HHVM (and ended up contributing 20k lines of code back to HHVM). For most transitions, this kind of work won’t be necessary – we don’t all run applications as complex as Mediawiki. But it does go to show that HHVM isn’t just a “install and forget” part of the stack.

I hope to see more HHVM in 2015, but the switch shouldn’t be taken lightly. A move like this will never be as simple as removing PHP packages and installing the HHVM ones. There’s a lot more to it, when changing software stacks.

But the results speak for themselves. For Mediawiki, I’m sure the 6 month development trip is worth it, in terms of cost savings for their hosting, hardware and maintenance (with 10% CPU utilisation, I would assume part of the current hardware can be shut down?). And now that they’re on HHVM, they can use unique features such as the asynchronous PHP code.



Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.