I recently had a chance to speak with an engineer at the company that runs WordPress.com, the blogging platform that hosts almost 10 million blogs and serves over a billion pageviews a month. I used to use the precursor to the WordPress software for my personal blog, and continued to use WordPress as the blogging platform evolved and improved over the years. I’ve also managed a small WordPress multi-user install, serving about 300 blogs and 300,000 monthly pageviews at Harvard Law School.
What is remarkable about WordPress.com is that, despite being absolutely gargantuan, the technological underpinnings are very similar to the open-sourced WordPress code that is available for anyone to use. Unlike Twitter, with its well-known performance woes, or Facebook, with its huge interconnected network of users, WordPress, with it’s blog-centric approach, scales nearly linearly. With the exception of a few global features, like single sign-on, blogs are relatively self-contained. The WordPress crew wisely chose to move complex operations like search out of the core (utilizing Lucene), so that as usage grows, scaling is as easy as throwing more computing power and storage at the problem.
At the database level, they run stock MySQL, and not even the latest version, because their code doesn’t require anything more complex than simple SELECTs, INSERTS, and JOINs. Rather than attempting to optimize every layer with hyper-efficient C code, they instead cache content aggressively using the powerful Varnish reverse proxy.
As the WordPress platform has gotten more complex and plugins more sophisticated, I’ve had less need to actually delve into the code to customize my blog to my liking. Taking a look at WordPress 3.0, I see that things have evolved significantly from earlier versions. The developers have wisely focusing on beefing up the plugin API over the last several revisions. Because of this, WordPress.com coders are free to spend time developing cool new features and improving functionailty using the same modular plugin and theming architecture as standard WordPress. This in turn means that development is non-stop, and developers actually push out updates to the main WordPress.com code on a daily basis.
The WordPress.com approach is not appropriate for all, or even most, large-scale web applications. But it is instructive. Rather than spending huge amounts of time re-writing and hyper-optimizing, the WordPress crew focused on incrementally improving their core product, implementing common-sense technologies to simplify their traffic management, and building a solid foundation for continuous platform improvement. As a result, WordPress.com has grown to a top-10 web property in terms of traffic while keeping a staff of only 50 globally distributed employees.
All that, and the core product, the WordPress platform, continues to be free software overseen by a non-profit foundation and open to anyone. Pretty neat and, if you ask me, not a bad way to run a business.