Whether you’re a novice website owner or an expert backend web developer, you’ve most likely come across the dilemma of choosing the best type of web server to handle your website. In your search, you’ve most likely heard about Apache and Nginx more than other web servers available, which isn’t surprising; being the most common web servers used in the world due to their flexibility, power, and robustness, it’s not an easy task to choose one over the other.
Apache and Nginx share a lot of similar traits, which means that if you plan to build websites that won’t need to handle too much requests or won’t experience much user traffic, these two web servers can almost be interchangeable. However, it’s more than likely that you’re expecting your user base to grow quickly, which then makes the differences of Apache and Nginx more apparent. Each one will have its own strengths and weaknesses, and you’ll need to pick one that coincides with your type of website and expected user base. This article shall talk about Apache and Nginx in more detail, highlighting both their strength and weaknesses in various areas of web development.
A Quick Primer
Let’s have a rather quick introduction to each web server, along with an overview of its functionality and characteristics in order to help compare and contrast each web server better.
What is Apache?
The Apache HTTP Server is the most widely used web server software in the whole world, serving around half of the websites available online. Starting out as early as 1995 and quickly rising to become the dominant server for HTTP, it became the go-to web server software for websites big and small.
Apache was created by Robert McCool back in 1995 and has been continuously developed, maintained, and updated by an open community of developers with the guidance of the Apache Software Foundation. Being released under the Apache License, the Apache HTTP Server remains free to use, distribute, and modify.
One of the most important features of Apache is the ability to expand its functionality using custom, often open-source modules. Because of the huge number of websites running on Apache, as well as the large open source community working together to fix bugs and add functionality, Apache enjoys extensive documentation, widespread cross platform support, and up-to-date features.
What is Nginx?
As early as 1999, there arose an issue of having a large number of clients connecting to a single web server, possibly exhausting all the available resources. This was originally called the C10K problem, as it was about trying to handle 10,000 clients simultaneously trying to connect using a Gigabit Ethernet connection.
Even though servers today are much powerful than they were decades ago, client demand has also increased quite dramatically. Nowadays the Internet has become a common commodity, almost like oxygen. Devices are also continuously doubling or tripling in power and resolution, which means that the expected amount of data in a typical website exponentially increases. People are no longer content with plain HTML websites with a table layout; they want dynamic content, high-resolution background images, typography, and videos. With Internet users requesting huge streams of data coming from high definition images, videos, sounds, etc. from different websites, there are two ways to handle the increasing client demand:
1. Purchase more servers in order to handle the increased traffic
2. Switch to event-driven I/O
Number 1 may not be viable, as the costs incurred may be too high especially for startups. Number 2 is what led to the creation of Nginx.
In 2002, Igor Sysoev started the development of Nginx, hoping to improve the current web server architecture with an efficient thread management system. Traditionally, thread-based models had one thread for each client, causing an overload when too many clients are connecting at the same time. This is where event-based models come in. Event-based models use a more dynamic form of resource allocation that lets other processes use resources until a notification or signal is received to start or complete a different process. This helps servers consume less memory and CPU threads.
One of the biggest differences between Apache and Nginx comes from their architectures. In a nutshell, Apache has a process-driven architecture while Nginx has an event-driven architecture, but let’s take a closer look and see how their particular type of architecture either makes or breaks them.
Because Apache is process-driven, it’ll create a new process for every new request made. This can result in messy memory leaks especially with heavy user traffic. Fortunately, along with huge number of available modules that can be compiled statically into the server or loaded dynamically at runtime, Apache takes a step further in customizability by having a special-purpose module called the Multi-Processing Modules (MPMs). MPMs, unlike regular modules, can access the operating system without using Apache Portable Runtime (APR) libraries. MPMs are capable of implementing different server architectures at the system level, therefore allowing one to fine-tune the server for a particular infrastructure.
That said, it’s interesting to note that Apache did not have the power to multitask in the past; it was only after the creation of MPM that Apache was able to come close to the speed of Nginx. When the Apache 2.2 series were found to deliver static pages at a significantly slower speed compared to Nginx, the Event MPM was born, using several threads and processes per process with an asynchronous event-based loop.
Unfortunately, even with the Event MPM, Apache still slows down considerably when faced with heavy user traffic. Because heavier user traffic equates to having additional connections, based on Apache’s architecture, new processes and threads will be spawned and therefore more memory is consumed. When the system runs out of memory, the server eventually starts using the hard disk as swap, which causes a sharp decline in performance. Eventually Apache will reach a certain limit of processes as specified by the system administrator, and therefore stop any new connections from being made.
As we’ve talked about before, Nginx was the solution to the C10K problem that Apache is currently still facing. Unlike Apache, Nginx doesn’t make new processes for each web request. Instead, the administrator gets to decide how many worker processes should be created for the main Nginx process. Each worker process is single-threaded and can handle more than a few thousands of simultaneous connections.
Nginx is non-blocking, event-driven, and asynchronous. Non-blocking means that it won’t stop disk I/O even when the CPU is busy. Event-driven means that the flow of the program will depend on user actions, such as clicking or scrolling through the website. Asynchronous means that it can handle more than a single user request at a time.
Handling Static and Dynamic Content
The way Apache and Nginx handles static and dynamic content are surprisingly different, which is why you’ll need to decide whether your site is going to be predominantly static, dynamic, or a mix of both in order to determine what web server software is best for your site.
Since Apache creates new a new process for every request made, it uses quite a lot of memory when serving static pages while being rather slow in doing so. Apache can also serve dynamic content simply by attaching a processor of a particular language into the instances of its workers. This allows seamless execution of dynamic content within the web server itself without needing external components.
Apache makes it easier to handle dynamic content because there’s no need to purchase and install additional software; modules can be dynamically loaded and unloaded as necessary.
Nginx uses an insanely low amount of memory when serving static pages. It also serves static pages considerably faster compared to Apache. Unfortunately, being relatively new web server software, Nginx cannot process dynamic content natively as of the moment. Requests for dynamic content must be passed on to an external processor. This would mean that Nginx would have to communicate with http, FastCGI, or any other protocols it supports. What further complicates matters is that for every call to the processor, an additional connection will be used. The benefits of the lack for native support for dynamic content, however, is that the overhead caused by the dynamic interpreter will only be present when serving dynamic content. Serving static content won’t require any form of communication with the interpreter.
Documentation, compatibility, and support from the community often come with maturity. Apache, being the most popular will naturally have more extensive documentation, compatibility, and support. However, this isn’t to say that Nginx has inadequate support. In fact, because of its exponential increase in users, the documentation has been quickly filled out, with more and more third-party support coming in as well.
Why Not Both?
As it turns out, one can actually use Nginx with Apache by placing Nginx as a reverse proxy in front of Apache. This’ll allow Nginx to handle requests from all the clients, therefore preventing Apache from consuming all the memory creating new processes and threads.
Nginx will handle static content, but for dynamic content, Nginx will proxy the request to Apache. In a sense, you’d be using Nginx as a front line server, letting it handle all the requests that it can handle natively, passing off ones that it can’t into its older sibling, Apache.
Nginx and Apache are both great web servers, however, as the user traffic increases in your site, their differences become more apparent, making it even more important for you to choose carefully between the two. As of the moment, there’s no clear winner between Apache and Nginx, so choose the one that best coincides with the site you’re making and the users it serves.