Healthcare.gov, the US Government’s health insurance exchange website for states that didn’t provide their own, was supposed to handle about 50,000 to 60,000 simultaneous users. Unless you have been living under a rock, you know what a complete disaster the website as been. But it didn’t (and doesn’t) have to be that way.
In most big cities you can stand in front of one Starbucks and hit another Starbucks with a rock if you have a decent arm. Why are there so many Starbucks? Well, imagine 500 people showing up at a single Starbucks at the same time for their morning coffee. You’d have a lot of unhappy people. Instead, Starbucks handles this by having multiple locations. The more people in an area, the more Starbucks locations to balance the load.
Healthcare.gov is both a website and a web application. The website provides information about finding insurance, example rates, etc. The web application lets you create a user account and apply for insurance. People aren’t having a problem with the website. It’s the web application that’s not handling the load. Why? Let’s look at a simple example of a web application connected to a database. When the user enters the domain name, the domain’s DNS Server (Doman Name System Server) redirects the user to the application running on a server somewhere. That application then connects to a database running on the same computer or perhaps a different computer.
For an application that’s not going to have much traffic, this is fine. This is like a single Starbucks location. A typical Starbucks serves 500 drinks a day but a single location can’t handle 500 people showing up all at once. A single copy (referred to as an “instance”) of a typical web application can handle 300 or so simultaneous users. But healthcare.gov needs to support 50,000 or more simultaneous users. To do that, like Starbucks, it needs more locations. In the case of a web application, that really means more instances of both the application and the database.
First, we to determine how many instances we need. Let’s say the web app is really well-written so a single instance can handle 500 simultaneous users. That means to handle 50,000 users at the same time, we will need 100 instances of the app. Those 50,000 users need to be spread out over those 100 app instances. That’s where a load balancer comes in. A load balancer is an application that can direct traffic to several application instances. And depending on the amount of traffic, you might need more than one load balancer. Fortunately, a single domain can be pointed at multiple load balancers.
In the diagram above, you see there can be any number of load balancers directing traffic to any number of application instances. So the more traffic your app has, the more load balancers you may need in addition to more application instances. These application instances need to connect to a database to store user accounts, insurance rates and other information. But if we connect them all to a single database, we will have the same problem with which we started. A single database can’t handle 50,000 users at the same time.
We need to use more than one database. That means spreading out the data across many databases. How you decide what data goes in which database depends on a lot of different factors. If 26 database servers would be enough, then sake of simplicity let’s say there was a separate database for each letter of the alphabet. If your user name begins with “A” your record goes into the first database. If it begins with “B”, it goes into the second database, and so on.
Like distributing traffic to the application instances, we need something that will distribute the traffic between the app instances and our 26 databases. This is were a Database Proxy application comes in. It’s like a load balancer but with some additional logic to help find the right database in which to connect. And just like a load balancer, if you have a lot of traffic, you may need more than one Database Proxy application.
As you can see, making an app scale up to 50,000 users does require some additional software in the form of load balancers for the applications and databases, as well of course the additonal hardware upon which all of this will run. After all, no single computer has enough horsepower to handle 50,000 users at the same time. So many of the load balancers, application instances, database proxy and database servers are going to run on separate computers. But this isn’t rocket science. It’s computer science. 🙂
Seriously, it just takes some planning and testing to find out how much traffic each instance can handle. But once you know these numbers, it’s basic division to figure out how many instances you need. If you don’t have a lot of traffic at first, you can start with a few instances and then add more as your traffic grows. Once you have the right architecture in place, scaling the app to handle 50,000 users mostly requires just having more computing power to run all the extra instances.
At Xojo, Inc., we make Xojo, an application development tool that makes it faster and easier to develop web applications than with any of the traditional web technologies. Our users have to deal with application scaling just like this if they have a lot of traffic. It can be done. Each year, users come to our annual conference to learn about techniques like this and more. Our next one will be this coming March in Las Vegas.
Once the developers behind healthcare.gov put the right architecture in place, it will handle the load required. It’s just a shame they didn’t do this from the get-go. According to the healthcare.gov progress report, the site simply did not have enough application and database instances (among many other things), and had the wrong architecture as well. But it gets worse. Amazingly, the ACA Testing Bulletin which comes from the House of Representatives Oversight Committee, shows that the day before the launch, the web app could only handle only 1100 users. The bulletin also shows that in the first days after the site launched, they were planning to stress test to see if the web app could handle 10,000 simultaneous users. No, you don’t have to read that again. That’s AFTER the site had already launched! And their target was 50,000 users. Wow.
Do yourself a favor. If you think your traffic is going to grow, be prepared to scale your site and web app from the beginning. You will save yourself a lot of headaches and angry users. Then you can sit back and enjoy your Grande, Iced, Sugar-Free, Vanilla Latte With Soy Milk.