There’s obviously a lot we don’t know about it, so this might be complete BS. But from they’ve said, the system as designed and built would have handled the expected load with no problems. Assuming that’s the case, the problem is not with the design but with the initial requirements.
It sounds great to build software to handle any unexpected circumstances, but it isn’t economically justified. For plain content delivery, an elastic platform is pretty simple (and for all I know, that part of the healthcare system might be on such a platform). But the problem with the system is in the registration process, which is much more complex to architect. And this is still speculation - maybe they did develop that on an elastic platform and did a poor job of it or didn’t have the hardware to expand to.
I fully support any criticism of their usage estimates. How anyone could have expected 50-60K concurrent users at the peak is beyond me. But if those are your marching orders and your system handles that level with no problem, I have a hard time criticizing your design.
I am an employee of CGI. I didn’t work on healthcare.gov, but I am part of the same project (we also do medicare.gov, mymedicare.gov, cms.gov, etc.) and I have worked with and spoken to several of the developers and architects on the project. We’re under a gag order so I can’t really talk about anything that isn’t already in the news. Not trying to be unhelpful, I just don’t want to accidentally say anything that I shouldn’t.
As to the complaints about user estimates, they estimated more than two times the maximum number of concurrent users that we have ever had on medicare.gov (30,000). Estimates of how many people the exchanges are estimated to cover are hard to find, but I was able to find one place that said 24,000,000 people (likely out of date, their estimate was from 2010). Since Medicare has nearly 50,000,000 beneficiaries expecting double the load of medicare.gov seems perfectly reasonable and will likely be where the load balances out over time.
Secondly, in my opinion at least, the root cause of all of these issues is the fact that the requirements changed so damned much. In July I spoke to one of the architects and he complained that the project requirements were changing daily. Three months before the public release of the software when all development should have been completed and in serious testing, the requirements weren’t even finalized. That’s no way to develop a dependable piece if bug-free software with dozens of integration points and a tight schedule.
Finally, while I can’t talk about how it is hosted, suffice to say that even in 2011 Amazon’s EC2 wasn’t unknown to the architects. I’m not sure if they ended up using it for anything but I really wish every armchair architect would stop flaunting EC2 as though we had no idea it actually existed and decided to host the servers in the dev’s closets.
I know this is meant in jest, but it points out a misunderstanding that many people have. You can’t just throw more hardware at an application like this and fix an overloaded site.
Simplifying this a bit: if I have an application that does nothing but present content, then adding hardware is pretty simple. When a visitor comes to the site, they can be routed to any available server. It doesn’t matter where the content comes from because it’s the same everywhere.
If I have an application that registers users or otherwise saves and retrieves data for a specific user, that approach doesn’t work. If I keep all the data in a single location, it will become overloaded at some point. If I keep all the data in one repository that is replicated across many locations, the traffic needed to keep each repository current is overwhelming.
The typical approach to this is to partition the data so each repository has a specific set of data. The partitioning can be in a variety of ways: maybe alphabetical (A-C on server 1, D-F on server 2, etc.), or geographical (more common in a scenario like this). However it is partitioned, the app must be designed so it knows where to get the data. This takes some work but it’s pretty common and not rocket science.
However, even this doesn’t give you the ability to dynamically scale up. If Server 1 stores data for California and it reaches capacity, I can’t just split it into two servers for Northern and Southern California until I move half the data into the new server and change my routing rules to know the new locations. Developing a system with a data store than can dynamically scale isn’t an insurmountable problem, but it’s complicated enough that you don’t do it unless you know you will need it.
And really, we don’t want the government spending Google-levels of money on their hardware and software.
My tax returns don’t go to a single location. It’s destination is based on where I live. The hardware should be set up like a database with physical entities broken down into manageable groups.
If major companies can work successfully online without shifting the planet’s axis then I don’t see why the US government can’t do it also.
If only there was someway to coordinate activities between the groups of people. I know it’s an untested concept, but perhaps a person or a group of people who could somehow oversee the activities of the groups and plan accordingly. Manage, if you will. Perhaps we could call these novel employees “managers” and the group “management”?
I’m sure the current system does that. I’m addressing the idea that it should be able to easily scale dynamically. To use your tax return example, if the post office got swamped by too many returns, they can’t just open another post office. The return envelopes already have the original address so there’s a lot of work needed to reroute half the mail.
Yes this is all possible. But it has to be planned for up front. Assuming it really does work at the expected user load, the system design and management isn’t where the problem is.
Yes, they can and do open up temporary post offices. I’ve seen them do it at large events. I’ve also seen phone companies bring in portable towers to handle the phone traffic of large events.
Yes it has to be planned up front. The same people who didn’t read the healthcare bill they passed are the same people running it. It’s a great number of people who were hired (voted in) based on popularity and not any actual skills deciding how things are done. what you end up with is this:
Comments I’ve read from people that were on the project said this was part of the problem, way too many managers and not enough people that understood how to design/build something like this.
Google spends a billion-plus dollars on infrastructure every quarter just to keep up with new demand. How much do you think they’d spend to start from scratch with a user base of 20,000,000?
Pretty odd question. They would spend enough to accommodate the user base. Of course, that’s assuming they know what they’re doing. Which apparently they do.
Just so we’re clear, you have no objection to the government spending a gigantic sum of your money so everyone can log on to healthcare.gov during the first week of operation?
Just so we are clear, the government spent a large sum of money getting the information out that you could log on to healthcare.gov on Oct 1st. It could hardly be a surprise that many did. If they couldn’t handle the volume why tell everyone it was available? How about staggering the enrollment by the first digit of your SSN last 4?
Exactly this. Building a system to handle a spike at the beginning just isn’t worth it. But at least make a plan on how to handle the spike that you know is coming.
Still, it isn’t surprising to me that it worked out like this. You have the hard core proponents with rose-colored glasses who are excited about this finally coming to fruition and might overlook the potential problems. You have the people deep in the weeds racing against the clock to finish their own tasks. You have people scared to raise a a concern with what is a very political subject. And you have opponents who wouldn’t really mind if the rollout isn’t smooth.
Well Loach, think of the money that idea would cost us? I’m going to go with NONE. That would have cost us nothing.
How much time did that take to think that up? Now imagine how many politicians there are who put together the program. They all make over $100,000 a year. Each one of them has a staff which certainly costs us over $100,000 a year. 535 members of Congress +1 for this dinner party of stupidity. We paid over $100 million dollars for a board of directors that were hired based not on any job skills but their own popularity. Almost none of them have any business or managerial skills and it shows.
The only amazing thing here are the people making excuses for this lot of rocket surgeons as if it’s the Super Bowl and we’re suppose to cheer for one of the teams.
I don’t think I’ve seen anyone here making excuses for the people in charge of rolling this out. I’m defending the architecture and design of a system that is supposed to support 60K users and supposedly does just that. You get no argument from me that the rollout of the system was poorly planned and executed (i.e., not planned at all). Application rollout is separate from design, and is unfortunately often left as an afterthought.
For actual figures, the project was estimated at $93.7 million, and has come in at $634 million. Slashdot
I don’t really get the States - Federal dynamic; but I don’t see why the States’ Exchanges were chosen by each State choosing it’s own provider: would it not be simpler to issue a boilerplate Exchange website from one provider that each State could theme as it wished, but have exactly the same functionality as the rest ?
The source for the main website is up on GitHub for those coders who are interested. The last commit is from three months ago; and for something running on Ruby that is apparently not a CMS, it is entitled CMSGov.
Mr. MacAfee, interviewed by Fox, not for his exploits in Belize nor yet for his much-loved anti-malware, got rather excited:
*For starters, McAfee said the way it is set up makes it possible for fake websites be set up to fool people to think they’re signing up for Obamacare.
“It’s seriously bad,” McAfee said. “Somebody made a grave error, not in designing the program but in simply implementing the web aspect of it. I mean, for example, anybody can put up a web page and claim to be a broker for this system. There is no central place where I can go and say, ‘OK, here are all the legitimate brokers, the examiners for all of the states and pick and choose one.’”* “Here’s the problem: It’s not something software can solve,” McAfee continued. “I mean, what idiot put this system out there and did not create a central depository? There should be one website, run by the government, you go to that website and then you can click on all of the agencies. This is insane. So, I will predict that the loss of income for the millions of Americans who are going to lose their identities — I mean, you can imagine some retired lady in Utah, who has $75,000 dollars in the bank, saving her whole life, having it wiped out in one day because she signed up for Obamacare. And believe me, this is going to happen millions of times. This is a hacker’s wet dream. I mean I cannot believe that they did this.”
That’s a question for the administration. This thread is about the vendors who developed the hardware and software.
ETA: That said, I don’t know why they couldn’t have done that either, but I also don’t know why people were logging on October 1st to buy policies that don’t go into effect until January.