My experience with testing data collection is maybe 10 people entering valid and invalid data, testing the data entry screens, trying to trigger validations that should pop up, etc.
I don’t know off-hand how many people were trying to use healthcare.gov at any one time, maybe in the high 10,000s to low 100,000s?
How does a company usually test for volume like that? Do they do it with actual people or is there a way to test it programmatically, or do they just hypothesize based on this, that, and the other thing (the proper amount of hardware has been dedicated to the project for instance), that if the basic functionality works there shouldn’t be a big problem with increased volume?
You use software to simulate high user load. Just write a bot that goes through the signup process (or whatever you’re trying to test) and set it loose, with however many you need running in parallel.
Stress, scale, and performance testing is typically done through automation. Ten people signing in simultaneously might be enough testing for your Justin Bieber fan website, but it’s not good enough for a service that might get millions of clients.
It totally is not good enough to look at your data center and figure you’ve got enough resources to handle things.
Loadrunner and TPF are two programs that simulate load, and there are surely many others. However, it was reported in the news that the launch date for the site was rushed which didn’t leave them with much time for testing.
You can also hire virtual cloud-based machines to do your load testing. For example, Amazon Web Services offers a service that simulates hundreds or thousands of users hitting your site at once.