Careful about the ideal being the enemy of the attainable. Yes, obviously, if we could randomly assign the members of 500 twin pairs as either the space-living or ground-living twin, that would answer lots of questions about the effects of microgravity, radiation, etc. on the human body. That is not going to happen, so we have to take what we can get.
In this case, it has always been possible to look at the changes in an astronaut between pre and post space living. It is also easy to look at changes between people who live in space, and people who spend the same amount of time not living in space. What has not been done before is looking at these changes in the same genetic backgrounds—identical twins.
The identical twins are matched for genetics, age, upbringing, and tons of other things. They are also unique in many ways. One of the ways they are unique is in what they were doing for the time period in question; one lived in space and the other didn’t. It is extremely reasonable to conclude that living in space is responsible for some or all of the physical changes observed.
So, is the study flawed? Yes, absolutely, all studies are flawed. The question is, despite the flaws, does it provide useful information? In this case, the answer is yes it does. Maybe not because it is large enough to draw generalized conclusions, but because it is the first of its kind, and controls for things that previous studies have not been able to control for.
ETA plurality:
As somebody who works with twin data frequently, this is what I would say:
A member of a twin pair. (1 person)
A pair of twins. (2 people)
That family has twins. (2 people)
This is the twin data, it comes from MZ, DZ, and opposite sex DZ twins. (many people)
One of the twins from that family is scheduled for an interview. (1 person)
Please spend several pages talking about the appropriateness of data, datum, twin, and twins.