Can we uniquely indentify people with a subset of their DNA?

Is there a way to collect select DNA info from a person, such that it is useful for uniquely identifying people, but contains only a very limited subset of a person’s DNA, and so no other information about that person (like susceptibility to heart disease, schizophrenia, etc) ?

No.

Every cell contains the entire DNA coding for your genetic makeup.

I know that.

Maybe I should have phrased it:
“Can we store select DNA info from a person, such that it is useful for uniquely identifying people, but contains only a very limited subset of a person’s DNA, and so no other information about that person (like susceptibility to heart disease, schizophrenia, etc) ?”

That is, collect as much DNA info as you want, but store only indentifying info, and discard the rest.

The less DNA you have, the greater the chance of a mismatch. Remember, even having all 23 chromosomes doesn’t guarantee that you won’t make a mistake–rather, at that point, the probability of a mismatch is so low as to be virtually zero.

So that’s the question, then: How high a probability of a mismatch is acceptable?

I guess the question is, how does the probability of a mismatch vary with the amount of DNA info you have?

I can’t imagine they actually store the entire genome of every person whose DNA is collected. Perhaps they store the samples or something.

The DNA matching tests use gel electrophoresis techniques. Basically DNA segments are cut-up and drawn through the gel by electric current. Different people have their DNA cut up into different sized chunks, and so this process produces a unique pattern of lines when stained and viewed under UV light. (Google gel electrophoresis for more info).

So basically, this pattern is useless for actually determining useful information about the person’s genes. Gel electrophoresis is the most common (AFAIK only) process used in identifying and matching suspects with crime-scene evidence. Any national database containing only this information would have no other info regarding any given individual sample.

Perhaps the original samples are stored somewhere and could be used for further tests, but that would likely be a more local thing than any sort of national database.

So the “identification and no other use” criteria might be easily met by techniques and systems already in use.

This is already how it works. DNA matches are done based on STR(Short Tandem Repeat) sequences at multiple loci(points on a DNA strand). If a sample from a suspect has this repeating sequence in a certain spot on chromosome N and it matches the sequence in the same spot on chromosome N of the evidentiary sample, then that is one point of identity, but not conclusive. Enough points of identity, generally 13 IIRC, and you can say that the odds of having the same repeating sequence in the same loci on the same chromosome plus all the other repeating sequences in the same loci is extremely improbable. A brief intro to DNA matching for non-Scientists has this to say about modern forensic DNA tests.

So, by design, the tests only check the designated loci for the sequences that exist there. It is not routine, in my knowledge, to test the DNA samples for other information, such as a sequence which would indicate certain medical conditions.

This parallels fingerprint testing where “points of identity”, the most unusual characteristics of a fingerprint, are compared and the rest of the pattern is ignored(PDF). Enough matches to points of identity and you have a positive fingerprint match.

Enjoy,
Steven

Mtgman has covered it pretty well, but I figured that being a forensic scientist in training I might have something to add … maybe … :slight_smile:

The strategy that I would propose here would be to take the sample, and then immediately profile it, recording what proprietary kit was used to obtain the profile. “Real-time” PCR allows you to quantify the DNA as it’s being replicated, so you would be able to see immediately if the DNA extraction and amplification processes worked; if they didn’t, then you’d do it again, hoping that “second time’s a charm”. Once you’ve obtained your profile - which is just a computer file with a set of peaks that are labelled with identifiers telling the analysts what variation of the STR you have - they could then destroy the remaining DNA sample.

In the US, 13 locations of the genome are profiled - one sex-specific locus, and 12 STRs. In other countries, only 8 STR locations are used. So the record would contain your name, probably DOB and such, your sex, and your variations on the STRs. No identifying information other than that.

If this was done to, hell, everyone, I wouldn’t mind. The state can’t clone you from the DNA they replicate during the PCR process, as it’s only a small portion of your genome and doesn’t do anything. They can’t identify anything like diseases or hair colour by your profile.

How I imagine the database would work - DNA from a crime scene is profiled, the profile is searched against the database of all profiles. This will produce, ideally, one match. You then would use this information to obtain a warrant for that person’s DNA again - so you could reconfirm the match is true. It’d be very similar to how we use fingerprint databases now. This extra step would be very important if the database contained records only from people who had been charged with or convicted of a crime, as just going into court and saying “well we found his/her DNA profile on this database of charged/convicted criminals…” is, as far as I’m aware, not allowed as it prejudices the jury against the defendent (by revealing that s/he has had a past criminal history).


On a practical note, forensics is looking to cash in on the phenomenon of “multiarrays”, which may expand our DNA analysis capabilities significantly, as they can be used to probe up to 400000 locations in the genome. (They’re currently being used to get a “picture” of the active genes in cancer cells, with the hope that information about the pathology of the disease can be obtained.) These microarrays, if their potential was fully realised, COULD tell us information such as hair colour (some forensic scientists have already determined a single base mutation that may either cause red hair or be highly associated with it). This would be ideal because these microarray chips could be used at a crime scene to develop a ‘picture’ of the suspect based upon this profile. However, this is a long way off at the moment, and so long as the law regarding DNA databases mandated that no identifying portions of the genome besides a sex marker are stored, the database could still be useful - and not infringe anyone’s civil rights - even after the microarray technology comes to fruition.


Also, Mtgman, we don’t use “points” for fingerprint identification anymore. I think it was 1983, or 1993, that the International Association of Identification determined there was “no scientific basis” for their use. The main advantage here is that we can use more advanced methods of comparison (ridgeology - defining and matching the shape of the outline of the ridge - and poreology - matching up the locations of the pore voids in the fingerprints), and that we can make matches when our scene sample is only a partial fingerprint, which may not be large enough to get 12 “points” as they are normally defined, but can still be unique enough to allow individualisation.

About the only thing we use “points” for is when we demonstrate our findings in court. It’s easier for a layperson to look at a chart and see 12 points that match up, rather than carefully scrutinising the print to compare ridge concavity/convexity and precise pore arrangements.

Interesting PDF file, though!

Thanks for the info Caiata, and the correction on predominant current fingerprinting techniques. I’ve researched how DNA testing is done more recently than the last time I looked up fingerprinting so the state of the art must have passed me by(story of my life). I must have biased my search for fingerprinting technique cites by throwing in keywords specific to the technique I remembered (points, identity) versus just looking for the most common modern method. Thus I got a document which shows a method of using points of identity and was blissfully ignorant of the fact that the technique has fallen out of common use.

Enjoy,
Steven

No worries - the PDF file you found wasn’t really related to police work/forensics anyway, it looked more like stuff for fingerprint-keyed security systems, and for all I know they might find it a lot easier to compare points in that industry. :slight_smile: