Skip to main content
opinion

Xuhua Xia is a professor in the Department of Biology, University of Ottawa, specializing in genetics, genomics, bioinformatics and molecular evolution.

Putting names to Canada's nameless through DNA forensics and a national DNA-based databank would represent a major advance in missing-persons cases and in identifying remains. The construction of a useful DNA database requires substantial resources and require broad support. There is a reason that forensic scientists have tried to sell the story involving the Russian Romanovs many times in an effort to woo public support. The creation of such databases U.S., U.K., New Zealand, and France represented major joint efforts involving the public, government and scientists.

Running a successful system would involve some basic requirements.

First of all, the DNA databank needs to have two types of DNA data. The first type is a catalog of genetic diversity of the Canadian population (genetic diversity is typically characterized by the relatively fast-evolving maternally inherited mitochondrial genome and the paternally inherited Y-chromosome). If a large collection of mitochondrial DNA sequences from various groups of aboriginal people are all identical, which indicates that all have been descended from a recent ancestral mother (i.e., the contemporaries of this "Eve" did not leave any offspring that survives to this day), then sequencing the mitochondrial DNA of a nameless aboriginal will not help putting a name to the nameless since his/her mitochondrial DNA will match those of all aboriginals.

Similarly, if all males have descended from a recent ancestral father (i.e., descendants from other contemporary men did not survive to this day), then again the Y-chromosome will contribute little to putting a name to the nameless because the DNA sequence from the Y-chromosome of the nameless will match those of all males.

Thus, a DNA database of genetic diversity is essential for us to decide what DNA data to collect and whether DNA forensics will be useful at all in a particular population for putting a name to the nameless. It will tell us which genetic markers are variable and useful for DNA forensics.

A half-baked database will provide only fussy or even misleading guidance. However, a database of genetic diversity in the form of DNA profiles would be quite useful in basic and health-related research (e.g., to address questions such as what disease genes are shared by how many Canadians, whether such genes are increasing in frequency leading an overload of our medical system, etc.). If kept public, such a database will grow and mature by itself.

The second type of data in the DNA database would come from close relatives of the missing person. The DNA from these close relatives must have some unique features expected to be shared among the close relatives (including the missing) but not shared among the general public. High genetic diversity in the population means little effort is needed to find discriminating DNA evidence. A low genetic diversity means that we need to use many loci (in the case of DNA fingerprinting or sequence very long DNA sequences to find discriminating DNA evidence). In this latter case, a half-hearted effort will fail to put a name to the nameless, and the value of the DNA testing is nil.

One difficulty with identifying aboriginal women is that they are quite closely related when compared to other human populations, even to North Asians. In the extreme case, suppose we have 26 genetically identical people named A, B, C, …, Z, with A and B being brothers and C and D being brothers. Also suppose that B and D were dead and reported missing by their respective brothers A and C. Also assume that we are in an ideal world of forensics so that the remains of two bodies were found and DNA sample obtained. Will we now be able to put name B to one body and D to the other only by DNA forensics? The answer is no. For the same reason, when one of the two identical twins committed a crime, it is almost impossible to pinpoint which twin was at the crime scene based on DNA forensics alone. Of course, by "almost impossible" I mean that if we screen the entire genome of the two twins (about 3 billion bases) in various cell lineages, we might actually find a few somatic mutations that discriminate the two twins. However, such an effort would be inhibitively expensive. A DNA database will allow us to identify population-specific genetic markers for forensic purposes and facilitate cost-benefit analysis.

The new databank set to begin running in 2017 will tell us how genetically diverse the aboriginal people are, what genetic markers can be used in DNA forensics, and if DNA testing will get us anywhere closer to identifying the missing. If we have profiled aboriginal people in different places in Canada and derived a set of variable genetic markers, then we should be able to put names to the nameless quickly and efficiently. The new databank, if made public, would also help us understand aboriginal ancestry, their genetic diseases, and their migration patterns from time immemorial.

Interact with The Globe