I have finally made public a summary, which marks the completion of the original “independent” J2 Y-DNA project (http://j2-ydnaproject.org) and the beginnings of the new J M172 project. For those of you already in the FTDNA J2 group,.. you are automatically in the new project & the only difference you might see is that the results tables have been extensively rearranged ( and will continue to be rearranged over the coming weeks),.. and the fact that the new project has several admins.
The new focus of The FTDNA M172 group is collaborative research between all haplogroup J2 researchers.
The primary aim of the “j2-Ydnaproject.org” J2 project was to answer the question “Can extended haplotypes reliably predict what clade a haplotype belongs in”
The final answer to this question is: yes and no.
The original question can be interpreted in two different ways, and the overall conclusion for the two different versions of the question are different.
A)We can firstly interpret the question as being – can we use extended haplotypes to produce an accurate phylogenetic tree for haplogroup J2. The answer to this question is no, we cannot.
B)The second interpretation of the question is – can we used extended haplotypes to predict whether a haplotype belongs to a previously predefined haplogroup subclade?. The answer to this question is, yes we probably can (but with varying accuracy). Of course, to determine whether a haplotype fits within a predefined clade, requires extensive knowledge of known haplotype values for that specific subclade. Since haplotypes are unable to produce an accurate phylogenetic tree, we must rely on SNP’s to define the phylogenetic tree.
Basically, – haplotypes are very good at clustering very closely related haplotypes, and are very good at telling the difference between very distantly related haplotypes (eg. J2a v’s J2b) but they are not particularly good at telling us exactly how different haplotype clusters are related to each other.
It is very easy to tell the difference between haplotype clusters that are very different to what the ancestral modal might have been (an obvious example of this is J2a-L25) but not so good at distinguishing between haplotypes that are closer to the original modal, but are actually in different sub-branches of the tree.
This conclusion can be substantiated by three main forms of evidence (the first two are the most definitive:
A) The haplogroup J2 trees produced from SNP’s and haplotypes are in-congruent (ie. aren’t the same). Since December last year, the new SNP’s in the Geno 2.0 chip have provided results that have given far greater resolution to the J2 tree. Geno 2.0 testing has given us information about how clusters are related in a way that we would have never guessed from haplotype data.
B) If a phylogenetic tree for haplogroup J2 is made using character data and “bootstapped”, most of the sub-branches have probability values well below 10% (for a branch to be regarded as “accurate” it needs a probability value of at least 75%, preferably about 95%). The only tree divisions that bootstrapping indicates are reliable, are J2a v’s J2b, and for some very tight haplotype clusters (that probably have a common ancestor within a genealogical time scale). These results will in time be described more fully (and exactly what “bootstrapping is, will be explained). The consensus tree constructed from the bootstrapping results, is shown below (see website)
C) For several years I have tried to devise a method to produce an accurate phylogenetic tree from STR data, by trying to identify which markers are more diagnostic. Despite refining my methodology used, I have failed to create a method that produces phylogenetically accurate result. It is uncertain whether this is because it is impossible to create accurate phylogenies using STR’s, or whether the methodology still requires further refinement. It is now apparent that if it is possible to devise a method that will produce accurate phylogenies from trees, the only way such a methodology can be validated, is by testing the methodology with a phylogeny defined using SNP’s.
Over time, I will document the methodology that was developed.