Czech Y-DNA: Haplogroup R1a

The R1a haplogroup is the most numerous in the Czech Y-DNA Database, it includes one third of all our records. In this year the number of our R1a haplogroup profiles exceeded one hundred, and also the number of profiles having 25 markers identified is close to fifty. This enables us to make a study of relations between individual profiles in dependence on a number of used markers by using the Median Joining Network method. We consider this very important, because the most of the Czech records are "short". Discovering eventual relations and regularities applied to the long and short profiles helps us to reveal the inner structure of a haplogroup and associating of profiles into the groups of identical character (and above all of the same origin) already with knowledge of a lower number of profiles. The R1a haplogroup, or R1a1, as the case may be, has actually no official inner structure, unlike the R1b group, which is with help of till now discovered SNP mutations divided into circa 20 subgroups.
The lack of knowledge of the R1a group´s inner structure primarily consists in a fact that it is only a little represented in world databases and an official ressearch isn´t therefore exposed to such a demand as it is for instance in a case of the R1b group as mentioned above. A spread of the R1a haplogroup in Europe is uneven, it dominates in Eastern-European states and therefore it is often connected to so called slavic ethnic and origin. This is at least a naive opinion for on a recent example of a haplotype from the Eulau burial ground 4 600 years old we made certain of at least one part of the European R1a occured in the Middle-European region in the period at least twice as old as is the time of the Slavs´ rise. It is also well-known that among the inhabitants of Middle-Asian countries the R1a is even much more represented than in the countries of Eastern Europe.

On the first picture (click on the picture to show it in full size) we can see the Czech and Slovak haplogroup R1a profiles displayed at resolution of 17 markers.
Profiles from the Ysearch have a country of the origin-code prefix in front of their ID. The Network program cuts the ID to 6 characters. Such a ID is - with some exceptions - explicit.

The profiles from our databases are marked light-green, dark-green are the profiles from the Ysearch which should have Czech ancestors, yellow are then the other Ysearch profiles having the same markers identified. Profiles from archaeological researches are marked brown,
LC-R1a is the profile detected in the Lichtenstein cave 3 000 years old, the profile from the Eulau burial ground is 4 600 years old.
These two prehistoric profiles are replenished byl modal values of missing markers. Other colors will be discussed later.

There were weights used for graph computation (for the first time on Genebaze) considering different mutation speeds of various markers. A great number of worldwide laboratories works on determining markers´ mutation speeds. The slowly mutating markers are given a higher weight than the those mutating quickly. Graphs, where different markers have the same scale, are usually very complex for the quickly mutating markers bring in a noise and false affinities. An example shows the same data processed on identical mutation speeds:

A graph is very well-arranged, in many of its parts we can see well definable partial branches.
In the next part of the text we will observe whether these groups can be seen in graphs with higher resolution. If it is so, we will try their closer determination.

Processed 7.4.2009, 183 records, haplogroup R1a, 17 markers.
DYS19, DYS390, DYS391, DYS392, DYS393, DYS385a, DYS385b, DYS437, DYS438, DYS439, DYS389-1, DYS389-2, DYS456, DYS458, GATA-H4, DYS635 and DYS448.
In the next part we tried to create the same graphs for higher numbers of the markers, i.e. for 25: and for 56 markers: .
Both graphs are again arranged to show particular groups.
Processed 7.4.2009, 436 records, haplogrup R1a, 25 markers.
DYS19, DYS390, DYS391, DYS392, DYS393, DYS385a, DYS385b, DYS437, DYS438, DYS439, DYS389-1, DYS389-2, DYS456, DYS458, GATA-H4, DYS448, DYS446, DYS444, DYS388, DYS426, DYS481, DYS449, DYS447, DYS459a and DYS459b.
Processed 7.4.2009, 396 records, haplogrup R1a, 56 markers.
DYS19, DYS390, DYS391, DYS392, DYS393, DYS385a, DYS385b, DYS437, DYS438, DYS439, DYS389-1, DYS389-2, DYS456, DYS458, GATA-H4, DYS448, DYS446, DYS444, DYS388, DYS426, DYS481, DYS449, DYS447, DYS459a, DYS459b, DYS454, DYS455, DYS460, YCAIIa, YCAIIb, DYS607, DYS576, DYS570, DYS442, DYS531, DYS578, DYS395A1a, DYS395A1b, DYS590, DYS537, DYS641, DYS472, DYS406S1, DYS511, DYS413a, DYS413b, DYS557, DYS594, DYS436, DYS490, DYS534, DYS450, DYS520, DYS617, DYS568, DYS487, DYS572, DYS640, DYS492 and DYS565.

In the graph for 56 markers there are few groups of haplotypes highlighted of which we presume they could be localized even in graphs of a lower resolution. With some exceptions, profiles from our databases are in a focus of our interest.

A profile marked red is a theoretical modal profile od all R1a1 from the Ysearch up to the year 2007.

We tried to connect a group V with a group of haplotypes already known abroad before, which are marked by having the marker DYS388=10. The next picture shows a detail of this group.

On this detail there is depicted a point defining a group of profiles having the marker DYS388 equal 10. An attempt to mark all these profiles on graphs with lower number of markers failed - the profiles did not form any clearly visible groups. The only exception represent the profiles marked right here. Herein defined group V is therefore just a part of a group of profiles defined with help of the marker DYS388.

A group N is of a great importance to us, for it seems that one third of Czech R1a haplotypes belongs to it. We don´t know yet any objective criterion for its determination so far, or perhaps only that with a group P it creates areas in graphs having higher values of the marker DYS481, namely 25 and higher.

A letter E does not mark any group, it is only an attempt to mark a probable position on graphs with a higher number of markers where a haplotype from Eulau could be located. The letter E marks here the profile closest to those which are again the closest to the profile from Eulau on a graph of 17markers.
Alike V a group S is well known from former researches of the R1a haplogroup. It represents a group of haplotypes having values of the marker YCAIIa,b equals 19,21. It is known for many years that these haplotypes can be found only in Britain and Norway.
A group F could correspond with the third most numerous group of Czech and Slovak profiles from the graph for 17 markers.
Details of the groups S and F:

Now we will look at how these groups marked above can be seen on graphs with lower resolution.
As it is said in an introduction, finding out some rules or features would help with classifying into groups even those haplotypes with only a small number of markers known so far.

A graph with resolution of 25 markers:
In this graph groups N and P form large, clearly definable separate parts. Again the reason is that unlike the rest of graphs they have values of the marker DYS481 equal 25 and higher.
A group F contains only two profiles of a graph of 56 markers.

Also a group V is separate here. If we´d classify into it another profiles with a value of the marker DYS388=10, they would be located in the middle of the graph in a not quite well-arranged area.

In this graph a group S occupies several close branches without obvious relation. It is because the markers defining this group were not used for this graph. Although it can be seen that (with some exceptions) profiles of this group "keep close together".
Detail of the groups V and S:

Now let´s look at the situation of our groups on a graph of 17 markers.

Even here the group N forms an isolated branch absolutely clearly graphically separated from the rest of the graph. It contains only two profiles from the graph of 56 markers, though. In the graph there are another three profiles of this group from the graph of 56 markers, but they are located inentirely different parts of the graph.

Similarly the group V contains just two profiles visible in the graph of 56 markers, too. Also this group is clearly graphically separated from the other parts here.

The group P has no direct connection to the graph of 56 markerů. It contains many profiles here, belonging into this group even in the graph of 25 markers.

The group F also contains two profiles visible even in the graph of 56 markers, and it is clearly graphically definable here, too.

However, the group S has an unclear position here and it cannot be grafically defined. From this it can be concluded that some missing markers here, used in the graph of 25 markers, have some connection with defining of this group.

Graphs with a small resolution do not provide any certainty concerning classification of some profile into a specific subgroup.
We can be more sure only when the studied profile is close to a profile of which we know the position in a graph with higher resolution.
It is still necessary to try to determine a higher number of profiles. We also need to be aware of how little of profiles were used to create these graphs. As more profiles will be determined, the greater changes in graphs will be surely done. As an example we can use branches of the graph visible in the upper part of the picture. Here the profiles from our database, beginning with O, are known only for few months or weeks. There were no Czech or Slovak profiles in this part of the graph before.

From many situations, when a profile, located in a graph with a higher resolution in one group and in a graph with a lower resolution completely somewhere else, it is clear that these graphs hold within an information about a further inner structure of these groups. Yet any conclusions would be quite premature so far.
Until the SNP mutations, exactly dividing the R1a subgroups, are known, there is no choise but to collect another, new profiles and monitor behavior of graphs.

Following map shows the countries from where the profiles of individual groups originate.
The letter E does not mark a group, but the profiles close to the profile from Eulau, up to the genetic distance 3 included.
The map concerns only the profiles used in graphs from this study.

                  Ych files:   17.ych 25.ych 56.ych
© Genebáze