11

While chromosome 19 only is the 19th largest autosomal chromosome, it contains 1440 protein-coding genes, and thus has the second highest number of protein-coding genes of any human chromosome.

For comparison: If one would naively assume the same density of protein-coding genes as on chr1, which has the highest absolute number of protein coding genes (2109), the anticipation for chr19 would be ~540 protein-coding genes.

The x-axis of this graph highlights the increased density of protein-coding genes on chromosome 19; The x-axis highlights the increased density of genes on Chromosome 19.

David
  • 25,583
  • 8
  • 53
  • 95
tsttst
  • 1,597
  • 9
  • 25
  • 7
    I don't think there is any answer to "why". – WYSIWYG Jul 18 '16 at 11:19
  • 1
    Within any natural population there are going to be outliers. The distribution graph you show has a long tail, which is to be expected. Look for graphs on the distribution of height, for example, and you'll see that 90% of the values are within a fairly small range, but the 5% on either side stretch out quite far, as there are extreme outliers on both ends. – MattDMo Jul 18 '16 at 13:31
  • 2
    As extreme outliers in natural populations often are somehow special, and as a n of over 1000 genes would strongly argue against as many independent chance events, I was curious if there is something special about chr 19 or its genes, which could favor such a comparatively high gene density. – tsttst Jul 18 '16 at 14:02
  • Well a very rough analogy type answer is that random mutation is not a very efficient way to make the genome, just like randomly tapping the keyboard is not a very efficient way to write computer code, you'll end up with variation in the efficiency of scripts – rg255 Jul 20 '16 at 07:16
  • Paralogous genes normally lie on different chromosomes, which shows that the chromosomal location is not strictly given by the de-novo mutation / formation of a gene on a given chromosome. – tsttst Jul 21 '16 at 03:08
  • 1
    I think that OP is correct that there is something to explain here (i.e., the standard null models for gene distribution across the genome would not predict such an outlier), but I think that nobody knows the answer. – Daniel Weissman Jul 21 '16 at 16:15
  • 1
    @WYSIWYG I think "why" in this case translates to "what events led to". – James Jul 27 '16 at 04:25

2 Answers2

11

This Nature paper from 2004, by Jane Grimwood et al. goes at least a long way towards giving an answer to the question of the OP. In short: there were inordinately many duplications, especially during an event 30-40 million years ago, as well as during a much more recent event. These duplications are, uncharacteristically, predominantly intra-chromosomal rather than inter-chromosomal. Also, chromosome 19 contains a lot of immunoglobin-like paralogues: a type of gene for which it is clearly evolutionarily adaptive to undergo rapid duplication followed by random mutation, as they play a role in adapting to potential antigens.

WYSIWYG
  • 35,564
  • 9
  • 67
  • 154
Yuri Robbers
  • 414
  • 3
  • 6
3

It's interesting that not only the leader 19, but also 16 and 17 follow a similar trend. Perhaps their size could be the best weight/length proportion to ensure a safe replication? Then what would have to be explained would be 18, so far to the left. That could be if 18 is newer, resulting from the split of a larger chromosome or the fusion of two smaller, having yet no time to accumulate a greater number of genes favoured by the advantages of the "genes positional co-evolution" (if such thing exists).

Rodrigo
  • 1,300
  • 8
  • 27