Human genome
Human genome
- The human genome is the complete set of nucleic acid sequence for humans(Homo sapiens), encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. Human genomes include both protein-coding DNA genes and noncoding DNA.
- Difference : human individual(~0.1%), chimpanzees(~4%),
- MicroRNA functions as a post-transcriptional regulator of gene expression.
- Completeness of the human genome sequence : Although the human genome has been completely sequenced for all practical purposes, there are still hundreds of gaps in the sequence. A recent study noted more than 160 euchromatic gaps of which 50 gaps were closed. However, there are still numerous gaps in the heterochromatic parts of the genome which is much harder to sequence due to numerous repeats and other intractable sequence features.
summarizes the physical organization and gene content of the human reference genome, with links to the original analysis, as published in the Ensembl database at the European Bioinformatics Institute (EBI) and Wellcome Trust Sanger Institute.
<thead> </thead>Chromosome | Length (mm) | Base pairs | Variations | Confirmed proteins | Putative proteins | Pseudogenes | miRNA | rRNA | snRNA | snoRNA | Misc ncRNA | Links | Centromere position (Mbp) | Cumulative (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 85 | 249,250,621 | 4,401,091 | 2,012 | 31 | 1,130 | 134 | 66 | 221 | 145 | 106 | EBI | 125.0 | 7.9 |
2 | 83 | 243,199,373 | 4,607,702 | 1,203 | 50 | 948 | 115 | 40 | 161 | 117 | 93 | EBI | 93.3 | 16.2 |
3 | 67 | 198,022,430 | 3,894,345 | 1,040 | 25 | 719 | 99 | 29 | 138 | 87 | 77 | EBI | 91.0 | 23.0 |
4 | 65 | 191,154,276 | 3,673,892 | 718 | 39 | 698 | 92 | 24 | 120 | 56 | 71 | EBI | 50.4 | 29.6 |
5 | 62 | 180,915,260 | 3,436,667 | 849 | 24 | 676 | 83 | 25 | 106 | 61 | 68 | EBI | 48.4 | 35.8 |
6 | 58 | 171,115,067 | 3,360,890 | 1,002 | 39 | 731 | 81 | 26 | 111 | 73 | 67 | EBI | 61.0 | 41.6 |
7 | 54 | 159,138,663 | 3,045,992 | 866 | 34 | 803 | 90 | 24 | 90 | 76 | 70 | EBI | 59.9 | 47.1 |
8 | 50 | 146,364,022 | 2,890,692 | 659 | 39 | 568 | 80 | 28 | 86 | 52 | 42 | EBI | 45.6 | 52.0 |
9 | 48 | 141,213,431 | 2,581,827 | 785 | 15 | 714 | 69 | 19 | 66 | 51 | 55 | EBI | 49.0 | 56.3 |
10 | 46 | 135,534,747 | 2,609,802 | 745 | 18 | 500 | 64 | 32 | 87 | 56 | 56 | EBI | 40.2 | 60.9 |
11 | 46 | 135,006,516 | 2,607,254 | 1,258 | 48 | 775 | 63 | 24 | 74 | 76 | 53 | EBI | 53.7 | 65.4 |
12 | 45 | 133,851,895 | 2,482,194 | 1,003 | 47 | 582 | 72 | 27 | 106 | 62 | 69 | EBI | 35.8 | 70.0 |
13 | 39 | 115,169,878 | 1,814,242 | 318 | 8 | 323 | 42 | 16 | 45 | 34 | 36 | EBI | 17.9 | 73.4 |
14 | 36 | 107,349,540 | 1,712,799 | 601 | 50 | 472 | 92 | 10 | 65 | 97 | 46 | EBI | 17.6 | 76.4 |
15 | 35 | 102,531,392 | 1,577,346 | 562 | 43 | 473 | 78 | 13 | 63 | 136 | 39 | EBI | 19.0 | 79.3 |
16 | 31 | 90,354,753 | 1,747,136 | 805 | 65 | 429 | 52 | 32 | 53 | 58 | 34 | EBI | 36.6 | 82.0 |
17 | 28 | 81,195,210 | 1,491,841 | 1,158 | 44 | 300 | 61 | 15 | 80 | 71 | 46 | EBI | 24.0 | 84.8 |
18 | 27 | 78,077,248 | 1,448,602 | 268 | 20 | 59 | 32 | 13 | 51 | 36 | 25 | EBI | 17.2 | 87.4 |
19 | 20 | 59,128,983 | 1,171,356 | 1,399 | 26 | 181 | 110 | 13 | 29 | 31 | 15 | EBI | 26.5 | 89.3 |
20 | 21 | 63,025,520 | 1,206,753 | 533 | 13 | 213 | 57 | 15 | 46 | 37 | 34 | EBI | 27.5 | 91.4 |
21 | 16 | 48,129,895 | 787,784 | 225 | 8 | 150 | 16 | 5 | 21 | 19 | 8 | EBI | 13.2 | 92.6 |
22 | 17 | 51,304,566 | 745,778 | 431 | 21 | 308 | 31 | 5 | 23 | 23 | 23 | EBI | 14.7 | 93.8 |
X | 53 | 155,270,560 | 2,174,952 | 815 | 23 | 780 | 128 | 22 | 85 | 64 | 52 | EBI | 60.6 | 99.1 |
Y | 20 | 59,373,566 | 286,812 | 45 | 8 | 327 | 15 | 7 | 17 | 3 | 2 | EBI | 12.5 | 100.0 |
mtDNA | 0.0054 | 16,569 | 929 | 13 | 0 | 0 | 0 | 2 | 0 | 0 | 22 | EBI | N/A | 100.0 |
total | 3,095,693,981 | 19,313 | 738 | 12,859 |
- coding seq : Protein-coding sequences represent the most widely studied and best understood component of the human genome. About 20,000 human proteins have been annotated in databases such as Uniprot.
- Protein-coding capacity per chromosome. Protein-coding genes are distributed unevenly across the chromosomes, ranging from a few dozen to more than 2000, with an especially high gene density within chromosomes 19, 11, and 1
- non-coding DNA : Noncoding DNA is defined as all of the DNA sequences within a genome that are not found within protein-coding exons, and so are never represented within the amino acid sequence of expressed proteins. By this definition, more than 98% of the human genomes is composed of ncDNA.
https://en.wikipedia.org/wiki/Human_genome
http://web.ornl.gov/sci/techresources/Human_Genome/project/info.shtml