實驗方法> 生物信息學技術(shù)> 數(shù)據(jù)庫>Genomic?Libraries

Genomic?Libraries

關(guān)鍵詞： genomic libraries來源：互聯(lián)網(wǎng)

Genomic DNA libraries

Size of some genomes and chromosomes:

Comparative Sequence Sizes	(Bases)
(yeast chromosome 3)	350 Thousand
Escherichia coli (bacterium) genome	4.6 Million
Largest yeast chromosome now mapped	5.8 Million
Entire yeast genome (completed 5/96)	15 Million
Smallest human chromosome (Y)	50 Million
Largest human chromosome (1)	250 Million
Entire human genome	3 Billion

The human genome contains approximately 50,000 unique genes within 3-4 billion base pairs of DNA , scattered about in 23 pairs of chromosomes .

Fragmentation of genomic DNA for library construction

Restriction endonuclease digestion

A six-cutter (e.g. Eco RI) will cut on average every 4.1 Kb . Complete digestion of human DNA with this type of enzyme will result in approximately 1 x 106 unique fragments.
What is the probability of finding a clone within a given library?

The exact probability of having any given DNA sequence in the library can be calculated from the equation

N = ln(1 -P) /ln(1 - f)

P is the desired probability

f is the fractional proportion of the genome in a single recombinant

N is the necessary number of recombinants For example, how large a library (i.e. how many clones) would you need in order to have a 99% probability of finding a desired sequence represented in a library created by digestion with a 6-cutter?

N = ln(1 - 0.99)/ln(1 - (4096/3x109 ))

N = 3.37 x 106 clones

Thus, from this type of analysis we can see that we need a technology which will allow us to achieve the following:

Stable insertion of relatively large DNA fragments into our cloning vector High efficiency of insertion and the ability to handle large numbers of clones

For example, when plating E. coli colonies on a 3" petri plate, the maximum practical density to allow isolation of individual colonies is about 100-200 colonies per plate.
If we were to try to plate our library of 3.37 x 106 in such a way would need about 22,500 plates .
Not only that, but such large DNA fragments are not well tolerated in typical E. coli cloning vectors such as pBR322.

Bacteriophage lambda vectors are commonly used for construction of genomic libraries

Bacteriophage l is an E. coli phage with a type of icosahedral phage particle which contains the viral genome:

During replication, the phage DNA is produced in a concatameric form, which is cleaved by appropriate endonucleases to allow packaging of a single genome within the phage capsid.
It was found that internal regions of the phage genome, which were not essential to phage replication, could be removed and replaced with DNA of interest.
This hybrid DNA could be efficiently packaged, and form an infective phage.

The advantages of this type of system vs plasmids like pBR322 are:

The phage genome is able to package efficiently with DNA inserts as large as 20 Kb . Furthermore, the packaged phage are highly infectious and infect E. coli at a much higher efficiency than plasmid transformation methods .

Incomplete Digestion of Genomic DNA will allow identification of sequence overlaps

Complete digestion with an endonuclease will result in a library containing no overlapping fragments :

However, incomplete digestion will result in a library containing overlapping fragments:

Thus, the sequence information obtained from one clone will allow the isolation of clones containing neighboring (overlapping) sequence information .
This can allow large contiguous stretches of sequence information to be obtained ("Chromosome Walking ").

Probing libraries

Once a library (cDNA or genomic) has been constructed we want to be able to identify clones which contain DNA of interest.

For example, from protein sequence information we can deduce possible stretches of the corresponding DNA sequence (there will however be ambiguity due to the degeneracy of codons).
If we can synthesize an oligonucleotide complementary to our DNA sequence of interest we can use it to specifically hybridize to the appropriate clone in our libraray (i.e. to probe our library).

In standard methodologies the oligonucleotide is phosphorylated at the 5' end with radiolabeled g32 P-ATP and T4 polynucleotide kinase .

The probe is then incubated with individual phage plaques which have been fixed onto nitrocellulose and their DNA denatured by treatment with base.
If the plaque contains complementary DNA to to probe sequence, the probe will hybridize.
If the nitrocellulose (containing many individual plaques) is exposed to x-ray film, only those plaques with hybridized probe will show up (as a dark spot) :

Note that its important to keep track of the orientation of the nitrocellulose in relationship to the x-ray film (usually radioactive ink is used to identify the nitrocellulose orientation).

False positives

If we are designing DNA probes from protein sequence information we will have possible ambiguity in our deduced DNA sequence used for the design of the probe.

Usually 14-24mer oligonucleotides are used as probes, a 14-24mer probe means we need a stretch of 5-8 amino acids in the polypeptide.
Given the choice, the best amino acid sequences to look for in a polypeptide are those with low codon degeneracy (see above).
Thus, we would look for a short stretch of polypeptide sequence hopefully containing Met or Trp , and with the remaining amino acids comprising either Phe, Tyr, His, Gln, Asn , Lys, Asp, Glu or Cys .
Regions including Leu, Arg or Ser are to be avoided (6 codons each).

During oligonucleotide synthesis multiple bases will be incorporated at ambiguous positions.

Thus our probe will actually be a mixture of oligonucleotides .
The higher the degeneracy, the greater the posibility of "false positives", i.e. clones which hybridize but are unrelated to the actual sequence we want.
Positive clones are sequenced and the deduced amino acid sequence is compared to our polypeptide sequence information to identify correct clones.

推薦方法

亚洲不卡一区二区av,国产精品乱一区二区三区,另类亚洲综合区图片小说区,99人妻精品日韩欧美一区二区`

Genomic?Libraries