Main Menu


Lecture notes

Gene regulation is significantly more complex in eukaryotes than in prokaryotes for a number of reasons-

1)  Large Genome

First, the genome being regulated is significantly larger. The E. coli genome consists of a single, circular chromosome containing 4.6 Mb. This genome encodes approximately 2000 proteins. In comparison, the genome within a human cell contains 23 pairs of chromosomes ranging in size from 50 to 250 Mb. Approximately 40,000 genes are present within the 3000 Mb of human DNA. It would be very difficult for a DNA-binding protein to recognize a unique site in this vast array of DNA sequences. Consequently, more-elaborate mechanisms are required to achieve specificity.

2) Complex Genome

Another source of complexity in eukaryotic gene regulation is the many different cell types present in most eukaryotes. Liver and pancreatic cells, for example, differ dramatically in the genes that are highly expressed.

3) Widely spread genes- No well-defined operons

Moreover, eukaryotic genes are not generally organized into operons. Instead, genes that encode proteins for steps within a given pathway are often spread widely across the genome.

4) Compact Genome

The DNA in eukaryotic cells is extensively folded and packed into the protein-DNA complex called chromatin. Histones are an important part of this complex since they form the structures known as nucleosomes and also  contribute significantly into gene regulatory mechanisms.

5) Uncoupled transcription and Translation

Finally, transcription and translation are uncoupled in eukaryotes, eliminating some potential gene-regulatory mechanisms.

Mechanism of regulation of gene expression in Eukaryotes

1) Chromatin Remodeling

Chromatin structure provides an important level of control of gene transcription. Large regions of chromatin are transcriptionally inactive while others are either active or potentially active. With few exceptions, each cell contains the same complement of genes (antibody-producing cells are a notable exception). The development of specialized organs, tissues, and cells and their function in the intact organism depend upon the differential expression of genes. Some of this differential expression is achieved by having different regions of chromatin available for transcription in cells from various tissues. For example, the DNA containing the β-globin gene cluster is in “active” chromatin in the reticulocyte but in “inactive” chromatin in muscle cells.

Formation and disruption of nucleosome structure

The presence of nucleosomes and of complexes of histones and DNA certainly provides a barrier against the ready association of transcription factors with specific DNA regions. The dynamics of the formation and disruption of nucleosome structure are therefore an important part of eukaryotic gene regulation and the processes involved are as follows-

i) Histone acetylation and deacetylation is an important determinant of gene activity. Acetylation is known to occur on lysine residues in the amino terminal tails of histone molecules (Figure-1). This modification reduces the positive charge of these tails and decreases the binding affinity of histone for the negatively charged DNA. Accordingly, the acetylation of histones could result in disruption of nucleosomal structure and allow readier access of transcription factors to cognate regulatory DNA elements. Different proteins with specific acetylase and deacetylase activities are associated with various components of the transcription apparatus.


Histone acetylation

Figure-1- Showing the Acetylation of lysine residues  in the amino terminal ends of Histones. The positive charge is removed after acteylation.

Thus, histone acetylation can activate transcription through a combination of three mechanisms: by reducing the affinity of the histones for DNA, by recruiting other components of the transcriptional machinery, and by initiating the active remodeling of the chromatin structure (Figure-2).

Histone actylation and chromatin remodeling

Figure-2- Acetylation of histones leads to disruption of nucleosomal structure and access of transcription machinery for transcription of required genes

ii)  Modification of DNA-The modification of DNA provides another mechanism, in addition to packaging with histones, for inhibiting inappropriate gene expression in specific cell types. Methylation of deoxycytidine residues (Figure-3) in DNA may effect gross changes in chromatin so as to preclude its active transcription. Acute demethylation of deoxycytidine residues in a specific region of the tyrosine aminotransferase gene—in response to glucocorticoid hormones—has been associated with an increased rate of transcription of the gene. However, it is not possible to generalize that methylated DNA is transcriptionally inactive, that all inactive chromatin is methylated, or that active DNA is not methylated.

DNA methylation

Figure-3- Methylation of deoxycytidine residues  in DNA preclude its active transcription.

iii) DNA binding proteins- The interactions between DNA-binding proteins such as CAP and RNA polymerase can activate transcription in prokaryotic cells. Such protein-protein interactions play a dominant role in eukaryotic gene regulation. In contrast with those of prokaryotic transcription, few eukaryotic transcription factors have any effect on transcription on their own. Instead, each factor recruits other proteins to build up large complexes that interact with the transcriptional machinery to activate or repress transcription.

A major advantage of this mode of regulation is that a given regulatory protein can have different effects, depending on what other proteins are present in the same cell. This phenomenon, called combinatorial control, is crucial to multicellular organisms that have many different cell types.

The binding of specific transcription factors to certain DNA elements may result in disruption of nucleosomal structure. Many eukaryotic genes have multiple protein-binding DNA elements. The serial binding of transcription factors to these elements may either directly disrupt the structure of the nucleosome or prevent its re-formation. These reactions result in chromatin-level structural changes that in the end increase DNA accessibility to other factors and the transcription machinery.

2) Enhancers and Repressors- Enhancer elements are DNA sequences, although they have no promoter activity of their own but they greatly increase the activities of many promoters in eukaryotes. Enhancers function by serving as binding sites for specific regulatory proteins. An enhancer is effective only in the specific cell types in which appropriate regulatory proteins are expressed. In many cases, these DNA-binding proteins influence transcription initiation by perturbing the local chromatin structure to expose a gene or its regulatory sites rather than by direct interactions with RNA polymerase.

Enhancer elements can exert their positive influence on transcription even when separated by thousands of base pairs from a promoter; they work when oriented in either direction; and they can work upstream (5′) or downstream (3′) from the promoter. Enhancers are promiscuous; they can stimulate any promoter in the vicinity and may act on more than one promoter.

The elements that decrease or repress the expression of specific genes have also been identified. Silencers are control regions of DNA that, like enhancers, may be located thousands of base pairs away from the gene they control. However, when transcription factors bind to them, expression of the gene they control is repressed.

Tissue-specific gene expression is mediated by enhancers or enhancer-like elements. Many genes are now recognized to harbor enhancer or activator elements in various locations relative to their coding regions. In addition to being able to enhance gene transcription, some of these enhancer elements clearly possess the ability to do so in a tissue-specific manner. Thus, the enhancer element associated with the immunoglobulin genes between the J and C regions enhances the expression of those genes preferentially in lymphoid cells.

3) Locus control regions and Insulators- some regions are controlled by complex DNA elements called locus control regions (LCRs). An LCR—with associated bound proteins—controls the expression of a cluster of genes. The best-defined LCR regulates expression of the globin gene family over a large region of DNA.

Another mechanism is provided by insulators. These DNA elements, also in association with one or more proteins, prevent an enhancer from acting on a promoter .

4) Gene Amplification- One way to increase the rate at which gene product can be increased is to increase the number of genes available for transcription of specific molecules. Among the repetitive DNA sequences are hundreds of copies of ribosomal RNA genes and tRNA genes. These genes preexist repetitively in the genomic material of the gametes and thus are transmitted in high copy numbers from generation to generation.

During early development of metazoans, there is an abrupt increase in the need for specific molecules such as ribosomal RNA and messenger RNA molecules for proteins that make up such organs as the eggshell. Such requirements are fulfilled by amplification of specific genes. Subsequently, these amplified genes (Figure-4)  presumably generated by a process of repeated initiations during DNA synthesis, provide multiple sites for gene transcription.

Gene amplification

 Figure-4- gene amplification increases the copy number of genes and hence increase in the amount of gene product

In some cases, a several thousand-fold increase in the copy number of specific genes can be achieved over a period of time involving increasing doses of selective drugs. It has been demonstrated in patients receiving methotrexate for cancer that malignant cells can develop drug resistance by increasing the number of genes for dihydrofolate reductase, the target of Methotrexate.

5. Gene Rearrangement- Gene rearrangement is observed during immunoglobulins synthesis. Immunoglobulins are composed of two polypeptides, heavy (about 50 kDa) and light (about 25 kDa) chains. The mRNAs encoding these two protein subunits are encoded by gene sequences that are subjected to extensive DNA sequence-coding changes. These DNA coding changes are needed for generating the required recognition diversity central to appropriate immune function.

IgG heavy and light chain mRNAs are encoded by several different segments that are tandemly repeated in the germ line. Thus, for example, the IgG light chain is composed of variable (VL), joining (JL), and constant (CL) domains or segments. For particular subsets of IgG light chains, there are roughly 250-300 tandemly repeated VL gene coding segments, five tandemly arranged JL coding sequences, and roughly ten CL gene coding segments. All of these multiple, distinct coding regions are located in the same region of the same chromosome (Figure-4).By having multiple VL, JL, and CL segments to choose from, an immune cell has a greater repertoire of sequences to work with to develop both immunologic flexibility and specificity.

However, a given functional IgG light chain transcription unit contains only the coding sequences for a single protein. Thus, before a particular IgG light chain can be expressed, single VL, JL, and CL coding sequences must be recombined to generate a single, contiguous transcription unit excluding the multiple nonutilized segments (ie, the other approximately 300 unused VL segments, the other four unused JL segments, and the other nine unused CL segments). This deletion of unused genetic information is accomplished by selective DNA recombination that removes the unwanted coding DNA while retaining the required coding sequences: one VL, one JL, and one CL sequence. (VL sequences are subjected to additional point mutagenesis to generate even more variability—hence the name.) The newly recombined sequences thus form a single transcription unit that is competent for RNA polymerase II-mediated transcription.

Gene rearrangement

 Figure-5-  Showing Immunoglobulin m RNA for a light chain formed by transcription of rearranged genes.

6. Alternative RNA Processing

Eukaryotic cells also employ alternative RNA processing to control gene expression. This can result when alternative promoters, intron-exon splice sites, or polyadenylation sites are used. Occasionally, heterogeneity within a cell results, but more commonly the same primary transcript is processed differently in different tissues.

Alternative polyadenylation sites in the immunoglobulin  (Ig M) heavy chain primary transcript result in mRNAs that are either 2700 bases long (m) or 2400 bases long (s). This results in a different carboxyl terminal region of the encoded proteins such that the m protein remains attached to the membrane of the B lymphocyte and the s immunoglobulin is secreted.

Alternative splicing and processing, results in the formation of seven unique -tropomyosin mRNAs in seven different tissues (Figure-6).

alternative splicing


Figure-6- The presence or absence of extra exon can alter the structure and hence the functions of a protein.

7. Class switching- In this process one gene is switched off and a closely related gene takes up the function.

For example- During intrauterine life embryonic Hb is the first Hb to be formed. It is produced by having two “Zeta” and two “Epsilon” chains. By the sixth month of intrauterine life, embryonic Hb is replaced by HbF consisting of “α2 and y2 chains. After birth HbF is replaced by adult type of Hb A 1(97%) and HbA2(3%). Thus the genes for a particular class of Hb are switched off and for another class are switched on.

Gene switching is also observed in the formation of immunoglobulins. Ig M is the formed during primary immune response, while Ig G is formed during secondary immune response.

8. mRNA stability -Although most mRNAs in mammalian cells are very stable (half-lives measured in hours), some turn over very rapidly (half-lives of 10–30 minutes). In certain instances, mRNA stability is subject to regulation. This has important implications since there is usually a direct relationship between mRNA amount and the translation of that mRNA into its cognate protein. Changes in the stability of a specific mRNA can therefore have major effects on biologic processes.

The stability of the m RNA can be influenced by hormones and certain other effectors.

The ends of mRNA molecules are involved in mRNA stability. The 5′ cap structure in eukaryotic mRNA prevents attack by 5′ exonucleases, and the poly(A) tail prohibits the action of 3′ exonucleases.

9. Specific motifs of regulatory proteins- Certain DNA binding proteins having specific motifs bind certain region of DNA to influence the rate of transcription. The specificity involved in the control of transcription requires that regulatory proteins bind with high affinity to the correct region of DNA. Three unique motifs—the helix-turn-helix, the zinc finger, and the leucine zipper—account for many of these specific protein-DNA interactions. The motifs found in these proteins are unique; their presence in a protein of unknown function suggests that the protein may bind to DNA. The protein-DNA interactions are maintained by hydrogen bonds and van der Waals forces.

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

Prokaryotes must use substances and synthesize macromolecules just fast enough to meet their needs. The genes for metabolizing enzymes are expressed only in the presence of nutrients.  If the enzymes are not needed, genes are turned off. This allows for conservation of cell resources. Controlling gene expression is one method of regulating metabolism.

Bacteria such as E. coli usually rely on glucose as their source of carbon and energy. However, when glucose is scarce, E. coli can use lactose as their carbon source even though this disaccharide does not lie on any major metabolic pathways. An essential enzyme in the metabolism of lactose is β-galactosidase, which hydrolyzes lactose into galactose and glucose (Figure-1)

Beta galactosidase action

Figure-1- Action of Beta galactosidase on lactose

An E. coli cell growing on a carbon source such as glucose or glycerol contains fewer than 10 molecules of β -galactosidase. In contrast, the same cell contains several thousand molecules of the enzyme when grown on lactose. The presence of lactose in the culture medium induces a large increase in the amount of β -galactosidase by eliciting the synthesis of new enzyme molecules rather than by activating a preexisting but inactive precursor. A crucial clue to the mechanism of gene regulation is that two other proteins are synthesized in concert with β -galactosidase namely, galactoside permease and thiogalactoside transacetylase. The permease is required for the transport of lactose across the bacterial cell membrane. The transacetylase is not essential for lactose metabolism but appears to play a role in the detoxification of compounds that also may be transported by the permease. Thus, the expression levels of a set of enzymes that all contribute to the adaptation to a given change in the environment change together. Such a coordinated unit of gene expression is called an operon.


The parallel regulation of β -galactosidase, the permease, and the transacetylase suggested that the expression of genes encoding these enzymes is controlled by a common mechanism. Francois Jacob and Jacques Monod proposed the operon model (Figure-2) to account for this parallel regulation as well as the results of other genetic experiments .The genetic elements of the model are a regulator gene, a regulatory DNA sequence called an operator site, and a set of structural genes.

The regulator gene encodes a repressor protein that binds to the operator site. The binding of the repressor to the operator prevents transcription of the structural genes. The operator and its associated structural genes constitute the operon. For the lactose (lac) operon, the i gene encodes the repressor, o is the operator site, and the z, y, and a genes are the structural genes for β -galactosidase, the permease, and the transacetylase, respectively (Figure-2). The operon also contains a promoter site (denoted by p), which directs the RNA polymerase to the correct transcription initiation site. The z, y, and a  genes are transcribed to give a single mRNA molecule that encodes all three proteins. A mRNA molecule encoding more than one protein is known as a polygenic or polycistronic transcript.

lac operon

Figure-2- (A) The general structure of an operon as conceived by Jacob and Monod. (B) The structure of the lactose operon. In addition to the promoter (p) in the Operon, a second promoter is present in front of the regulator gene (i) to drive the synthesis of the regulator.

Regulation of lac operon

(a) Negative Control- Repression

How does the lac repressor inhibit the expression of the lac operon?

The lac repressor can exist as a dimer of 37-kd subunits, and two dimers often come together to form a tetramer. In the absence of lactose, the repressor binds very tightly and rapidly to the operator. When the lac repressor is bound to DNA, it prevents bound RNA polymerase from locally unwinding the DNA to expose the bases that will act as the template for the synthesis of the RNA strand.(Figure-3).Thus, very little β-galactosidase, permease, or transacetylase are produced.

Lac operon  Off

Figure-3- In the absence of lactose lac Operon is off

(b) Double negative control-  Derepression

How does the presence of lactose trigger expression from the lac operon?

Interestingly, lactose itself does not have this effect; rather, allolactose, a combination of galactose and glucose with an α-1,6 rather than an α -1,4 linkage, does. Allolactose is thus referred to as the inducer of the lac operon. Allolactose is a side product of the β-galactosidase reaction produced at low levels by the few molecules of β-galactosidase that are present before induction.

Structure of Allo lactose


Structure of IPTG


Figure- 4- structure of Allolactose(a) and Isopropyl thiogalactoside (IPTG)

A lactose analog that is capable of inducing the lac operon while not itself serving as a substrate for β-galactosidase is an example of a gratuitous inducer. An example is isopropylthiogalactoside (IPTG) -Figure-4-(b).

IPTG is useful in the laboratory as a tool for inducing gene expression. The addition of lactose or of a gratuitous inducer such as IPTG to bacteria growing on a poorly utilized carbon source (such as succinate) results in prompt induction of the lac operon enzymes.

When the lac repressor is bound to the inducer, the repressor’s affinity for operator DNA is greatly reduced. This binding leads to local conformational changes so that it cannot easily contact DNA simultaneously, leading to a dramatic reduction in DNA-binding affinity and the release of DNA by the lac repressor. With the operator site unoccupied, RNA polymerase can then transcribe the other lac genes and the bacterium produces the proteins necessary for the efficient utilization of lactose (Figure-5)

Lac operon On

Figure-5- Presence of Lactose (substrate is actually allolactose) changes conformation of the lac repressor. Inactive form unable to bind to operator ->GENE TURNED ON

In such a manner, an inducer derepresses the lac operon and allows transcription of the structural genes for -β -galactosidase, galactoside permease, and thiogalactoside transacetylase.

Repressible and Inducible enzymes are both an example of negative control of a pathway. Activating the repressor proteins shuts off the pathway. Positive control requires that an activator molecule switch on transcription.

 (c) Positive control

There are also DNA-binding proteins that stimulate transcription. One particularly well-studied example is the catabolite activator protein (CAP), which is also known as the cAMP response protein (CRP). When bound to cAMP, CAP, which also is a sequence-specific DNA-binding protein, stimulates the transcription of lactose catabolizing genes. Within the lac operon, CAP binds to an inverted repeat that is centered near position -61 relative to the start site for transcription (Figure-6).

The CAP-cAMP complex stimulates the initiation of transcription by approximately a factor of 50. A major factor in this stimulation is the recruitment of RNA polymerase to promoters to which CAP is bound. Studies have been undertaken to localize the surfaces on CAP and on the α subunit of RNA polymerase that participate in these interactions.

These energetically favorable protein-protein contacts increase the likelihood that transcription will be initiated at sites to which the CAP-cAMP complex is bound. Thus, in regard to the lac operon, gene expression is maximal when the binding of allolactose relieves the inhibition by the lac repressor, and the CAP-cAMP complex stimulates the binding of RNA polymerase. The E. coli genome contains many CAP-binding sites in positions appropriate for interactions with RNA polymerase.

Thus, an increase in the cAMP level inside an E. coli bacterium results in the formation of CAP-cAMP complexes that bind to many promoters and stimulate the transcription of genes encoding a variety of catabolic enzymes.

When grown on glucose, E. coli have a very low-level of catabolic enzymes such as β-galactosidase. Clearly, it would be wasteful to synthesize these enzymes when glucose is abundant. The inhibitory effect of glucose, called catabolite repression, is due  to the ability of glucose to lower the intracellular concentration of cyclic AMP. By an independent mechanism, the bacterium accumulates cAMP only when it is starved for a source of carbon. In the presence of glucose—or of glycerol in concentrations sufficient for growth—the bacteria will lack sufficient cAMP to bind to CAP because the glucose inhibits adenylyl cyclase, the enzyme that converts ATP to cAMP. Thus, in the presence of glucose or glycerol, cAMP-saturated CAP is lacking, so that the DNA-dependent RNA polymerase cannot initiate transcription of the lac operon (Figure-7).

Thus, the CAP-cAMP regulator is acting as a positive regulator because its presence is required for gene expression. The lac operon is therefore controlled by two distinct DNA binding factors; one that acts positively (cAMP-CRP complex) and one that acts negatively (LacI repressor). Maximal activity of the lac operon occurs when glucose levels are low (high cAMP with CAP activation) and lactose is present (LacI is prevented from binding to the operator) (Figure-7).

CAP-cAMP complex

Figure-6- The CAP binding site on DNA is adjacent to the position at which RNA polymerase binds.

Role of CAP-cAMP complex

Figure-7- Catabolite repression and the role of CRP-c AMP complex

Constitutive Expression and continuous repression

When the lacI gene has been mutated so that its product, LacI, is not capable of binding to operator DNA, the organism will exhibit constitutive expression of the lac operon. In a contrary manner, an organism with a lacI gene mutation that produces a LacI protein which prevents the binding of an inducer to the repressor will remain repressed even in the presence of the inducer molecule, because the inducer cannot bind to the repressor on the operator locus in order to derepress the operon. Similarly, bacteria harboring mutations in their lac operator locus such that the operator sequence will not bind a normal repressor molecule constitutively express the lac operon genes.

Thus the repression/derepression and induction of lac operon can be summarized as follows-

1) In the absence of lactose- Lac operon remains repressed due to the presence of  lac repressor at the operator site- (Negative control).

2) In the presence of only Lactose- Lac operon is derepressed, the structural genes are transcribed and the lactose metabolizing  enzymes are synthesized (Double negative control).

3) In the presence of both glucose and lactose- CAP -cAMP complex is not formed, RNA polymerase can not initiate the transcription of structural genes( absence of positive regulation),  the operator site is though  still vacant due to the binding of lactose/allolactose with lac repressor (Double negative regulation) but Lac operon remains in the repressed state. Thus glucose is consumed first and as it is exhausted , c AMP level rises, CAP-cAMP complex is formed and lac operon is expressed, lactose metabolizing enzymes are synthesized.

4) In the presence of only glucose, cAMP level is low, CAP-cAMP complex is not formed, no transcription of lactose metabolizing enzyme (Lack of positive control. Besides that lac repressor is bound to the operator site, RNA polymerase cannot transcribe the genes. (Negative control). Thus lac operon remains in the repressed state.


Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

Gene expression is the combined process of the transcription of a gene into mRNA , the processing of that mRNA, and its translation into protein (for protein-encoding genes).

Significance of gene Expression

Regulated expression of genes is required for Adaptation, differentiation and development,

1) Adaptation

Organisms adapt to environmental changes by altering gene expression.

a) Bacteria are highly versatile and responsive organisms : the rate of synthesis of some proteins in bacteria may vary more than a 1000-fold in response to the supply of nutrients or to environmental challenges. 

b) Cells of multicellular organisms also respond to varying conditions. Such cells exposed to hormones and growth factors change substantially in shape, growth rate, and other characteristics.

2) Tissue specific differentiation and development

The genetic information present in each somatic cell of a metazoan organism is practically identical. For example, cells from muscle and nerve tissue show strikingly different morphologies and other properties, yet they contain exactly the same DNA. These diverse properties are the result of differences in gene expression.

The exceptions in the genetic information are found in those few cells that have amplified or rearranged genes in order to perform specialized cellular functions.

A comparison of the gene-expression patterns of cells from the pancreas, which secretes digestive enzymes, and the liver, the site of lipid transport and energy transduction, reveals marked differences in the genes that are highly expressed a difference consistent with the physiological roles of these tissues.

Expression of the genetic information must be regulated during ontogeny and differentiation of the organism and its cellular components.

Mammalian cells possess about 1000 times more genetic information than does the bacterium Escherichia coli. Much of this additional genetic information is probably involved in regulation of gene expression during the differentiation of tissues and biologic processes in the multicellular organism and in ensuring that the organism can respond to complex environmental challenges.

How is gene expression controlled?

Gene activity is controlled first and foremost at the level of transcription. Much of this control is achieved through the interplay between proteins that bind to specific DNA sequences and their DNA binding sites. This can have a positive or negative effect on transcription. Transcription control can result in tissue-specific gene expression.

In addition to transcription level controls, gene expression can also be modulated by gene amplification, gene rearrangement, post transcriptional modifications, and RNA stabilization.
Types of gene regulation

There are only two types of gene regulation: positive regulation and negative regulation.

A) Positive regulation
When the expression of genetic information is quantitatively increased by the presence of a specific regulatory element, regulation is said to be positive. The element or molecule mediating positive regulation is a positive regulator or activator.

B) Negative regulation
When the expression of genetic information is diminished by the presence of a specific regulatory element, regulation is said to be negative. The element or molecule mediating negative regulation is said to be a negative regulator or repressor.

A double negative has the effect of acting as a positive. Thus, an effector that inhibits the function of a negative regulator will bring about a positive regulation. Many regulated systems that appear to be induced are in fact derepressed at the molecular level.

Types of responses

The extent or amount of gene expression in response to an inducing signal is observed in three types of temporal responses-

1) Type A response

Type A response is characterized by an increased extent of gene expression that is dependent upon the continued presence of the inducing signal. When the inducing signal is removed, the amount of gene expression diminishes to its basal level, but the amount repeatedly increases in response to the reappearance of the specific signal.


This type of response is commonly observed in prokaryotes in response to sudden changes of the intracellular concentration of a nutrient. It is also observed in many higher organisms after exposure to inducers such as hormones, nutrients, or growth factors.(Figure-1)

2) Type B response

Type B response exhibits an increased amount of gene expression that is transient even in the continued presence of the regulatory signal. After the regulatory signal has terminated and the cell has been allowed to recover, a second transient response to a subsequent regulatory signal may be observed.


This phenomenon characterizes the action of many pharmacologic agents, but it is also a feature of many naturally occurring processes.

This type of response commonly occurs during development of an organism, when only the transient appearance of a specific gene product is required although the signal persists.
Type A response

 Figure-1- Showing type A response. The response is observed only in the presence of a signal.

Type B response

 Figure-2- showing type B response. The signal persists but the response is transient.

3) Type C response

The type C response pattern exhibits, in response to the regulatory signal, an increased extent of gene expression that persists indefinitely even after termination of the signal. The signal acts as a trigger in this pattern. Once expression of the gene is initiated in the cell, it cannot be terminated even in the daughter cells; it is therefore an irreversible and inherited alteration.


This type of response typically occurs during the development of differentiated function in a tissue or organ.
Type C response

 Figure-3- showing type C response. The response is signal independent. Response persists even in the absence of a signal.

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

The pathway of protein synthesis is called Translation because the language of nucleotide sequence on mRNA is translated in to the language of an amino acid sequence. The process of Translation requires a Genetic code, through which the information contained in nucleic acid sequence  is expressed to produce  a specific sequence of amino acids.

  • The letters A, G, T and C correspond to the nucleotides found in DNA. They are organized into codons. 
  • The collection of codons is called Genetic code.
  • For 20 amino acids there should be 20 codons.
  • Each codon should have 3 nucleotides to impart specificity  to each of the amino acid for a specific codon
  • 1 Nucleotide-   4 combinations
  • 2 Nucleotides- 16 combinations
  • 3 Nucleotides- 64 combinations ( Most suited for 20 amino acids).

Genetic code

  •  Genetic code is a dictionary that corresponds with sequence of nucleotides and sequence of Amino Acids.
  • — Words in dictionary are in the form of codons
  • — Each codon is a triplet of nucleotides
  • — 64 codons in total and three out of these are Non Sense codons (Figure-1)
  • — 61 codons for 20 amino acids.

 Genetic code

Figure-1- Genetic code is a dictionary that corresponds with sequence of nucleotides and sequence of Amino Acids.

Genetic Code-Characteristics (Table-1)

1) Specificity

  •  Genetic code is specific (Unambiguous)
  • — A specific codon always codes for the same amino acid.
  •  e.g. UUU codes for Phenyl Alanine, it cannot code for any other amino acid.

2) Universal

  • In all living organism Genetic code is the same.
  • —The exception to universality is found in mitochondrial codons-
  • where AUA codes for Methionine and UGA for tryptophan, instead of termination codon respectively of cytoplasmic protein synthesizing machinery.
  • AGA and AGG code for Arginine in cytoplasm but in mitochondria they are termination codons.

3) Redundant

  • Genetic code is Redundant, also called Degenerate.
  • Although each codon corresponds to a  single amino acid but a single amino acid can have multiple codons.
  • Except Tryptophan and Methionine each amino acid has multiple codons.

4) Non Overlapping and Non Punctuated

  • All codons are independent sets of 3 bases.
  • —There is no overlapping ,
  • —Codon is read from a fixed starting point as a continuous sequence of bases, taken three at a time.
  • —The starting point is extremely important and this is called Reading frame.

5) Non Sense Codons

  • —There are 3 codons out of 64 in genetic code which do not encode for any Amino Acid.
  • —These are called termination codons or stop codons or nonsense codons.
  • The stop codons are UAA, UAG, and UGA. 
  • They encode no amino acid. 
  • The ribosome pauses and falls off the mRNA.

6) Initiator codon

  • —  AUG is the initiator codon in majority of proteins
  • —  In a few cases GUG may be the initiator codon
  • —  Methionine is the only amino acid specified by just one codon, AUG.

 Table-1- Characteristics of Genetic code (Summary)

S.No. Feature Details
1. Specific/ Unambiguous Given a specific codon, only a single amino acid is indicated.
2. Universal In all living organism Genetic code is the same (Except mitochondrial codons) 
3. Redundant/ Degenerate Multiple codons can decode the same amino acid
4. Non Overlapping The reading of the genetic code during the process of protein synthesis does not involve any overlap of codons
5. Non Punctuated Once the reading is commenced at a specific codon, there is no punctuation between codons, and the message is read in a continuing sequence of nucleotide triplets until a translation stop codon is reached.


Wobbling phenomenon

  • —The rules of base pairing are relaxed at the third position, so that a base can pair with more than one complementary base (Table-2)
  • Some tRNA anticodons have Inosine at the third position. 
  • Inosine can pair with U, C, or A.  This means that we don’t need 61 different tRNA molecules, only half as many are required (Table-2)
  • — First two bases in Codon in m RNA(5’-3’) base pair traditionally with the 2 nd and 3rd base of the Anticodon in t RNA(5’-3’)
  • — Nontraditional base pairing is observed between the third base of the codon and 1st base of anticodon.
  • — The reduced specificity between the third base of the codon and the complementary nucleotide in anticodon is responsible for  wobbling (Table-2)
  • — Proline has 4 codons(5’-3’CCU, CCC, CCA, CCG)
  • — The first three codons can be recognized by a single t RNA having  Inosine at the first place.(IGG- 5’-3’)

 Table-2- Showing traditional and nontraditional base pairing between codon and anticodon. 

t RNA (first base) Anticodon m RNA (Third base) Codon Base pairing
C G Traditional
A U Traditional
U A Traditional
U G Nontraditional
G C Traditional
G U Nontraditional
I U Nontraditional
I C Nontraditional
I A Nontraditional

Clinical Significance- Mutations can be well explained using the genetic code.

A) Point Mutations 

1) Silent- Single nucleotide change-A to G (Figure-2) same amino acid is incorporated. Mutation  goes unnoticed.

Point mutations

Figure-2- Silent mutation – same amino acid is incorporate due to degeneracy of the genetic  code

2) Missense-Single nucleotide change A to C – different amino acid incorporated (Figure-3). Loss of functional capacity of protein.

Missense mutations

Figure-3- Single nucleotide change leads to incorporation of different amino acid with the resultant synthesis of faulty protein.

3) NonsenseSingle nucleotide change from C to T, stop codon is generated (In m RNA   represented by UAG) (Figure-4) , there is premature termination of chain, may be incompatible with life.


Non sense mutations

Figure-4- Single nucleotide change (C to T), leads to generation of stop codon (UAG), causing     premature termination of growing peptide chain.

B) Frame shift mutations-Insertion or removal of a base/bases can alter the reading frame with the resultant incorporation of different amino acids (Figure-5)             

Frame shift mutations

Figure-5- Insertion or removal of nucleotides can alter the reading frame  with the resultant incorporation of different amino acids.


Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

A primary transcript is a linear copy of a transcriptional unit-The segment of DNA between specific initiation and termination sequences. The primary transcript of both prokaryotic and eukaryotic t-RNAs and rRNAs are post-transcriptionally modified by removing  extra nucleotides.t- RNAs are then further modified to acquire special characters unique to tRNAs. In fact, many of the tRNA transcription units contain more than one molecule. Thus, in prokaryotes the processing of these rRNA and tRNA precursor molecules is required for the generation of the mature functional molecules.

In prokaryotic organisms, the primary transcripts of mRNA-encoding genes begin to serve as translation templates even before their transcription has been completed. This is because the site of transcription is not compartmentalized into a nucleus as it is in eukaryotic organisms. Thus, transcription and translation are coupled in prokaryotic cells. Consequently, prokaryotic mRNAs are subjected to little processing prior to carrying out their intended function in protein synthesis.

Nearly all eukaryotic RNA primary transcripts undergo extensive processing between the time they are synthesized and the time at which they serve their ultimate function, whether it be as mRNA or as a component of the translation machinery such as rRNA, 5S RNA, or tRNA or RNA processing machinery, snRNAs. Processing occurs primarily within the nucleus and includes nucleolytic cleavage to smaller molecules and coupled nucleolytic and ligation reactions (splicing of exons). However, the processes of transcription, RNA processing, and even RNA transport from the nucleus are highly coordinated.

Some of the processes involved in the post transcriptional modifications of primary transcript  of major RNAs are as follows-

A) Ribosomal RNA

In mammalian cells, the three rRNA molecules are transcribed as part of a single large precursor molecules called,Pre ribosomal RNAs.

The precursor is subsequently processed in the nucleolus to provide the RNA component for the ribosome subunits found in the cytoplasm.

The 23S,16S, and 5S ribosomal RNAs of prokaryotes are produced form a single RNA precursor molecule (figure-1) as are the 28S, 18S and 5.8S r RNAs of eukaryotes. Eukaryotic  5S rRNA is synthesized by RNA polymerase III and modified separately.

 Modifications of r RNA

Figure-1- Showing the formation of mature r RNA from pre ribosomal RNA. Cleavage and trimming are the mechanisms involved, similar modifications are observed in the processing  of eukaryotic r RNA.

The pre ribosomal RNAs are cleaved by ribonucleases to yield intermediate-sized pieces of r RNAs, which are further trimmed to produce the required r RNA species.

The proteins destined to become components of ribosome associate with the r RNA precursor prior to and during the post transcriptional modifications.

The rRNA genes are located in the nucleoli of mammalian cells. Hundreds of copies of these genes are present in every cell. This large number of genes is required to synthesize sufficient copies of each type of rRNA to form the 107 ribosomes required for each cell replication. Whereas a single mRNA molecule may be copied into 105 protein molecules, providing a large amplification, the rRNAs are end products. This lack of amplification requires both a large number of genes and a high transcription rate, typically synchronized with cell growth rate.

B) Transfer RNA

The tRNA molecules serve as adapter molecules for the translation of mRNA into protein sequences. Both eukaryotic and prokaryotic transfer RNA s are made from longer precursor molecules that must be modified. The basic mechanism involved are as follows-

1) Splicing- An intron must be removed from the anticodon loop (Fgure-2)

2)Trimming-  The sequences at both the 5′ and 3′ ends of the molecule are trimmed (Figure-2).

3) Base modifications

The tRNAs contain many modifications of the standard bases A, U, G, and C, including methylation, reduction, deamination, and rearranged glycosidic bonds. Further modification of the tRNA molecules includes nucleotide alkylations,

4) CCA attachment (Figure-2)

The attachment of the characteristic CpCpAOH terminal at the 3′ end of the molecule by the enzyme nucleotidyl transferase is the most important modification.

The 3′ OH of the A ribose is the point of attachment for the specific amino acid that is to enter into the polymerization reaction of protein synthesis.

The methylation of mammalian tRNA precursors probably occurs in the nucleus, whereas the cleavage and attachment of CpCpAOH are cytoplasmic functions.

Enzymes within the cytoplasm of mammalian cells are required for the attachment of amino acids to the CpCpAOH residues.

 Modifications of t RNA

Figure-2-The extra nucleotides at both 5′ and 3′ ends of t RNA are removed, an intron from the anticodon arm is removed, bases are modified (not shown here) and CCA arm is attached to form the mature functional t RNA.

C) Eukaryotic m RNA

The RNA molecule synthesized by RNA polymerase II (the Primary transcript) contains the sequences that are found in cytosolic m RNA. The collection of all the precursor molecules for m RNA is known as heterogeneous nuclear RNA(hn RNA). The primary transcripts are extensively modified in the nucleus after transcription. These modifications include-

a) 5′ Capping

Mammalian mRNA molecules contain a 7-methylguanosine cap structure at their 5′ terminal. The cap structure is added to the 5′ end of the newly transcribed mRNA precursor in the nucleus prior to transport of the mRNA molecule to the cytoplasm. The 5 cap of the RNA transcript is required both for efficient translation initiation and protection of the 5′ end of mRNA from attack by 5-‘3’ exonucleases.

Eukaryotic m RNAs lacking the cap are not efficiently translated.

The addition of the Guanosine triphosphate (Figure-3) part of the cap is catalyzed by the nuclear enzyme guanylyl transferase. Methylation of the terminal guanine occurs in the cytoplasm. and is catalyzed by guanine-7-methyl transferase.

S-Adenosyl methionine is the methyl group donor. Additional methylation steps may occur.

The secondary methylations  of mRNA molecules, those on the 2′-hydroxy and the N6 of adenylyl residues, occur after the mRNA molecule has appeared in the cytoplasm.

 5'Cap of m RNA

Figure-3-showing the attachment of 7 methyl guanosine triphosphate at the 5′ end of the primary transcript by a special 5′-5′ linkage. Additionally methylation can take place at the 2′ OH group of ribose residues of the first two nucleotides.

b) Addition of poly A tail

Poly(A) tails are added to the 3′ end of mRNA molecules in a posttranscriptional processing step. The mRNA is first cleaved about 20 nucleotides downstream from an AAUAA recognition sequence (Figure-4).Another enzyme, poly(A) polymerase, adds a poly(A) tail which is subsequently extended to as many as 200 A residues. The poly(A) tail appears to protect the 3′ end of mRNA from 3′ 5′ exonuclease attack. Histone and interferon’s mRNAs  lack poly A tail.

 Polyadenylation of m RNA

Figure-4- Showing the process of polyadenylation of primary transcript.

 After the m RNA enters the cytosol, the poly A tail is gradually shortened.

c) Removal of introns  (Splicing)

Intons or intervening sequences are the RNA sequences which do not code for the proteins. These introns are removed from the primary transcript in the nucleus, exons (coding sequences) are ligated to form the mRNA molecule, and the mRNA molecule is transported to the cytoplasm.

The steps of splicing are as follows-

i) Role of small nuclear RNA (sn RNA)  and Spliceosome

The molecular machine that accomplishes the task of splicing is known as the spliceosome. Spliceosomes consist of the primary transcript, five small nuclear RNAs (U1, U2, U5, U4, and U6) and more than 60 proteins. Collectively, these form a small ribonucleoprotein (snRNP) complex, sometimes called a “snurp” (snRNPs) (Figure-5).Snurps are thought to position the RNA segments for the necessary splicing reactions. These facilitate the splicing of exon segments by forming base pairs with the consensus sequence at each end of the intron. Although the sequences of nucleotides in the introns of the various eukaryotic transcripts—and even those within a single transcript—are quite heterogeneous, there are reasonably conserved sequences at each of the two exon-intron (splice) junctions and at the branch site, which is located 20–40 nucleotides upstream from the 3′ splice site.

Spliceosome assembly

Figure-5- Showing spliceosome assembly at the splice site.  

ii) Mechanism of excision of introns

The binding of snRNPs brings the sequences of the neighboring exons in to the correct alignment for splicing. The 2′-OH group of an adenosine (A) residue (known as the branch site) in the intron attacks and forms a phosphodiester bond with the phosphate at the 5′ end of the intron 1.The newly- feed 3’OH of the upstream exon 1 then forms a phosphodiester bond with the 5’end of the downstream exon 2.The excised intron is released as a “lariat” structure, which is degraded (Figure-6).

Splicing 2

Figure-6- showing the process of splicing.

After removal of all the introns, the mature m RNA molecules leave the nucleus by passing in to the cytosol through pores in to the nuclear membrane.

Clinical significance

1) Antibodies against snRNPs

In systemic Lupus Erythematosus (SLE), an auto immune disease, the antibodies are produced against host proteins, including sn RNPs.

2) Mutations at the splice site

Mutations at the splice site can lead to improper splicing and the production of aberrant proteins .For example some cases of Beta thalassemia are as a result of incorrect splicing of beta globin m RNA due to mutation at the splice site.

Biological significance

Alternative Splicing

Alternative patterns of RNA splicing is adapted for the synthesis of tissue-specific proteins.

The pre-m RNA molecules from some genes can be spliced in two or more alternative ways in different tissues. This produces multiple variations of the m RNA and thus diverse set of proteins can be synthesized from a given set of genes (Figure-7).

For example- Tissue specific tropomyosins are produced  from the same primary transcript by alternative splicing. Alternative splicing and processing results in the formation of seven unique α -tropomyosin mRNAs in seven different tissues.

 Alternative splicing

Figure-7- Showing the mechanism of alternative splicing.

 Similarly, the use of alternative termination-cleavage-polyadenylation sites also results in mRNA variability (Figure-8). Alternative polyadenylation sites in the immunoglobulin heavy chain primary transcript result in mRNAs that are either 2700 bases long (m) or 2400 bases long (s). This results in a different carboxyl terminal region of the encoded proteins such that the m protein remains attached to the membrane of the B lymphocyte and the s immunoglobulin is secreted.

 Alternative poly A site

Figure-8- Showing the mechanism of alternative poly A sites selection to produce variability in mRNA and thus different proteins can be synthesized from a given set of genes.

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

The general process of transcription can be applied to both prokaryotic cells and eukaryotic cells. The basic biochemistry for each is the same; however, the specific mechanisms and regulation of transcription differ between prokaryotes and eukaryotes. 

Transcription of eukaryotic genes is far more a complicated process than prokaryotes. The important points of differences are as follows

1) Location

  • In prokaryotes (bacteria), transcription occurs in the cytoplasm. (Figure-1-A)
  • Translation of the mRNA into proteins also occurs in the cytoplasm (Figure-1-B)
  • In eukaryotes, transcription occurs in the cell’s nucleus. mRNA then moves to the cytoplasm for translation.

 Eukaryotic transcription



                     Prokaryotic transcription


Figure-1- (A)- Showing eukaryotic transcription, (B)- Showing Prokaryotic transcription

2) Genome size

  • The genome size is much larger in eukaryotes,
  •  Greater specificity is needed for the transcription of eukaryotic genes.

3) Chromatin structure

  • DNA in prokaryotes is much more accessible to RNA polymerase than DNA in eukaryotes.
  • Eukaryotic DNA is wrapped around proteins called histones to form structures called nucleosomes (Figure-2)
  • Eukaryotic DNA is packed to form chromatin (Figure-2).
  • While RNA polymerase interacts directly with prokaryotic DNA, other proteins mediate the interaction between RNA polymerase and DNA in eukaryotes.

 Chromatin structure

Figure-2- showing chromatin structure.

4)  RNA polymerases

  • There are three distinct classes of RNA polymerases in eukaryotic cells. All are large enzymes with multiple subunits. Each class of RNA polymerase recognizes particular types of genes.
  • RNA polymerase I- Synthesizes the precursor of the large ribosomal RNAs (28S, 18S and 5.8S).
  • RNA polymerase II – Synthesizes the precursors of messenger RNA and small nuclear RNAs(snRNAs).
  • RNA polymerase III- Synthesizes small RNA, including t RNAs, small 5S RNA and some snRNAs.

 5) Promoter regions

  • Eukaryotic promoters are more complex.
  • Two types of sequence elements are promoter-proximal and distal regulatory elements.
  • There are two elements in promoter proximal ,One of these defines where transcription is to commence along the DNA, and the other contributes to the mechanisms that control how frequently this event is to occur.(Figure-3)
  • Most mammalian genes have a TATA box that is usually located 25–30 bp upstream from the transcription start site.
  • The consensus sequence for a TATA box is TATAAA, though numerous variations have been characterized. 
  • Sequences farther upstream from the start site determine how frequently the transcription event occurs.
  • Typical of these DNA elements are the GC and CAAT boxes, (Figure-3) so named because of the DNA sequences involved.
  • Each of these boxes binds a specific protein.
  • Distal regulatory elements enhance or decrease the rate of transcription.
  • They include the enhancer/ silencer regions and other regulatory elements.(Figure-3)

 Eukaryotic promoters

Figure-3- Showing that a gene can be divided into its coding and regulatory regions, as defined by the transcription start site (arrow; +1). The coding region contains the DNA sequence that is transcribed into mRNA, which is ultimately translated into protein. The regulatory region consists of two classes of elements. One class is responsible for ensuring basal expression. These elements generally have two components. The proximal component, generally the TATA box, or Inr or DPE elements direct RNA polymerase II to the correct site (fidelity). In TATA-less promoters, an initiator (Inr) element that spans the initiation site (+1) may direct the polymerase to this site. Another component, the upstream elements, specifies the frequency of initiation. Among the best studied of these is the CAAT box, but several other elements may be used in various genes.The distal regulatory elements consist of enhancers and repressors and other regulatory regions .Enhancers and repressors enhance or repress expression and mediate the response to various signals, including hormones, heat shock, heavy metals, and chemicals. 

6) Promoter identification

  • In contrast to the situation in prokaryotes, eukaryotic RNA polymerases alone are not able to discriminate between promoter sequences and other regions of DNA
  • The TATA box is bound by 34 kDa TATA binding protein (TBP), which in turn binds several other proteins called TBP-associated factors (TAFs).
  • This complex of TBP and TAFs is referred to as TFIID (Figure-4)Basal transcription complex

Figure-3-The eukaryotic basal transcription complex. Formation of the basal transcription complex begins when TFIID binds to the TATA box. It directs the assembly of several other components by protein-DNA and protein-protein interactions. The entire complex spans DNA from position -30 to +30 relative to the initiation site.

  • Binding of TFIID to the TATA box sequence is thought to represent the first step in the formation of the transcription complex on the promoter.(Figure-4)
  • Another set of proteins—coactivators—help regulate the rate of transcription initiation by interacting with transcription activators that bind to upstream DNA elements (Figure-5)

7) Enhancers and Repressors

  • A third class of sequence elements can either increase or decrease the rate of transcription initiation of eukaryotic genes (Figure-4)
  •  These elements are called either enhancers or repressors (or silencers), depending on which effect they have.
  • They have been found in a variety of locations both upstream and downstream of the transcription start site and even within the transcribed portions of some genes.
  • In contrast to proximal and upstream promoter elements, enhancers and silencers can exert their effects when located hundreds or even thousands of bases away from transcription units located on the same chromosome.
  • Hormone response elements (for steroids, T3, retinoic acid, peptides, etc) act as—or in conjunction with—enhancers or silencers

Eukaryotic initiation complex

Figure-5- Showing promoter identification and formation of basal transcription  complex.The basal transcription complex is assembled on the promoter after the TBP subunit of TFIID is bound to the TATA box. Several TAFs (coactivators) are associated with TBP. TAFs, since they are required for the action of activators, are often called coactivators. There are thus three classes of transcription factors involved in the regulation of class II genes: basal factors, coactivators, and activator-repressors .

7) Termination of transcription

  • The signals for the termination of transcription by eukaryotic RNA polymerase II are very poorly understood.

8) Processing of primary  transcript

  • mRNA produced as a result of transcription is not modified in prokaryotic cells. Eukaryotic cells modify mRNA by RNA splicing, 5′ end capping, and addition of a polyA tail.
  • Most eukaryotic RNAs are synthesized as precursors that contain excess sequences which are removed prior to the generation of mature, functional RNA.

 To be continued….

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

RNA Synthesis

The synthesis of an RNA molecule from DNA is called Transcription.

All eukaryotic cells have five major classes of RNA: ribosomal RNA (rRNA), messenger RNA (mRNA), transfer RNA (tRNA), and small nuclear RNA and microRNA (snRNA and miRNA). The first three are involved in protein synthesis, while the small RNAs are involved in mRNA splicing and gene regulation.

Similarities between Replication and Transcription

The processes of DNA and RNA synthesis are similar in that they involve

(1) the general steps of initiation, elongation, and termination with 5′ to 3′ polarity;

(2) large, multicomponent initiation complexes; and

(3) adherence to Watson-Crick base-pairing rules.

Differences between Replication and Transcription

(1) Ribonucleotides are used in RNA synthesis rather than deoxy ribonucleotides;

(2) U replaces T as the complementary base pair for A in RNA;

(3) A primer is not involved in RNA synthesis;

(4) Only a portion of the genome is transcribed or copied into RNA, whereas the entire genome must be copied during DNA replication; and

(5) There is no proofreading function during RNA transcription.

Template strand

  • The sequence of ribonucleotides in an RNA molecule is complementary to the sequence of deoxy ribonucleotides in one strand of the double-stranded DNA molecule.
  • The strand that is transcribed or copied into an RNA molecule is referred to as the template strand of the DNA.
  • The other DNA strand, the non-template strand, is frequently referred to as the coding strand of that gene.
  •  It is called this because, with the exception of T for U changes, it corresponds exactly to the sequence of the RNA primary transcript, which encodes the (protein) product of the gene.
  •  In the case of a double-stranded DNA molecule containing many genes, the template strand for each gene will not necessarily be the same strand of the DNA double helix.
  • Thus, a given strand of a double-stranded DNA molecule will serve as the template strand for some genes and the coding strand of other genes.
  • The information in the template strand is read out in the 3′ to 5′ direction.

Transcription unit

  • DNA-dependent RNA polymerase is the enzyme responsible for the polymerization of ribonucleotides into a sequence complementary to the template strand of the gene.
  • The enzyme attaches at a specific site—the promoter—on the template strand.
  • This is followed by initiation of RNA synthesis at the starting point, and the process continues until a termination sequence is reached (Figure-3).
  • A transcription unit is defined as that region of DNA that includes the signals for transcription initiation, elongation, and termination.

Primary transcript

  • The RNA product, which is synthesized in the 5′ to 3′ direction, is the primary transcript.
  • In prokaryotes, this can represent the product of several contiguous genes
  •  In mammalian cells, it usually represents the product of a single gene
  •  The 5′ terminals of the primary RNA transcript and the mature cytoplasmic RNA are identical.
  • The starting point of transcription corresponds to the 5 nucleotide of the mRNA.
  • This is designated position +1, as is the corresponding nucleotide in the DNA (Figure-2.3)
  • The numbers increase as the sequence proceeds downstream.
  • The nucleotide in the promoter adjacent to the transcription initiation site is designated -1,
  •  These negative numbers increase as the sequence proceeds upstream, away from the initiation site.
  • This provides a conventional way of defining the location of regulatory elements in the promoter.

 Bacterial DNA-Dependent RNA Polymerase- (Figure-1)

  • The DNA-dependent RNA polymerase (RNAP) of the bacterium Escherichia coli exists as an approximately 400 kDa core complex consisting of-
  • two identical α subunits,
  • similar but not identical β andβ ‘ subunits, and
  • an ω  subunit.
  • Beta is thought to be the catalytic subunit.
  • RNAP, a metalloenzyme, also contains two zinc molecules.
  • The core RNA polymerase associates with a specific protein factor (the sigma σ factor) that helps the core enzyme recognize and bind to the specific deoxynucleotide sequence of the promoter region to form the preinitiation complex (PIC) (Figure-2)
  •  Bacteria contain multiple factors, each of which acts as a regulatory protein.

 structure of bacterial RNA polymerase

Figure-1- showing the subunits of bacterial RNA polymerase

Mammalian cells possess three distinct nuclear DNA-Dependent RNA Polymerases

  • RNA polymerase I is for the synthesis of r RNA
  • RNA polymerase II is for the synthesis of m RNA and miRNA
  • RNA polymerase III is for the synthesis of tRNA/5S rRNA, snRNA

Steps in RNA (Prokaryotic transcription)

The process of transcription of a typical gene of E.Coli can be divided in to three phases-

i) Initiation

ii) Elongation

iii) Termination

i) Initiation

  • Initiation of transcription involves the binding of the RNA polymerase holoenzyme to the promoter region on the DNA to form a preinitiation complex, or PIC(Figure-2)


 RNA Polymerase

Figure-2- showing the promoter identification by sigma subunit of RNA Polymerase.

  • Characteristic “Consensus” nucleotide sequence of the prokaryotic promoter region are highly conserved.

Structure of bacterial prokaryotic promoter region

  • Pribnow box

  This is a stretch of 6 nucleotides ( 5′- TATAAT-3′) centered about 8-10 nucleotides to the left of the transcription start site.(Figure-3)

  •  -35 Sequence

 A second consensus nucleotide sequence ( 5′- TTGACA-3′), is centered about 35 bases to the left  of the transcription start site.(Figure-3)

 Structure of bacterial promoter

Figure-3-Bacterial promoters share two regions of highly conserved nucleotide sequence. These regions are located 35 and 10 bp upstream (in the 5′ direction of the coding strand) from the start site of transcription, which is indicated as +1. By convention, all nucleotides upstream of the transcription initiation site (at +1) are numbered in a negative sense and are referred to as 5′-flanking sequences. Also by convention, the DNA regulatory sequence elements (TATA box, etc) are described in the 5′ to 3′ direction and as being on the coding strand. These elements function only in double-stranded DNA.

  • Binding of RNA-polymerase (RNAP) to the promoter region is followed by a conformational change of the RNAP, and the first nucleotide (almost always a purine) then associates with the initiation site on the subunit of the enzyme.
  • In the presence of the appropriate nucleotide, RNAP catalyzes the formation of a phosphodiester bond, and the nascent chain is now attached to the polymerization site on the subunit of RNAP.
  •  In both prokaryotes and eukaryotes, a purine ribonucleotide is usually the first to be polymerized into the RNA molecule.
  • After 10–20 nucleotides have been polymerized, RNAP undergoes a second conformational change leading to promoter clearance.
  •  Once this transition occurs, RNAP physically moves away from the promoter, transcribing down the transcription unit, leading to the next phase of the process, elongation.

II) Elongation

  • As the elongation complex containing the core RNA polymerase progresses along the DNA molecule, DNA unwinding must occur in order to provide access for the appropriate base pairing to the nucleotides of the template strand.
  •  The extent of this transcription bubble (ie, DNA unwinding) is constant throughout  and is about 20 base pairs per polymerase molecule. (Figure-4)
  • RNA polymerase has associated with it an “unwindase” activity that opens the DNA helix.
  • Topoisomerase both precedes and follows the progressing RNAP to prevent the formation of superhelical complexes.

iii) ) Termination

Termination of the synthesis of the RNA molecule in bacteria is of two types-

a) Rho (ρ) Dependent termination (Figure-4)

  • The termination process is signaled by a sequence in the template strand of the DNA molecule—a signal that is recognized by a termination protein, the rho (ρ) factor.
  • Rho is an ATP-dependent RNA-stimulated helicase that disrupts the nascent RNA-DNA complex.


 Prokaryotic transcription

Figure-4- showing the process of transcription. The promoter region is identified by the sigma factor of RNA polymerase. The strand unwinding continues with the elongation phase, base pairing rule is followed till there is signal for the termination of transcription. Rho dependent termination is shown in the figure.

b) Rho independent termination

  • This process requires the presence of intrachain self complementary sequences in the newly formed primary transcript so that it can acquire a stable hair pin turn that slows down the progress of the RNA polymerase and causes it to pause temporarily.
  • Near the stem of the hairpin, a sequence occurs that is rich in G and C. This stabilizes the secondary structure of the hair pin.(Figure-5)

Hair pin structure

Figure-5-showing the hair pin structure formed in the primary  transcript structure due to the presence of self complementary base pairs

  • Beyond the hair pin, the RNA transcript contains a strings of Us,(Figure-3) the bonding of Us to the corresponding As is weak. This facilitates the dissociation of the primary transcript from DNA.(Figure-6)


 Rho independent termination

Figure-6- showing the process of transcription. A, B and C showing initiation, elongation and termination respectively.

After termination of synthesis of the RNA molecule, the enzyme separates from the DNA template. With the assistance of another factor, the core enzyme then recognizes a promoter at which the synthesis of a new RNA molecule commences. 

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

DNA in the living cell is subjected to many chemical alterations. If the genetic information encoded in the DNA is to remain uncorrupted, any chemical changes must be corrected. A failure to repair DNA produces a mutation.

Agents that Damage DNA

  • Radiations- Highly reactive oxygen radicals produced during normal cellular respiration as well as by other biochemical pathways
    • Ionizing radiation such as gamma rays and x-rays
    • Ultraviolet rays, especially the UV-C rays (~260 nm) that are absorbed strongly by DNA but also the longer-wavelength UV-B that penetrates the ozone shield
  • Chemicals in the environment
    • many hydrocarbons, including some found in cigarette smoke
    • some plant and microbial products, e.g. the Aflatoxin produced in moldy peanuts
    • Chemicals used in chemotherapy, especially chemotherapy of cancers.

Types of DNA damage

DNA in cells suffers a wide range of damage:

  • Purine bases are lost by spontaneous fission of the base-sugar link;
  • Cytosines, and occasionally adenines, spontaneously deaminate to produce Uracil and hypoxanthine respectively
  • Many chemicals, for example Alkylating agents, form adducts with DNA bases
  • Reactive oxygen species in the cell attack purine and pyrimidine rings
  • Errors in DNA replication result in incorporation of a mismatched base
  • Ionizing radiation causes single- or double-strand breaks;
  • Errors in replication or recombination leave strand breaks in DNA.
  • Crosslinks- Covalent linkages can be formed between bases on the same DNA strand (“intrastrand”) or on the opposite strand (“interstrand”). Ultraviolet light causes adjacent thymines to form a stable chemical dimer

All these lesions must be repaired if the cell has to survive. The importance of effective DNA repair systems is highlighted by the severe diseases affecting people with deficient repair systems

DNA Repair

To cope with all these forms of damage, cells must be capable of several different types of DNA repairs. DNA repair seldom involves simply undoing the change that caused the damage. Almost always a stretch of DNA containing the damaged nucleotide(s) is excised and the gap is filled by re synthesis.

DNA repair can be grouped into two major functional categories:

A) Direct Damage reversal

B) Excision of DNA damage

A) Direct Damage Reversal

The direct reversal of DNA damage is by far the simplest repair mechanism that involves a single polypeptide chain, with enzymatic properties which binds to the damage and restores the DNA genome to its normal state in a single-reaction step. The major polypeptides involved in this pathway are:

i) DNA photolyases, the enzymes responsible for removing cyclobutane pyrimidine dimers from DNA in a light-dependent process called as photo reactivation.(Figure-1)

Direct dna repair

 Figure-1- Showing the mechanism of direct reversal of DNA damage by photolyases

ii) O6-methylguanine-DNA methyltransferase I and II (MGMT), also called DNA-alkyltransferases, remove the modified bases like O6-alkylguanine and O4-alkylthymine.

The photolyase protein is not found in all living cells. However, the DNA-alkyltransferases are widespread in nature.

Some of the drugs used in cancer chemotherapy also damage DNA by alkylating. Some of the methyl groups can be removed MGMT enzyme; however, the enzyme can only do it once, so the removal of each methyl group requires another molecule of enzyme.

B) Excision of DNA damage – includes

a) Base excision repair (BER)

b) Nucleotide excision repair (NER),

c) Mismatch repair (MMR) and

d) Strand break repairs.

In these reactions a nucleotide segment containing base damage, double-helix distortion or mispaired bases is replaced by the normal nucleotide sequence in a new DNA polymerase synthesis process. All of these pathways have been characterized in both bacterial and eukaryotic organisms.

i) Base excision repair (BER)- Figure-2

BER is initiated by DNA glycosylases, which catalyze the hydrolysis of the N-glycosidic bonds, linking particular types of chemically altered bases to the deoxyribose-phosphate backbone. Thus, DNA damage is excised as free bases, generating sites of base loss called apurinic or apyrimidinic (AP) sites. Another means of AP site generation is the depurination or depyrimidation of DNA, due to spontaneous hydrolysis of N-glycosidic bonds. The AP sites are substrates for AP endonucleases (Figure-2).These enzymes produce incisions in duplex DNA as a result of the hydrolysis of a phosphodiester bond immediately 5′ or 3′ to each AP site. The ribose-phosphate backbone is then removed from the DNA through the action of a specific exonuclease called deoxyribophosphodiesterase or dRpase. Finally, the DNA polymerase and a ligase catalyze the incorporation of a specific deoxyribonucleotide into the repaired site, enabling correct base pairing (Figure-2).

Base Excision Repair 

 Figure-2- showing mechanism of basic excision repair system.

ii) Nucleotide excision repair (NER)

Several types of agents generate bulky base adducts in DNA, leading to a significant distortion of the DNA helix. The most widely studied of these DNA damaging agents is UV radiation, responsible for thymine dimers, which produce a bend of ~30° in the DNA. Some chemical agents form DNA cross-links, which are particularly hazardous. These cross-links produce conformational distortions in DNA; they are substrates for DNA endonucleases that make an incision in DNA, several nucleotides to each side of the damage, generating a potential oligonucleotide fragment. Subsequent helicase reactions promote the excision of this fragment. The resulting gap is filled by DNA polymerase synthesis and covalently sealed by DNA ligase. These sequential enzymatic reactions, initiated by a specific endonuclease that recognizes the DNA distortion, are called NER (Nucleotide excision repair) (Figure-3).

Nucleotide excision repair system

 Figure-3- showing the mechanism of nucleotide excision repair

NER is a much more complex biochemical process than BER, especially in eukaryotic cells. Several gene products are required in a multiple step process, during which the ordered assembly of DNA proteins provides an enzymatic complex that discriminates damaged from undamaged DNA.

In Escherichia coli there are three specific proteins, called UvrA, B and C, involved in lesion recognition and endonuclease incision. This fragment is released by UvrD helicase action, generating a gap that is finally submitted to repair synthesis (Figure-4).


 Nucleotide excision repair

Figure-4- showing mechanism of nucleotide excision repair in E.Coli

Transcription-Coupled NER

Nucleotide-excision repair proceeds most rapidly

  • in cells whose genes are being actively transcribed
  • on the DNA strand that is serving as the template for transcription.

If RNA polymerase II, tracking along the template (antisense) strand), encounters a damaged base, it can recruit other proteins, e.g., the CSA and CSB proteins, to make a quick fix before it moves on to complete transcription of the gene.

iii) Mismatch repair (MMR)

Mismatch repair corrects errors made when DNA is copied. For example, a C could be inserted opposite an A, or the polymerase could slip or stutter and insert two to five extra unpaired bases. Specific proteins scan the newly synthesized DNA, using adenine methylation within a GATC sequence as the point of reference (Figure -5). The template strand is methylated, and the newly synthesized strand is not. This difference allows the repair enzymes to identify the strand that contains the errant nucleotide which requires replacement. If a mismatch or small loop is found, a GATC endonuclease cuts the strand bearing the mutation at a site corresponding to the GATC. An exonuclease then digests this strand from the GATC through the mutation, thus removing the faulty DNA. This can occur from either end if the defect is bracketed by two GATC sites. This defect is then filled in by normal cellular enzymes according to base pairing rules (Figure-5).

Mismatch repair system

 Figure-5- Showing the mechanism of  mismatch repair system

In E coli, three proteins (Mut S, Mut L, and Mut H) are required for recognition of the mutation and nicking of the strand. Other cellular enzymes, including ligase, polymerase, and SSBs, remove and replace the strand. The process is somewhat more complicated in mammalian cells, as about six proteins are involved in the first steps (Figure-6)

Mismatch repair system in E.coli

 Figure-6–showing mechanism of mismatch repair system in E. coli

Faulty mismatch repair has been linked to hereditary nonpolyposis colon cancer (HNPCC), one of the most common inherited cancers.

B) Repairing Strand Breaks

Ionizing radiation and certain chemicals can produce both single-strand breaks (SSBs) and double-strand breaks (DSBs) in the DNA backbone.

i) Single-Strand Breaks (SSBs)

Breaks in a single strand of the DNA molecule are repaired using the same enzyme systems that are used in Base-Excision Repair (BER).

ii) Double-Strand Break Repair

There are two mechanisms by which the cell attempts to repair a complete break in a DNA molecule:

1) Direct joining of the broken ends. This requires proteins that recognize and bind to the exposed ends and bring them together for ligating. This type of joining is also called Nonhomologous End-Joining (NHEJ). A protein called Ku is essential for NHEJ.(Figure-7)

Errors in direct joining may be a cause of the various translocations that are associated with cancers. Examples:

  • Burkitt’s lymphoma
  • Philadelphia chromosome in chronic myelogenous leukemia (CML)
  • B-cell leukemia

2) Homologous Recombination. Here the broken ends are repaired using the information on the intact. (Figure-7)

  • sister chromatid, or on the
  • homologous chromosome
  • same chromosome if there are duplicate copies of the gene on the chromosome oriented in opposite directions (head-to-head or back-to-back).

Two of the proteins used in homologous recombination are encoded by the genes BRCA1 and BRCA2. Inherited mutations in these genes predispose women to breast and ovarian cancers.

Double strand repairs


 Figure-7-Showing mechanism of double strand break repair

Meiosis also involves DSBs

Recombination between homologous chromosomes in meiosis I also involves the formation of DSBs and their repair. Meiosis I with the alignment of homologous sequences provides a mechanism for repairing damaged DNA.

Diseases associated with defective DNA repair system

Some of the examples include:                                

  • Ataxia telangiectasia
  • Bloom syndrome
  • Cockayne’s syndrome
  • Progeria (Hutchinson-Gilford Progeria syndrome)
  • Rothmund-Thomson syndrome
  • Trichothiodystrophy
  • Werner syndrome
  • Xeroderma pigmentosum
  • Hereditary non polyposis colon cancer. 
Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

DNA Replication

  • Basis for inheritance
  •  Fundamental process occurring in all cells for copying DNA to transfer the genetic information to daughter cells
  • Each cell must replicate its DNA before division.

Salient features of DNA replication

1) Semi conservative

  • Parental strands are not degraded
  • Base pairing allows each strand to serve as a template for a new strand
  • New  duplex is 1/2 parent template & 1/2 new DNA

2) Semi discontinuous- Leading & Lagging strands

  • Leading strand – continuous synthesis
  • Lagging strand – Okazaki fragments joined by ligases

 3) Energy of Replication- The nucleotides arrive as nucleoside triphosphates with their own energy source for bonding

  • DNA base, sugar with PPP
  • P-P-P = energy for bonding 
  • Mono phosphate form is bonded by enzyme DNA polymerase III, pyrophosphate is released, which is further broken down, the energy released is used for polymerization.

4)  Primer is needed

  • —  DNA polymerase can only add nucleotides to 3′ end of a growing DNA strand needs a “starter” nucleotide to make a bond
  • —  Primer serves as a starter sequence for DNA polymerase III
  • —  RNA primer is synthesized by Primase
  • —  RNA Primer has a free 3’OH group to which the first Nucleotide is bound.
  • —  Only one RNA Primer-required for the leading strand
  • —  RNA Primers for the lagging strand depend on the number of “OKAZAKI     FRAGMENTS”

 5)  Template reading and direction of polymerization         

  •  Each DNA strand serves as a template to guide the synthesis of complementary strand.
  • Template is read in the 3′-5′ direction while polymerization takes place in the 5′-3′ direction.

 DNA Replication (Prokaryotes) Steps-

  1.  Identification of the origins of replication
  2. ž Unwinding (denaturation) of dsDNA to provide a ssDNA template
  3. ž Formation of the replication fork
  4. ž Initiation of DNA synthesis and elongation
  5. ž Primer removal and ligation of the newly synthesized DNA segments

Components of Replication

  • ž  DNA polymerases- Deoxynucleotide polymerization
  • ž  Helicase -Processive unwinding of DNA
  • ž  Topoisomerases relieve torsional strain that results from helicase-induced unwinding
  • ž  RNA primase initiates synthesis of RNA primers
  • ž  Single-strand binding proteins prevent premature reannealing of dsDNA
  • ž  DNA ligase-seals the single strand nick between the nascent chain and Okazaki     fragments on lagging strand

   Identification of Origin of Replication

  • At the origin of replication (ori), there is an association of sequence-specific dsDNA-binding  proteins with a series of direct repeat DNA sequences.
  • In E coli, the oriC is bound by the protein dnaA
  • a complex is formed consisting of 150–250 bp of DNA and multimers of the DNA-binding  protein. This leads to the local denaturation and unwinding of an adjacent A+T-rich region of DNA.

 Unwinding of double stranded DNA to provide single stranded template

  • The interaction of proteins with ori defines the start site of replication and provides a short  region of ssDNA essential for initiation of synthesis of the nascent DNA strand.
  • DNA Helicase allows for processive unwinding of DNA.
  • Single-stranded DNA-binding proteins (SSBs) stabilize this complex.
  • Replication occurs in both directions along the length of DNA and both strands are replicated simultaneously.
  • This replication process generates “replication bubbles”
  • Further unwinding of DNA creates replication fork

 Formation of Replication Fork–  A replication fork consists of four components that form in the following sequence:

  • DNA helicase unwinds a short segment of the parental duplex DNA
  • A primase initiates synthesis of an RNA molecule that is essential for priming DNA synthesis;
  • DNA polymerase initiates nascent, daughter strand synthesis; and
  • SSBs bind to ssDNA and prevent premature reannealing of ssDNA to dsDNA.


 Replication fork

Figure-1 -showing the components and the processes involved at the replication fork.

The DNA Polymerase Complex

A number of different DNA polymerase molecules engage in DNA replication. These share three important properties:

(1) Chain elongation,

(2) Processivity, and

(3) Proofreading.

  • Chain elongation accounts for the rate (in nucleotides per second) at which polymerization occurs.
  • žProcessivity is an expression of the number of nucleotides added to the nascent chain before the polymerase disengages from the template.
  • žThe proofreading function identifies copying errors and corrects them
  • žIn E coli, polymerase III (pol III) functions at the replication fork. Of all polymerases, it catalyzes the highest rate of chain elongation and is the most processive.
  • žPolymerase II (pol II) is mostly involved in proofreading and DNA repair.
  • žPolymerase I (pol I) completes chain synthesis between Okazaki fragments on the lagging strand.

 Differences between DNA Polymerase I, II and III

Features DNA pol I DNA pol II DNA pol III
Polymerization Rate Low Low High
Processivity Low Low High
Proof reading 3’-5’ and 5’-3’ Exonuclease activities 3’-5’ Exonuclease activity 3’-5’ Exonuclease activity
Primer removal Best Nil Nil
Strand synthesis Lagging strand No role Both strands
DNA repair Active Active No role








 DNA Topo- isomerases

  •   Relief of super coils is  brought by Topo isomerases

Two types: 

  • Topoisomerases I : act by making a transient single cut in the backbone of the DNA, enabling the strands to swivel around each other to remove the build-up of twists
  • Topoisomerases II (DNA Gyrase) act by introducing double stranded breaks enabling one double-stranded DNA to pass through another, thereby removing knots and entanglements that can form within and between DNA molecules.

Initiation of replication

  • The polymerase III holoenzyme binds to template DNA as part of a multiprotein complex
  • DNA polymerases only synthesize DNA in the 5′ to 3′ direction,
  • Because the DNA strands are anti parallel, the polymerase functions asymmetrically.
  • On the leading (forward) strand, the DNA is synthesized continuously.
  • On the lagging (retrograde) strand, the DNA is synthesized in short (1–5 kb)fragments, the so-called Okazaki fragments.
  • Primer-The priming process involves the nucleophilic attack by the 3′-hydroxyl group of the RNA primer on the phosphate of the first entering deoxynucleoside triphosphate with the splitting off of pyrophosphate.

Elongation of DNA Synthesis

  • Selection of the proper deoxyribonucleotide whose terminal 3′-hydroxyl group is to be attacked is dependent upon proper base pairing with the other strand of the DNA molecule according to the rules proposed originally by Watson and Crick
  • When an adenine deoxyribonucleoside monophosphoryl moiety is in the template position, a thymidine triphosphate will enter and its phosphate will be attacked by the 3′-hydroxyl group of the deoxyribonucleoside monophosphoryl most recently added to the polymer.
  • By this stepwise process, the template dictates which deoxyribonucleoside triphosphate is complementary and by hydrogen bonding holds it in place while the 3′-hydroxyl group of the growing strand attacks and incorporates the new nucleotide into the polymer.

Primer removal and Nick sealing

  • Primers are removed by DNA polymerase I by replacing ribonucleotides with deoxy ribonucleotides
  • Nicks are sealed by DNA ligase
  • Multiple primers on the lagging strand while single primer on the leading strand.

Proof reading and Editing

DNA polymerase I

  • —  Proofreads & corrects typos
  • —  Repairs mismatched bases
  • —  Removes abnormal bases
  • —  Repairs damage
  •  Reduces error rate from 1 in 10,000 to 1 in 100 million bases

ž  DNA Polymerase III

  • Proofreads & corrects typos
  • Repairs mismatched bases
  • Removes abnormal bases

Termination of replication

  • DNA replication terminates when replication forks reach specific “termination sites”.
  • The two replication forks meet each other on the opposite end of the parental  circular DNA .
  • This process is completed in about 30 minutes, a replication rate of 3 x 105 bp/min in prokaryotes.

Replication in Eukaryotes

1) Origin of Replication -Eukaryotes

  • Functionally similar autonomously replicating sequences (ARS) or replicators have been identified in yeast cells.
  • The ARS contains a somewhat degenerate 11-bp sequence called the origin replication element (ORE).
  • The ORE binds a set of proteins, analogous to the dnaA protein of E coli, which is collectively called the origin recognition complex (ORC).
  • The ORE is located adjacent to an approximately 80-bp A+T-rich sequence that is easy to unwind. This is called the DNA unwinding element (DUE).
  • Multiple origins of replications to duplicate the entire length of genome (Figure-2)

 origin of replication

Figure-2- Showing multiple origins of replication in eukaryotes

2) Eukaryotic DNA polymerases- There are 5 DNA Polymerases, each with a specific function

 Comparison between Prokaryotic and Eukaryotic DNA polymerases

E coli  Mammalian Function
I Alpha Gap filling and synthesis of lagging strand
II Epsilon DNA proofreading and repair
  βeta DNA repair
  Gamma Mitochondrial DNA synthesis
III delta Processive , Leading strand synthesis

3) Mammalian DNA polymerase Alpha is mainly responsible for the synthesis of primer.

4) Primer removal  takes place by RNAse H

5) The entire mammalian genome replicates in approximately 9 hours.

6) Replication takes place during “S” phase of cell cycle

7) Reconstitution of Chromatin Structure-chromatin structure must be re-formed after replication. Newly replicated DNA is rapidly assembled into nucleosomes, and the preexisting and newly assembled histone octamers are randomly distributed to each arm of the replication fork. Chromatin structure must be re-formed after replication.

8) The ends of chromosome (Telomeres) are replicated by Telomerase

Inhibitors of DNA replication-

  • Bacterial DNA Gyrase (Type II Topoisomerase)- Inhibited by Novobiocin and Nalidixic acid.
  • Ciprofloxacin interferes with DNA breakage and rejoining process
  • Mammalian  topoisomerases – inhibited by Etoposide and Adriamycin, used as anticancer drugs.
  • Nucleoside analogues 6- mercaptopurine, 5-FluoroUracil and Cytosine Arabinoside also inhibit replication and are used as anticancer drugs.
Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!

RNA (Ribonucleic acid )

Ribonucleic acid (RNA) is a polymer of purine and pyrimidine ribonucleotides linked together by 3′,5′-phosphodiester bridges analogous to those in DNA (Figure-1).

RNA- Structure.

Figure-1- Showing an RNA fragment, the ribonucleotides are linked together by 3′-5′ phosphodiester linkages.  Ribose is the principal sugar in RNA

 Differences between RNA and DNA

Although sharing many features with DNA, RNA possesses several specific differences:

1) Single stranded mainly except when self complementary sequences are there it forms a double stranded structure (Hair pin structure)  Double stranded (Except for certain viral DNA s which are single stranded) 
2) Ribose is the main sugar  The sugar moiety is deoxy ribose 
3) Pyrimidine components differ. Thymine is never found (Except-tRNA)  Thymine is always there but uracil is never found 
4) Being single stranded structure- It does not follow Chargaff’s rule  It does follow Chargaff’s rule. The total purine content in a double stranded DNA is always equal to pyrimidine content. 
5) RNA can be easily destroyed by alkalies to cyclic diesters of mono nucleotides.  DNA resists alkali action due to the absence of OH group at 2’ position 
6) RNA is a relatively a labile molecule, undergoes easy and spontaneous degradation  DNA is a stable molecule. The spontaneous degradation is very 2 slow. The genetic information can be stored  for years together without any change. 
7) Mainly cytoplasmic, but also present in nucleus (primary transcript and small nuclear RNA) Mainly found in nucleus, extra nuclear DNA is found in mitochondria, and plasmids etc 
8) The base content varies from 100- 5000. The size is variable.  Millions of base pairs are there depending upon the organism 
9) There are various types of RNA – mRNA, r RNA, t RNA, Sn RNA, Si RNA, mi RNA and hn RNA. These RNAs perform different and specific functions.  DNA is always of one type and performs the function of storage and transfer of genetic information. 
10) No variable physiological forms of RNA are found. The different types of RNA do not change their forms  There are variable forms of DNA (A to E and Z) 
11) RNA is synthesized from DNA, it cannot form DNA(except by the action of reverse transcriptase). It cannot duplicate (except in certain viruses where it is a genomic material )  DNA can form DNA by replication, it can also form RNA by transcription. 
12) Many copies of RNA are present per cell  Single copy of DNA is present per cell. 

Types of RNA

In all prokaryotic and eukaryotic organisms, three main classes of RNA molecules exist-

1)      Messenger RNA (m- RNA)

2)      Transfer RNA (t- RNA)

3)      Ribosomal RNA (r- RNA)

The other are –

  • Small nuclear RNA (SnRNA)
  • Micro RNA(mi RNA)
  • Small interfering RNA(Si RNA) and
  • Heterogeneous nuclear RNA (hnRNA).

1) Messenger RNA (m-RNA

  • Comprises only 5% of the RNA in the cell
  • Most heterogeneous in size and base sequence
  • All members of the class function as messengers carrying the information in a gene to the protein synthesizing machinery

 Structural Characteristics of m-RNA

  • The 5’ terminal end is capped by 7- methyl Guanosine triphosphate cap.
  • The cap is involved in the recognition of mRNA by the translating machinery
  • It stabilizes m- RNA by protecting it from 5’ exonuclease
  • The 3’end of most m-RNAs have a polymer of Adenylate residues( 20-250)
  • The tail prevents the attack by 3’ exonucleases. Histones and interferons do not contain poly A tails
  • On both 5’ and 3’ end there are non coding sequences which are not translated (NCS)
  • The intervening region between non coding sequences present between 5’ and 3’ end is called coding region. This region encodes for the synthesis of a protein.
  • The m- RNA molecules are formed with the help of DNA template during the process of transcription.
  • The sequence of nucleotides in m RNA is complementary  to the sequence of nucleotides on template DNA.
  • The  sequence carried on m -RNA is read in the form of codons.
  • A codon is made up of 3 nucleotides
  • The m-RNA is formed after processing of heterogeneous nuclear RNA

2) Heterogeneous nuclear RNA (hnRNA)

  • In mammalian nuclei,  hnRNA is the immediate product of gene transcription
  • The nuclear product is heterogeneous in size (Variable) and is very large.
  • Molecular weight may be more than 107, while the molecular weight of m RNA is less than 2x 106
  • 75 % of hnRNA is degraded in the nucleus, only 25% is processed to mature m RNA

3) Transfer RNA (t- RNA)

  • Transfer RNA are the smallest of three major species of RNA molecules
  • They have 74-95 nucleotide residues
  • They are synthesized by the nuclear processing of a precursor molecule
  • They transfer the amino acids from cytoplasm to the protein synthesizing machinery, hence the name t RNA.
  • They are easily soluble , hence called “Soluble RNA or s RNA
  • They are also called Adapter molecules, since they act as adapters for the translation of the sequence of nucleotides of the m RNA in to specific amino acids
  • There are at least 20 species of t RNA one corresponding to each of the 20 amino acids required for protein synthesis.

 Structural characteristics of t- RNA

A) Primary structure- The nucleotide sequence of all the t RNA molecules allows extensive intrastand complimentarity that generates a secondary structure.

B) Secondary structure– Each single t- RNA shows extensive internal base pairing and acquires a clover leaf like structure. The structure is stabilized by hydrogen bonding between the bases and is a consistent feature.

Clover leaf structure (Figure-2(a)

All t-RNA contain 5 main arms or loops which are as follows-

a)      Acceptor arm

b)      Anticodon arm

c)       D HU arm

d)      TΨ C arm

e)      Extra arm

a) Acceptor arm

  • The acceptor arm is at 3’ end
  • It has 7 base pairs
  • The end sequence is unpaired Cytosine, Cytosine-Adenine at the 3’ end
  • The 3’ OH group terminal of Adenine binds with carboxyl group of amino acids
  • The t RNA bound with amino acid is called Amino acyl t RNA
  • CCA attachment is done post transcriptionally,

b) Anticodon arm

  • Lies at the opposite end of acceptor arm
  • 5 base pairs long
  • Recognizes the triplet codon present in the m RNA
  • Base sequence of anticodon arm is complementary to the base sequence of m RNA codon.
  • Due to complimentarity it can bind specifically with m RNA by hydrogen bonds.

c) DHU arm

  • It has 3-4 base pairs
  • Serves as the recognition site for the enzyme (amino acyl t RNA synthetase) that adds the amino acid to the acceptor arm.

d) TΨC arm

  • This arm is opposite to DHU arm
  • Since it contains pseudo uridine that is why it is so named
  • It is involved in the binding of t RNA to the ribosomes

e) Extra arm or Variable arm

  •  About 75 % of t RNA molecules possess a short extra arm
  • If about 3-5 base pairs are present the t-RNA is said to be belonging to class 1. Majority t -RNA belong to class 1.
  • The t –RNA belonging to class 2 have long extra arm, 13-21 base pairs in length.

 C) Tertiary structure of t- RNA- Figure-2(b)

  •  The L shaped tertiary structure is formed by further folding of the clover leaf due to hydrogen bonds between T and D arms.
  • The base paired double helical stems get arranged in to two double helical columns, continuous and perpendicular to one another.


Secondary structure of t- RNA   


Tertiary structure of t- RNA



Figure-2-(a) Showing secondary (clover leaf ) structure of t-RNA. The carboxyl group of amino acid is attached to 3’OH group of Adenine nucleotide of the acceptor arm. The anticodon arm base pairs with the codon present on the m- RNA. (b) Tertiary structure of t RNA is formed by further folding of secondary structure of t RNA.

4) Ribosomal RNA (rRNA)

The mammalian ribosome contains two major nucleoprotein subunits—a larger one with a molecular weight of 2.8 x 106 (60S) and a smaller subunit with a molecular weight of 1.4 x 106 (40S) (Figure-3).

  •  The 60S subunit contains a 5S ribosomal RNA (rRNA), a 5.8S rRNA, and a 28S rRNA; there are also probably more than 50 specific polypeptides.
  • The 40S subunit is smaller and contains a single 18S rRNA and approximately 30 distinct polypeptide chains.
  • All of the ribosomal RNA molecules except the 5S rRNA are processed from a single 45S precursor RNA molecule in the nucleolus . 5S rRNA is independently transcribed.

Functions of the ribosomal RNA

The functions of the ribosomal RNA molecules in the ribosomal particle are not fully understood, but they are necessary for ribosomal assembly and seem to play key roles in the binding of mRNA to ribosomes and its translation. Recent studies suggest that an rRNA component performs the peptidyl transferase activity and thus is an enzyme (a ribozyme).

 5) Small RNAs

  •  Most of these molecules are complexed with proteins to form ribonucleoproteins and are distributed in the nucleus, in the cytoplasm, or in both.
  • They range in size from 20 to 300 nucleotides and are present in 100,000–1,000,000 copies per cell.

Small Nuclear RNAs (snRNAs)

  •  snRNAs, a subset of the small RNAs, are significantly involved in mRNA processing and gene regulation
  • Of the several snRNAs, U1, U2, U4, U5, and U6 are involved in intron removal and the processing of hnRNA into mRNA
  • The U7 snRNA is involved in production of the correct 3′ ends of histone mRNA—which lacks a poly(A) tail.


Figure-3- showing structure of mammalian ribosomes- Ribosomal RNA are the structural  components of ribosomes.

 Micro RNAs, miRNAs, and Small Interfering RNAs, siRNAs

  • These two classes of RNAs represent a subset of small RNAs; both play important roles in gene regulation.
  • miRNAs and siRNAs cause inhibition of gene expression by decreasing specific protein production albeit apparently via distinct mechanisms

 Micro RNAs (miRNAs)

  •  miRNAs are typically 21–25 nucleotides in length and are generated by nucleolytic processing of the products of distinct genes/transcription units
  • The small processed mature miRNAs typically hybridize, via the formation of imperfect RNA-RNA duplexes within the 3′-untranslated regions of specific target mRNAs, leading via unknown mechanisms to translation arrest.


 mi RNA

Figure-4- showing mechanism of action of micro RNAs, micro RNAs, short non-coding RNAs present in all living organisms, have been shown to regulate the expression of at least half of all human genes. These single-stranded RNAs exert their regulatory action by binding messenger RNAs and preventing their translation into proteins.

Small Interfering RNAs (siRNAs)

  •  siRNAs are derived by the specific nucleolytic cleavage of larger, double-stranded RNAs to again form small 21–25 nucleotide-long products.
  • These short siRNAs usually form perfect RNA-RNA hybrids with their distinct targets potentially anywhere within the length of the mRNA where the complementary sequence exists.
  • Formation of such RNA-RNA duplexes between siRNA and mRNA results in reduced specific protein production because the siRNA-mRNA complexes are degraded by dedicated nucleolytic machinery.

 Significance of mi RNAs and si RNAs

  •  Both miRNAs and siRNAs represent exciting new potential targets for therapeutic drug development in humans.
  • In addition, siRNAs are frequently used to decrease or “knock-down” specific protein levels in experimental procedures in the laboratory, an extremely useful and powerful alternative to gene-knockout technology.


Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!


  • DNA – a polymer of deoxy ribonucleotides
  • found in chromosomes, mitochondria and chloroplasts
  • carries the genetic information.

DNA structure

a) Primary structure (Figure-1)

  • Represents the linear sequence of deoxy ribonucleotides linked together by 3′-5′ phosphodiester linkages
  • The informational content of DNA resides in the sequence in which monomers—purine and pyrimidine deoxy ribonucleotides—are ordered
  • The polymer as depicted possesses a polarity; one end has a 5′-hydroxyl or phosphate terminal while the other has a 3′-phosphate or hydroxyl terminal.
  • Traditionally, a DNA sequence is drawn from 5’ to 3’ end.


Primary structure of DNA

Figure-1- Showing structure of a linear single stranded DNA fragment, individual nucleotides are linked together by 3′-5’phosphodiester linkage. DNA strand has a polarity (5′ Free end and a 3′ free or unattached end). In a single-stranded DNA, sequence is written in the 5′ to 3′ direction.

b) Secondary structure (Figure- 2)

The secondary structure of DNA is the double-helical structure as proposed by Watson, Crick, and Wilkins

  • The two strands of the double-helical molecule, each of which possesses a polarity, are antiparallel; ie, one strand runs in the 5′ to 3′ direction and the other in the 3′ to 5′ direction.
  • Sugar-phosphate chains wrap around the periphery.
  • Bases (A,T, C and G) occupy the core, forming complementary A · T and G · C Watson-Crick base pairs.
  • The DNA double helix is held together mainly by- Hydrogen bonds.
  • Two hydrogen bonds between A:T pairs while three hydrogen bonds between C: G pairs
  • The bases in DNA are planar and have a tendency to “stack”.
  • Major stacking forces: hydrophobic interaction and Vander Waals forces.
  • This common form of DNA is said to be right-handed because as one looks down the double helix, the base residues form a spiral in a clockwise direction.
  • In a double-helical structure Chargaff rule is followed which states  that in DNA molecules the concentration of deoxyadenosine (A) nucleotides equals that of thymidine (T) nucleotides (A = T), while the concentration of deoxyguanosine (G) nucleotides equals that of deoxycytidine (C) nucleotides (G = C).
  • In the double-stranded DNA molecules, the genetic information resides in the sequence of nucleotides on one strand, the template strand. This is the strand of DNA that is copied during ribonucleic acid (RNA) synthesis. It is sometimes referred to as the noncoding strand. The opposite strand is considered the coding strand because it matches the sequence of the RNA transcript (but containing uracil in place of thymine) that encodes the protein.
  • There are Grooves in the DNA Molecule- a major groove and a minor groove winding along the molecule parallel to the phosphodiester backbones. In these grooves, proteins can interact specifically with exposed atoms of the nucleotides (via specific hydrophobic and ionic interactions) thereby recognizing and binding to specific nucleotide sequences without disrupting the base pairing of the double-helical DNA molecule.
  • Double-stranded DNA exists in at least six forms (A–E and Z). The B form is usually found under physiologic conditions (low salt, high degree of hydration).
  • A single turn of B-DNA about the axis of the molecule contains ten base pairs. The distance spanned by one turn of B-DNA is 3.4 nm (34 Å). The width (helical diameter) of the double helix in B-DNA is 2 nm (20 Å).


 Secondary structure of DNA

Figure-2-  showing a  diagrammatic representation of the Watson and Crick model of the double-helical structure of the B form of DNA. The horizontal arrow indicates the width of the double helix (20 Å), and the vertical arrow indicates the distance spanned by one complete turn of the double helix (34 Å). The major and minor grooves are depicted. Hydrogen bonds between A/T and G/C bases indicated by short horizontal lines.

 Structural forms of DNA

Property A-DNA B-DNA Z-DNA
Helix Handedness Right Right Left
Base Pairs per turn 11 10.4 12
Rise per base pair along axis 0.23 nm 0.34 nm 0.38 nm
Pitch 2.46 nm 3.40 nm 4.56 nm
Diameter 2.55 nm 2.37 nm 1.84 nm
Conformation of Glycosidic bond anti anti Alternating anti and syn
Major Groove Present Present Absent
Minor Groove Present Present Deep cleft

 c) Tertiary structureIn eukaryotic cells,  DNA is folded into chromatin. Chromatin consists of very long double-stranded DNA molecules and a nearly equal mass of rather small basic proteins termed histones as well as a smaller amount of nonhistone proteins (most of which are acidic and larger than histones) and a small quantity of RNA. The double-stranded DNA helix in each chromosome has a length that is thousands of times the diameter of the cell nucleus. One purpose of the molecules that comprise chromatin, particularly the histones, is to condense the DNA.

Levels of organization of DNA (Figure-3)

  • Nucleosomes are composed of DNA wound around a collection of histone molecules.
  • The disk-like nucleosome structure has a 10-nm diameter and a height of 5 nm. The 10-nm fibril consists of nucleosomes arranged with their edges separated by a small distance (30 bp of DNA) with their flat faces parallel with the fibril axis.
  • The 10-nm fibril is probably further supercoiled with six or seven nucleosomes per turn to form the 30-nm chromatin fiber.
  • In interphase chromosomes, chromatin fibers appear to be organized into 30,000–100,000 bp loops or domains anchored in a scaffolding (or supporting matrix) within the nucleus.
  • At metaphase, mammalian chromosomes possess a twofold symmetry, with the identical duplicated sister chromatids connected at a centromere, the relative position of which is characteristic for a given chromosome (Figure-3)

Tertiary structure of DNA

Figure-3- Showing the different levels of organization of DNA structure.

Functions of DNA

The genetic information stored in the nucleotide sequence of DNA serves two purposes.

  • It is the source of information for the synthesis of all protein molecules of the cell and organism,
  • it provides the information inherited by daughter cells or offspring.
  • Both of these functions require that the DNA molecule serve as a template—in the first case for the transcription of the information into RNA and in the second case for the replication of the information into daughter DNA molecules.


Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!


A mutation is a permanent change in the nucleotide sequence of a gene. Mutations may be either gross, so that large area of chromosome is changed, or may be subtle with a change in one or a few nucleotides.

Causes of Mutations

1) Spontaneous

Spontaneous mutations on the molecular level include:

  • Tautomerism – A base is changed by the repositioning of a hydrogen atom.
  • Depurination – Loss of a purine base (A or G).
  • Deamination – Changes a normal base to an atypical base; C  U, (which can be corrected by DNA repair mechanisms), or spontaneous deamination of 5-methycytosine (irreparable), or A  HX (hypoxanthine).
  • Transition – A purine changes to another purine, or a pyrimidine to a pyrimidine.
  • Transversion – A purine becomes a pyrimidine, or vice versa.

2) Induced by Mutagens

Induced mutations on the molecular level can be caused by:

  • Chemicals
    • Nitroso compounds
    • Hydroxylamine NH2OH
    • Base analogs
    • Simple chemicals (e.g. acids)
    • Alkylating agents (e.g. N-ethyl-N-nitrosourea (ENU)) These agents can mutate both replicating and non-replicating DNA. In contrast, a base analog can only mutate the DNA when the analog is incorporated in replicating the DNA.
    • Methylating agents  
    • Polycyclic aromatic hydrocarbons e.g. benzopyrenes
    • DNA intercalating agents (e.g. ethidium bromide)
    • DNA cross linker (e.g. platinum)
    • Oxidative damage caused by oxygen(O)] radicals
  • Radiation
    • Ultraviolet radiation (nonionizing radiation) – excites electrons to a higher energy level. DNA absorbs ultraviolet light. Two nucleotide bases in DNA – cytosine and thymine-are most vulnerable to excitation that can change base-pairing properties. UV light can induce adjacent thymine bases in a DNA strand to pair with each other, as a bulky dimer.
    • Ionizing radiation
  •  Biological- Viruses

DNA has so-called hotspots, where mutations occur up to 100 times more frequently than the normal mutation rate. A hotspot can be at an unusual base, e.g., 5-methylcytosine.

Classification of mutations-

Structurally, mutations can be classified as:

A) Point mutations, often caused by chemicals or malfunction of DNA replication, and include single nucleotide changes-

Ø     Substitution-exchange a single nucleotide for another. Most common is the transition that exchanges a purine for a purine (A  G) or a pyrimidine for a pyrimidine, (C  T). A transition can be caused by nitrous acid, base mis-pairing, or mutagenic base analogs. Less common is a transversion, which exchanges a purine for a pyrimidine or a pyrimidine for a purine (C/T  A/G). An example of a transversion is adenine (A) being converted into a cytosine (C).

Ø   Insertions add one nucleotide into the DNA. They are usually caused by transposable elements or errors during replication of repeating elements (e.g. AT repeats). Insertions in the coding region of a gene may alter splicing of the mRNA (splice site mutation), or cause a shift in the reading frame (frame shift), both of which can significantly alter the gene product.

 Ø   Deletions remove one nucleotide from the DNA. Like insertions, these mutations can alter the reading frame of the gene.

B ) Large-scale mutations in chromosomal structure, including:

a)     Amplifications (or gene duplications) leading to multiple copies of all chromosomal regions, increasing the dosage of the genes located within them.

b)    Deletions of large chromosomal regions, leading to loss of the genes within those regions.

c)     Mutations whose effect is to juxtapose previously separate pieces of DNA, potentially bringing together separate genes to form functionally distinct fusion genes (e.g. bcr-abl). These include:

§  Chromosomal translocations: interchange of genetic parts from nonhomologous chromosomes.

§  Interstitial deletions: an intra-chromosomal deletion that removes a segment of DNA from a single chromosome, thereby apposing previously distant genes.

§  Chromosomal inversions: reversing the orientation of a chromosomal segment.

Effects of mutations

Although the initial change may not occur in the template strand of the double-stranded DNA molecule for that gene, after replication, daughter DNA molecules with mutations in the template strand will segregate and appear in the population of organisms.

If the nucleotide sequence of the gene containing the mutation is transcribed into an RNA molecule, then the RNA molecule will possess a complementary base change at this corresponding locus.(See figure)

Single-base changes in the mRNA molecules may have one of several effects when translated into protein:

(1) There may be no detectable effect because of the degeneracy of the code; such mutations are often referred to as silent mutations. This would be more likely if the changed base in the mRNA molecule were to be at the third nucleotide of a codon. Because of wobble, the translation of a codon is least sensitive to a change at the third position. E.g. valine has 4 codons GUU, GUC, GUA, or GUG, the change in the third nucleotide will have the incorporation of same amino acid, thus there will not be any effect on the functional capacity of the protein.

(2) A missense effect will occur when a different amino acid is incorporated at the corresponding site in the protein molecule. This mistaken amino acid—or missense, depending upon its location in the specific protein—might be acceptable, partially acceptable, or unacceptable to the function of that protein molecule. From a careful examination of the genetic code, one can conclude that most single-base changes would result in the replacement of one amino acid by another with rather similar functional groups. This is an effective mechanism to avoid drastic change in the physical properties of a protein molecule. If an acceptable missense effect occurs, the resulting protein molecule may not be distinguishable from the normal one. A partially acceptable missense will result in a protein molecule with partial but abnormal function. If an unacceptable missense effect occurs, then the protein molecule will not be capable of functioning in its assigned role.

a) Acceptable Missense mutations- The sequencing of a large number of hemoglobin mRNAs and genes from many individuals has shown that the codon for valine at position 67 of the beta chain of hemoglobin is not identical in all persons who possess a normally functional bets chain of hemoglobin. The codon changes by point mutation from GUU (Of valine) to GAU of Aspartic acid in Hb Bristol. Similarly in Hb Sydney the codon changes from GUU to GCU for Alanine. Both Hb Bristol and Hb Sydney are normal Hb variants with normal oxygen carrying capacity. Thus these are acceptable mutations. Hemoglobin Hikari has been found in at least two families of Japanese people. This hemoglobin has asparagine substituted for lysine at the 61 position in the beta chain. The corresponding transversion might be either AAA or AAG changed to either AAU or AAC. The replacement of the specific lysine with asparagine apparently does not alter the normal function of the beta chain in these individuals.

b) Partially acceptable Missense mutations

A partially acceptable missense mutation is best exemplified by hemoglobin S, which is found in sickle cell anemia. Here glutamic acid, the normal amino acid in position 6 of the beta chain, has been replaced by valine. The corresponding single nucleotide change within the codon would be GAA or GAG of glutamic acid to GUA or GUG of valine. Clearly, this missense mutation hinders normal function and results in sickle cell anemia when the mutant gene is present in the homozygous state. The glutamate-to-valine change may be considered to be partially acceptable because hemoglobin S does bind and release oxygen, although abnormally.

c) Unacceptable Missense Mutations For example, the hemoglobin M mutations generate molecules that allow the Fe2+ of the heme moiety to be oxidized to Fe3+, producing met hemoglobin. Here the single nucleotide change alters the properties of a protein to such an extent that it becomes non functional. Hb M results from histidine to tyrosine substitution.

Distal Histidine of alpha chain of Globin is replaced by Tyrosine. The codon CAU is changed to UAU with the resultant incorporation of Tyrosine and formation of Met Hb. Met hemoglobin cannot transport oxygen.

(3) A nonsense codon may appear that would then result in the premature termination of a peptide chain and the production of only a fragment of the intended protein molecule. The probability is high that a prematurely terminated protein molecule or peptide fragment will not function in its assigned role.e.g. The codon UAC for Tyrosine may be mutated to UAA or UAG, both are stop codons. Beta Thalassemia is an example of non sense mutation.

In certain conditions as a result of mutational event the stop codon may be changed to normal codon (UAA to CAA) . This results in the elongation of protein to produce “Run on polypeptides”. The resultant protein is a functionally abnormal protein.

Frame shift Mutations

frame shift mutation is a mutation caused by inserts or deletes of a number of nucleotides from a DNA sequence. Due to the triplet nature of gene expression by codons, the insertion or deletion can disrupt the reading frame, or the grouping of the codons, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein produced is.

If three nucleotides or a multiple of three are deleted from a coding region, the corresponding mRNA when translated will provide a protein from which is missing the corresponding number of amino acids. Because the reading frame is a triplet, the reading phase will not be disturbed for those codons distal to the deletion.

Triplet deletion

A triplet deletion removes exactly one amino acid from the polypeptide ,the most common mutation in cystic fibrosis is Delta F508 (i.e. deletion of amino acid number 508 (a phenylalanine, F)).

Trinucleotide expansion

The commonest inherited cause of mental retardation is a syndrome originally known as Martin-Bell syndrome. Patients are most usually male, have a characteristic elongated face and numerous other abnormalities including greatly enlarged testes. In 1969 the name of the syndrome was changed to the fragile X syndrome. The mutation was tracked down to a trinucleotide expansion in the gene now named FMR1 (Fragile site with Mental Retardation).  A number of diseases have now been ascribed to trinucleotide expansions. These include Huntington’s disease and Myotonic dystrophy.

Gene deletions Alpha Thalassemia is an example of Gene deletion. The clinical manifestations are as per the number of genes deleted.

Consequences of Mutations

Harmful mutations

Changes in DNA caused by mutation can cause errors in protein sequence, creating partially or completely non-functional proteins. To function correctly, each cell depends on thousands of proteins to function in the right places at the right times. When a mutation alters a protein that plays a critical role in the body, a medical condition can result. A condition caused by mutations in one or more genes is called a genetic disorder. However, only a small percentage of mutations cause genetic disorders; most have no impact on health. For example, some mutations alter a gene’s DNA base sequence but don’t change the function of the protein made by the gene.

If a mutation is present in a germ cell, it can give rise to offspring that carries the mutation in all of its cells. This is the case in hereditary diseases. On the other hand, a mutation can occur in a somatic cell of an organism. Such mutations will be present in all descendants of this cell, and certain mutations can cause the cell to become malignant, and thus cause cancer.

Often, gene mutations that could cause a genetic disorder are repaired by the DNA repair system of the cell. Each cell has a number of pathways through which enzymes recognise and repair mistakes in DNA. Because DNA can be damaged or mutated in many ways, the process of DNA repair is an important way in which the body protects itself from disease.

Beneficial mutations

A very small percentage of all mutations actually have a positive effect. These mutations lead to new versions of proteins that help an organism and its future generations better adapt to changes in their environment. For example, a specific 32 base pair deletion in human CCR5 (CCR5-Δ32) confers HIV resistance to homozygotes and delays AIDS onset in heterozygotes. The CCR5 mutation is more common in those of European descent.

Summary of Mutations

Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!


The pathway of protein synthesis is called Translation because the ‘language’ of the nucleotide sequence on the mRNA is translated into the language of the amino acid sequence. The m RNA is translated from its 5’end to its 3’end, producing a protein synthesized from its amino terminal end to its carboxyl terminal end.

Prokaryotic Translation

 Components required for Translation-

l  Amino acids

l  Transfer RNA

l  Messenger RNA

l  Aminoacyl t RNA synthetase

l  Functionally competent ribosomes

l  Protein factors

l  ATP and GTP as a source of energy


Ribosomes are large complexes of protein and rRNA. They consist of two subunits(Figure-1), one large (Heavy) and one small (Light) whose relative sizes are generally given in terms of their sedimentation coefficients or S (Svedberg) values. The S values are determined by the shape as well as by the molecular mass; their numerical values are strictly not additive. The prokaryotic 50S and 30S ribosomal subunits together form a ribosome with an S value of 70.

Figure-2- Three sites are created upon Ribosomal assembly

The ribosome has two binding sites for tRNA molecule A and P sites, each of which extends over both subunits. Together they cover the neighboring codons. (See figure-2)

During translation, The A site binds an incoming Aminoacyl tRNA as directed by the codon currently occupying the site.  This codon specifies the next amino acid to be added to the growing peptide chain.

The P site codon is occupied by the Peptidyl-t RNA. This   t RNA carries the chain of amino acids that has already been synthesized.

 An E site is also there that is occupied by the empty  tRNA that is about to exit the ribosome(Figure-2).

 Transfer RNA

At least one specific type of tRNA is required per amino acid. Two sites are important, one Amino acid attachment site the other is Anticodon site. Because of their ability to carry a specific amino acid and to recognize the codons for that amino acid, tRNAs are called the adapter molecules (Figure-3).

Figure-3- Secondary structure of transfer RNA (tRNA)

Amino acids are activated before incorporation and this activation is brought about by amino acyl t RNA synthetase in the presence of ATP. There is at least one amino acyl t RNA synthetase per amino acid. The carboxyl group of the amino acid is esterified to the 3’hydroxyl group of the t RNA.

Steps of Protein Synthesis

The process of protein synthesis is divided into 3 stages

i) Initiation

ii) Elongation

iii) Termination

i)  Initiation-Initiation of the protein synthesis involves the assembly of the components of the translation system before the peptide bond formation occurs. These components include

o   Two ribosomal subunits

o   m RNA

o   Aminoacyl tRNA specified by the codons in the message

o   GTP and

o    Initiation factors that facilitate the assembly of this initiation complex. In prokaryotes three initiation factors are known

(IF-1,IF-2 and IF-3) while in eukaryotes there are at least 9 designated as e- IF to indicate the eukaryotic origin

Ribosomal assembly and formation of Initiation complex

The small ribosomal subunit binds to Initiation Factor 3 (IF3). The small subunit/IF3 complex binds to the mRNA. Specifically, it binds to the sequence AGGAGG, known as the Shine-Dalgarno sequence, which is found in all prokaryotic mRNAs. (Figure 4).

Meanwhile, the fmet tRNA binds to Initiation Factor 2 (IF2), which promotes binding of the tRNA to the start codon. (Figure 5)

Figure-4- Showing shine dalgarno sequence

Figure-5- Formation of complex between Initiator tRNA and Initiator factor-2

The small subunit/IF3 complex scans along the mRNA until it encounters the start codon. The tRNA/IF2 complex also binds to the start codon. This complex of the small ribosomal subunit, IF3, initiator tRNA, and IF2 is called the initiation complex. (Figure-6)


Figure-6- Formation of initiation complex

At this point, the large ribosomal subunit joins in. A molecule of GTP is hydrolyzed, and the initiation factors are released. The ribosomal complex is now ready for protein synthesis.(Figure-7)

Figure-7 Complete ribosomal assembly

When the ribosome is assembled, two tRNA binding sites are created; these are designated ‘P’ and ‘A’ (P stands for Peptidyl, A stands for Aminoacyl). The initiator tRNA is in the P site, and the A site will be filled by the tRNA with the anticodon that is complementary to the codon next to the start. (In this case, it is the tRNA that binds proline.) Figure-8


Figure-8- Incorporation of first amino acid (Formylated methionine)

 ii) Elongation

 When the second tRNA base pairs with the appropriate codon in the mRNA, an enzyme called Peptidyl transferase catalyzes the formation of a peptide bond between the two amino acids present (while breaking the bond between fmet and its tRNA).This activity is intrinsic to the 23S rRNA found in the large subunit. Since the r RNA catalyzes this process, it is referred to as the Ribozyme. At this point, the whole ribosome shifts over one codon. This shift requires several elongation factors (not shown) and energy from the hydrolysis of GTP. The result of the shift is that the uncharged tRNA that was in the P site is ejected, and the tRNA that was in the A site is now in the P site. The A site is free to accept the tRNA molecule with the appropriate anticodon for the next codon in the mRNA.(Figure 9)

Figure-9- Elongation process involves incorporation of amino acids in the growing peptide chain as per the information on the messenger RNA.

The next tRNA base pairs with the next codon, and Peptidyl transferase catalyzes the formation of a peptide bond between the new amino acid and the growing peptide chain. Once again, the ribosome shifts over, so that the uncharged tRNA is expelled, and the tRNA with the peptide chain occupies the P site. (This is why this site is called the ‘Peptidyl’ site – after the shift, it contains the tRNA with the growing peptide chain. The other site will accept a tRNA with an amino acid, hence the name ‘Aminoacyl’ site.The process of shifting and peptide bond formation continues over and over until a termination codon is encountered.(Figure-10)


 Figure-10- The peptide chain elongation process terminates when the stop codon is encountered on the m RNA.

The elongation process is fairly rapid, with prokaryotic ribosomes able to add 15 amino acids to the growing polypeptide every second. The process is also relatively error-free. Only one mistake is made every 10,000 amino acids. For large proteins of 1000 amino acids, that would mean one wrong amino acid in every 10 polypeptides.

  iii) Termination

 When a termination codon enters the A site, translation halts. This is because there is no tRNA with an anticodon that is complementary to any of the stop codons. The release factor causes the translation complex to fall apart, and cleaves the polypeptide from the final tRNA. (Figure-11)

Figure-11- The release factors help in releasing the newly formed chain from the translation machinery

The polypeptide product is now free to function in the cell. The mRNA molecule is now available to be translated again. Very often, more than one ribosome will translate a single mRNA at the same time. One ribosome will initiate translation, and after it moves down the mRNA a bit, another ribosome will initiate, then another, and so on. The structure consisting of multiple ribosomes translating a single mRNA molecule is called a polysome. Eventually, the mRNA is degraded, and translation of that particular message will cease.

 Eukaryotic Translation

Eukaryotic translation is very similar overall to prokaryotic translation. There are a few notable differences, These include the following:

Eukaryotic mRNAs do not contain a Shine-Dalgarno sequence. Instead, ribosomal subunits recognize and bind to the 5′ cap of eukaryotic mRNAs. In other words, the 5′ cap takes the place of the Shine-Dalgarno sequence.
Eukaryotes do not use formyl methionine as the first amino acid in every polypeptide; ordinary methionine is used. Eukaryotes do have a specific initiator tRNA, however,  Eukaryotic translation involves many more protein factors than prokaryotic translation (For example, eukaryotic initiation involves at least 10 factors, instead of the 3 in prokaryotes).

Inhibitors of protein synthesis

l  The tetracyclines (tetracycline, doxycycline, demeclocycline, minocycline, etc.) block bacterial translation by binding reversibly to the 30S subunit and distorting it in such a way that the anticodon of the charged tRNAs cannot align properly with the codons of the mRNA.

l  Puromycin structurally binds to the aminoacyl t RNA and becomes incorporated into the growing peptide chain thus causing inhibition of the further elongation.

l  Chloramphenicol inhibits prokaryotic Peptidyl Transferase

l  Clindamycin and Erythromycin bind irreversibly to a site on the 50 s subunit of the bacterial ribosome thus inhibit translocation.

l   Diphtheria toxin inactivates the eukaryotic elongation factors thus prevent translocation.

Post Translational Modifications

The newly synthesized protein is modified to become functionally active. The various post translational modifications are as follows

1) Trimming- Trimming removes excess amino acids.

2) Covalent Modifications

a) Phosphorylation

b) Glycosylation

c) Hydroxylation

d) Gamma carboxylation

e) Isoprenylation

f) Methylation

g) Acetylation

h) Protein degradation

  • Phosphorylation may activate or inactivate the protein.e.g .Glycogen Phosphorylase becomes active while glycogen synthase becomes in active on phosphorylation.
  • Glycosylation targets a protein to become a part of the plasma membrane, or lysosomes or be secreted out of the cell
  • Hydroxylation such as seen in collagen is required for acquiring the three- dimensional structure and for imparting strength.
  • Gamma carboxylation of glutamic acid residues of prothrombin takes place in the presence of vitamin K.
  • Methylation or Acetylation of histones takes place for gene expression.
  • Defective proteins or destined for turn over are marked for destruction by attachment of a Ubiquitin protein. Proteins marked in this way are degraded by a cellular component known as the Proteasome.

 3)  Subunit Aggregation- Examples are immunoglobulins, hemoglobin and maturation of collagen. Failure of post translational modifications affects the functional capacity of the proteins


Please help "Biochemistry for Medics" by CLICKING ON THE ADVERTISEMENTS above!