Welcome to Biology
-Biology is the science of life
Welcome to Biochemistry
-Biochemistry is the study of chemical aspect of biology
Welcome to the Protein World
-Proteins are the workhorse molecules and major players of life
Where are we?
About Myself
Xiangjun (Frank) Liu
Ph.D,in Molecular Biology,MS in Biochemistry and EECS
Tel,62792997 Email,frankliu@tsinghua.edu.cn
About the TAs
Xinmiao Fu,62789386,fuxinmiao00@mails.tsinghua.edu.cn
Mingjie Liu,62772241,liumingjie99@mails.tsinghua.edu.cn
About My Classes
20% Quiz—taken at the beginning of the 3rd class
30% Paper—due 12/01 (hard copy,soft one by email,both to TA)
50% Final Exam
QA Sessions,This,next and the last Thursday,3-5pm,强斋 253
http://166.111.16.113 or 校园网络
Proteins are polymers(多聚体 ) of amino acids.
-20 amino acids (氨基酸 ) ?millions of proteins with different properties and
activities,
Protein structures are studied at primary,secondary,tertiary and quaternary
levels.
-? helix,? sheets,globular,complexes,denaturation and folding.
Proteins have widely diverse forms and functions.
-enzymes(酶 ),hormones(激素,荷尔蒙 ),antibodies(抗体 ),transporters(转运蛋
白 ),muscle(肌肉 ),lens protein of eyes(眼睛晶状体 ),spider webs(蜘蛛
网 ),rhinoceros horn(犀牛角 ),antibiotics(抗生素 ),mushroom poisons(蘑菇毒素 ).
0,General Introduction
Chapter 5 Amino Acids,Peptides,and Proteins
Fireflies emit light catalyzed by luciferase(荧光素酶) with ATP
A few examples on protein diversity…
Erythrocytes(红细胞) contain a large amount of
hemoglobins(血红蛋白),the oxygen-transporting protein.
The protein keratin (角蛋白) is the chief structural
components of hair,scales,horn,wool,nails and feathers.
1,All natural proteins were found to be built
from a repertoire of 20 standard ?-amino acids
The earliest studies of proteins focused on the free
amino acids derived from these proteins.
?The 1st amino acid (asparagine) was discovered in
1806 from asparagus (a green vegetable).
?The last (threonine) was not identified until 1938!
?All the amino acids were given a trivial (common)
name,Glutamate from wheat gluten (sticky),
Tyrosine from cheese (“tyros” in Greek).
1.1 The 20 ?-amino acids share common structural
features.
?Each has a carboxyl group and an amino group
(but one has an imino group in proline) bonded
to the same carbon atom,designated as the ?-
carbon.
In protein chemistry,we use Greek letter nomenclature.
To find all Greek letters,go to:
http://www.astro.uiuc.edu/~kaler/sow/greek.html
?Each has a different side chain (or R group,
R=“Remainder of the molecule”).
The ?-carbons for 19 of them are asymmetric (or chiral),thus
being able to have two enantiomers,Glycine has no chirality.
RS system re-visited (Chapter 3,p60):
The priorities of some common substituents are
-OCH2 > -OH > -NH2 > -COOH > -CHO > -CH2OH > -CH3 > -H
The chiral atom is viewed with the group of lowest priority pointing away from
the viewer,If the priority of the other three groups decreases in clockwise order,
the configuration is R (“right”); if in counterclockwise order,the configuration
is L (“left”).
The horizontal bonds project out of the plane of the paper,
the vertical behind.
The two enantiomers of each amino acid defined by
the ?-carbon are designated D- and L- forms (D for
Dextrorotary右旋,L for Levorotary左旋 )
Lined up by similarity,chiral to chiral,COO to CHO
The D- and L-forms of amino acids are named in
reference to the absolute configuration of D- and L-
glyceraldehydes (whose structure was originally
assumed and confirmed by X-ray crystallography later).
About AA chirality:
?Normally,the ?-carbon is the only chiral center
with several exceptions.
?There are ways to determine the D- and L-forms of
amino acids.
?Only the L-amino acids have been found in proteins
(D-isomers have been found only in small peptides of
bacteria cell walls and in some peptide antibiotics).
?The correlation of structure (or configuration) with
optical rotation is very complex and has not been
successful to date! (i.e.,the D- and L-signs do not tell
anything about their optical rotation!)
1.2 Each amino acid is given a three-letter abbreviation
and a one-letter symbol,They often use the first three
letter and the first letter,When there is confusion,an
alternative is used,These must be remembered,
1.3 All proteins in all species (from bacteria to human)
are constructed from the same set of 20 amino acids.
?All proteins,no matter how different they are in
structure and function,are made from the 20
standard amino acids.
?This fundamental alphabet of the protein language
is at least two billion years old.
List of Amino Acids and Their Abbreviations
amino acid three letter code single letter code
glycine Gly G
alanine Ala A
valine Val V
leucine Leu L
isoleucine Ile I
methionine Met M
phenylalanine Phe F
tryptophan Trp W
proline Pro P
Polar (hydrophilic)
serine Ser S
threonine Thr T
cysteine Cys C
tyrosine Tyr Y
asparagine Asn N
glutamine Gln Q
Electrically Charged (negative and
hydrophilic)
aspartic acid Asp D
glutamic acid Glu E
Electrically Charged (positive
and hydrophilic)
lysine Lys K
arginine Arg R
histidine His H
Nonpolar Amino Acids (hydrophobic)
Yellow amino acids contain sulfur,Blue amino acids can be phosphorylated.
2,The 20 amino acids are usually grouped according
to the properties (mainly polarity) of their R groups
2.1 Six amino acids have nonpolar,aliphatic
(hydrophobic) R groups.
?They are Gly,Ala,Val,Leu,Ile,and Met.
?Gly has a hydrogen as its R group,having
minimal steric hindrance.
?In protein structure Gly offers the most
flexibility!
?Ala,Val,Leu,Ile and Met have hydrocarbon R
groups,often involved in hydrophobic
interactions.
Gly,G,甘氨酸
Ala,A,丙氨酸
Val,V,缬氨酸
Leu,L,亮氨酸
Ile,I,异亮氨酸
Met,M,蛋氨酸,
甲硫氨酸
2.2 Ser,Thr,Asn,Gln,Cys,and Pro have polar,
uncharged R groups.
?The R groups are more hydrophilic,due to the presence
of hydroxyl groups,sulfur atoms,or amide groups.
?-SH group of two Cys in proteins can be oxidized to
form a covalent disulfide bond.
?Cys often participates in hydrophobic interactions.
?Pro has an imino group,instead of an amino group,
forming a five-membered ring structure,being rigid in
conformation.
?Pro is often found in the bends of folded protein chains
and often present on the surface of proteins,It offers the
least flexibility.
Ser,S,丝氨酸
Thr,T,苏氨酸
Cys,C,半胱氨酸
Pro,P,脯氨酸
Asn,N,天冬酰胺
Gln,Q,谷氨酰胺
Cystine (胱氨酸 ) is dimer of cycteine (半胱氨酸 )
2.3 Phe,Tyr,and Trp have aromatic (芳香族 ) R
groups
?Phe and Tyr both have benzene rings.
?Tryptophan has an indole ring.
?All three participate in hydrophobic interactions.
?The -OH group in Tyr is an important functional
group in proteins,(phosphorylation,hydrogen bond,
etc)
?They are jointly responsible for the light
absorption of proteins at 280 nm,a property used
as a measure of the concentration of proteins.
Phe,F,苯丙氨酸 ; Tyr,Y,络氨酸 ; Trp,W,色氨酸
Lambert-Beer’s law:
A=Log Io/I = ecl
e,extinction coefficient;
c,concentration;
l,optical length
2.4 Positively charged (basic) and negatively
charged (acidic) R groups
?The most hydrophilic R groups are those either positively
or negatively charged.
?Asp and Glu have carboxyl in their R groups,They have
net negative charge at pH 7.0,thus usually named as
aspartate and glutamate (conjugate base names,instead of
aspartic acid and glutamic acid,un-ionized form).
?Arg,Lys,and His have positively charged R groups at pH
7.0.
?Their R groups contain guanidino(胍基 ),amino,
imidazole(咪唑基 ) groups respectively.
?The side chain of His can be positively or uncharged
depending on the local environment near pH 7.0
Lys,K,赖氨酸 ; Arg,R,精氨酸 ; His,H,组氨酸
Asp,D,天冬氨酸 ; Glu,E,谷氨酸
2.5 Hydropathy index values reflect the tendency of an amino
acid to seek a hydrophilic environment (“-” values ) or a
hydrophobic environment (“+” values).
3,Nonstandard amino acids are found in certain proteins,
generally as a result of post-translational modifications.
?These modifications are made after the standard amino acids
have been incorporated into proteins.
?4-Hydroxyglutamate and 5-hydroxylysine in collagen(胶原 ),
? g-carboxyglutamate is found in the blood-clotting
prothrombin (凝血酶原,an enzyme).
?Desmosine (锁链素 ) is a covalent linkage made from four
Lys side chains in elastin (弹性蛋白 ).
?Selenocysteine (硒代半胱氨酸,sulfur->selenium) is found in
many enzymes (having been recognized as the 21st amino
acid in ribosome-mediated protein synthesis!).
?Many additional nonstandard amino acids are found in cells,
but not in proteins (e.g.,ornithine鸟氨酸 and citrulline瓜氨酸,
intermediates in amino acid metabolism).
4,The amino acids ionize in aqueous solutions.
?Crystalline amino acids (in neutral aqueous
solutions) have melting points much higher than
those of other organic molecules of similar size.
?The amino acids ionize to various states depending
on pH values.
?The amino acids (of neutral side chains) exist
predominantly as dipolar ions,known as
zwitterions (German for,hybrid ions”).
?Amino acids can act both as acids and bases,The
zwitterion form of amino acids are ampholytes,
Amino acids can be diprotic and triprotic acids.
4.1 Amino acids,being both weak acids and bases,
have characteristic titration curves and pKa values
4.2 Monoamino monocarboxylic ?-amino acids (e.g.,
Gly,Ser,Phe with no ionizable groups) all have
similar two-stage titration curves.
?The first stage reflects the deprotonation of the ?-
COOH group (pK1).
?The second stage reflects the deprotonation of the
?-NH 3 + group (pK2).
?The pKa value of the ?-COOH is more than 2.0
units smaller than that of acetic acid (pKa of 4.76),
that is,a stronger (weak) acid.
?These amino acids have two buffering power
regions.
4.3 Acidic and basic amino acids have three-stage
titration curves,The additional stage is for the
ionizable group on the side chains (pKR).
4.4 There is a specific pH (designated pI) at which an
amino acid has equal positive and negative charge.
?An amino acid does not move in an electric field at
its pI,called isoelectric point.
?The pI of monoamino monocarboxylic amino
acids reflects a status at which the ?-COOH group
is fully deprotonated,but the ?-NH3+ group has
not yet started deprotonating
pI = (pK1+pK2)/2
pI = (pK1+pK2)/2
4.5 The pI point of an acidic amino acid reflects a
status at which the ?-COOH is fully deprotonated,but
the side chain -COOH and the a-NH3+ group have not
yet started deprotonating
pI = (pK1+pKR)/2
4.6 The pI point of a basic amino acid reflects a status
at which the ?-COOH and the side chain -NH3+ or -
NH+= group have fully deprotonated,but the ?-NH3+
group not yet deprotonated
pI = (pKR+pK2)/2
4.7 The amino acids are positively charged at pH
smaller than their pI values,negatively charged at pH
larger than their pI values.
An acidic amino acid pI=(pK1+pKR)/2
A basic amino acid pI=(pKR+pK2)/2
5,Amino acids covalently join one another to
form peptides
5.1 The ?-carboxyl group of one amino acid joins with
the ?-amino group of another amino acid by a peptide
bond (actually an amide bond)
5.1.1 This is a condensation reaction where a
water molecule is liberated or eliminated.
5.1.2 ?G of the condensation reaction is about 5
kcal/mol,not being able to occur spontaneously (an
endergonic reaction).
5.1.3 The condensation reaction can occur
repeatedly to form oligopeptides (with less than 50 aa),
polypeptides (btw 50-100 aa),and proteins (longer).
5.2 The peptide chain is directional.
5.2.1 An amino acid unit in a peptide
chain is called a residue.
5.2.2 The end having a free ?-amino
group is called amino-terminal or N-terminal.
5.2.3 The end having a free ?-carboxyl
group is called carboxyl-terminal or C-terminal,
5.2.4 By convention,the N-terminal is
taken as the beginning of the peptide chain,and
put at the left (C-terminal at the right),
Biosynthesis starts from the N-terminal.
5.2.5 The peptide chain consist of a
regularly repeating main chain (or backbone)
and the variable side chains of the residues.
5.2.6 Amino acid residues have an order
or sequence on a peptide.
5.2.7 The 20 amino acids are analogous to
the 26 letters in English; the number of
different peptides made of them is unlimited.
5.3 The size of a peptide can be described by its
total number of residues (e.g.,a pentapeptide,an
octapeptide) or relative molecular mass
(molecular weight).
5.3.1 The mean molecular weight of an
amino acid residue in a peptide is ~110 dalton.
5.3.2 Most natural polypeptide chains
contain between 50 and 2000 amino acid
residues,thus having relative molecular mass
between 5500 daltons and 220,000 daltons,or 5.5
kDa and 220 kDa,respectively.
5.4 Each peptide has a characteristic
titration curve and an isoelectric point (pI).
5.4.1 The titration curve of a peptide
reflect the collective behavior of all the acid-
base groups.
5.4.2 The peptide would not move in
an electric field at its pI,(determined by IEF,
isoelectric focusing gel).
5.4.3 2D gel in proteomics,one
dimension is IEF separation by pI; the other
SDS-PAGE separation by molecular weight.
This tetrapeptide has one free ?-amino group,one free
?-carboxyl group,plus two ionizable R groups.
6,Many short peptides have important biological
activities
6.1 Some short peptides,such as neuropeptides,act as
neurotransmitters,neurohormones,and neuromodulators.
6.1.1 These peptides are secreted(分泌 ) by the
neurons.
6.1.2 The LHRH-like(luteinizing hormone releasing
hormone,促黄体激素释放激素 ) decapeptide (10-residue)
act as a neurotransmitter for the frog sympathetic ganglia
(交感神经节 ).
6.1.3 Thyrotropin-releasing factor (3-residue) is
formed in the hypothalamus(下丘脑 ) and stimulates the
release of thyrotropin(促甲状腺素 ) from the anterior
pituitary gland (前垂体腺 ).
An artificial sweetener better known as
aspartame or NutraSweet
6.1.4 Oxytocin (垂体后叶催产素 )(9-residue)
is secreted by the posterior pituitary and
stimulates uterine contraction(子宫收缩 ).
6.1.5 The opioid peptides (鸦片样活性肽 )
(including mainly enkephalins脑啡肽,
endorphins内啡肽,and dynorphins强啡肽 ) have
been implicated in the control of pain,responses
to stress,and other functions.
6.1.6 Some drugs,like morphine吗啡 and
heroin海洛因,generate their addictive effect by
binding to opioid peptide receptors!
6.2 Some short peptides act as antibiotics
6.2.1 Gramicidin(短杆菌肽 ) A (15-residue)
is a well studied peptide antibiotic (from Bacillus
brevis),Its structure has been determined.
6.2.2 It contains alternating L- and D-amino
acid residues,
6.2.3 It is not synthesized on ribosomes!
6.2.4 Gramicidin S (10-residue,circular) is
another antibiotic also from Bacillus brevis.
6.2.5 Peptide antibiotics have also been
found in frog skins,neutrophile cells(嗜中性细胞 ),
and insects.
6.3 Many short peptides are used as defensive
poisons.
6.3.1 ?-Amanitin (鹅膏蕈碱 ) (8-residue,
circular) in mushroom is an extremely toxic
peptide (inhibiting RNA polymerases II and III
at picomolar levels!)
6.3.2 Very toxic short peptides are also
found in snake venom,spider.
6.4 Many vertebrate hormones are small polypeptides.
6.4.1 Insulin (胰岛素 )(51-residue) is produced by
the pancreas and acts to lower blood glucose level,after
food intake.
6.4.2 Glucagon (胰高血糖素 ) (29-residue) is also
produced by the pancreas and acts to increase the
blood glucose level.
6.4.3 Corticotropin (促肾上腺皮质素 ) (or
adrenocorticotropin,39-residue) is produced by the
anterior pituitary gland and stimulates the growth of
adrenal cortex and secretion of corticosteroid (皮质类
固醇激素 ).
6.4.4 Vasopressin (加压素 ) (9-residue) stimulates
the reabsorption of water in the distal tubules of the
kidney (diabetes insipidus patients have deficient
vasopression)
6.5 Many such bioactive peptides are present in
exceedingly small amounts (thus difficult to
discover!) and acts at very low concentrations.
Bioactive short peptides can be selected by
making random peptide libraries (through
chemical synthesis,combinatory chemistry,or
phage display),Very little is known about the
receptors of these bioactive peptides,They should
be good potential drug targets.
7,Proteins in general
7.1 While proteins and polypeptides are normally
used interchangeably,polypeptides generally have
MW of less than 10,000; and also proteins may have
multiple polypeptide subunits.
7.2 Proteins have
characteristic AA
compositions
7.3 Some proteins contain chemical groups other
than AA,which are called the prosthetic groups,and
these proteins are referred as conjugated proteins
while the amino acid part alone is called apoprotein.
7.4 Protein structure is studied at different levels.
More details to be discussed at Chapter 6.
8,Proteins may be separated from each other and
thus purified by chromatography
8.1 Ion exchange chromatography
8.1.1 Each of the protein’s building block
amino acids has a different pI value,
Therefore,each amino acid has a different
net charge at a given pH,and same for
different protein.
8.1.2 The variously charged proteins bind to
charged synthetic resins with various
affinities.
8.1.2.1 When the resin is positively charged,
negatively charged proteins (or other anions)
will bind,and vice versa.
8.1.2.2 Proteins having the same charge as
the resin will not bind.
8.1.2.3 The positively charged resin is called
anion-exchange resin,while the negatively
charged one is called cation-exchange resin
(e.g.,the sulfonated polystyrene).
8.1.2.4 The resin (serving as the stationary
phase) is usually packed in a column
(providing the mechanical support and stable
fluid flow).
8.1.3 The bound proteins can be eluted by running
a pH or salt gradient (serving as the mobile phase).
8.1.3.1 Proteins will be eluted out in the order of
their binding affinity (strongly bound ones being
eluted out later).
8.1.3.2 This way of separating proteins (or other
charged biomolecules) is called ion-exchange
chromatography.
8.2 Chromatography is a method of separating
substances by allowing them to partition between two
phases,one mobile,one stationary (differences in
charge,size,hydrophobic interactions,specific
interactions can be exploited for substance separation
with chromatography).
A.J,Martin and R.L,Synge won the Nobel Prize
in 1952 for inventing chromatography.
8.3 The mobile phase can percolate through the
column at low pressure or high pressure.
8.3.1 To operate under high pressure,
specially designed resins and apparatus (the
pumps and the plumbing system) are needed.
8.3.2 Using high pressure allows better
separation in a much shorter period of time,thus
named High Performance Liquid
Chromatography (HPLC).
Column
Chromatography
Cation-
exchange
column
Gel filtration
column
Affinity
chromatography
Protein purification normally begins with
cheap steps,and then more expensive ones for
smaller sample sizes.
Activity versus
specific activity
9,Proteins could be analyzed by pI and electrophoresis
9.1 IEF (isoelectric focusing gel) separates proteins by pI
9.2 SDS-PAGE separates proteins by MW
9.3 Combining IEF and SDS-PAGE (2D gel or
electrophoresis) permits the resolution of complex
mixtures of proteins,and is widely used in proteomics.
10,The amino acid sequence of short
polypeptide chains can be determined by
chemical methods.
10.1 Amino acid sequence of a peptide chain is
the identity and linking order of its amino acid
residues,No other properties so clearly
distinguish one peptide from another.
10.2 Sanger worked out the first amino acid
sequence of a peptide (bovine insulin) in 1953.
He accomplished this by using 1-fluoro-
2,4-dinitrobenzene(1- 氟 -2,4-硝基苯) to
react with the N-terminal residues of
cleaved short peptides,100 g of insulin
were consumed over ten years to
determine the sequence,The peptide
chains were cut into 150 fragments of
different lengths,He was awarded the
Nobel Prize in 1958 in chemistry for this
breakthrough invention.
10.3 The amino acid sequence of a short peptide can be
efficiently determined by Edman degradation(埃德曼
降解),
10.3.1 The uncharged terminal amino group
reactes with phenylisothiocyanate(苯异硫氰酸盐,异硫
氰酸 ) to form a phenylthiocarbamyl(苯氨基硫代甲酰
基 ) peptide.
10.3.2 The N-terminal amino acid residue is
liberated as a cyclic phenylthiohydantoin (PTH)
derivative under mildly acid conditions,leaving the rest
of the peptide chain intact.
10.3.3 The PTH derivative (thus the amino acid
residue) can be identified by chromatographic methods
10.3.4 The newly exposed N-terminal amino acid
residue can be identified by repeating the above
procedure.
10.4 The N-terminal amino acid sequence of a
polypeptide chain can be easily obtained by using a fully
automated sequenator.
10.4.1 The machine is designed based on the
Edman degradation method,
10.4.2 The peptide is covalently linked to glass
beads through its carboxyl terminals,One cycle of the
Edman degradation is carried out in less than 2 hours.
10.4.3 Usually 50 residues from the N-terminal can
be routinely determined by the sequenator,permitting
the complete sequencing of insulin in a day or two.
10.4.4 Less than a microgram (or picomoles level)
of the peptide is needed for such sequence determination.
11,Large proteins are cleaved into short peptides and
then sequenced (the,divide and conquer” strategy!)
11.1 Disulfide bonds(二硫键),if existing,need to be
broken first.
11.1.1 The PTH-cysteines would not be released
if connected by disulfide bonds.
11.1.2 The disulfide bonds can be reduced by
dithiothreitol (DTT,二硫苏糖醇 ) or ?-
mercaptoethanol(巯基乙醇 ),and then alkylated with
iodoacetate(碘乙酸 ) to prevent reformation of the
disulfide bonds,Addition of iodoacetate in SDS-PAGE
can prevent cross-linking by disulfide bonds between
subunits.
11.2 Polypeptide chains are cleaved into short fragments
by chemical or enzymatic methods and then sequenced
by Edman method.
11.2.1 Cyanogen bromide (CNBr) cleaves
polypeptides on the carboxyl side of methionine residues.
11.2.2 A set of proteases cleave peptide chains
adjacent to specific amino acid residues,Trypsin,
specifically on the carboxyl side of Arg(精氨酸) and
Lys(赖氨酸) ; Chymotrypsin(胰凝乳蛋白酶,糜蛋白
酶),the carboxyl side of Phe(苯丙氨酸),Tyr(酪氨
酸),and Trp(色氨酸)
11.2.3 The fragments produced need to be
separated (purified) by chromatographic(色谱分析)
or electrophoretic(电泳) methods before they can be
sequenced.
11.3 The polypeptide has to be cleaved by at least
two sets of reagents to get the order of the short
peptides on the polypeptide.
11.3.1 A second set of short peptides
overlapping(重叠) the first set is needed to put
the short polypeptides in the correct order.
11.3.2 If the second set fails to provide
appropriate overlapping sequences a third or
even further cleavage is needed.
11.4 The positions of disulfide bonds,if existing,
need to be located.
This can be accomplished by comparing
patterns of peptide fragments on electrophoresis
gels with and without breaking the disulfide
bonds.
12,The amino acid sequences of many proteins
are currently deduced from their genes or cDNA
(互补 DNA) sequences
12.1 The amino acid sequence of a protein is
encoded by its corresponding gene,Every
three bases consist of a genetic code,which is
translated into a specific amino acid on the
polypeptide chain.
T C A G
T
TTT Phe (F)
TTC "
TTA Leu (L)
TTG "
TCT Ser (S)
TCC "
TCA "
TCG "
TAT Tyr (Y)
TAC
TAA Ter
TAG Ter
TGT Cys (C)
TGC
TGA Ter
TGG Trp (W)
C
CTT Leu (L)
CTC "
CTA "
CTG "
CCT Pro (P)
CCC "
CCA "
CCG "
CAT His (H)
CAC "
CAA Gln (Q)
CAG "
CGT Arg (R)
CGC "
CGA "
CGG "
A
ATT Ile (I)
ATC "
ATA "
ATG Met (M)
ACT Thr (T)
ACC "
ACA "
ACG "
AAT Asn (N)
AAC "
AAA Lys (K)
AAG "
AGT Ser (S)
AGC "
AGA Arg (R)
AGG "
G
GTT Val (V)
GTC "
GTA "
GTG "
GCT Ala (A)
GCC "
GCA "
GCG "
GAT Asp (D)
GAC "
GAA Glu (E)
GAG "
GGT Gly (G)
GGC "
GGA "
GGG "
Table of Standard Genetic Code
12.2 Genes encoding specific proteins are routinely
isolated (cloned) in the laboratory of molecular biology.
12.2.1 Sequencing DNA is much easier (faster
and more accurate) than sequencing a polypeptide,
12.2.2 Amino acid sequences of proteins are
mostly deduced from their DNA sequences nowadays!
12.2.3 The partial amino acid sequence of a
protein can be used for its gene isolation.
12.2.4 Disulfide bonds can not be deduced from
DNA sequences and has to be determined directly.
12.2.5 Proteins can be much more efficiently
studied with their genes available! New techniques for
large scale processing are being developed (mass
spectrum[质谱 ]).
Mass spectrometer precisely measures the ratio of
the mass to the electric charge of a particle
The mass of a particle can be calculated from the
peaks of the spectrum
Tandem MS (MS/MS) can give sequence information
13,The function of a protein depends on its AA sequence
13.1 Each separate type of protein has a unique
amino acid sequence.
Each of the about 4289 different proteins in
an E.coli cell,the ~40,000 ones in a human being
has a different amino acid sequence.
13.2 Many human genetic diseases have been
traced to the deficiency of a single enzyme or
protein,The deficiency of many such enzymes are
found to be caused by a single nucleotide
polymorphism (SNP) resulting in a single change
of amino acid residue,indicating protein functions
are determined by their amino acid sequences.
单核苷酸多态性和药物基因组学
SNP and Pharmacogenomics
Sickle Cell Anemia,
hemoglobin ?6,GAA(Glu) -> GTA(Val)
Anemia,Mediterranean or ?-thalassemia or Cooley’s anemia,
underproduction or absence of hemoglobin ?
SNP as Marker in Disease Gene Discovery
Linkage
Disequilibrium
Association
Study
Re-sequencing
13.3 Proteins that have similar functions but from
different species are found to be very similar in amino
acid sequences,(cytochrome c,myoglobin,etc.)
sequence homology as a basis for phylogenetic trees.
Conservation and variation of cytochrome c sequences
27 invariant residues (yellow),conservative substitutions are in blue
and nonconservative or variable residues are unshaded
13.4 A newly revealed amino acid sequence (of
well studied or unknown proteins) is usually
compared with a large bank of stored
sequences.
13.4.1 Thousands of sequences have been
revealed and stored in computerized databases
(e.g.,Swiss-Prot,PIR,TreEMBL).
13.4.2 Sequence similarity usually reveals
functional relatedness.
13.4.3 Evolution can be studied quantitatively
at the molecular level,phylogenetic trees are made
by using the number of residues that differ.
13.4.4 Highly conserved residues usually play
important roles in protein structure and/or function,
13.4.5 Homologous proteins share the same
three-dimensional structure,(sequence-structure-
function paradigm,unless the structure is flexible
and non-unique).
14,Peptides can be synthesized chemically
14.1 Peptides of up to 150 residues can be synthesized
by automated solid-phase methods mainly invented by
R,Bruce Merrifield (who won the 1984 Nobel Prize in
Chemistry for this accomplishment).
14.1.1 Amino acids are added stepwise to a
growing peptide chain that is linked to an insoluble
matrix,such as polystyrene (聚苯乙烯 ) beads.
14.1.2 A major advantage is that the desired
product at each stage is bound the insoluble beads with
other chemicals easily filtered and washed away.
14.1.3 The synthesis starts with fixing the C-
terminal amino acid on the insoluble beads through its
?-carboxyl group,This is in the reverse direction of
biosynthesis.
14.1.4 The ?-amino group of the next amino acid
to be added is protected and its carboxyl group activated,
The amino group is protected by the t-butyloxycarbonyl
group (t-boc) and deprotected by CF3COOH,The
carboxyl group is activated by dicyclohexylcarbodiimide
(DCC).
14.1.5 The peptide bond is formed by the free ?-
amino group (deprotected) of the fixed C-terminal
residue attacking the DCC activated ?-carboxyl group of
the free amino acid in solution.
14.1.6 After washing away unreacted free amino
acids and other reagents,step 14.1.4 and 14.1.5 are
repeated,The synthesized peptide is eventually cleaved
off from the resin by adding HF.
14.2 The efficiency of this solid phase synthesis is much
lower than biosynthesis in living organisms.
14.2.1 Synthesizing a 100 amino acid peptide will
take about 4 days to finish with a fully automated
machine with reasonable yield.
14.2.2 The same peptide would be synthesized
with exquisite fidelity in about 5 seconds in a bacterial
cell!
14.3 Peptides having natural activity have been
synthesized chemically,
14.3.1 The complete bovine insulin was first
synthesized and showed to be the same as the natural
insulin in China in 1965!
14.3.2 Merrifield also synthesized the interferon
(155 aa) and ribonuclease (124 aa).
We will discuss protein structure in next chapter