Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Pivot Concepts:
Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Target Concepts:
Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Query: UNIPROT:P06889 (
Mol
)
630,302
document(s) hit in 31,850,051 MEDLINE articles (0.00 seconds)
Protein sequences encoded in three complete bacterial genomes, those of Haemophilus influenzae,
Mycoplasma
genitalium and Synechocystis sp., and the first available archaeal genome sequence, that of Methanococcus jannaschii, were analysed using the BLAST2 algorithm and methods for amino acid motif detection. Between 75% and 90% of the predicted proteins encoded in each of the bacterial genomes and 73% of the M. jannaschii proteins showed significant sequence similarity to proteins from other species. The fraction of bacterial and archaeal proteins containing regions conserved over long phylogenetic distances is nearly the same and close to 70%. Functions of 70-85% of the bacterial proteins and about 70% of the archaeal proteins were predicted with varying precision. This contrasts with the previous report that more than half of the archaeal proteins have no homologues and shows that, with more sensitive methods and detailed analysis of conserved motifs, archaeal genomes become as amenable to meaningful interpretation by computer as bacterial genomes. The analysis of conserved motifs resulted in the prediction of a number of previously undetected functions of bacterial and archaeal proteins and in the identification of novel protein families. In spite of the generally high conservation of protein sequences, orthologues of 25% or less of the M. jannaschii genes were detected in each individual completely sequenced genome, supporting the uniqueness of archaea as a distinct domain of life. About 53% of the M. jannaschii proteins belong to families of paralogues, a fraction similar to that in bacteria with larger genomes, such as Synechocystis sp. and Escherichia coli, but higher than that in H. influenzae, which has approximately the same number of genes as M. jannaschii. Certain groups of proteins, e.g. molecular chaperones and DNA repair enzymes, thought to be ubiquitous and represented in the minimal gene set derived by bacterial genome comparison, are missing in M. jannaschii, indicating massive non-orthologous displacement of genes responsible for essential functions. An unexpectedly large fraction of the M. jannaschii gene products, 44%, shows significantly higher similarity to bacterial than to eukaryotic proteins, compared with 13% that have eukaryotic proteins as their closest homologues (the rest of the proteins show approximately the same level of similarity to bacterial and eukaryotic homologues or have no homologues). Proteins involved in translation, transcription, replication and protein secretion are most closely related to eukaryotic proteins, whereas metabolic enzymes, metabolite uptake systems, enzymes for cell wall biosynthesis and many uncharacterized proteins appear to be 'bacterial'. A similar prevalence of proteins of apparent bacterial origin was observed among the currently available sequences from the distantly related archaeal genus, Sulfolobus. It is likely that the evolution of archaea included at least one major merger between ancestral cells from the bacterial lineage and the lineage leading to the eukaryotic nucleocytoplasm.
Mol
Microbiol 1997 Aug
PMID:Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea. 1074 90
Two paralogous, site-specific invertible loci, designated hsd1 and hsd2 (host specificity determinant), have been identified in the
Mycoplasma
pulmonis genome. They encode putative type I restriction and modification (R-M) systems with maximum sequence homology to the type IC family, which includes EcoR124II and EcoDXXI. Each locus encodes an endonuclease subunit (HsdR), a methylase subunit (HsdM) and two DNA specificity subunits (HsdS). The gene organization at each locus is such that hsdR and hsdM are flanked by two hsdS genes. Within each locus, one of the hsdS genes, hsdR and hsdM, is encoded in tandem by the same DNA strand, while the second hsdS gene is encoded by the complementary strand but without overlap with the other three hsd genes. The hsdR and hsdM sequences of one locus are almost identical to their counterparts in the other. The four hsdS genes (two per locus) are highly homologous at their 5' ends and also share sequence similarities in the 3' ends of their corresponding coding regions. Owing to the disposition of and sequence similarities among the hsdS genes, they form inverted repeats at each locus. Analysis by polymerase chain reaction (PCR) has shown that both loci behave as site-specific DNA invertible elements with multiple inversion sites, termed 'vipareetus', occurring within the hsdS genes. The inversions lead to a reassortment of hsdS sequences, generating an array of recombinant genes that probably encode S subunits possessing alternative DNA-binding specificities. Sequence information obtained from the analysis of hsd2 transcripts by 5' RACE (rapid amplification of cDNA ends) indicates that inversion induces the transcription of alternative hsdS genes by the relocation of coding sequences downstream of a promoter and ribosome-binding site (RBS) situated at one end of each locus.
Mol
Microbiol 1997 Oct
PMID:The hsd loci of Mycoplasma pulmonis: organization, rearrangements and expression of genes. 938 94
To test the hypotheses that eubacterial genomes leave evolutionarily stable structures and that the variety of genome size is brought about through genome doubling during evolution, the genome structures of Haemophilus influenzae,
Mycoplasma
genitalium, Escherichia coli, and Bacillus subtilis were compared using the DNA sequences of the entire genome or substantial portions of genome. In these comparisons, the locations of orthologous genes were examined among different genomes. Using orthologous genes for the comparisons guaranteed that differences revealed in physical location would reflect changes in genome structure after speciation. We found that dynamic rearrangements have so frequently occurred in eubacterial genomes as to break operon structures during evolution, even after the relatively recent divergence between E. coli and H. influenzae. Interestingly, in such eubacterial genomes of high plasticity, we could find several highly conservative regions with the longest conserved region comprising the S10, spc, and alpha operons. This suggests that such exceptional conservative regions have undergone strong structural constraints during evolution.
J
Mol
Evol 1997
PMID:Genome plasticity as a paradigm of eubacteria evolution. 939 6
We have conducted genome sequence analyses of seven prokaryotic microorganisms for which completely sequenced genomes are available (Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Bacillus subtilis,
Mycoplasma
genitalium, Synechocystis PCC6803 and Methanococcus jannaschii). We report the distribution of encoded known and putative polytopic cytoplasmic membrane transport proteins within these genomes. Transport systems for each organism were classified according to (1) putative membrane topology, (2) protein family, (3) bioenergetics, and (4) substrate specificities. The overall transport capabilities of each organism were thereby estimated. Probable function was assigned to greater than 90% of the putative transport proteins identified. The results show the following: (1) Numbers of transport systems in eubacteria are approximately proportional to genome size and correspond to 9.7 to 10.8% of the total encoded genes except for H. pylori (5.4%), Synechocystis (4.7%) and M. jannaschii (3.5%) which exhibit substantially lower proportions. (2) The distribution of topological types is similar in all seven organisms. (3) Transport systems belonging to 67 families were identified within the genomes of these organisms, and about half of these families are also found in eukaryotes. (4) 12% of these families are found exclusively in Gram-negative bacteria, but none is found exclusively in Gram-positive bacteria, cyanobacteria or archaea. (5) Two superfamilies, the ATP-binding cassette (ABC) and major facilitator (MF) superfamilies account for nearly 50% of all transporters in each organism, but the relative representation of these two transporter types varies over a tenfold range, depending on the organism. (6) Secondary, pmf-dependent carriers are 1.5 to threefold more prevalent than primary ATP-dependent carriers in E. coli, H. influenzae, H. pylori and B. subtilis while primary carriers are about twofold more prevalent in M. genitalium and Synechocystis. M. jannaschii exhibits a slight preference for secondary carriers. (7) Bioenergetics of transport generally correlate with the primary forms of energy generated via available metabolic pathways but ecological niche and substrate availability may also be determining factors. (8) All organisms display a similar range of transport specificities with quantitative differences presumably reflective of disparate ecological niches. (9) M. jannaschii and Synechocystis have a two to threefold increased proportion of transporters for inorganic ions with a concomitant decrease in transporters for organic compounds. (10) 6 to 18% of all transporters in these bacteria probably function as drug export systems showing that these systems are prevalent in non-pathogenic as well as pathogenic organisms. (11) All seven prokaryotes examined encode proteins homologous to known channel proteins, but none of the channel types identified occurs in all of these organisms. (12) The phosphoenolpyruvate:sugar phosphotransferase system is prevalent in the large genome organisms, E. coli and B. subtilis, and is present in the small genome organisms, H. influenzae and M. genitalium, but is totally lacking in H. pylori, Synechocystis and M. jannaschii. Details of the information summarized in this article are available on our web sites, and this information will be periodically updated and corrected as new sequence and biochemical data become available.
J
Mol
Biol 1998 Apr 03
PMID:Microbial genome analyses: global comparisons of transport capabilities based on phylogenies, bioenergetics and substrate specificities. 953 81
The trigger factor is associated with bacterial ribosomes and catalyzes proline-limited protein folding reactions. Its folding activity is very high and conserved in evolution, as shown for the homologous enzymes from Escherichia coli and
Mycoplasma
genitalium. The folding protein substrate (a variant of ribonuclease T1) binds with high affinity to the trigger factors, and permanently unfolded proteins are strong, competitive inhibitors. We used this inhibition to characterize the substrate binding sites of the trigger factors. Unfolded alpha-lactalbumin binds very tightly and inhibits the trigger factor from M. genitalium with a KI value of 50 nM. The binding of inhibitory proteins is independent of proline residues, as shown for unfolded tendamistat, which binds to the trigger factor with equal affinity in the presence and in the absence of its three proline residues. The good inhibition by a non-folding variant of ribonuclease T1 that lacks Pro39 showed that this proline, at which the catalysis of folding occurs, is dispensable for substrate binding. The trigger factors cannot catalyze prolyl isomerization when proteins are partially folded already. They preferentially recognize unstructured protein chains, which bind with high affinity to a site distinct from the catalytic prolyl isomerase center in the FKBP domain.
J
Mol
Biol 1998 Apr 03
PMID:Recognition of protein substrates by the prolyl isomerase trigger factor is independent of proline residues. 953 90
We compare the frequency distribution of gene family sizes in the complete genomes of six bacteria (Escherichia coli, Haemophilus influenzae, Helicobacter pylori,
Mycoplasma
genitalium,
Mycoplasma
pneumoniae, and Synechocystis sp. PCC6803), two Archaea (Methanococcus jannaschii and Methanobacterium thermoautotrophicum), one eukaryote (Saccharomyces cerevisiae), the vaccinia virus, and the bacteriophage T4. The sizes of the gene families versus their frequencies show power-law distributions that tend to become flatter (have a larger exponent) as the number of genes in the genome increases. Power-law distributions generally occur as the limit distribution of a multiplicative stochastic process with a boundary constraint. We discuss various models that can account for a multiplicative process determining the sizes of gene families in the genome. In particular, we argue that, in order to explain the observed distributions, gene families have to behave in a coherent fashion within the genome; i.e., the probabilities of duplications of genes within a gene family are not independent of each other. Likewise, the probabilities of deletions of genes within a gene family are not independent of each other.
Mol
Biol Evol 1998 May
PMID:The frequency distribution of gene family sizes in complete genomes. 958 Sep 88
Genome sequences are available for increasing numbers of organisms. The proteomes (protein complement expressed by the genome) of many such organisms are being studied with two-dimensional (2D) gel electrophoresis. Here we have investigated the application of short N-terminal and C-terminal sequence tags to the identification of proteins separated on 2D gels. The theoretical N and C termini of 15, 519 proteins, representing all SWISS-PROT entries for the organisms
Mycoplasma
genitalium, Bacillus subtilis, Escherichia coli, Saccharomyces cerevisiae and human, were analysed. Sequence tags were found to be surprisingly specific, with N-terminal tags of four amino acid residues found to be unique for between 43% and 83% of proteins, and C-terminal tags of four amino acid residues unique for between 74% and 97% of proteins, depending on the species studied. Sequence tags of five amino acid residues were found to be even more specific. To utilise this specificity of sequence tags for protein identification, we created a world-wide web-accessible protein identification program, TagIdent (http://www.expasy.ch/www/tools.html), which matches sequence tags of up to six amino acid residues as well as estimated protein pI and mass against proteins in the SWISS-PROT database. We demonstrate the utility of this identification approach with sequence tags generated from 91 different E. coli proteins purified by 2D gel electrophoresis. Fifty-one proteins were unambiguously identified by virtue of their sequence tags and estimated pI and mass, and a further 11 proteins identified when sequence tags were combined with protein amino acid composition data. We conlcude that the TagIdent identification approach is best suited to the identification of proteins from prokaryotes whose complete genome sequences are available. The approach is less well suited to proteins from eukaryotes, as many eukaryotic proteins are not amenable to sequencing via Edman degradation, and tag protein identification cannot be unambiguous unless an organism's complete sequence is available.
J
Mol
Biol 1998 May 08
PMID:Protein identification with N and C-terminal sequence tags in proteome projects. 960 Aug 41
Mycoplasma
fermentans was reported as a common contaminant of cell cultures, and was shown to either induce or suppress several immunological functions. A strain of M. fermentans was recently isolated from a mouse T-lymphoma cell line, which differs from other M. fermentans strains by its growth characteristics and was designated (in the authors' records) as strain 609. Using the differential display technique (DD), a differentially expressed gene that was identified as the M. fermentans 609 ftsZ gene was isolated. Comparison of the nucleotide sequence of the M. fermentans 609 ftsZ gene to other ftsZ genes showed a 98% homology with
Mycoplasma
fermentans strain K7 and approximately 50% homology with
Mycoplasma
pulmonis and
Mycoplasma
genitalium. Comparison of the putative amino acid sequences of the FtsZ proteins showed similar homology. A polymerase chain reaction (PCR) assay to detect the presence of this ftsZ gene was established; it is a fast and convenient assay to detect infection of cells by the M. fermentans species. This work demonstrates that: (i) DD can be used as a useful technique to identify and isolate mycoplasmal genes from infected cells; and (ii) the ftsZ gene can be a useful marker to distinguish between different species of
mycoplasma
.
Mol
Cell Probes 1998 Apr
PMID:The ftsZ gene as a tool for detection of Mycoplasma fermentans. 963 43
The DNA repair genes uvrC from
Mycoplasma
bovis and
Mycoplasma
agalactiae type strains were cloned and their nucleotide sequences were established. These sequences were used to design polymerase chain reaction (PCR) primer pairs for M. bovis and M. agalactiae. Each primer pair amplified a 1-6 kb fragment of the uvrC gene in the respective species. The specificity of the primer pairs for the two species was demonstrated through the lack of cross-amplifications in heterologous PCR reactions and in reactions using DNA from other
mycoplasma
species. Subsequent restriction enzyme analysis of the amplified uvrC gene segments from type and field strains of M. bovis and M. agalactiae showed that the uvrC genes are well conserved in both species but differ significantly between the two species. The diagnostic PCR assay enabled unambiguous identification of M. bovis and M. agalactiae strains isolated from geographically diverse places, even in cases where 16S rRNA gene sequence analysis was unable to discriminate between the two species.
Mol
Cell Probes 1998 Jun
PMID:Species identification of Mycoplasma bovis and Mycoplasma agalactiae based on the uvrC genes by PCR. 966 78
Homology search techniques based on the iterative PSI-BLAST method in combination with various filters for low sequence complexity are applied to assign folds to all
Mycoplasma
genitalium proteins. The resulting procedure (implemented as a web server) is able to predict at least one domain in 37% of these proteins automatically, with an estimated accuracy higher than 98%. Taking structural features such as coiled coil or transmembrane regions aside, folds can be assigned to more than half of the globular proteins in a bacterium just by iterative sequence comparison.
J
Mol
Biol 1998 Jul 17
PMID:Homology-based fold predictions for Mycoplasma genitalium proteins. 966 39
<< Previous
1
2
3
4
5
6
7
8
9
10