Gene/Protein Disease Symptom Drug Enzyme Compound
Pivot Concepts:   Target Concepts:
Query: UNIPROT:P06889 (Mol)
630,302 document(s) hit in 31,850,051 MEDLINE articles (0.00 seconds)

Homology modelling, currently, is the only theoretical tool which can successfully predict protein 3D structure. As 3D structure is conserved in sequence families, homology modelling allows to predict 3D structure for 20% of SWISSPROT. 20% of the proteins in PDB are remote homologues to another PDB protein. Threading techniques attempt to predict such remote homologues based on sequence information. Here, a new threading method is presented. First, for a list of PDB proteins, 3D structure was projected onto 1D strings of secondary structure and relative solvent accessibility. Then, secondary structure and accessibility were predicted by neural network systems (PHD). Finally, the predicted and observed 1D strings were aligned by dynamic programming. The resulting alignment was used to detect remote 3D homologues. Four results stand out. Firstly, even for an optimal prediction (assignment based on known structure), only about half the hits that ranked above a given threshold were correctly identified as remote homologues; only about 25% of the first hits were correct. Secondly, real predictions (PHD) were not much worse: about 20% of the first hits were correct. Thirdly, a simple filtering procedure improved prediction performance to about 30% correct first hits. The correct hit ranked among the first three for more than 23 out of 46 cases. Fourthly, the combination of the 1D threading and sequence alignments markedly improved the performance of the threading method TOPITS for some selected cases.
Proc Int Conf Intell Syst Mol Biol 1995
PMID:TOPITS: threading one-dimensional predictions into three-dimensional structures. 758 54

When starved for nitrogen, MATa/MAT alpha cells of the budding yeast Saccharomyces cerevisiae undergo a dimorphic transition to pseudohyphal growth. A visual genetic screen, called PHD (pseudohyphal determinant), for S. cerevisiae pseudohyphal growth mutants was developed. The PHD screen was used to identify seven S. cerevisiae genes that when overexpressed in MATa/MAT alpha cells growing on nitrogen starvation medium cause precocious and unusually vigorous pseudohyphal growth. PHD1, a gene whose overexpression induced invasive pseudohyphal growth on a nutritionally rich medium, was characterized. PHD1 maps to chromosome XI and is predicted to encode a 366-amino-acid protein. PHD1 has a SWI4- and MBP1-like DNA binding motif that is 73% identical over 100 amino acids to a region of Aspergillus nidulans StuA. StuA regulates two pseudohyphal growth-like cell divisions during conidiophore morphogenesis. Epitope-tagged PHD1 was localized to the nucleus by indirect immunofluorescence. These facts suggest that PHD1 may function as a transcriptional regulatory protein. Overexpression of PHD1 in wild-type haploid strains does not induce pseudohyphal growth. Interestingly, PHD1 overexpression enhances pseudohyphal growth in a haploid strain that has the diploid polar budding pattern because of a mutation in the BUD4 gene. In addition, wild-type diploid strains lacking PHD1 undergo pseudohyphal growth when starved for nitrogen. The possible functions of PHD1 in pseudohyphal growth and the uses of the PHD screen to identify morphogenetic regulatory genes from heterologous organisms are discussed.
Mol Cell Biol 1994 Mar
PMID:Induction of pseudohyphal growth by overexpression of PHD1, a Saccharomyces cerevisiae gene related to transcriptional regulators of fungal development. 811 41

Secondary structure prediction recently has surpassed the 70% level of average accuracy, evaluated on the single residue states helix, strand and loop (Q3). But the ultimate goal is reliable prediction of tertiary (three-dimensional, 3D) structure, not 100% single residue accuracy for secondary structure. A comparison of pairs of structurally homologous proteins with divergent sequences reveals that considerable variation in the position and length of secondary structure segments can be accommodated within the same 3D fold. It is therefore sufficient to predict the approximate location of helix, strand, turn and loop segments, provided they are compatible with the formation of 3D structure. Accordingly, we define here a measure of segment overlap (Sov) that is somewhat insensitive to small variations in secondary structure assignments. The new segment overlap measure ranges from an ignorance level of 37% (random protein pairs) via a current level of 72% for a prediction method based on sequence profile input to neural networks (PHD) to an average 90% level for homologous protein pairs. We conclude that the highest scores one can reasonably expect for secondary structure prediction are a single residue accuracy of Q3 > 85% and a fractional segment overlap of Sov > 90%.
J Mol Biol 1994 Jan 07
PMID:Redefining the goals of protein secondary structure prediction. 828 37

The far-ultraviolet circular dichroism spectrum of the alpha beta-tubulin dimer analyzed by six different methods indicates an average content of approximately 33% alpha helix, 21% beta sheet, and 45% other secondary structure. Deconvolution of Fourier transform infrared spectra indicates 24% sheet, 37% (maximum) helix, and 38% (minimum) other structure. Separate alignments of 75 alpha-tubulin, 106 beta-tubulin, and 14 gamma-tubulin sequences and 12 sequences of the bacterial cell division protein FtsZ have been employed to predict their secondary structures with the multiple-sequence method PHD [Rost, B., & Sander, C. (1993a) J. Mol. Biol. 232, 584-599]. The predicted secondary structures average of 33% alpha helix, 24% beta sheet, and 43% loop for the alpha beta dimer. The predictions have been compared with sites of limited proteolysis by 12 proteases at the surfaces of the heterodimer and taxol-induced microtubules [de Pereda, J. M., & Andreu, J. M. (1996) Biochemistry 35, 14184-14202]. From 24 experimentally determined nicking sites, 18 are at predicted loops or at the extremes of secondary structure elements. Proteolysis zone A (including acetylable Lys40 and probably Lys60 in alpha-tubulin and Gly93 in beta-tubulin) and proteolysis zone B (extending between residues 167 and 183 in both chains) are accessible in microtubules. Proteolysis zone C, between residues 278 and 295, becomes partially occluded in microtubules. The alpha-tubulin nicking site Arg339-Ser340 is at a loop following a predicted alpha helix in proteolysis zone D. This site is protected in taxol microtubules; however, a new tryptic site appears which is probably located at the N-terminal end of the same helix. Zone D also contains beta-tubulin Cys354, which is accessible in microtubules. Proteolysis zone E includes the C-terminal hypervariable loops (10-20 residues) of each tubulin chain. These follow the two larger predicted helical zones (residues 372-395 and 405-432 in beta-tubulin), which also are the longer conserved part of the alpha- and beta-tubulin sequences. Through combination of this with other biochemical information, a set of surface and distance constraints is proposed for the folding of beta-tubulin. The FtsZ sequences are only 10-18% identical to the tubulin sequences. However, the predicted secondary structures show two clearly similar (85-87 and 51-78%) regions, at tubulin positions 95-175 and 305-350, corresponding to FtsZ 65-135 and 255-300, respectively. The first region is flanked by tubulin proteolysis zones A and B. It consists of a predicted loop1-helix-loop2-sheet-loop3-helix-loop4-sheet fold, which contains the motif (KR)GXXXXG (loop1), and the tubulin-FtsZ signature G-box motif (SAG)GGTG(SAT)G (loop3). A simple working model envisages loop1 and loop3 together at the nucleotide binding site, while loops 2 and 4 are at the surface of the protein, in agreement with proteolytic and antigenic accessibility results in tubulin. The model is compatible with studies of tubulin and FtsZ mutants. It is proposed that this region constitutes a common structural and evolutionary nucleus of tubulins and FtsZ which is different from typical GTPases.
...
PMID:Tubulin secondary structure analysis, limited proteolysis sites, and homology to FtsZ. 891 5

In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is defined by one of seven residue classes and three secondary structure classes. Each homologous fold position is defined by one of seven residue classes, three secondary structure classes, and two burial classes. Thus the matrix is five-dimensional and contains 7 x 3 x 2 x 7 x 3 = 882 elements or 3D-1D scores. The first step in assigning a probe sequence to its homologous fold is the prediction of the three-state (helix, strand, coil) secondary structure of the probe; here we use the profile based neural network prediction of secondary structure (PHD) program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the effectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus specificity plots (or SENS-SPEC plots). The added efficacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.
J Mol Biol 1997 Apr 11
PMID:A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. 913 28

The accuracy of secondary structure prediction methods has been improved significantly by the use of aligned protein sequences. The PHD method and the NNSSP method reach 71 to 72% of sustained overall three-state accuracy when multiple sequence alignments are with neural networks and nearest-neighbor algorithms, respectively. We introduce a variant of the nearest-neighbor approach that can achieve similar accuracy using a single sequence as the query input. We compute the 50 best non-intersecting local alignments of the query sequence with each sequence from a set of proteins with known 3D structures. Each position of the query sequence is aligned with the database amino acids in alpha-helical, beta-strand or coil states. The prediction type of secondary structure is selected as the type of aligned position with the maximal total score. On the dataset of 124 non-membrane non-homologous proteins, used earlier as a benchmark for secondary structure predictions, our method reaches an overall three-state accuracy of 71.2%. The performance accuracy is verified by an additional test on 461 non-homologous proteins giving an accuracy of 71.0%. The main strength of the method is the high level of prediction accuracy for proteins without any known homolog. Using multiple sequence alignments as input the method has a prediction accuracy of 73.5%. Prediction of secondary structure by the SSPAL method is available via Baylor College of Medicine World Wide Web server.
J Mol Biol 1997 Apr 25
PMID:Protein secondary structure prediction using local alignments. 914 39

The three-dimensional modelling of proteins is a useful tool to fill the gap between the number of sequenced proteins and the number of experimentally known 3D structures. However, when the degree of homology between the protein and the available 3D templates is low, model building becomes a difficult task and the reliability of the results depends critically on the correctness of the sequence alignment. For this reason, we have undertaken the modelling of human cytochrome P450 1A2 starting by a careful analysis of several sequence alignment strategies (multiple sequence alignments and the TOPITS threading technique). The best results were obtained using TOPITS followed by a manual refinement to avoid unlikely gaps. Because TOPITS uses secondary structure predictions, several methods that are available for this purpose (Levin, Gibrat, DPM, NnPredict, PHD, SOPM and NNSP) have also been evaluated on cytochromes P450 with known 3D structures. More reliable predictions on alpha-helices have been obtained with PHD, which is the method implemented in TOPITS. Thus, a 3D model for human cytochrome P450 1A2 has been built using the known crystal coordinates of P450 BM3 as the template. The model was refined using molecular mechanics computations. The model obtained shows a consistent location of the substrate recognition segments previously postulated for the CYP2 family members. The interaction of caffeine and a carcinogenic aromatic amine (MeIQ), which are characteristic P450 1A2 substrates, has been investigated. The substrates were solvated taking into account their molecular electrostatic potential distributions. The docking of the solvated substrates in the active site of the model was explored with the AUTODOCK programme, followed by molecular mechanics optimisation of the most interesting complexes. Stable complexes were obtained that could explain the oxidation of the considered substrates by cytochrome P450 1A2 and could offer an insight into the role played by water molecules.
J Comput Aided Mol Des 1997 Jul
PMID:Three-dimensional modelling of human cytochrome P450 1A2 and its interaction with caffeine and MeIQ. 933 5

The feasibility of predicting the global fold of small proteins by incorporating predicted secondary and tertiary restraints into ab initio folding simulations has been demonstrated on a test set comprised of 20 non-homologous proteins, of which one was a blind prediction of target 42 in the recent CASP2 contest. These proteins contain from 37 to 100 residues and represent all secondary structural classes and a representative variety of global topologies. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Predicted tertiary restraints are derived from multiple sequence alignments via a two-step process. First, seed side-chain contacts are identified from correlated mutation analysis, and then a threading-based algorithm is used to expand the number of these seed contacts. A lattice-based reduced protein model and a folding algorithm designed to incorporate these predicted restraints is described. Depending upon fold complexity, it is possible to assemble native-like topologies whose coordinate root-mean-square deviation from native is between 3.0 A and 6.5 A. The requisite level of accuracy in side-chain contact map prediction can be roughly 25% on average, provided that about 60% of the contact predictions are correct within +/-1 residue and 95% of the predictions are correct within +/-4 residues. Precision in tertiary contact prediction is more critical than absolute accuracy. Furthermore, only a subset of the tertiary contacts, on the order of 25% of the total, is sufficient for successful topology assembly. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm holds considerable promise for the prediction of the global topology of small proteins.
J Mol Biol 1998 Mar 27
PMID:Fold assembly of small proteins using monte carlo simulations driven by restraints derived from multiple sequence alignments. 951 47

Using a recently developed protein folding algorithm, a prediction of the tertiary structure of the KIX domain of the CREB binding protein is described. The method incorporates predicted secondary and tertiary restraints derived from multiple sequence alignments in a reduced protein model whose conformational space is explored by Monte Carlo dynamics. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that was modified for the presence of predicted U-turns, i.e., regions where the chain reverses global direction. Tertiary restraints are obtained via a two-step process: First, seed side-chain contacts are identified from a correlated mutation analysis, and then, a threading-based algorithm expands the number of these seed contacts. Blind predictions indicate that the KIX domain is a putative three-helix bundle, although the chirality of the bundle could not be uniquely determined. The expected root-mean-square deviation for the correct chirality of the KIX domain is between 5.0 and 6.2 A. This is to be compared with the estimate of 12.9 A that would be expected by a random prediction, using the model of F. Cohen and M. Sternberg (J. Mol. Biol. 138:321-333, 1980).
...
PMID:Tertiary structure prediction of the KIX domain of CBP using Monte Carlo simulations driven by restraints derived from multiple sequence alignments. 951 44

Wolf-Hirschhorn syndrome (WHS) is a malformation syndrome associated with a hemizygous deletion of the distal short arm of chromosome 4 (4p16.3). The smallest region of overlap between WHS patients, the WHS critical region, has been confined to 165 kb, of which the complete sequence is known. We have identified and studied a 90 kb gene, designated as WHSC1 , mapping to the 165 kb WHS critical region. This 25 exon gene is expressed ubiquitously in early development and undergoes complex alternative splicing and differential polyadenylation. It encodes a 136 kDa protein containing four domains present in other developmental proteins: a PWWP domain, an HMG box, a SET domain also found in the Drosophila dysmorphy gene ash -encoded protein, and a PHD-type zinc finger. It is expressed preferentially in rapidly growing embryonic tissues, in a pattern corresponding to affected organs in WHS patients. The nature of the protein motifs, the expression pattern and its mapping to the critical region led us to propose WHSC1 as a good candidate gene to be responsible for many of the phenotypic features of WHS. Finally, as a serendipitous finding, of the t(4;14) (p16.3;q32.3) translocations recently described in multiple myelomas, at least three breakpoints merge the IgH and WHSC1 genes, potentially causing fusion proteins replacing WHSC1 exons 1-4 by the IgH 5'-VDJ moiety.
Hum Mol Genet 1998 Jul
PMID:WHSC1, a 90 kb SET domain-containing gene, expressed in early development and homologous to a Drosophila dysmorphy gene maps in the Wolf-Hirschhorn syndrome critical region and is fused to IgH in t(4;14) multiple myeloma. 961 63


1 2 3 4 5 6 7 8 9 Next >>