Interactions between biomolecules such as proteins, DNA, or metabolites, are essential to their function. In this talk I will present our recent work towards predicting such interactions and learning about determinants of specificity. I will mostly focus on three selected examples:
(i) We combined a sequence-alignment based coevolution approach with a statistical framework using Expectation-Maximization. This allows to predict protein interactions and the residues involved in these interactions, purely based on sequences of the proteins involved. Polyketide synthases are used as example application.
(ii) Plant sesquiterpene synthases display an enormous variation in product specificity, and sequence sim ilarity or phylogeny cannot describe this well. An approach combining homology modelling with machine learning enables to predict product specificity. This in turn allows to mine genome datasets for sequences of interest, and indicates residues important for product specificity.
(iii) Crossovers represent reciprocal exchange of genetic information between homologous nonsister chromatids. We developed a machine learning approach to predict crossovers in various plant genomes based on descriptors such as sequence motifs and DNA shape. Based on the features relevant for prediction, our approach allows to obtain insight into determinants of meiotic recombination in various species.