Cconférence prononcée par Xinshuai Zhang, Ph.D. de l'Université d'Illinois.
With the advances in genome sequencing technologies, we are now exposed to a large collection of protein sequences that natural continues to evolve. The abundance of protein sequences represents an unprecedented opportunity for the biomedical and biotechnological societies. Such data can be exploited to provide targets for chemotherapeutic agents, and allow the advances in the fields of biocatalyst and biofuel. However, achievement of this huge potential is confounded by the problem that reliable functions have only been assigned to a small and diminishing fraction of protein sequences in the protein database. To address this issue, the Enzyme Function Initiative (EFI) which is a large-scale, multi-institutional collaborative project was launched and the integrated “genomic enzymology” strategy was established. Especially, the “genomic enzymology” web tools “Sequence Similarity Network (SSN) and “Genome Neighborhood Network (GNN)” were developed. SSNs are used to visualize the sequence-function relationships in protein families and segregate the family into isofunctional clusters; GNNs enable a user to retrieve, display, and interrogate the genome contexts of members of isofunctional SSN clusters so that the enzyme components of metabolic pathways can be identified. Experimentally determined ligand specificity of solute binding protein (SBP) provides the first reactant in a metabolic pathway. In bacterial, the transporter genes are often co-located with genes encoding the enzymes responsible for catabolism of the transported SBP ligand. Collectively, ligand specificity of SBP, synergetic analysis of SSNs and GNNs facilitates the large‑scale prediction of enzymatic activities and metabolic pathways.
Guided by four TRAP SBPs that bind D/L-erythronate, we assigned novel ATP-dependent four-carbon acid sugar kinase functions to members of the Domain of Unknown Function 1537 protein family (PF07005); we also identified the related catabolic pathways to degrade D/L-threonate and D-erythronate. In addition, informed with the ligand specificities of three ABC SBPs for D-apiose, a ubiquitous branched-chain pentose in the rhamnogalacturonan-II (RG-II) of plant cell walls, four catabolic pathways including one non-oxidative transketolase pathway and three oxidative pathways for D-apiose/D-apionate were delineated. The non-oxidative transketolase pathway can also be utilized by organisms from human gut microbiome. Significantly, the pathways involve several unusual enzymatic transformations, for example, members of the functionally diverse RuBisCO superfamily (PF00016) catalyze decarboxylation or transcarboxylation reactions. Members of PGDH_C family (PF16896) catalyze the oxidative isomerization of D-apionate. The integrated “genomic enzymology” strategy, as demonstrated here, is a powerful strategy for assigning enzymatic functions and elucidating metabolic pathways.