Keyword search (3,448 papers available)


TOUCAN: a framework for fungal biosynthetic gene cluster discovery.

Author(s): Almeida H, Palys S, Tsang A, Diallo AB

Fungal secondary metabolites (SMs) are an important source of numerous bioactive compounds largely applied in the pharmaceutical industry, as in the production of antibiotics and anticancer medications. The discovery of novel fungal SMs can potentially bene...

Article GUID: 33575642

Machine learning for biomedical literature triage.

Author(s): Almeida H, Meurs MJ, Kosseim L, Butler G, Tsang A

PLoS One. 2014;9(12):e115892 Authors: Almeida H, Meurs MJ, Kosseim L, Butler G, Tsang A

Article GUID: 25551575

mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support.

Author(s): Strasser K, McDonnell E, Nyaga C, Wu M, Wu S, Almeida H, Meurs MJ, Kosseim L, Powlowski J, Butler G, Tsang A

Database (Oxford). 2015;2015: Authors: Strasser K, McDonnell E, Nyaga C, Wu M, Wu S, Almeida H, Meurs MJ, Kosseim L, Powlowski J, Butler G, Tsang A

Article GUID: 25754864


Title:TOUCAN: a framework for fungal biosynthetic gene cluster discovery.
Authors:Almeida HPalys STsang ADiallo AB
Link:https://www.ncbi.nlm.nih.gov/pubmed/33575642
DOI:10.1093/nargab/lqaa098
Category:NAR Genom Bioinform
PMID:33575642
Dept Affiliation: GENOMICS
1 Departement d'Informatique, UQAM, Montréal, QC, H2X 3Y7, Canada.
2 Centre for Structural and Functional Genomics, Concordia University, Montréal, QC, H4B 1R6, Canada.

Description:

TOUCAN: a framework for fungal biosynthetic gene cluster discovery.

NAR Genom Bioinform. 2020 Dec; 2(4):lqaa098

Authors: Almeida H, Palys S, Tsang A, Diallo AB

Abstract

Fungal secondary metabolites (SMs) are an important source of numerous bioactive compounds largely applied in the pharmaceutical industry, as in the production of antibiotics and anticancer medications. The discovery of novel fungal SMs can potentially benefit human health. Identifying biosynthetic gene clusters (BGCs) involved in the biosynthesis of SMs can be a costly and complex task, especially due to the genomic diversity of fungal BGCs. Previous studies on fungal BGC discovery present limited scope and can restrict the discovery of new BGCs. In this work, we introduce TOUCAN, a supervised learning framework for fungal BGC discovery. Unlike previous methods, TOUCAN is capable of predicting BGCs on amino acid sequences, facilitating its use on newly sequenced and not yet curated data. It relies on three main pillars: rigorous selection of datasets by BGC experts; combination of functional, evolutionary and compositional features coupled with outperforming classifiers; and robust post-processing methods. TOUCAN best-performing model yields 0.982 F-measure on BGC regions in the Aspergillus niger genome. Overall results show that TOUCAN outperforms previous approaches. TOUCAN focuses on fungal BGCs but can be easily adapted to expand its scope to process other species or include new features.

PMID: 33575642 [PubMed]