Genelens: A Python Package Implementing Monte Carlo Machine Learning and Network Analysis Methods for Biomarker Discovery and Gene Functional Annotation
- 作者: Osmak G.Z.1,2, Pisklova M.V.1,2
-
隶属关系:
- Chazov National Medical Research Center for Cardiology
- Pirogov Russian National Research Medical University
- 期: 卷 59, 编号 5 (2025)
- 页面: 845-854
- 栏目: БИОИНФОРМАТИКА
- URL: https://genescells.com/0026-8984/article/view/696392
- DOI: https://doi.org/10.31857/S0026898425050096
- ID: 696392
如何引用文章
详细
We present GeneLens, a Python package for comprehensive analysis of differentially expressed genes and biomarker discovery. The package consists of two core modules: FSelector for biomarker identification by utilizing Monte Carlo simulations of L1-regularized models, and NetAnalyzer for functional prediction of selected gene sets based on the topology of their protein-protein interaction networks.The FSelector includes: (1) automated gene selection through iterative bootstrap sampling; (2) calculation of gene significance weights taking into account ROC-AUC model performance and their number in simulations; (3) adaptive thresholding for feature space reduction. NetAnalyzer performs pathway enrichment analysis while integrating significance weights from FSelector. Implemented as a PIP module, GeneLens provides standardized algorithms for applying machine learning and network analysis methods in differential gene expression studies, along with automated model hyperparameter tuning and visualization tools.
作者简介
G. Osmak
Chazov National Medical Research Center for Cardiology; Pirogov Russian National Research Medical University
Email: german.osmak@gmail.com
Moscow, 121552 Russia; Moscow, 117997 Russia
M. Pisklova
Chazov National Medical Research Center for Cardiology; Pirogov Russian National Research Medical UniversityMoscow, 121552 Russia; Moscow, 117997 Russia
参考
- Altman N., Krzywinski M. (2018) The curse of dimensionality. Nat. Methods. 15, 399–400.
- Altman N., Krzywinski M. (2017) Ensemble methods: bagging and random forests. Nat. Methods. 14, 933–935.
- Осьмак Г., Писклова М. (2025) Транскриптомика и “проклятие размерности”: Монте-Карло симуляции классификационных моделей как инструмент анализа многомерных данных в задачах поиска маркеров биологических процессов. Молекуляр. биология. 59, 143–149.
- Pisklova M., Osmak G. (2024) Unveiling miRNA-124 as a biomarker in hypertrophic cardiomyopathy: an innovative approach using machine learning and intelligent data analysis. Int. J. Cardiol. 410, 132220.
- Osmak G., Kiselev I., Baulina N., Favorova O. (2020) From miRNA target gene network to miRNA function: miR-375 might regulate apoptosis and actin dynamics in the heart muscle via Rho-GTPases-dependent pathways. Int. J. Mol. Sci. 21, 9670.
- Tibshirani R. (1996) Regression shrinkage and selection via the lasso. J. R. Stat. Soc.: Ser. B (Methodological). 58, 267–288.
- Hastie T., Tibshirani R., Friedman J.H., Friedman J.H. (2009) The elements of statistical learning: data mining, inference, and prediction. N.Y.: Springer.
补充文件



