BICOB-2023:Papers with Abstracts

Abstract. Epithelial-to-Mesenchymal Transition (EMT) plays a key role in epithelial-cancer. The state trajectory of its underlying Gene Regulatory Network (GRN) includes three fixed-point attractors characterizing epithelial, senescent, and mesenchymal, cell phenotypes, which implies specific cell-to-cell and cell-to-tissue interactions. The interplay between the GRN driving EMT and the one regulating the Mammalian Cell Cycle (MCC) influences cancer- related cell growing and proliferation. We expose the characteristics of the network arising from the interconnection of the gene regulatory networks associated to EMT and MCC. Our purpose is twofold: first, to elucidate the dynamical properties of cancer-related gene regulatory networks. Subsequently, to propose a computational methodology to address the interconnection of networks related to cancer. Our approach is based on feedback-based interconnection of networks described in discrete Boolean terms.
Abstract. A multi-omics dataset combining clinical features with the discovery of biomarkers could contribute significantly to the timely identification of mortality risk and the develop- ment of personalized therapies for a wide range of diseases, including cancer and stroke. As well, new advances in “omics” technologies can open up a lot of possibilities for researchers to find disease biomarkers through system-level analysis. Machine learning methods, es- pecially based on tensor decomposition methods (TD-based), are becoming more popular because the integrative analysis of multi-omics data is challenging due to biological com- plexity. Therefore, it is important to identify future research directions and opportunities on the topic of biomarker discovery using tensor decompositions in multi-omics datasets by integrating literature reviews. This article systematically reviews the research trends from 2015 to 2022. Several themes are discussed, including challenges and problems of de- veloping and applying tensor decompactions, application areas for biomarker discovery in “omics” datasets, proposed methodologies, key evaluation criteria used in deciding whether the new methods are effective, and the limitations and shortcomings of this field, which call for further research and development. This review helps researchers who are interested in this field understand what research has already been done and where potential areas for future research might lie.
Abstract. We explore here the systems-based regulatory mechanisms that determine human blood pressure patterns. This in the context of the reported negative association between hypertension and COVID-19 disease. We are particularly interested in the key role that plays angiotensin converting enzyme 2 (ACE2), one of the first identified receptors that enable the entry of the SARS-CoV-2 virus into a cell. Taking into account the two main systems involved in the regulation of blood pressure, that is, the Renin-Angiotensin system and the Kallikrein-Kinin system, we follow a Bottom-Up systems biology modeling approach in order to built the discrete Boolean model of the gene regulatory network that underlies both the typical hypertensive phenotype and the hypotensive/normotensive phenotype. These phenotypes correspond to the dynamic attractors of the regulatory network modeled on the basis of publicly available experimental information. Our model recovers the observed phenotypes and shows the key role played by the inflammatory response in the emergence of hypertension.
Abstract. Protein functions are strongly related to their 3D structure. Therefore, it is crucial to identify their structure to understand how they behave. Studies have shown that numerous numbers of proteins cross a biological membrane, called Transmembrane (TM) proteins, and many of them adopt alpha helices shape. Unlike the current contact prediction methods that use inductive learning to predict transmembrane protein inter-helical residues contact, we adopt a transductive learning approach. The idea of transductive learning can be very useful when the test set is much bigger than the training set, which is usually the case in amino acids residues contacts prediction. We test this approach on a set of transmembrane protein sequences to identify helix-helix residues contacts, compare transductive and inductive approaches, and identify conditions and limitations where TSVM outperforms inductive SVM. In addition, we investigate the performance degradation of the traditional TSVM and explore the proposed solutions in the literature. Moreover, we propose an early stop technique that can outperform the state of art TSVM and produce a more accurate prediction.
Abstract. Classifying proteins into families is an important task when studying newly discovered proteins. If we can identify the family a protein belongs to, we can predict features without knowing the exact structure of such a protein.
However, this grouping process is challenging. We propose a two-stage algorithm that classifies proteins into families by combining a dimensionality reduction technique using a variational autoencoder with learned fingerprint representations using a Convolutional Neural Network (CNN). Our models use fewer parameters than existing methods but perform better, with our variational autoencoder achieving 94% accuracy in reconstructing the most common amino acid in a sequence alignment, and the neural network provides 98-100% accuracy in classifying protein families. We developed a software framework to access our algorithms. All code and data are publicly available at
Abstract. We investigate the task of disease clustering with the functional annotations of disease genes from the Gene Ontology using the biological process aspect. As an unsupervised machine learning step, the clustering task places communities of similar diseases together based on their closeness to one another using functional annotations of their associated genes. The research work and studies for the similarity, relationship, or clustering of human diseases using the functional information associated with the disease genes are limited. This work builds on and benefits from the advances in gene disease association studies; also from the advances in the functional annotations of human disease genes from the Gene Ontology. We validated the experimental results by comparing the intra-cluster and inter-cluster disease similarity with their semantic similarity in the is-a hierarchy in both MeSH and DO disease ontology. The experimental results are highly encouraging and show that we can rely on the functional profiles using the biological process annotations of disease genes for the study of disease clustering and similarity.
Abstract. There are several programs available in bioinformatics for DNA sequence assembly. This is typically an extremely time-consuming endeavor, as DNA sequences can be extensive and intricate. Velvet was created to combine short and long read sequencing data into larger genomic sequences. Using OpenMP parallel programming, the last version of Velvet was created to support multiple threads. Through OpenACC directives, we present a new version of Velvet that takes advantage of multiprocessing using graphical processing units (GPU). Our tests demonstrate that this extension of Velvet allows for faster performance and efficient memory use.
Abstract. The epigenetic landscape concept initially proposed by Conrad H. Waddington has become a powerful tool to quantitatively address constraints underlying cell differentiation and morphogenesis. In theoretical and experimental terms, this has been enabled by grounding gene regulatory network models on experimental data. Such models have, in turn, led to proposing epigenetic landscape models that entail functional and structural constraints of cell differentiation and morphogenetic dynamics, and thus the understanding of development from a systems–based perspective. Therefore, it is mainly in the context of the study of development where the epigenetic landscape has been anchored as a conceptual support. On the other hand, nonetheless, given the recent understanding of gene control by epigenomic modifications and the capacity to profile these modifications using high– throughput molecular techniques, the notion of epigenetics has been mainly related to non-genetic heritable modifications of the genome. Therefore, this approach, which until now has not been based on a systems–based dynamical treatment, has given proximal epigenomic modifications a central role in understanding development. The latter, has left the dynamic view of epigenetic landscape aside. In this paper we aim at establishing a conceptual link between both conceptualizations of epigenetic regulation.
Abstract. We illustrate here a systems-based computational analysis technique intended for uncov- ering dynamic properties of gene regulatory networks described in discrete Boolean terms. This through the use of the algebraic Semi-Tensor Product (STP), and with the purpose of exploring the regulatory consequences of genetic mutations in the modification of cell phenotypes. The proposed technique derives from the state-based reachability analysis of dynamic systems via the design of open-loop perturbative control schemes. We choose as a case-of-study the discrete Boolean network that describes in qualitative formal terms the transcriptional gene regulatory network that underlies the Epithelial-to-Mesenchymal transition in the context of epithelial cancer. We are particularly interested in qualitatively understanding the systems level consequences of mutations in specific genes that regulate the phenotypic transition. More specifically, we are interested in bringing to light potential preventive therapeutic interventions that promote the Mesenchymal-to-Epithelial transition.
Abstract. The environment and the exposure individuals carry throughout their lifetime can gar- ner diverse effects on their health. This paper discusses the application of association analysis, to determine relationships between carcinogenesis and the human exposome. Human exposome data from the World Health Organization was analyzed to determine associations between human exposure and breast cancer. The discovered associations outline specific factors that may be associated with the prevention or causation of breast cancer. We discovered an association between biomarkers in specific biospecimens and breast cancer. Xanthophylls, measured in two different biospecimens, were determined to be associated with American breast cancer patients. The associations discovered may be of use in future cancer studies. This research is particularly interesting because of xanthophylls’ relationship to retinol, inhibiting oncogenesis. Providing support and data for such associations will encourage more research on the exposome’s effect on breast cancer and other conditions.