Exploratory Transcriptomic Analysis of Colorectal Cancer: Identification of Highly Variable Genes and Co-expression Patterns

Authors

Keywords:

Colorectal Neoplasms, Gene Expression Profiling, Biological Variability, Transcriptome, Data Analysis Pipeline

Abstract

Background: Gene expression variability represents an essential dimension of transcriptomic complexity, reflecting biological heterogeneity and regulatory diversity across tumors. Characterizing such variability may reveal candidate biomarkers and co-regulated gene modules relevant to colorectal cancer biology. Purpose: The study aimed to develop a reproducible computational framework for identifying and visualizing highly variable genes within colorectal cancer transcriptomic data, providing a foundation for exploratory analysis and hypothesis generation.

Methods: A modular Python-based pipeline was constructed to process microarray data derived from colon cancer patients included in the GSE39582 cohort. Data interrogation was performed in October 2025. Following preprocessing, probe-to-gene annotation, and log-transformation, gene-wise variance was calculated. The top 0.1% of genes ranked by variance were selected as highly variable genes. Visualization included z-score–normalized heatmaps, boxplots, and correlation matrices to illustrate heterogeneity and co-expression patterns. Results: Analysis revealed a small subset of genes exhibiting markedly heterogeneous expression profiles across the colorectal cancer cohort. Variability patterns suggested the existence of co-regulated gene modules and potential subtype-associated transcriptional programs. Genes previously linked to colorectal tumorigenesis, such as OLFM4, MS4A12, and CEACAM7, were among the most variable, supporting the biological relevance of variance-based selection. Conclusions: The developed pipeline provides a transparent and reproducible framework for rapid exploration of transcriptomic variability in colorectal cancer. Its simplicity and adaptability make it suitable for integration into diverse analytical workflows and for educational or exploratory research applications.

Downloads

Published

12.12.2025

How to Cite

1.
MAZHARI AM. Exploratory Transcriptomic Analysis of Colorectal Cancer: Identification of Highly Variable Genes and Co-expression Patterns. Appl Med Inform [Internet]. 2025 Dec. 12 [cited 2026 Jan. 6];47(4). Available from: https://ami.info.umfcluj.ro/index.php/AMI/article/view/1217

Issue

Section

Articles