Exploring diagnostic gene markers and immune infiltration in idiopathic pulmonary fibrosis
Highlight box
Key findings
• This study identified 486 differentially expressed genes in idiopathic pulmonary fibrosis (IPF) using the GSE10667 dataset.
• A weighted gene co-expression network analysis and protein-protein interaction network revealed five key hub genes (COL1A1, COL3A1, COL1A2, POSTN, and MMP2).
• MMP2 was identified as a potential diagnostic biomarker for IPF, and was found to be significantly associated with immune cell infiltration and disease progression.
• The Mendelian randomization analysis confirmed a causal relationship between MMP2 and the risk of IPF.
What is known and what is new?
• IPF is a progressive and fatal lung disease with limited diagnostic and therapeutic options. Current biomarkers are insufficient for early detection and prognosis.
• This study integrated bioinformatics approaches to identify novel genetic markers and elucidate the role of MMP2 in IPF pathogenesis and immune infiltration, extending understandings of IPF. Our findings suggest that MMP2 may serve as a promising diagnostic and prognostic biomarker, potentially improving early detection and treatment strategies.
What is the implication, and what should change now?
• MMP2 was identified as a key gene in IPF, and could serve as a diagnostic marker and therapeutic target in IPF.
• Clinicians should consider incorporating MMP2 testing into IPF diagnostic protocols to enable earlier intervention and improve patient outcomes.
• Future research should focus on validating the clinical utility of MMP2 and exploring its role in therapeutic strategies targeting IPF.
Introduction
Idiopathic pulmonary fibrosis (IPF) is an undetermined cause of interstitial lung disease (ILD), which results in the worsening of breathing problems and pulmonary function (1). Damage to the alveolus causes a loss of air circulation and pulmonary volume, which may lead to a worse outcome, and is generally fatal within 2–3 years of the diagnosis of IPF (2,3). Currently, it is estimated that three million people are affected by IPF worldwide, and this figure is expected to double by 2030 (4). Besides a pulmonary transplant, there are no effective therapeutic approaches for IPF. Thus, the discovery of an efficient biomarker is important to identify those who have a poor prognosis at an early stage, and to help to treat them in the future.
Gene expression chips have been extensively applied to investigate the mechanism of disease and therapy. Weighted gene co-expression network analysis (WGCNA) has been used to identify the genetic relationships of clinical specimens (5). Research needs to be conducted to identify the micro-mechanism of IPF and the biomarkers associated with the diagnostic and therapeutic assessment of IPF. Although WGCNA, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway-based enrichment methods have been developed, no previous study appears to have used a WGCNA in combination with Mendelian randomization (MR) to identify IPF core genes.
In this study, we re-analyzed GSE10667 gene expression patterns and identified differentially expressed genes (DEGs) using the “limma” package of R. WGCNA was used to build the gene co-expression network, and GO and KEGG analyses of the critical modules were conducted. The results were verified by a nomogram and receiver operating characteristic (ROC) curve analysis. Cell-type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT) was also used to analyze the relationship between IPF and the DEGs. This study sought to investigate the correlation between MMP2 and the risk of IPF by MR. This study also sought to identify novel IPF biomarkers and to develop a novel approach for finding genetic or even therapeutic targets. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1293/rc).
Methods
Microarray data analysis
The GSE10667 dataset was downloaded from the Gene Expression Omnibus (GEO) database (6,7). The GSE10667 [GEO Accession Viewer (nih.gov)] dataset comprised 46 pulmonary specimens, including 31 from IPF patients, and 15 from healthy individuals, and clinical data. For network construction, we implemented the WGCNA approach using the corresponding R package (version 1.72), which enables the identification of highly co-expressed gene clusters through soft-thresholding of correlation matrices (5,8). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Identification of DEGs
We performed differential gene expression analysis using the “limma” R package (version 3.54.0) (9). Linear models were then fitted to identify differentially expressed messenger RNAs (mRNAs) between IPF and control specimens. DEGs were selected based on the following stringent criteria: adjusted P value <0.05 (to ensure statistical significance after multiple testing correction) and absolute log2fold change >1 (corresponding to a two-fold difference in expression levels). For data visualization, we employed the “pheatmap” (version 1.0.12) and “ggplot2” (version 3.4.0) packages in R (version 4.2.2) (10,11). The heatmaps were generated using complete linkage hierarchical clustering with Euclidean distance as the similarity metric, displaying standardized z-scores of gene expression values.
Co-expression module construction and hub gene identification
Using “WGCNA” software, the mRNAs were analyzed by WGCNA to discover the genes that were most highly correlated with IPF. Based on the calculation of genetic significance (GS) and module membership (MM), the genes with GS >0.4, MM >0.5, and P<0.05 were selected as the necessary block genes, and the node genes that had the highest correlations with IPF were screened. The DNA of the DEGs and relevant modules was extracted for further analysis. These analyses were performed using the WGCNA package (v1.72) implemented in R statistical environment (v4.2.2). All analytical parameters and custom scripts have been archived for reference. Through this systematic analytical pipeline, we successfully identified key gene modules and their hub genes that are closely associated with IPF pathogenesis.
Identifying critical candidate genes and the GO/KEGG analysis
As the Venn diagram shows (12), the key cross-linking DEGs in the WGCNA were identified as the key genes associated with IPF. GO and KEGG enrichment analyses were then conducted using the “Cluster Profiler” tool in R (13). The KEGG is a systematic analytical tool for genetic functional sources (14). A GO analysis describes significant gene function (P<0.001) in terms of the biological processes (BPs), cellular components (CCs), and molecular functions (MFs). The molecular processes and the fundamental pathways of the genes are elucidated.
Protein-protein interaction (PPI) network of hub genes
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) [functional protein association networks (string-db.org)] (v11.5) (15) was applied to predict and visualize the PPI networks. Cytoscape (v3.9.1) was used for the PPI grid filtration and mapping (16). Cytoscape used the maximum clique centrality method to identify 10 nodes in the PPI network. A nomogram was created to predict the IPF key genes (17). For verification, ROC curve analysis was conducted, and the area under the curve (AUC) of the ROC curve was computed using the “P ROC” tool (v1.18.0) in R (18). The AUC is an accurate measurement of precision (0.8≤ AUC <0.9). Based on this, we finally chose the best node gene for future studies.
Immune cell analysis
The raw transcriptomic data of the GSE10667 dataset (platform: GPL570) were downloaded from the GEO database. Preprocessing was performed using R (limma package), including background correction, quantile normalization, and batch-effect adjustment via the ComBat algorithm. Immune cell fractions were deconvoluted using CIBERSORTx (https://cibersortx.stanford.edu/, version 19) (19) with the signature matrix (22 immune cell subtypes, e.g., CD8+ T cells, M2 macrophages, and regulatory T cells). Only samples with a significance threshold of P<0.05 (1,000 permutations) were retained for downstream analysis. To identify IPF-associated immune cell subsets, least absolute shrinkage and selection operator (LASSO) regression was applied (glmnet package, 10-fold cross-validation), followed by multivariate logistic regression adjusted for clinical covariates.
MR
In this study, we examined the cause-and-effect relationship of IPF by double-specimen MR, and identified single-nucleotide polymorphism (SNP) as an instrumental variable (IV). The core genetic information was derived from a genome-wide association study (GWAS), which is open to the public. In this study, we chose MMP2 (MONA) and IPF for the MR. The MMP2 (MONA) data are available from https://gwas.mrcieu.ac.uk/datasets/?gwas_id__icontains=&year__iexact=&trait__icontains=MONA&consortium__icontains=finn-b-IPF, and the IPF data are available from https://gwas.mrcieu.ac.uk/datasets/?gwas_id__icontains=&year__iexact=&trait__icontains=idiopathtic+pulmonary+fibrosis&consortium__icontains=ebi-a-GCST90018120, which were derived from the MR software package. The relationship between the core genes and IPF was assessed by MR Egger regression (20,21) by means of reverse inverse variance weighting (IVW).
Results
DEG screening
The IPF data set (GSE10667) was downloaded from the GEO database. In total, 486 DEGs were identified, of which 244 were downregulated and 242 were upregulated (table available at https://cdn.amegroups.cn/static/public/jtd-2025-1293-1.xlsx) (P<0.001 and |log2fold change| value >1). In Figure 1A, the DEG volcanic profile displays the downregulated genes in green and the upregulated genes in red. Figure 1B provides a heat map of the top 50 IPF and normal genes.
WGCNA
The IPF dataset was preprocessed to obtain a total of 19,630 gene expression values. Finally, the maximum mean of 4,000 genes was chosen to evaluate the characteristics of the model (Figure 2A). Highly correlated genes in the co-expression modules were identified. A soft-threshold power of 9 (no scale R2=0.8) was set to ensure a non-scale net (Figure 2B). To determine the correlation between the potential gene modules and IPF, a WGCNA was used to analyze all of the candidate genes, and 14 genes were identified, which are shown as black, blue, brown, cyan, green, light-cyan, midnight-blue, salmon, tan, yellow, green-yellow, purple, and magenta in Figure 2C,2D. The results showed a positive correlation between the potential gene modules and IPF. The relationship between IPF and the green model was most significant (r=0.72, P<0.001). The green model may serve as a critical module for the GS and MM analysis (Figure 2D). There was a statistically significant linear relationship between the genes in the green block and the clinical phenotypes (Cor =0.61, P<0.001) (Figure 2E) (table available at https://cdn.amegroups.cn/static/public/jtd-2025-1293-2.xlsx).
GO/KEGG analyses
Ultimately, 281 cross-genes were selected via the WGCNA and DEGs as the candidate core genes for IPF by the Venn diagram (Figure 3A). GO and KEGG analyses were performed to investigate the possible effects of the 281 cross-genes. The GO results showed that the core genes were associated with the extracellular matrix (ECM), outer envelope, endoplasmic reticulum (ER) lumen, presynaptic activity, and matrix metalloproteinase (MMP) activity (table available at https://cdn.amegroups.cn/static/public/jtd-2025-1293-3.xlsx). The 281 cross-genes also appeared to play an important role in the formation of the ECM, facilitating the formation and migration of fibroblast differentiation. These findings suggest that these genes are associated with cellular aging, ER stress, and the activation of MMPs (Figure 3B). The KEGG signaling pathway was enriched by PI3K-PKB (Akt), ECM receiver, Wnt, AGE-RAGE, and proteoglycans (Table S1) (Figure 3C).
Hub gene identification and validation
Using the STRING database, we built a core gene interaction network. We visualized the top 10 key node genes in this module (i.e., COL3A1, COL1A1, COL1A2, POSTN, MMP2, COL5A2, FBN1, THBS2, SSP1, and ACAN) using Cytoscape software (Figure 4A). The deeper the color, the higher the score, and the more important it is (Figure 4B). The ROC curve showed that the previous five nodes had been assessed, which indicated that they had good diagnostic value for IPF. The AUC values of COL3A1, COL1A1, COL1A2, POSTN, and MMP2 were 0.912, 0.886, 0.920, 0.890, and 0.830, respectively (Figure 4C). This study established a nomogram to forecast the risk of IPF. Based on these findings and the established pro-fibrotic roles of COL1A1 and MMP2, we hypothesize that COL1A1 and MMP2 may be associated with disease progression in IPF, and thus are strong clinical biomarkers (Figure 4D). The calibration curves (Figure 4E) were applied to establish the precision due to a higher MMP2 hazard rating (MONA). MMP2 belongs to the MMP gene family; however, it differs to the rest of the MMP family in that it can be activated externally or internally by protease (22). MMP2 plays an important role in the degradation of the ECM and in the promotion of tissue remodeling, angiogenesis, and restoration. Due to these features, it is possible to measure MMP2 in the bloodstream, which is more applicable and predictable in clinic. Thus, MMP2 was selected for further study.
Immune cell infiltration in IPF
There was a significant difference between the immunocytes, monocytes, and M2 macrophages in the IPF patients and healthy controls. In the IPF group, the infiltration rates of the plasma cells and M2 macrophages were greater than that of the monocytes (P<0.001) (Figure 5A,5B). The IPF group had more macrophages, and a colony of resident macrophages was observed in the lungs during inflammatory and traumatic events. Simultaneously, monocyte-derived groups were recruited to help with the restoration. When the injury subsided, the number of cells produced from these monocytes was generally reduced, leading to apoptosis (23). The results revealed that the M2-mediated activity was positively correlated with monocytes, and the pathogenic mechanisms of IPF also involve interactions between other immune cells. This suggests that IPF is closely related to the invasion of immune cells (Figure 5C).
The KEGG and GO accumulation results suggested that the core genes were key to the formation of the ECM, which could facilitate the growth of fibroblast cells. The immune cells may transport the matrix to the tissues, and the macrophages may not only produce the collagen degradation enzyme, but may also absorb collagen into the lysosome by the receptor. MMP2 plays an important role in the degradation of the ECM, as well as in the promotion of tissue remodeling, angiogenesis, and recovery. Increased ECM degradation by MMP2 was significantly associated with reduced plasma cell numbers (P=0.007) and elevated monocyte counts (P=0.01) (all P<0.001, Figure 5D). These findings suggest that MMP2, as a key component of IPF, may be a useful biological marker.
MMP2 (MONA) in IPF
The SNP features of MMP2 are not weak tools in IPF (table available at https://cdn.amegroups.cn/static/public/jtd-2025-1293-4.xlsx). The causality of each of the genetic variants to IPF is presented in Figure 6A,6B. The IVW analysis revealed an association between MMP2 and IPF [odds ratio (OR) =1.001183, P<0.001]. Moreover, the MR Egger regression results were statistically significant (OR =1.004, P=0.01). The funnel graph shows approximately symmetric causality (Figure 6C), and horizontal pleiotropy was not observed in the MR Egger regression test results (P=0.41), suggesting that there was no causality bias in the pleiotropy. Following the removal of every SNP, MR was conducted for the remaining SNPs (Figure 6D). The results revealed that MR was effective.
Discussion
IPF is an advanced, difficult, and highly fatal pulmonary disease with complicated cellular and signaling pathways. The alveolar epithelial cell (AEC) is associated with metabolic impairment, aging, epithelial activity, and repair impairment. Genetic variability is associated with pathogenesis; in the GWAS, the second highest risk area for IPF was the desmoplakin gene (DSP). Reduced DSP expression can lower cell-cell adhesion and may damage the integrity of alveolar structure (24). The epigenetic processes of IPF include nucleosome remodeling, DNA methylation, histone modification, and microRNA-mediated gene expression. At present, there is no valid biomarker to help diagnose IPF at an early stage. However, diagnostic biomarkers could be used to detect the pathology of IPF, investigate its complicated regulatory relations, and search for better diagnostic and therapeutic goals.
In this study, we used an integrated bioinformatic approach to investigate possible core genes that might be involved in the development of IPF and immuno-invasion. First, we obtained an IPF data set of 486 DEGs. Second, a WGCNA of the key components of IPF was conducted, and a Venn diagram was used to select the node genes. Third, GO and KEGG analyses were conducted to clarify the key functions of the key genes. Next, we screened the five key genes (COL1A1, COL3A1, COL1A2, POSTN, and MMP2) via the PPI network. In the nomogram, COL1A1 and MMP2 were essential for predicting IPF. The COL1A1 gene plays an important role in the formation of collagen and the ECM, as well as in the process of tissue fibrosis. While the MMP2 gene plays an important role in the degradation of ECM, it also plays a role in the process of tissue remodeling, angiogenesis, and repair. The characteristics of MMP2 enable its detection in the bloodstream. Thus, MMP2 is more reliable and predictable in the clinic. Thus, MMP2 was selected for the study.
The results of the GO analysis suggested that the core genes were involved in the organization of the ECM. There is a close relationship between the outer envelope structure, the lumen of the ER, the synapse, and the activity of the MMP. A characteristic of fibrotic diseases is the activity of transforming growth factor-β (TGF-β) (25). The upregulation of TGF-β stimulates the deposition of the ECM, and its role in IPF has been well established (25). It is well known that pirfenidone inhibits the synthesis of TGF-β and has a notable effect on IPF. In addition, the dysregulated fibrotic cells in IPF express specific genes and cell program markers, including MMP and epithelial-mesenchymal transition (EMT) (26-28), PI3K-PKB (Akt), ECM-receptor interaction, the Wnt signaling pathway, the AGE-RAGE signaling pathway, and proteoglycans in cancer function.
Previous research has shown that Epstein-Barr virus (EBV), cytomegalovirus (CMV), human herpesvirus (HHV)-7, and HHV-8 are associated with an increased risk of IPF (29). A study has shown that persistent DNA damage can trigger a chain of events resulting in cellular senescence, such as the activation of ataxia, nuclear factor kappa-B (NF-κB) P53, and phosphoinositide kinase (PI3K)/Akt signaling, which can lead to fibrosis gene expression (30).
Alveolar type 2 (AT2) epithelial cells, critical for pulmonary surfactant production and alveolar homeostasis, demonstrate senescence and reduced abundance in advanced IPF (31). AT2 progenitor cells are usually maintained by Wnt signaling from nearby fibroblasts; Wnt signaling induces interleukin (IL)-1 in AT2 cells β expression via TGF-β, which results in fibrosis (25). Persistent Wnt signaling in IPF has been shown to have a particular pro-fibrosis action in the presence of Wnt-responsive fibroblasts (22). The key genes in this research were found to be associated with the development of IPF and could serve as a basis for further diagnostic and therapeutic approaches.
IPF was originally referred to as an inflammatory disorder. Inflammation, both inborn and adaptive, is associated with the process of healing and fibrosis, and may even be associated with inflammation in IPF. Based on CIBERSORTx data, we discovered significant differences between the IPF patients and healthy controls, particularly in terms of plasma, monocytes, and M2 macrophages. In the IPF group, the infiltration rates of the plasma cells and M2 macrophages were greater than those of the monocytes. A group of resident pulmonary macrophages persists during inflammation and injury, while monocyte-derived populations are recruited; once the injury has abated, the number of cells produced by the monocytes is generally reduced, and apoptosis occurs (23,32). To explain these findings, the investigators propose that during lung injury, monocytes rapidly differentiate into monocyte-derived alveolar macrophages (Mo-AMs). In fibrotic lungs, both tissue-resident alveolar macrophages (TR-AMs) and Mo-AMs undergo profibrotic M2-like polarization. Critically, simultaneous non-selective depletion of both macrophage populations may paradoxically augment fibrosis by triggering compensatory monocyte recruitment (32). In contrast to T helper 1 (Th1) cells, T helper 2 (Th2) cells are implicated in the pro-fibrotic pathology of IPF. These cells secrete IL-4, IL-5, and IL-13, which drive M2 macrophage polarization; activated M2 macrophages subsequently promote fibrogenesis through dual mechanisms: suppression of inflammatory responses and dysregulated tissue repair (33,34). Meanwhile, monocytes may differentiate into various kinds of macrophage phenotypes depending on the cell’s environmental factors (23,32). The activation of TGF-β may facilitate the development of fibrosis, which suggests that macrophages may be associated with the dysfunction of the alveolar epithelium in IPF. Further, macrophages generate a variety of MMPs, such as MMP3, MMP7, and MMP8, which have been shown to facilitate the development of fibrosis in a bleomycin-induced mouse model (30). We discovered that COL1A1 and MMP2 are the key genes of IPF, and that MMP2 differs from the majority of other members of the MMP group.
MMP2 plays an important role in the degradation of ECM, as well as in the promotion of tissue remodeling, angiogenesis, and recovery. In the context of pulmonary arterial hypertension, MMP2 can modulate immune responses and vascular changes, playing a significant role in the pathophysiology of pulmonary arterial hypertension (35). It was found that the number of blood cells increased as MMP2 increased. Thus, it appears that the invasion of immune cells is directly related to MMP2 in IPF. In IPF, the excessive accumulation of the ECM leads to scar formation and functional impairment in lung tissue (36). When pulmonary tissue is damaged, monocytes are taken to the lungs to differentiate into macrophages and phagocytes, which facilitates pulmonary fibrosis by a variety of mechanisms. A higher number of monocytes in the blood has been linked to the development of IPF, hospitalization, and mortality (32,37). The specific gene expression of monocytes is significantly associated with pathways related to inflammation and fibrosis (38). The number of classical monocytes increases in IPF patients, and this increase has been linked to disease progression and poor prognosis (39). Additionally, monocytes in IPF patients may contribute to disease progression by modulating processes, such as cell chemotaxis, adhesion, and migration (40). The quantity of monocytes is not only related to the progression of IPF but may also be correlated with the severity of other lung diseases. For example, an increase in monocytes has also been shown to be associated with disease severity and functional decline in chronic obstructive pulmonary disease patients (41). According to a report, the circulating level of immunoglobulin G (IgG) and immunoglobulin A (IgA) in IPF patients is related to reduced pulmonary function (29). Elevated monocyte counts have been proposed as negative prognostic biomarkers for early mortality and rapid progression in IPF (23,42). In clinical practice, beyond monitoring radiological imaging and pulmonary function, an increase in monocyte counts in peripheral blood or monocyte/macrophage numbers in bronchoalveolar lavage fluid (BALF) can serve as an indicator of disease severity, guiding the timely initiation of antifibrotic therapy. Furthermore, despite good treatment compliance, a subset of patients continues to progress clinically, functionally, or radiologically. Consequently, researchers continue to seek useful predictors of the clinical course of IPF during antifibrotic therapy, such as physiological models or quantitative radiographic patterns (43,44). A previous study has shown that only one-third of long-term IPF survivors tolerate and continue antifibrotic therapy without interruption for at least three years (45). Thus, monitoring blood monocyte levels or MMP2 during treatment may enhance patient confidence through observed improvements in these biomarkers, potentially increasing tolerance to antifibrotic therapy. Additionally, the pathogenesis and progression of IPF involve a complex interplay between core genetic drivers and immune cells. This interaction drives disease progression through multiple cellular pathways.
This study had a number of limitations. First, this study only used one dataset, as microarray data for IPF domains are limited. Second, the combination of more IPF-related data sets would make our findings more compelling. Third, more biology tests are needed to validate the specific mechanisms of these core genes. Thus, a number of related studies are required to validate our findings.
Conclusions
MMP2 may be related to the activation of the immune system in the course of pulmonary damage, and thus may serve as a potential indicator for the diagnostic aid of IPF.
Our results indicate that MMP2 is involved in the occurrence and development of IPF and is a potential biomarker for IPF.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1293/rc
Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1293/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1293/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Richeldi L, Collard HR, Jones MG. Idiopathic pulmonary fibrosis. Lancet 2017;389:1941-52. [Crossref] [PubMed]
- King TE Jr, Albera C, Bradford WZ, et al. All-cause mortality rate in patients with idiopathic pulmonary fibrosis. Implications for the design and execution of clinical trials. Am J Respir Crit Care Med 2014;189:825-31. [Crossref] [PubMed]
- Cai M, Zhu M, Ban C, et al. Clinical features and outcomes of 210 patients with idiopathic pulmonary fibrosis. Chin Med J (Engl) 2014;127:1868-73.
- Hutchinson J, Fogarty A, Hubbard R, et al. Global incidence and mortality of idiopathic pulmonary fibrosis: a systematic review. Eur Respir J 2015;46:795-806. [Crossref] [PubMed]
- Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 2008;9:559. [Crossref] [PubMed]
- Barrett T, Troup DB, Wilhite SE, et al. NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res 2007;35:D760-5. [Crossref] [PubMed]
- Rosas IO, Richards TJ, Konishi K, et al. MMP1 and MMP7 as potential peripheral blood biomarkers in idiopathic pulmonary fibrosis. PLoS Med 2008;5:e93. [Crossref] [PubMed]
- Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 2005;4:Article17.
- Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47. [Crossref] [PubMed]
- Kolde R. pheatmap: Pretty Heatmaps. 2019. Available online: https://cran.r-project.org/web/packages/pheatmap/index.html
- Villanueva AM, Chen ZJ. ggplot2: Elegant Graphics for Data Analysis (2nd ed.). Measurement: Interdisciplinary Research and Perspectives 2019;17:160-7.
- Chenn H. Generate high-resolution Venn and Euler plots. 2018. Available online: https://CRAN.R-project.org/package=VennDiagram
- Wu T, Hu E, Xu S, et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb) 2021;2:100141. [Crossref] [PubMed]
- Kanehisa M, Furumichi M, Sato Y, et al. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res 2023;51:D587-92. [Crossref] [PubMed]
- Szklarczyk D, Gable AL, Lyon D, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019;47:D607-13. [Crossref] [PubMed]
- Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003;13:2498-504. [Crossref] [PubMed]
- Harrell F Jr. rms: Regression modeling strategies. 2020. Available online: https://cran.r-project.org/src/contrib/Archive/rms/rms_6.1-0.tar.gz
- Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011;12:77. [Crossref] [PubMed]
- Newman AM, Liu CL, Green MR, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 2015;12:453-7. [Crossref] [PubMed]
- Emdin CA, Khera AV, Kathiresan S. Mendelian Randomization. JAMA 2017;318:1925-6. [Crossref] [PubMed]
- Dudbridge F. Polygenic Mendelian Randomization. Cold Spring Harb Perspect Med 2021;11:a039586. [Crossref] [PubMed]
- Michalski JE, Schwartz DA. Genetic Risk Factors for Idiopathic Pulmonary Fibrosis: Insights into Immunopathogenesis. J Inflamm Res 2020;13:1305-18. [Crossref] [PubMed]
- Kreuter M, Lee JS, Tzouvelekis A, et al. Monocyte Count as a Prognostic Biomarker in Patients with Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 2021;204:74-81. [Crossref] [PubMed]
- Leavy OC, Ma SF, Molyneaux PL, et al. Proportion of Idiopathic Pulmonary Fibrosis Risk Explained by Known Common Genetic Loci in European Populations. Am J Respir Crit Care Med 2021;203:775-8. [Crossref] [PubMed]
- Frangogiannis N. Transforming growth factor-β in tissue fibrosis. J Exp Med 2020;217:e20190103. [Crossref] [PubMed]
- Kobayashi Y, Tata A, Konkimalla A, et al. Persistence of a regeneration-associated, transitional alveolar epithelial cell state in pulmonary fibrosis. Nat Cell Biol 2020;22:934-46. [Crossref] [PubMed]
- Stella GM, D'Agnano V, Piloni D, et al. The oncogenic landscape of the idiopathic pulmonary fibrosis: a narrative review. Transl Lung Cancer Res 2022;11:472-96. [Crossref] [PubMed]
- Lettieri S, Bertuccio FR, Del Frate L, et al. The Plastic Interplay between Lung Regeneration Phenomena and Fibrotic Evolution: Current Challenges and Novel Therapeutic Perspectives. Int J Mol Sci 2023;25:547. [Crossref] [PubMed]
- Sheng G, Chen P, Wei Y, et al. Viral Infection Increases the Risk of Idiopathic Pulmonary Fibrosis: A Meta-Analysis. Chest 2020;157:1175-87. [Crossref] [PubMed]
- Craig VJ, Quintero PA, Fyfe SE, et al. Profibrotic activities for matrix metalloproteinase-8 during bleomycin-mediated lung injury. J Immunol 2013;190:4283-96. [Crossref] [PubMed]
- Yao C, Guan X, Carraro G, et al. Senescence of Alveolar Type 2 Cells Drives Progressive Pulmonary Fibrosis. Am J Respir Crit Care Med 2021;203:707-17. Erratum in: Am J Respir Crit Care Med 2021;204:113. [Crossref] [PubMed]
- Ge Z, Chen Y, Ma L, et al. Macrophage polarization and its impact on idiopathic pulmonary fibrosis. Front Immunol 2024;15:1444964. [Crossref] [PubMed]
- Deng L, Huang T, Zhang L. T cells in idiopathic pulmonary fibrosis: crucial but controversial. Cell Death Discov 2023;9:62. [Crossref] [PubMed]
- Furuie H, Yamasaki H, Suga M, et al. Altered accessory cell function of alveolar macrophages: a possible mechanism for induction of Th2 secretory profile in idiopathic pulmonary fibrosis. Eur Respir J 1997;10:787-94.
- Xu J, Miao S, Wu T, et al. CXCR7 promotes pulmonary vascular remodeling via targeting p38/MMP2 pathway in pulmonary arterial hypertension. J Thorac Dis 2024;16:2460-71. [Crossref] [PubMed]
- Dabaghi M, Singer R, Noble A, et al. Influence of lung extracellular matrix from non-IPF and IPF donors on primary human lung fibroblast biology. Biomater Sci 2025;13:1721-41. [Crossref] [PubMed]
- Beisang DJ, Smith K, Yang L, et al. Single-cell RNA sequencing reveals that lung mesenchymal progenitor cells in IPF exhibit pathological features early in their differentiation trajectory. Sci Rep 2020;10:11162. [Crossref] [PubMed]
- Poole JA, Schwab A, Thiele GM, et al. Unique transcriptomic profile of peripheral blood monocytes in rheumatoid arthritis-associated interstitial lung disease. Rheumatology (Oxford) 2024;keae572. [Crossref] [PubMed]
- Unterman A, Zhao AY, Neumark N, et al. Single-Cell Profiling Reveals Immune Aberrations in Progressive Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 2024;210:484-96. [Crossref] [PubMed]
- Perrot CY, Karampitsakos T, Unterman A, et al. Mast-cell expressed membrane protein-1 is expressed in classical monocytes and alveolar macrophages in idiopathic pulmonary fibrosis and regulates cell chemotaxis, adhesion, and migration in a TGFβ-dependent manner. Am J Physiol Cell Physiol 2024;326:C964-77. [Crossref] [PubMed]
- Ryu MH, Yun JH, Kim K, et al. Computational deconvolution of cell type-specific gene expression in COPD and IPF lungs reveals disease severity associations. BMC Genomics 2024;25:1192. [Crossref] [PubMed]
- Karampitsakos T, Torrisi S, Antoniou K, et al. Increased monocyte count and red cell distribution width as prognostic biomarkers in patients with Idiopathic Pulmonary Fibrosis. Respir Res 2021;22:140. [Crossref] [PubMed]
- Zhang H, Li X, Zhang X, et al. Quantitative CT analysis of idiopathic pulmonary fibrosis and correlation with lung function study. BMC Pulm Med 2024;24:437. [Crossref] [PubMed]
- Lee JH, Jang JH, Jang HJ, et al. New prognostic scoring system for mortality in idiopathic pulmonary fibrosis by modifying the gender, age, and physiology model with desaturation during the six-minute walk test. Front Med (Lausanne) 2023;10:1052129. [Crossref] [PubMed]
- Cocconcelli E, Bernardinello N, Cameli P, et al. Prevalence and Predictors of Response to Antifibrotics in Long-Term Survivors with Idiopathic Pulmonary Fibrosis. Lung 2025;203:35. [Crossref] [PubMed]
(English Language Editor: L. Huleatt)

