Register      Login
Australian Journal of Chemistry Australian Journal of Chemistry Society
An international journal for chemical science
RESEARCH ARTICLE (Open Access)

Implementation of network embedding strategy on proteome datasets from multi-source cancers to demonstrate marker proteins of cancers

Dezhi Sun https://orcid.org/0000-0002-0069-452X A B # , Ruzhen Chen A # , Shuaikang Ma C , Yuqi Zhang A C and Dong Li https://orcid.org/0000-0002-8680-0468 A *
+ Author Affiliations
- Author Affiliations

A State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China.

B Department of Pharmaceutical Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, China.

C College of Life Sciences, Hebei University, Baoding 071002, China.

* Correspondence to: lidong.bprc@foxmail.com
# These authors contributed equally to this paper

Handling Editor: Mibel Aguilar

Australian Journal of Chemistry 76(8) 437-447 https://doi.org/10.1071/CH22176
Submitted: 10 August 2022  Accepted: 22 November 2022   Published: 19 January 2023

© 2023 The Author(s) (or their employer(s)). Published by CSIRO Publishing. This is an open access article distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND)

Abstract

The rapid production of high-throughput cancer omics data provides valuable data resources for revealing the pathogenesis, prognosis prediction and treatment strategies of cancers. However, the huge data scale brings great challenges to data analysis. Therefore, we applied the representation learning method to the joint analysis of biomedical network and omics data. According to the protein expression profile of patients with early-stage hepatocellular carcinoma, 15 dimensional embedding vectors of 101 samples were obtained. Unsupervised learning was then used to cluster the embedded vectors of the samples, and we found that the clustering of the embedded vectors of the samples was consistent with the clustering of the original data. Therefore, the spatial distribution of embedded vectors can maintain the similarity of samples. New pan-cancer subtypes were obtained by joint embedding the expression profile of pan-cancer proteomic and pathway network data. Nine hunded and forty four proteins such as KIF2C, AURKA, ATP1B1, BDH1 and C6ORF106 were found to be significantly related to these subtypes, and 143 biological pathways or processes such as p53 signaling pathway, nucleotide synthesis, immune diseases, metabolism, cholesterol synthesis and transportation were found to be significantly related to these subtypes. These results show that the representation learning system developed can realize the seamless connection between the omics data and the pathway network. Our method is expected to help mine the biological knowledge contained in the omics data and provide a new perspective for further explanation of the molecular mechanism.

Keywords: biological pathway, network embedding, pan-cancer analysis, proteomics, representation learning.


References

[1]  H Cai, VW Zheng, CC Chang, A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Trans Knowl Data Eng 2018, 30, 1616.
         | A comprehensive survey of graph embedding: Problems, techniques, and applications.Crossref | GoogleScholarGoogle Scholar |

[2]  H Xu, L Gao, M Huang, R Duan, A network embedding based method for partial multi-omics integration in cancer subtyping. Methods 2021, 192, 67.
         | A network embedding based method for partial multi-omics integration in cancer subtyping.Crossref | GoogleScholarGoogle Scholar |

[3]  C Hou, F Nie, X Li, D Yi, Y Wu, Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 2014, 44, 793.
         | Joint embedding learning and sparse regression: a framework for unsupervised feature selection.Crossref | GoogleScholarGoogle Scholar |

[4]  S-S Wu, M-X Hou, C-M Feng, J-X Liu, LJELSR: A strengthened version of jelsr for feature selection and clustering. Int J Mol Sci 2019, 20, 886.
         | LJELSR: A strengthened version of jelsr for feature selection and clustering.Crossref | GoogleScholarGoogle Scholar |

[5]  X Li, W Chen, Y Chen, et al. Network embedding-based representation learning for single cell RNA-seq data. Nucleic Acids Res 2017, 45, e166.
         | Network embedding-based representation learning for single cell RNA-seq data.Crossref | GoogleScholarGoogle Scholar |

[6]  Y Jiang, A Sun, Y Zhao, W Ying, H Sun, X Yang, B Xing, W Sun, L Ren, B Hu, C Li, L Zhang, G Qin, M Zhang, N Chen, M Zhang, Y Huang, J Zhou, Y Zhao, M Liu, X Zhu, Y Qiu, Y Sun, C Huang, M Yan, M Wang, W Liu, F Tian, H Xu, J Zhou, Z Wu, T Shi, W Zhu, J Qin, L Xie, J Fan, X Qian, F He, Chinese Human Proteome Project (CNHPP) Consortium.  Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature 2019, 567, 257.
         | Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma.Crossref | GoogleScholarGoogle Scholar |

[7]  M Kanehisa, M Furumichi, Y Sato, M Ishiguro-Watanabe, M Tanabe, KEGG: Integrating viruses and cellular organisms. Nucleic Acids Res 2021, 49, D545.
         | KEGG: Integrating viruses and cellular organisms.Crossref | GoogleScholarGoogle Scholar |

[8]  M Martens, A Ammar, A Riutta, A Waagmeester, DN Slenter, K Hanspers, et al. WikiPathways: Connecting communities. Nucleic Acids Res 2021, 49, D613.
         | WikiPathways: Connecting communities.Crossref | GoogleScholarGoogle Scholar |

[9]  PD Karp, R Billington, R Caspi, CA Fulcher, M Latendresse, A Kothari, et al. The BioCyc collection of microbial genomes and metabolic pathways. Brief Bioinform 2019, 20, 1085.
         | The BioCyc collection of microbial genomes and metabolic pathways.Crossref | GoogleScholarGoogle Scholar |

[10]  F Chen, DS Chandrashekar, S Varambally, CJ Creighton, Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers. Nat Commun 2019, 10, 5679.
         | Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers.Crossref | GoogleScholarGoogle Scholar |

[11]  G Jiang, X Zhang, Y Zhang, L Wang, C Fan, H Xu, Y Miao, E Wang., A novel biomarker C6orf106 promotes the malignant progression of breast cancer. Tumour Biol 2015, 36, 7881.
         | A novel biomarker C6orf106 promotes the malignant progression of breast cancer.Crossref | GoogleScholarGoogle Scholar |

[12]  Y Zou, WS Henry, EL Ricq, ET Graham, et al. Plasticity of ether lipids promotes ferroptosis susceptibility and evasion. Nature 2020, 585, 603.
         | Plasticity of ether lipids promotes ferroptosis susceptibility and evasion.Crossref | GoogleScholarGoogle Scholar |

[13]  F Pontén, K Jirstrm, M Uhlen, The human protein atlas—a tool for pathology. J Pathol 2008, 216, 387.
         | The human protein atlas—a tool for pathology.Crossref | GoogleScholarGoogle Scholar |

[14]  C Kandoth, MD McLellan, F Vandin, K Ye, B Niu, C Lu, M Xie, Q Zhang, JF McMichael, MA Wyczalkowski, M Leiserson, CA Miller, JS Welch, MJ Walter, MC Wendl, TJ Ley, RK Wilson, BJ Raphael, L Ding, Mutational landscape and significance across 12 major cancer types. Nature 2013, 502, 333.
         | Mutational landscape and significance across 12 major cancer types.Crossref | GoogleScholarGoogle Scholar |

[15]  R Teng, Z Liu, H Tang, W Zhang, Y Chen, R Xu, L Chen, J Song, X Liu, H Deng, HSP60 silencing promotes warburg-like phenotypes and switches the mitochondrial function from ATP production to biosynthesis in ccRCC cells. Redox Biol 2019, 24, 101218.
         | HSP60 silencing promotes warburg-like phenotypes and switches the mitochondrial function from ATP production to biosynthesis in ccRCC cells.Crossref | GoogleScholarGoogle Scholar |

[16]  ML Disis, Immune regulation of cancer. J Clin Oncol 2010, 28, 4531.
         | Immune regulation of cancer.Crossref | GoogleScholarGoogle Scholar |

[17]  LK Boroughs, RJ DeBerardinis, Metabolic pathways promoting cancer cell survival and growth. Nat Cell Biol 2015, 17, 351.
         | Metabolic pathways promoting cancer cell survival and growth.Crossref | GoogleScholarGoogle Scholar |

[18]  LM Coussens, B Fingleton, LM Matrisian, Matrix metalloproteinase inhibitors and cancer—Trials and tribulations. Science 2002, 295, 2387.
         | Matrix metalloproteinase inhibitors and cancer—Trials and tribulations.Crossref | GoogleScholarGoogle Scholar |

[19]  (a) H Xu, S Zhou, Q Tang, H Xia, F Bi, Cholesterol metabolism: New functions and therapeutic approaches in cancer. Biochim Biophys Acta Rev Cancer 2020, 1874, 188394.
         | Cholesterol metabolism: New functions and therapeutic approaches in cancer.Crossref | GoogleScholarGoogle Scholar |
      (b) S Ehmsen, MH Pedersen, G Wang, MG Terp, A Arslanagic, BL Hood, et al. Increased cholesterol biosynthesis is a key characteristic of breast cancer stem cells influencing patient outcome. Cell Rep 27, 3927.
         | Increased cholesterol biosynthesis is a key characteristic of breast cancer stem cells influencing patient outcome.Crossref | GoogleScholarGoogle Scholar |

[20]  (a) PMR Cruz, H Mo, WJ Mcconathy, N Sabnis, AG Lacko, The role of cholesterol metabolism and cholesterol transport in carcinogenesis: A review of scientific findings, relevant to future cancer therapeutics. Front Pharmacol 2013, 4, 119.
         | The role of cholesterol metabolism and cholesterol transport in carcinogenesis: A review of scientific findings, relevant to future cancer therapeutics.Crossref | GoogleScholarGoogle Scholar |
      (b) OF Kuzu, R Gowda, MA Noory, et al. Modulating cancer cell survival by targeting intracellular cholesterol transport. Br J Cancer 2017, 117, 513.
         | Modulating cancer cell survival by targeting intracellular cholesterol transport.Crossref | GoogleScholarGoogle Scholar |

[21]  H Zhou, J Jin, H Zhang, B Yi, M Wozniak, L Wong, IntPath--an integrated pathway gene relationship database for model organisms and important pathogens. BMC Syst Biol 2012, 6, S2.
         | IntPath--an integrated pathway gene relationship database for model organisms and important pathogens.Crossref | GoogleScholarGoogle Scholar |

[22]  Mikolov T, Sutskever I, Kai C, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in neural information processing systems 26; 2013.

[23]  Tang J, Qu M, Wang M, Zhang M, et al. LINE: Large‐scale information network embedding. Proceedings of the 24th international conference on world wide web; 2015. pp. 1067–1077.
| Crossref |.