Resource
Open access
Published: 02 July 2024

Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping

Nature Immunology (2024)Cite this article

6773 Accesses
2 Citations
64 Altmetric
Metrics details

Subjects

Abstract

The functional diversity of natural killer (NK) cell repertoires stems from differentiation, homeostatic, receptor–ligand interactions and adaptive-like responses to viral infections. In the present study, we generated a single-cell transcriptional reference map of healthy human blood- and tissue-derived NK cells, with temporal resolution and fate-specific expression of gene-regulatory networks defining NK cell differentiation. Transfer learning facilitated incorporation of tumor-infiltrating NK cell transcriptomes (39 datasets, 7 solid tumors, 427 patients) into the reference map to analyze tumor microenvironment (TME)-induced perturbations. Of the six functionally distinct NK cell states identified, a dysfunctional stressed CD56^bright state susceptible to TME-induced immunosuppression and a cytotoxic TME-resistant effector CD56^dim state were commonly enriched across tumor types, the ratio of which was predictive of patient outcome in malignant melanoma and osteosarcoma. This resource may inform the design of new NK cell therapies and can be extended through transfer learning to interrogate new datasets from experimental perturbations or disease conditions.

High-dimensional single-cell analysis of human natural killer cell heterogeneity

Article Open access 02 July 2024

Reusability report: Leveraging supervised learning to uncover phenotype-relevant biology from single-cell RNA sequencing data

Article 05 March 2024

Deep learning identifies a T-cell exhaustion-dependent transcriptional signature for predicting clinical outcomes and response to immune checkpoint blockade

Article Open access 11 July 2023

Main

NK cells are innate lymphocytes that play a vital role in the immune response through their ability to directly kill transformed and virus-infected cells by orchestrating the early phase of the adaptive immune response¹. NK cells are commonly divided into two functionally distinct subsets: CD56^bright and CD56^dim NK cells^2,3. However, this is an oversimplified view of the repertoire. Mass cytometry profiling of NK cell repertoires at the single-cell level revealed an extensive phenotypic diversity comprising up to 100,000 unique subsets in healthy individuals⁴. Much of this diversity is based on combinatorial expression of stochastically expressed, germline-encoded activating and inhibitory receptors that bind to human leukocyte antigen (HLA) class I and tune NK cell function in a process termed NK cell education^5,6. Another layer of diversity reflects the continuous differentiation through well-defined intermediate phenotypes from the naive CD56^bright NK cells through CD62L⁺NKG2A⁺KIR⁻CD57⁻CD56^dim NK cells to terminally differentiated, adaptive CD62L⁻NKG2C⁺CD57⁺KIR⁺CD56^dim NK cells, associated with past infection with cytomegalovirus (CMV)^7,8,9,10. Given the increasing interest in harnessing the cytolytic potential of NK cells in cell therapy against cancer, it is of fundamental importance to understand the molecular programs and gene-regulatory circuits driving NK cell differentiation and the underlying functional diversification of the human NK cell repertoire.

Utilizing single-cell RNA sequencing (scRNA-seq), Crinier et al. discovered organ-specific signatures in human spleen NK cells and two major transcriptional clusters in blood-derived NK cells (PB-NK), corresponding to CD56^dim (NK1) and CD56^bright (NK2) NK cell subsets². Bulk RNA and chromatin immunoprecipitation sequencing identified dominant transcription factor (TF) axes defining CD56^bright (TCF1-LEF-MYC) and CD56^dim (PRDM1) phenotypic subsets, respectively¹¹. Later research reported additional diversity with unique transcriptional clusters, including interleukin (IL)-2- and type I interferon (IFN)-responding NK cell subsets¹² and an intermediate CD56^dimGzmK⁺ stage, potentially bridging CD56^bright and CD56^dim NK cells¹³. A comprehensive analysis unveiled a role for Bcl11b in driving NK cell differentiation toward the adaptive state, reciprocally suppressing early TFs such as RUNX2 and ZBTB16 (ref. ¹⁴). Combining gene expression analysis, chromatin accessibility and lineage tracing via mitochondrial DNA mutations, Rückert et al. revealed clonal expansions and a distinct inflammatory memory signature in adaptive NK cells¹⁵. Using a pan-cancer, single-cell atlas approach, Tang et al.¹⁶ identified a tumor-enriched dysfunctional CD56^dimCD16^hi NK cell population interacting with LAMP3⁺ dendritic cells in the TME. Hence, scRNA-seq and bulk RNA-seq usage have defined major transcriptional regulatory hubs during NK cell differentiation and identified a persistent memory state in human innate immunity. However, it remains unclear how the regulatory gene circuits that operate under homeostasis in healthy tissues are affected by cellular and/or soluble cues in the TME, resulting in perturbed functional states within tumor-infiltrating NK (TiNK) cells.

In the present study we established a single-cell transcriptional reference map that resolves gene expression trends and dominating TF–target interactions during NK cell differentiation in blood and normal tissues. Reference mapping enabled the analysis of cellular differences and gene programs in diseases and various conditions by contextualizing new datasets within a healthy transcriptional reference, facilitating the identification of new states not found in the literature¹⁷. We utilized our NK cell reference map, compiled from 44,640 PB-NK cells (12 donors) and 27,732 tissue-resident NK (TrNK) cells (136 donors), to query the regulons and functional states, as defined through gene expression signatures, of TiNK cells derived from 427 patients with 7 distinct solid tumors (38,982 TiNKs). We found that TrNK and TiNK cells have a clear tissue-residency signature but still share the dominant regulons of blood CD56^bright and CD56^dim NK cells. Of the six functional states identified in our pan-cancer atlas and confirmed in a spatial transcriptomics dataset, a dysfunctional stressed CD56^bright state susceptible to TME-associated cellular communication and a cytotoxic effector CD56^dim state were commonly enriched across tumor types. Stratification of patient survival data identified a high ratio of effector CD56^dim to stressed CD56^bright state to correlate with improved survival in patients with osteosarcoma and melanoma. This resource provides a granular view of cancer-specific alterations of solid TiNK cells, identifying how the TME can lead to NK cell dysfunction and may inspire new strategies to engineer cell therapy products with robust functional phenotypes resistant to TME-induced suppressive mechanisms.

Results

NK cell subset annotation using predictive gene signatures

To establish a pan-cancer atlas of TiNK cells, we first defined NK cell differentiation at the transcriptional level. We performed scRNA-seq of the total NK cell population from seven healthy donors and integrated our transcriptomes with five publicly available donor datasets^2,18 using scVI (single-cell Variational Inference)¹⁹ (Supplementary Table 1). By retaining only cell-to-cell variation independent of sample-to-sample variation, the cells that initially clustered by donor and laboratory origin were successfully integrated into a homogeneous population and visualized using diffusion maps²⁰ to preserve the continuous trajectories observed with biological differentiation (Fig. 1a). Although NK cell differentiation is best described as a continuum, CD56^bright and CD56^dim NK cells represent two distinct stages of differentiation. By performing gene signature scoring using AUCell²¹, we identified cells at the top of the diffusion map embedding scoring high for the CD56^bright gene signature², whereas the main body of the embedding exhibited increasing intensity of the CD56^dim signature² (Fig. 1b). Scoring of two independent gene signatures based on the CD56^bright/dim regulon¹¹ and proteome²² confirmed our results (Extended Data Fig. 1a,b).

**Fig. 1: NK cell differentiation at the transcriptional level.**

The relatively large and heterogeneous population of CD56^dim NK cells is commonly phenotypically defined into functionally distinct subsets based on a selected number of inhibitory and activating receptors contributing to the functional tuning⁷. To identify predictive gene signatures associated with these functional stages encompassing NK cell differentiation, we sorted and sequenced equal numbers of CD56^bright NK cells and four CD56^dim NK cell subsets (NKG2A⁺KIR⁻CD57⁻, NKG2A⁻self-KIR⁺CD57⁻, NKG2A⁻nonself-KIR⁺CD57⁻, NKG2A⁻self-KIR⁺CD57⁺ or NKG2A^−/+self-KIR⁺CD57⁺NKG2C⁺) from two donors, one without and one with a large adaptive NK cell expansion (Fig. 1c and Extended Data Fig. 1c,d). Transcriptionally, the adaptive NK cell subset was the most distinct because the remaining CD56^dim subsets exhibited a high degree of transcriptional overlap, while still ordering themselves along the previously defined maturation scheme (Fig. 1c). As previously observed in bulk RNA-seq data²³, the transcriptomes of self and nonself KIR⁺ NK cells were highly similar even at the single-cell level and thus merged for subsequent analysis (Fig. 1c). The five transcriptionally distinct NK cell subsets were renamed to reflect their maturation stage: ‘CD56^bright’, ‘early CD56^dim’, ‘intermediate CD56^dim’, ‘late CD56^dim’ and ‘adaptive’ (Fig. 1c).

We next trained a semi-supervised model, scANVI (single-cell ANnotation using Variational Inference)²⁴, to leverage our identified NK cell subset gene signatures to predict and infer subset annotation of compiled bulk NK cell scRNA-seq datasets. We first tested the accuracy of the prediction model (M1) on 15% of the subset-sorted NK cells (Fig. 1c) that were not included in the training of the model. Transcriptionally distinct subsets (CD56^bright, adaptive) were annotated with high accuracy, whereas subsets exhibiting higher transcriptional overlap were annotated with slightly reduced accuracy (Fig. 1d). Using this model, we could annotate the total NK cell dataset comprising 23,253 single-cell transcriptomes across 12 donors at the subset level (Fig. 1e). The transcriptional profiles of the subsets are captured by the model and used to identify differentially expressed genes (DEGs). The overlapping sets of genes illustrate the transition between the subsets. (Fig. 1f). To validate our annotation model, we performed unbiased clustering (Leiden) of the total NK cell dataset (12 donors), identifying 5 clusters closely matching our annotated 5 NK cell subsets (Fig. 1g). A small portion of intermediate CD56^dim-annotated NK cells clustered together with late CD56^dim-annotated NK cells in cluster 4 (Fig. 1h), probably corresponding to more mature cells within the population. The subset stratification obtained through training of our model based on subset signatures, as well as the unbiased Leiden clustering, harmonizes well with the recently proposed NK1–3 nomenclature²⁵ (Extended Data Fig. 1e). Having confirmed the validity of our five NK cell subsets, M1 was utilized to identify donors with an adaptive NK cell expansion, which were all confirmed to be CMV seropositive (Fig. 1i). Thus, this first scANVI model forms a basis to interrogate cellular states layered on top of the natural transcriptional changes with NK cell subsets at different stages of differentiation.

Fate-specific gene-regulatory networks

To decipher the regulatory gene pathways driving NK cell differentiation at the transcriptional level, we used Palantir²⁶ and RNA velocity to calculate pseudotime^27,28. Palantir identifies terminal cells based on a chosen starting cell, placing the remaining cells along a timeline (pseudotime). Defining the starting cell (blue) based on the lowest CD56^dim score² (Fig. 1b) identified two terminal cells (orange), predicted to be part of the late CD56^dim and adaptive population, respectively (Fig. 2a). To validate this trajectory, we utilized the dynamic model implemented in scVelo²⁷ to compute RNA velocity (spliced versus unspliced transcripts), inferring pseudotime without a predefined starting cell (Extended Data Fig. 2a,b). The resulting vector field and extrapolated pseudotime confirmed a trajectory starting within the CD56^bright NK cell subset and terminating in the adaptive subset (Fig. 2b). Last, to infer developmental relationships at the resolution of the five subsets, representing functionally distinct subsets and proposed stages of NK cell differentiation⁷, we applied partition-based graph abstraction (PAGA)²⁹ to quantify their connectivity and estimate transitions. In line with the two terminal fates (late CD56^dim, adaptive) identified by Palantir, we analyzed donors with conventional and adaptive NK cells separately (Fig. 1i). In both types of donors, early CD56^dim NK cells formed the connecting link between CD56^bright and the remaining CD56^dim populations (Fig. 2c,d). However, although adaptive donor NK cells continued their progression to intermediate CD56^dim cells, terminating in the transcriptionally distinct adaptive population, conventional donors instead progressed toward intermediate/late CD56^dim populations (Fig. 2c,d).

**Fig. 2: GRNs defining conventional and adaptive NK cell fates.**

Having established a temporal axis to NK cell differentiation, we utilized generalized additive models (GAMs) to compute gene expression trends as a function of time for each gene²⁶, which clustered into five distinct trends (Fig. 2e). Genes varying in expression across the two terminal fates were depicted in their trends for each fate, exemplified by KLRC2, CD52 (refs. ^15,18) and IL32 clustering into trend 1 in the conventional late CD56^dim fate and trend 4 in the adaptive fate (Fig. 2e). Based on the two-fate model, we constructed gene-regulatory networks (GRNs)²¹ stratified by the five gene trends and identified the dominant TFs across pseudotime and their known downstream target genes (Fig. 2f). Trend 1 is dominated by genes that are downregulated with differentiation from CD56^bright to CD56^dim cells, including previously reported TFs (MYC, LEF1, RUNX2)¹¹, RBPJ³⁰ involved in Notch signaling, the retinoic acid receptor (RXRA) and TFs regulating ID2 expression (HOXA9, HOXA10)³¹ (Fig. 2e,f). Trend 2 genes, compared with trend 1, are upregulated during differentiation from early to intermediate CD56^dim cells and include, among others, EGR1 (ref. ³²) (cell survival, proliferation, apoptosis, regulation of TRAIL expression), BHLHE40 (refs. ^33,34) (associated with NK cell activation and repression of RXRA) and IRF8 (refs. ^35,36) (role in orchestrating adaptive response, essential NK cell gene) (Fig. 2e,f). TFs exhibiting less dynamic changes across pseudotime are clustered in trend 3, such as IKZF1 (ref. ³⁷), XBP1 and KLF2, which play a role in regulating homeostatic proliferation, effector function and cytokine responsiveness^38,39. TFs exhibiting higher expression at the start and end of pseudotime fall into trend 4, including STAT3 (cell survival, IFN-γ production) and DDIT3 (ref. ⁴⁰) (stress response, metabolism). Last, expression of trend 5 genes steadily increases with differentiation, decreasing only during late differentiation, and includes previously reported TFs associated with CD56^dim NK cells (MAF, PRDM1, TBX21)¹¹, the AP-1 family member BATF, the ETS family member ETV7 and the Wnt target gene ASCL2 (Fig. 2e,f). The TF-based GRNs were further curated to only retain direct targets with significant motif enrichment, referred to as ‘regulons’ (denoted by ‘(+)’), expression of which was confirmed in an independent bulk RNA-seq dataset on sorted NK cell subsets. Regulon expression substantially differing between the conventional and adaptive fate includes conventional fate-associated BHLHE40 (ref. ³⁴), IRF8 (refs. ^35,36) and DDIT3 (ref. ⁴⁰) and adaptive fate-associated MAF¹¹, BATF and PRDM1 (ref. ⁴¹) regulons (Fig. 2g). Clustering of dominant TFs according to their temporal expression during NK cell differentiation revealed a set of highly connected regulatory circuits, expression of which diverged during terminal differentiation into one of the two cell fates: conventional or adaptive.

Transfer learning to generate pan-cancer atlas

Having transcriptionally defined NK cell differentiation in peripheral blood (PB), we proceeded to train a second model (M2) with publicly available scRNA-seq datasets encompassing 6 healthy tissues (prostate, lung, pancreas, skin, breast, brain) from a total of 136 donors using scVI¹⁹ to generate a healthy reference map (PB-NK + TrNK) (Fig. 3a and Supplementary Table 2). The tissue-specific datasets were integrated and annotated using scANVI and CellTypist⁴² was used to identify immune subsets of interest at the pan-tissue level (Fig. 3b and Extended Data Fig. 3a) and within individual tissues (Extended Data Fig. 3b–f). The annotation and integration steps were repeated for the scRNA-seq datasets from 7 solid tumors (prostate (PRAD), lung (NSCLC), melanoma (SKCM), pancreas (PAAD), breast cancer (BRAC), glioblastoma (GBM) and osteosarcoma (SARC)) from a total of 427 patients (Supplementary Tables 3 and 4), at the pan-cancer level (Fig. 3c,d) and within individual tumor types (Extended Data Fig. 4a–g). CellTypist-annotated innate lymphoid cells (ILCs) (Extended Data Fig. 5a,b) were further stratified into ILC1/2/3 based on previously described scRNA-seq signatures⁴³. We could not identify ILC1s in both the tissue and the tumor datasets, but, importantly, ILC2- and ILC3-annotated cells scored highly for IL7R expression compared with CD56^bright- and CD56^dim-annotated NK cells, excluding contamination by ILC1s (Extended Data Fig. 5c,d).

**Fig. 3: Pan-cancer atlas of healthy tissue-resident and solid TiNK cells.**

To assess tissue-residency status in our annotated NK cells in the tissue- and tumor-derived datasets (Extended Data Fig. 5a,b), we utilized a literature-derived TR signature as well as our own atlas-derived TR (atlas-TR) signature (Fig. 3e). The atlas-TR signature is based on the top six genes differentially expressed by both CD56^bright and CD56^dim NK cells across tissue types when comparing with the corresponding subset in the blood-derived NK cells (Extended Data Fig. 5e,f). CD56^bright NK cells scored generally higher for a TR signature compared with CD56^dim NK cells in both normal tissue and tumors, with a more distinct TR signal (compared with PB-NK) achieved with the atlas-TR signature (Fig. 3e and Extended Data Fig. 5g). NK cells annotated in a healthy brain scored very low for tissue residency and thus we cannot exclude blood contamination in these samples (Extended Data Fig. 5g).

CD56^bright- and CD56^dim-annotated TiNK cells were mapped on to the reference map (PB-NK, TrNK) using transfer learning (scArches⁴⁴) to generate the final model (M3), our pan-cancer NK atlas (Fig. 3f). CD56^bright and CD56^dim subsets from PB, tissues and tumors clustered together (Fig. 3g,h) and were more tightly connected than to their respective tissues/tumor origin, apart from skin-/SKCM-derived NK cells (Fig. 3g,h). Thus, differentiation stage had a greater influence on the NK cell transcriptome compared with tissue origin. Transfer learning facilitated incorporation of TiNK cells on to our healthy reference map of PB and TrNK cells, allowing for downstream systematic interrogation of cellular states within solid TiNK cells.

Altered NK cell subset frequencies across tissues and tumors

The TME is shaped by its cellular composition, in particular by the infiltrating immune cells, which in turn can be modulated by their surroundings. A pan-cancer comparison of the healthy tissue and tumor-annotated immune subtypes (Fig. 3b,d) identified an increased proportion of plasma cells and naive B cells, as well as a decreased proportion of CD56^dim NK cells, classic monocytes, dendritic cells, NK T cells, and effector memory/effector T helper cells (helper T_EM/EFF), effector memory/effector memory re-expressing CD45RA cytotoxic T cells (cytotoxic T_EM/EMRA) and resident memory cytotoxic T cells (cytotoxic T_RM) in the pan-cancer datasets (Fig. 4a). The fraction of CD56^bright NK cells out of total immune cells was enriched in BRAC, whereas CD56^dim NK cells were enriched in SKCM, but decreased in NSCLC and BRAC (Fig. 4a–c). We further annotated the CellTypist-identified NK cells at the subset level using our subset-trained model (M1) (Fig. 4d,e). Skewing of the CD56^bright:CD56^dim ratio between healthy blood or tissue and tumor was observed for most tumor types (Fig. 4d,e), including non-small cell lung cancer (NSCLC), which was independently validated by flow cytometry in an NSCLC cohort (Fig. 4f and Extended Data Fig. 6a). In line with this, we observed a general decrease in the intermediate CD56^dim population within the TiNK cells (Fig. 4d,e). Protein-based annotation of the CD56^dim population in the NSCLC cohort also identified a decrease of the early and intermediate CD56^dim subset and a modest increase of the late CD56^dim subsets in the NSCLC cohort compared with healthy blood controls (Fig. 4g and Extended Data Fig. 6b–e). Solid TiNK cells were enriched for a CD56^bright transcriptional phenotype whereas intermediate CD56^dim NK cells were reduced within the CD56^dim compartment in solid tumors, findings that were verified at the protein level in an NSCLC cohort⁴⁵.

**Fig. 4: Cellular composition of pan-cancer cell atlas and subset distribution of TiNK cells.**

Six functionally distinct cellular states of NK cells

TMEs of solid tumors are hostile and often immunosuppressive environments for immune cells to infiltrate⁴⁶. Understanding how the TME can modulate NK cells at the transcriptional level can provide important insights into understanding the tumor-mediated immunosuppressive mechanisms and how to overcome them.

We implemented an unbiased approach (Milo⁴⁷) to ascertain cellular states in our pan-cancer NK cell atlas by identifying 6,932 individual neighborhoods without pre-clustering based on cellular origin. Annotating individual neighborhoods as subset specific (>70% of cells in the neighborhood) identified TiCD56^bright NK cells as having the most frequent, but also the most unique (differentially abundant), specific neighborhoods (Extended Data Fig. 7a). Notably, most neighborhoods were annotated as ‘mixed’, highlighting transcriptional similarities among NK cells found in PB, tissues and tumors (Extended Data Fig. 7a). The 6,932 neighborhoods were grouped into 6 distinctive neighborhood groups and tested for differential abundance of neighborhoods between TiNK cells and Ref-NK cells (Fig. 5a and Extended Data Fig. 7b). Neighborhood groups 1 and 2 consisted of neighborhoods significantly enriched for TiNK cells and group 6 included neighborhoods enriched for Ref-NK cells (Fig. 5b and Extended Data Fig. 7b).

**Fig. 5: Distinct cellular states of NK cells identified in pan-cancer atlas.**

Next, we visualized the distribution of NK cell subsets within each group using our annotation model (M1). Groups 1 and 2 were enriched for, but not exclusive to, CD56^bright cells, whereas groups 3–6 were dominated by CD56^dim NK cell subsets (Fig. 5c). The dominant TF regulons of PB-NK cell differentiation previously identified (Fig. 2f) confirmed groups 1 and 2 as two CD56^bright states and groups 3–6 as four CD56^dim NK cell states (Fig. 5d).

Cell-state-specific GRNs, DEGs, gene set enrichment analysis (GSEA) and signature scoring informed our annotation of the states as stressed CD56^bright (group 1), typical CD56^bright (group 2), effector CD56^dim (group 3), adaptive CD56^dim (group 4), activated CD56^dim (group 5) and typical CD56^dim (group 6) (Fig. 5e–n and Extended Data Fig. 7c–i). Comparing the stressed with the typical CD56^bright state (group 1 versus group 2) identified increased expression of the cellular stress response ATF3 regulon, the hypoxia-induced MAFF regulon and numerous heat shock proteins (Fig. 5e,g and Extended Data Fig. 7f). The stressed CD56^bright cell state scored highly for immunosuppressive pathways (transforming growth factor (TGF)-β signaling, hypoxia, reactive oxygen species (ROS)) and exhibited increased metabolic activation (glycolysis, cholesterol homeostasis, fatty acid metabolism and mTORC1 (mammalian target of rapamycin complex 1)) (Fig. 5g,j–l). Furthermore, a low NK cell cytotoxicity score, exemplified by reduced effector and activating signaling molecules, was suggestive of reduced functionality in this stressed CD56^bright cellular state, which was uniquely enriched across all seven tumor types (Fig. 5i,m,o). In line with increased infiltration of CD56^bright cells in the TME, the typical CD56^bright cellular state was also enriched in five of seven tumor types compared with healthy tissue, with both CD56^bright groups exhibiting higher expression of immunomodulatory molecules, including XCL1, XCL2 and IFNG (Fig. 5n–o).

Of the CD56^dim states, the effector state was most frequently enriched across tumor types (SARC, PAAD), characterized by an enrichment for apical junction, actin and cytoskeleton-related genes as well as effector molecules (Fig. 5h and Extended Data Fig. 7g). This state, phenotypically enriched for intermediate and late CD56^dim NK cell subsets, scored highly for NK cytotoxicity and oxidative phosphorylation and, importantly, low for immune suppression (Fig. 5i,k–m). The adaptive CD56^dim state was uniquely enriched for adaptive NK cells, in line with adaptive-associated genes (CD52, IL32, GZMH, CD3E) being upregulated in this state (Fig. 5c and Extended Data Fig. 7c). The activated CD56^dim state was distinguished by increased hypoxia, upregulated nutrient transporters and the mTORC1–Myc axis (Fig. 5i,k and Extended Data Fig. 7d,h). Last, the PB-enriched typical CD56^dim state exhibited a low stress score and a high cytotoxicity score and was associated with IFN, tumor necrosis factor (TNF) and JAK/STAT signaling (Fig. 5i–j,m and Extended Data Fig. 7e,i). Notably, although we observed enrichment of individual cellular states in the TME, including the two CD56^bright and the effector CD56^dim states, all states were represented in healthy blood and tissue samples, albeit at different frequencies.

State-specific signaling in the TME links to functionality

To elucidate any TME-based influence on the six functional states identified, we employed CellChat⁴⁸ to infer intercellular communication, focusing on commonly enriched signaling pathways across all seven tumor types. Group 1 and 2 NK cell states were enriched for incoming signaling across tumor type from four dominant communication pathways (Fig. 6a). Increased expression of CD44, CXCR4 and CD74 on group 1 and 2 NK cells, on which numerous signals from fibroblasts, endothelial cells, tumor cells and macrophages converged (COLLAGEN, MIF, LAMININ), facilitated the augmented incoming signaling in NSCLC (Fig. 6b,c). Notably, the fibroblasts, endothelial cells, tumor cells and cancer-associated fibroblasts (CAFs) also exhibited the strongest outgoing interaction strength across tumor types (Extended Data Fig. 8a–g). Furthermore, group 1 and 2 NK cells preferentially received inhibitory input via the major histocompatibility complex I (MHC-I) (HLA-E/KLRC1) pathway owing to high KLRC1 expression in these cellular states (Fig. 6a,d). Hence, group 1 and 2 cellular states were more receptive to TME-induced immunosuppressive signals via upregulated expression of CD44, CXCR4, CD74 and KLRC1.

**Fig. 6: Intercellular communication of distinct cellular states in the TME.**

To understand how NK cells contribute to shaping the TME via an immunomodulatory role, we focused our analysis on outgoing signaling largely restricted to NK cells. We identified three signaling pathways (CC chemokine ligand (CCL), protease-activated inhibitors (PARs), IFN-II) through which NK cells predominantly communicated with dendritic cells, macrophages, fibroblasts and endothelial cells (Fig. 6e,f). CCL3 and CCL5, expressed across all states, can lead to the recruitment of cells expressing ACKR1, CCR1 and CCR4 (Extended Data Fig. 6h). Release of granzyme A, highly expressed at the transcriptional level by the effector NK cell state (group 3), can induce apoptosis of F2R-expressing cells in the TME, such as fibroblasts (Fig. 6g). Granzyme A expression was reduced in both frequency and intensity in CD56^dim NK cells from central tumor samples from patients with NSCLC compared with healthy blood controls, hinting at a release of granzyme A by NK cells within the tumor (Fig. 6h,i). Release of IFN-γ, predominantly by the stressed CD56^bright (group 1) state, can induce surrounding cells to upregulate MHC-I expression, including HLA-E (Fig. 6g and Extended Data Fig. 9a–d). Inhibitory signaling via the HLA-E axis significantly inhibits degranulation and granzyme B release of both CD56^bright and CD56^dim NK cells, as demonstrated by co-culturing NK cells with A549 (NSCLC) targets cells pre-stimulated with IFN-γ to upregulate HLA-E expression (Fig. 6j,k and Extended Data Fig. 9a–e). Blockade of the NKG2A–HLA-E axis, using an anti-NKG2A antibody, resulted in significant recovery of function, both degranulation and granzyme B release (Fig. 6j,k and Extended Data Fig. 9e). CD56^bright cellular states exhibited increased inhibitory signaling (MHC-I) and augmented susceptibility to TME-induced suppression (MIF, COLLAGEN, LAMININ) whereas CD56^dim states, particularly the effector state, exhibited high GZMA signaling, which was confirmed in samples of CD56^dim from patients with NSCLC.

Ratio of cellular states is predictive of patient outcome

Having identified 6 functionally distinct cellular states of NK cells within our pan-cancer NK cell atlas comprising 89,850 scRNA-seq transcriptomes, we validated our findings in spatial RNA-seq datasets (Supplementary Table 5). Spatial RNA-seq tissue sections from SKCM, NSCLC and GBM were deconvoluted using Tangram⁴⁹ combined with our established scRNA-seq references for the tumor types being analyzed to identify the cell types in these datasets (Fig. 7a). Compositional analysis of the main immune subtypes in SKCM, NSCLC and GBM varied greatly across tumor type, but was highly consistent across sequencing technique (scRNA-seq versus spatial-seq) (Fig. 7b). Focusing on SKCM, harboring the highest proportion of NK cells (Fig. 7b), we could further stratify the annotated NK cells into CD56^bright and CD56^dim subsets (Fig. 7c) and cellular states (Fig. 7d). Importantly, confirming previous results (Fig. 5i,m), the effector (group 3) and typical (group 6) CD56^dim states scored highly for genes associated with NK cell cytotoxicity. Similarly, stress response-related genes, as well as immunosuppressive-related genes (ROS, hypoxia) scored highest in the stressed CD56^bright (group 1) state (Fig. 7f,g), in line with results in the scRNA-seq data (Fig. 5i–k).

**Fig. 7: Distinct cellular states in spatial RNA-seq and association with patient outcome.**

The clinical benefit of NK cell infiltration in solid tumors has previously been assessed through a general NK cell signature score^50,51. Having identified six functional states of NK cells in blood, tissue and solid tumors, in both scRNA-seq and spatial-seq datasets, we proceeded to test clinical relevance of these cellular states by using BayesPrism⁵² to deconvoluted TCGA (The Cancer Genome Atlas) RNA-seq data where we also had survival data^53,54 (Extended Data Fig. 10). A higher ratio of effector CD56^dim:stressed CD56^bright NK state signatures was predictive of improved survival in SARC and SKCM (Fig. 7h). We hereby confirm that the six functional states identified in our pan-cancer NK cell atlas, and confirmed in spatial RNA-seq datasets, are also predictive of outcome in patients with osteosarcoma and melanoma.

Discussion

In the present study, we report a compact description of the transcriptional diversification encompassing human NK cell differentiation at the single-cell level. By enriching for less frequent, but phenotypically well-defined, functionally distinct NK cell subsets, we could first train a model to correctly annotate five transcriptional subsets from bulk NK cell populations. By applying probabilistic models implemented in scvi-tools, we created a transcriptional reference map of human blood and TrNK cells from normal tissues, including blood, pancreas, lung, breast, skin, prostate and brain. Transfer learning using scArches facilitated integration of query datasets comprising a total of 2,176,214 transcriptomes from 427 patients spanning 7 solid tumor types. By extracting, annotating and mapping the TiNK cells on to our reference map of healthy donors, we could systematically interrogate TME-induced perturbations of GRNs and functional states of TiNK cells (Supplementary Fig. 1). Our pan-cancer atlas revealed six functionally distinct NK cell states with varying abundance across blood, tissues and tumor types, which we could confirm in spatial RNA-seq datasets (SKCM, NSCLC, GBM). Two states commonly enriched for across tumor types included a dysfunctional CD56^bright cellular state susceptible to TME-induced immunosuppression and a cytotoxic TME-resistant CD56^dim state, the ratio of which was predictive of patient outcome.

The view that NK cells, like T cells and other immune cells, undergo a continuous process of NK cell differentiation is relatively recent and was originally based on phenotypic and functional classification of discrete subsets^7,55. There is abundant evidence to suggest that the CD56^bright NK cell subset is the most naive, giving rise to the more differentiated CD56^dim NK cells which can further differentiate toward terminal stages, a process accelerated by CMV infection^8,56,57. Instead of forcing individual NK cells into arbitrary clusters representing a snapshot of a given time point of differentiation, we clustered TFs and their target genes into five distinct gene expression trends as a function of pseudotime, reflecting continuous differentiation. The dominant TF regulons within these five gene trends correlated with functional traits of NK cells along the differentiation axis, such as cytokine responsiveness, as well as proliferative and cytotoxic capacity. By retaining fate-specific expression profiles, conventional versus adaptive fate in donors with CMV-induced clonal NK cell expansions, we could observe clear divergence of regulon expression (for example, BATF, MAF) during terminal differentiation. BATF belongs to the AP-1 TF family which have been identified as potential drivers in shaping adaptive NK cell chromatin accessibility and thus dictating the unique functional features of this subset, including enhanced IFN-γ response to receptor stimulation¹⁵. Establishing dominant regulons defining NK cell differentiation in PB provided a vital reference for downstream interrogation of both TrNK and solid TiNK cells.

Utilizing CellTypist, we harmonized annotations of individual cell subtypes across multiple datasets from six different healthy tissues, extracting and integrating CD56^bright and CD56^dim NK cells using scVI¹⁹ to expand our transcriptional reference map. Importantly, tissue-, as well as tumor-annotated, NK cells, did not express human ILC signature genes (IL7R), instead expressing both EOMES and TBX21. Literature-derived tissue-residency genes (for example, CD69, ITGAE, ITGA1, CXCR6, ZNF683 and IKZF3), originally extrapolated from tissue-resident T cell signatures^58,59,60,61, were more highly expressed in tissue-derived NK cells, particularly in CD56^bright NK cells⁶². Using our extensive pan-cancer NK cell atlas, we were able to generate a solely NK cell-derived, tissue-residency signature (atlas-TR: PSMA2, SLC5A3, CCL4L2, CLN3, SCGB1A1, AREG), which outperformed the conventional literature-derived TR signature across tissue and tumor type. CD56^bright and CD56^dim NK cells from healthy brain tissue exhibited a low TR- score, indicative of potential blood contamination in this specific dataset. Importantly, GBM-derived CD56^bright and CD56^dim NK cells scored highly for tissue residency, supporting their infiltration into the tumor. Expression of CCL4L2, encoding a chemokine that induces chemotaxis of CCR5- and CCR1-expressing cells, such as T cells, dendritic cells and macrophages, has previously been described in NK cells isolated from melanoma samples⁶³. This represents an independent verification, because this dataset was not included in our study. These melanoma-infiltrating NK cells also exhibited high AREG expression, an epidermal growth factor (EGF) receptor ligand. Notably, upregulation of AREG has also been described in the setting of healthy and cirrhotic liver-resident NK cells⁶⁴, a tissue type not included in our pan-cancer atlas. Intriguingly, SCGB1A1, a member of the secretoglobin family, functions as a potent inhibitor of phospholipase A₂ (ref. ⁶⁵), a well-described immunosuppressive molecule contributing to the development of the TME. Hence, it is tempting to speculate that secretion of the SCGB1A1-encoded protein could be another effector mechanism through which TiNK cells can positively contribute to remodeling of the TME.

The presence and abundance of NK cells that reside in the tumor bed vary across tumor types and treatments and between patients, and appears to be associated with the chemokine profiles in the different tissues/TMEs^66,67,68,69. In agreement with previous studies^45,67,70, we observed a predominance of CD56^bright NK cells in tumors compared with the corresponding normal tissue. TrNK cells are probably a mixed population including naturally residing TrNK cells and TiNK cells. Compositional differences between normal and tumor tissues suggests some degree of active recruitment, particularly in SKCM where NK cell frequencies starkly increased, albeit expansion from tissue-resident pools cannot be excluded. Migration into the TME is regulated by a broad family of integrins, selectins and chemokine receptors that are differentially expressed during NK cell differentiation. CXCR3, primarily expressed on CD56^bright NK cells, has been implicated in homing to several solid tumors based on CXCL10 gradients^71,72, and thus may contribute to the predominance of this subset in tumors. CCL2, CCL3, CCL5, CXCL8, CXCL9, CXCL10 and CXCL12 have similarly been implicated in mediating predominantly CD56^bright NK cell trafficking into solid tumors based on chemokine receptor expression⁶⁹. Release of CCL3 and CCL5 by NK cells can also recruit CCR1-expressing immune cells, such as macrophages. We observed increased CXCR4 expression in group 1 and 2 cellular states, corresponding to CD56^bright TrNK and TiNK cells. Previous reports^73,74 have demonstrated CD44-induced CXCR4 upregulation resulting in increased migration and invasiveness of malignant cells. Notably, CD44 was highly expressed on the tumor-enriched stressed CD56^bright state, alongside CXCR4 and CD74, possibly sensitizing this population to TME-mediated immunosuppression from CAFs, fibroblasts, endothelial and tumor cells, as noted by high scores for TGF-β signaling, hypoxia and ROS. High immunosuppression of this state is in line with the increased stressed response noted, as exemplified by high expression of the cellular stress response-associated TF ATF3, the HSP70 co-chaperone BAG3, the stressful growth arrest gene GADD45B and DUSP1, which is associated with cellular response to environmental stress.

Transcriptional stress response programs, including heat shock proteins, have previously been reported as a potential artefact downstream of digestion of tissues⁷⁵. We therefore took several measures to rule out digestion artefacts when compiling the present resource. In addition to implementing upstream data-processing steps, including removal of ambient RNA using decontX⁷⁶, we found no evidence for systematic artefactual stress signal coming from a particular study or tumor type. Perhaps most importantly, the stress signature defining the group 1 NK cell state was also found in spatial transcriptomics data directly on tissue sample sections that have not undergone any upstream tissue dissociation/digestion.

We also found high KLRC1 expression on the group 1 and 2 states, which, alongside high IFNG expression, can induce an inhibitory feedback loop, whereby local IFN-γ secretion leads to HLA-E upregulation resulting in inhibitory input through CD94/NKG2A. Conversely, the effector CD56^dim state, associated with improved patient outcome, lacked CD44 expression and highly expressed GZMA. Notably, this state exhibited high expression of the KLF2, PRDM1, BATF, TBX21 and IKZF1 regulons, indicative of high effector function, regulation of homeostatic proliferation and survival, but also cell migration and tissue residency. Unique TiNK cell-specific regulons in this state consisted of NFYC, CTBP1, POLE4 and CEBPA, which are involved in DNA repair, monitoring of proliferation, regulating MHC expression and maintaining structural homeostasis in the Golgi complex^77,78,79,80. Conversely, TiNK cell-specific regulons in the stressed CD56^bright state included hypoxia-induced MAFF, cellular stress response regulon ATF3 and EGR3 (ref. ⁸¹) which induce negative regulators in response to activation. Metabolically, the effector CD56^dim state scored highly for oxidative phosphorylation, compared with the stressed CD56^bright state which favored glycolysis, mTORC1 activation and exhibited upregulated nutrient transporters and genes associated with cholesterol homeostasis.

Contrary to Tang et al.¹⁶, increased gene signature scoring of the tumor-enriched states stressed that the CD56^bright state did not consistently associate with reduced survival across tumor types. Instead, we observed increased survival in patients exhibiting a high effector CD56^dim state, which was further augmented with a low signature for the stressed CD56^bright state. Of the four CD56^dim states, the effector CD56^dim state was enriched across two tumor types, painting a promising picture for the role of solid TiNK cells.

This resource provides a transcriptional reference map of human NK cells across healthy blood and tissues with harmonized annotations of transcriptional NK cell subsets. Uncovering the dominant gene-regulatory circuits during NK cell differentiation enabled identification of TME-induced perturbations in solid TiNK cells across tumor type. We identified functionally distinct NK cell states across healthy and malignant tissues, including tumor-enriched states predictive of patient outcome. Modeling of the intercellular communication pathways of outcome predicting NK cell states with the surrounding TME identified potential pathways of TME-induced NK cell suppression. Thus, our analysis has the potential to design more potent NK cell therapy products able to resist suppressive factors operating within the TME of solid tumors. Ultimately, this resource can be extended endlessly through transfer learning to interrogate new datasets from experimental perturbations or different tumor types.

Methods

Cell processing

Peripheral mononuclear cells (PBMCs) were isolated using density gradient centrifugation from anonymized healthy blood donors (Oslo University Hospital; Karolinska University Hospital) with informed consent. The study was approved by the regional ethics committee in Norway (Regional etisk komité (REK): protocol no. 2018/2482) and Sweden (Regionala etikprövningsnämnden i Stockholm: protocol no. 2016/1415-32; Etikprövningsmyndigheten: protocol no. 2020-05289). Donor-derived PBMCs were screened for KIR education and adaptive status using flow cytometry. NK cells were purified using an AutoMACS (DepleteS program, Miltenyi Biotec) and before overnight resting in complete Roswell Park Memorial Institute (RPMI) 1640 (Cytiva) (10% fetal bovine serum (FBS; GE Healthcare), 2 mM l-glutamine (GE Healthcare)) at 37 °C and 5% CO₂.

Flow cytometry screening

PBMCs were stained for surface antigens and viability in a 96 V-bottomed plate, followed by fixation/permeabilization and intracellular staining at 4 °C. The following antibodies were used in the screening panel: CD3-V500 (clone UCHT1), CD14-V500 (clone MφP9), CD19-V500 (clone HIB19) and Granzyme B-AF700 (clone GB11) from Beckton Dickinson; CD57-FITC (clone HNK-1), CD38-BV650 (clone HB-7) and CD158e1-BV421 (clone DX9) from BioLegend; CD158a-APC-Vio770 (clone REA284) and CD158a/h-PE-Vio770 (clone 11PB6) from Miltenyi Biotec; and CD158b1/b2,j-PE-Cy5.5 (clone GL183), CD159a-APC (clone Z199) and CD56-ECD (clone N901) from Beckman Coulter. LIVE/DEAD Fixable Aqua Dead Stain kit for 405-nM excitation (Life Technologies) was used to determine viability. Samples were acquired on an LSR-Fortessa equipped with a blue, red and violet laser and analyzed in FlowJo v.9 (TreeStar, Inc.).

FACS sorting

Cells were harvested and surface stained with the following antibodies: CD57-FITC (HNK-1) from BioLegend; CD158e1/e2-APC (clone Z27.3.7), CD56-ECD (clone N901) and CD158b1/b2,j-PE-Cy5.5 (clone GL183) from Beckman Coulter; and CD158a-APC-Vio770 (clone REA284), CD159c-PE (clone REA205) and CD159a-PE Vio770 (clone REA110) from Miltenyi Biotec. Cells, 12,000, were directly sorted into Eppendorf tubes at 4 °C for each sample using a FACSAriaII (Beckton Dickinson). Sorting strategies for scRNA-seq for the donor with and without an adaptive NK cell expansion are depicted in Extended Data Fig. 1c,d.

ScRNA-seq

After sorting, cells were kept on ice during the washing (phosphate-buffered saline (PBS) + 0.05% bovine serum albumin (BSA)) and counting steps. Cells, 10,000, were resuspended in 35 μl of PBS + 0.05% BSA and immediately processed at the Genomics Core Facility (Oslo University Hospital) using the Chromium Single Cell 3′ Library & Gel Bead Kit v.2 (Chromium Controller System, 10x Genomics). The recommended 10x Genomics protocol was used to generate the sequencing libraries, which was performed on a NextSeq500 (Illumina) with ~5% PhiX as spike-in. Sequencing raw data were converted into fastq files by running Illumina’s bcl2fastq v.2.

ScRNA-seq data collection and processing

Previously published scRNA-seq data were collected mostly in the form of count matrices already aligned to GRCh38; the rest were collected as fastq files. For the datasets where we collected fastq files, the data were aligned to GRCh38 using Cell Ranger (10x Genomics Cell Ranger 7.0.0).

Quality control and normalization of scRNA-seq data

Data-cleaning steps were first carried out whereby cells not expressing a minimum of 1,000 molecules and genes expressed by <10 cells were filtered out. Doublets were removed using the SOLO algorithm⁸². The count matrices for all the tumor and tissue types were corrected for ambient RNA using decontX⁷⁶. The data were normalized using log(transformation) for some of the downstream analysis as well as for visualization of gene expression-like dot plots. Quality control, transformation and most of the visualization of the gene expression data were performed using Scanpy⁸³. For analysis using scVI and scANVI, the raw count data were used.

Integration of scRNA-seq data

The probabilistic models scVI and scANVI, as implemented in scvi-tools¹⁹, were used for integration of scRNA-seq data. These methods have been shown to perform well for integration of scRNA-seq data, especially when dealing with complex batch effects and integrating atlas-level data⁸⁴. For cell-type and -subset annotations and prediction, scANVI was used to capture annotation of single-cell profiles. For the analysis of PB-NK subsets, the sorted subsets provided labels for training the scANVI model. The subset prediction provided by the model was tested on a held-out set of cells (15%) from the sorted subset data, giving us a confusion matrix summarizing the performance of the prediction.

Dimensionality reduction, clustering and visualization of scRNA-seq data

We computed the Uniform Manifold Approximation and Projection (UMAP) embeddings for visualization using the embedding learned from scVI and scANVI. Unsupervised clustering was also carried out using this learned embedding with Phenograph and the Leiden algorithm as implemented in Scanpy. PAGA²⁹ was used to quantify the connectivity of different groups of cells, thereby providing a representation of the data as a simpler graph. The various plots were mostly generated using the plotting functions in Scanpy.

Cell-type annotations and harmonization

For many of the publicly available datasets, cell-type annotations were readily available and used as seed labels when training the scANVI model for that particular tissue/tumor type to annotate the nonimmune cells. The scANVI model allowed us to harmonize annotations that were needed for analysis across datasets. All immune cells for all tissue types were integrated using scVI and annotated using CellTypist⁴². The same was done for all immune cells across all tumor types. The CD16⁻ and CD16⁺ NK cells identified by CellTypist were annotated as CD56^bright and CD56^dim, respectively. Where CITE-seq data were available, the surface expression of key markers also helped validate the cell-type annotations. For the identified NK cells, the cells were also scored using NK1/NK2 (CD56^bright/CD56^dim) signatures to validate the annotation of CD56^bright and CD56^dim NK cells. We also performed our own unsupervised Leiden clustering, which identified two dominating clusters corresponding to CD56^bright and CD56^dim NK cells.

Calculation of signature scores

Signature scores were computed using AUCell²¹, allowing for exploration of the relative expression of the signatures of interest in the datasets. Various gene sets were taken from the MSigDB Hallmark gene set collection⁸⁵.

Pseudotime and RNA velocity analysis

Pseudotime was computed using Palantir²⁶, which captures the continuous nature of differentiation, and cell fate, which allowed us to explore two terminal states and the gene expression changes seen along these trajectories. For this analysis, the starting cell was defined as the cell that was the least CD56^dim (the lowest score for the NK1 signature). GAMs fitted on cells ordered by pseudotime were used to calculate gene trends, where the contribution of cells was weighted by their probability to end up in the given terminal state as calculated by Palantir. The gene trends indicate how gene expression levels develop over the differentiation timeline. These trends were clustered using the Leiden clustering algorithm to give us five clusters of gene trends. RNA velocity²⁸ was also used to take advantage of splicing kinetics to identify directed dynamic information. We used velocyto²⁸ and scVelo²⁷ for this analysis, specifically the dynamic model implemented in the scVelo toolkit. The RNA velocity analysis was run on the 2 donors where sorted subsets were sequenced separately, as well as on the integrated data from 12 blood donors.

GRN analysis

SCENIC²¹ was used to infer TFs and GRNs from the scRNA-seq data. The SCENIC workflow⁸⁶ was followed and the pySCENIC implementation was used. TF–gene associations were inferred by GRNBoost⁸⁷ and motif–TF associations were downloaded from Aerts’s lab website and used for pruning the inferred associations. The inferred regulatory networks were also further pruned by removing lowly expressed TFs based on the bulk RNA-seq data. AUCell was used to compute the activity of the final regulons. The regulon activity was visualized using matrix plots, as implemented in Scanpy, to look at the activity across different groups of cells.

Bulk RNA-seq for TF and target validation

For validation of the TF and targets, we checked their expression in bulk RNA-seq data from four sorted NK cell populations (CD56^bright, NKG2A⁻KIR^-CD56^dim, NKG2A⁻KIR⁺CD56^dim and NKG2A⁻KIR⁺NKG2C⁺CD56^dim). Sequencing was performed using single-cell tagged reverse transcription⁸⁸.

Reference mapping

The TiNK cells were added after the model for a healthy NK cell reference was trained. Then, scArches⁴⁴ as implemented in scvi-tools¹⁹ was used to map these new data on to the established reference.

Cell–cell communication inference using CellChat

To infer the communication between the various cell types in the tumor datasets we used CellChat⁴⁸. Based on gene expression of receptors and ligands in the data and a curated database of pathways, CellChat computes the communication probability between various receptor–ligand pairs. CellChat also provided ways to aggregate this information and for us to visualize the inferred cell–cell communication networks. CellChat was computed separately for each of the tumor types included in the analysis.

Differential gene expression analysis

To perform differential gene expression analysis we used pseudobulk because this has shown good results when analyzing scRNA-seq data in various studies⁸⁹. This allowed us to aggregate up counts for each sample and consider the samples instead of the cells as replicates. We then used edgeR⁹⁰ on the pseudobulk data. We could then identify DEGs between healthy reference NK cells and TiNK cells within and across subsets.

Differential abundance analysis using Milo

We used Milo⁴⁷ to assign cells to neighborhoods on the k-nearest neighbors graph (k-NNG). The scVI representation of the cells was used for building the k-NNG. This allowed us to have a batch-corrected representation of the cells as input to this analysis. The differential abundance of the neighborhoods between the healthy reference and the TiNK cells was then computed. The neighborhoods were grouped into six groups using the groupNhoods function in Milo. These groups were considered as different NK cell states and further characterized using the functions in Milo for identification of DEGs. The differential expression analysis was done using pseudobulk by aggregating gene expression per sample. The single cells were then annotated using these groups for downstream analysis.

GSEA

GSEA was performed using the GSEA software⁹¹ and the MSigDB collection of gene sets. Genes were first ordered based on the differential expression analysis based on either the pseudobulk approach or the Milo analysis.

Spatial transcriptomics

Spatial transcriptomics datasets from lung tumor, glioblastoma and melanoma were collected from the 10x Genomics website (https://www.10xgenomics.com/datasets). Squidpy⁹² was used for preprocessing and segmentation and Tangram⁴⁹ was used for deconvolution using our annotated scRNA-seq data for each of the tumor types as reference. The deconvolution was performed with the NK cells annotated as CD56^bright and CD56^dim, as well using the group annotations established in this paper.

Clinical and bulk RNA-seq data from TCGA and TARGET

Bulk RNA-seq data and clinical data were downloaded from TCGA and TARGET using TCGAbiolinks⁵³ and curated survival data were downloaded from Xena⁵⁴.

Deconvolution of bulk RNA-seq

Deconvolution of the bulk RNA-seq data was performed for each of the tumor types using BayesPrism⁵². BayesPrism has been shown to work well for deconvolution of data from tumors and especially well in dealing with high cell-type granularity⁹³. The annotated reference datasets for each of the tumor types were used as prior information in the deconvolution. BayesPrism then computed both an expression matrix for each cell type and the cell-type fraction for each sample.

Survival analysis

The NK expression matrix inferred by BayesPrism for the various tumor types was used to score the signature genes for each of the identified NK cell states. The patients were then assigned as high and low for a group/state based on belonging to the highest or lowest half in terms of expression of these signature genes within the group of patients with a specific tumor type. The high and low designations could then be combined in an approach where a patient could be assigned as high or low in multiple groups. Survival analysis was conducted using Cox’s proportional hazards model from the R package survival⁹⁴, adjusting for confounding clinical factors such as tumor stage, gender and age. Subsequently, survival curves were derived using the Kaplan–Meier method within the same package. For visualization, the ggsurvplot function of the survminer package in R was utilized.

Samples from patients with primary NSCLC

The patient cohort, processing of tissue specimens and flow cytometry staining were collected and performed as previously described⁴⁵.

Functional assay using A549 cells

A549 cells were cultured in Dulbecco’s modified Eagle’s medium/high glucose with l-glutamine, sodium pyruvate (Cytiva) + 10% heat-inactivated FBS (Sigma-Aldrich) at 37 °C in 5% CO₂. A549 cells, 20,000, were seeded per well in a 96-well F-bottom plate and pre-treated with and without 50 ng ml⁻¹ of IFN-γ (PeptroTech) for 24 h before addition of NK cells. HLA-E expression after IFN-γ stimulation was evaluated using HLA-E–PE antibody (BioLegend, clone 3D12). NK cells were isolated using negative selection (NK cell isolation kit, Miltenyi Biotec) from previously cryopreserved PBMCs from healthy individuals. Cells were activated overnight with 5 ng ml⁻¹ of IL-15 (R&D) in RPMI 1640 (Cytiva) + 10% heat-inactivated FBS at 37 °C in 5% CO₂. NK cells were washed, resuspended in RPMI 1640 + 10% FBS and pre-incubated with and without α-NKG2A (a monalizumab biosimilar: immunoglobulin (Ig)G1 with PGLALA mutation, Merck) for 20 min prior. Target cells were washed in PBS before the addition of NK cells at a 1:1 effector:target (E:T) ratio in the presence of brefeldin A (GolgiPlug, 1:1,000, BD Biosciences), monensin (GolgiStop, 1:1,500, BD Biosciences) and anti-CD107a-BUV394 (BD Horizon, clone H4A3). After a 4-h incubation, the cells were stained with anti-IgG Fc–PE (Invitrogen), followed by surface, fixation and permeabilization (Cytofix/Cytoperm, BD) and finally intracellular staining using the following antibodies: CD159a-VioBright FITC (Miltenyi Biotec, clone REA110), Granzyme B-AF700 (BD, clone GB11), CD16-Pacific Blue (BD, clone 3G8), CD3-V500 (BD, clone UCHT1), TNF-α-BV650 (BioLegend, clone Mab11), IFN-γ-BV785 (BioLegend, clone 4S.B3), CD56-ECD (Beckman Coulter, clone N901) and perforin–PE-Cy7 (eBioscience, clone dG9), LIVE/DEAD Fixable Aqua Dead Cell Stain kit (Thermo Fisher Scientific).

Reagents and antibodies

A full list containing company information, catalog nos and antibody clones for all reagents can be found in Supplementary Data.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The gene expression data generated for this paper are available at the National Center for Biotechnology Information’s Gene Expression Omnibus with accession no. GSE245690 and raw sequencing data are available at the European Genome–Phenome Archive with accession no. EGAS50000000014. The details about the publicly available data included in the analysis are available in Supplementary Tables 1, 2 and 3. For GSEA the Molecular Signature Database (v.2023.2.Hs), available at https://www.gsea-msigdb.org/gsea/msigdb, was used. Relevant gene sets for scoring were also retrieved from this database. Bulk RNA-seq data were downloaded from TCGA and TARGET. Curated survival data were downloaded from Xena. Processed data and models have also been made available via Zenodo at https://doi.org/10.5281/zenodo.8434223 (ref. ⁹⁵) and as an online resource at http://nk-scrna.malmberglab.com. Source data are provided with this paper.

Code availability

The code generated for our analysis is available on GitHub at https://github.com/hernet/transcriptional-map-nk.

References

Moretta, A., Bottino, C., Mingari, M. C., Biassoni, R. & Moretta, L. What is a natural killer cell? Nat. Immunol. 3, 6–8 (2002).
Article CAS PubMed Google Scholar
Crinier, A. et al. High-dimensional single-cell analysis identifies organ-specific signatures and conserved NK cell subsets in humans and mice. Immunity 49, 971–986.e5 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cooper, M. A., Fehniger, T. A. & Caligiuri, M. A. The biology of human natural killer-cell subsets. Trends Immunol. 22, 633–640 (2001).
Article CAS PubMed Google Scholar
Horowitz, A. et al. Genetic and environmental determinants of human NK cell diversity revealed by mass cytometry. Sci. Transl. Med. 5, 208ra145 (2013).
Article PubMed PubMed Central Google Scholar
Horowitz, A. et al. Class I HLA haplotypes form two schools that educate NK cells in different ways. Sci. Immunol. 1, eaag1672 (2016).
Article PubMed PubMed Central Google Scholar
Goodridge, J. P., Önfelt, B. & Malmberg, K.-J. Newtonian cell interactions shape natural killer cell education. Immunol. Rev. 267, 197–213 (2015).
Article CAS PubMed PubMed Central Google Scholar
Björkström, N. K. et al. Expression patterns of NKG2A, KIR, and CD57 define a process of CD56^dim NK-cell differentiation uncoupled from NK-cell education. Blood 116, 3853–3864 (2010).
Article PubMed Google Scholar
Schlums, H. et al. Cytomegalovirus infection drives adaptive epigenetic diversification of NK cells with altered signaling and effector function. Immunity 42, 443–456 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lopez-Vergès, S. et al. CD57 defines a functionally distinct population of mature NK cells in the human CD56^dimCD16⁺ NK-cell subset. Blood 116, 3865–3874 (2010).
Article PubMed PubMed Central Google Scholar
Juelke, K. et al. CD62L expression identifies a unique subset of polyfunctional CD56^dim NK cells. Blood 116, 1299–1307 (2010).
Article CAS PubMed Google Scholar
Collins, P. L. et al. Gene regulatory programs conferring phenotypic identities to human NK cells. Cell 176, 348–360.e12 (2019).
Article CAS PubMed Google Scholar
Smith, S. L. et al. Diversity of peripheral blood human NK cells identified by single-cell RNA sequencing. Blood Adv. 4, 1388–1406 (2020).
Article CAS PubMed PubMed Central Google Scholar
Melsen, J. E. et al. Single-cell transcriptomics in bone marrow delineates CD56^dim Granzyme K⁺ subset as intermediate stage in NK cell differentiation. Front. Immunol. 13, 1044398 (2022).
Article CAS PubMed PubMed Central Google Scholar
Holmes, T. D. et al. The transcription factor Bcl11b promotes both canonical and adaptive NK cell differentiation. Sci. Immunol. 6, eabc9801 (2021).
Article CAS PubMed PubMed Central Google Scholar
Rückert, T., Lareau, C. A., Mashreghi, M.-F., Ludwig, L. S. & Romagnani, C. Clonal expansion and epigenetic inheritance of long-lasting NK cell memory. Nat. Immunol. 23, 1551–1563 (2022).
Article PubMed PubMed Central Google Scholar
Tang, F. et al. A pan-cancer single-cell panorama of human natural killer cells. Cell 186, 4235–4251.e20 (2023).
Article CAS PubMed Google Scholar
Rood, J. E., Maartens, A., Hupalowska, A., Teichmann, S. A. & Regev, A. Impact of the Human Cell Atlas on medicine. Nat. Med. 28, 2486–2496 (2022).
Article CAS PubMed Google Scholar
Yang, C. et al. Heterogeneity of human bone marrow and blood natural killer cells defined by single-cell transcriptome. Nat. Commun. 10, 3931 (2019).
Article PubMed PubMed Central Google Scholar
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
Article CAS PubMed Google Scholar
Haghverdi, L., Buettner, F. & Theis, F. J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31, 2989–2998 (2015).
Article CAS PubMed Google Scholar
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Article CAS PubMed PubMed Central Google Scholar
Scheiter, M. et al. Proteome analysis of distinct developmental stages of human natural killer (NK) cells. Mol. Cell. Proteom. 12, 1099–1114 (2013).
Article CAS Google Scholar
Goodridge, J. P. et al. Remodeling of secretory lysosomes during education tunes functional potential in NK cells. Nat. Commun. 10, 514 (2019).
Article CAS PubMed PubMed Central Google Scholar
Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).
Article PubMed PubMed Central Google Scholar
Vivier, E. et al. High-dimensional single-cell analysis of natural killer cell heterogeneity in human blood. Preprint at Research Square https://doi.org/10.21203/rs.3.rs-3870228/v1 (2024).
Setty, M. et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol. 37, 451–460 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).
Article CAS PubMed Google Scholar
Manno, G. L. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
Article PubMed PubMed Central Google Scholar
Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59 (2019).
Article PubMed PubMed Central Google Scholar
Chaves, P. et al. Loss of canonical notch signaling affects multiple steps in NK cell development in mice. J. Immunol. 201, 3307–3319 (2018).
Article CAS PubMed Google Scholar
Nagel, S. et al. Polycomb repressor complex 2 regulates HOXA9 and HOXA10, activating ID2 in NK/T-cell lines. Mol. Cancer 9, 151 (2010).
Article PubMed PubMed Central Google Scholar
Balzarolo, M., Watzl, C., Medema, J. P. & Wolkers, M. C. NAB2 and EGR-1 exert opposite roles in regulating TRAIL expression in human natural killer cells. Immunol. Lett. 151, 61–67 (2013).
Article CAS PubMed Google Scholar
Wiencke, J. K. et al. The DNA methylation profile of activated human natural killer cells. Epigenetics 11, 363–380 (2016).
Article PubMed PubMed Central Google Scholar
Cho, Y. et al. The basic helix-loop-helix proteins differentiated embryo chondrocyte (DEC) 1 and DEC2 function as corepressors of retinoid X receptors. Mol. Pharmacol. 76, 1360–1369 (2009).
Article CAS PubMed Google Scholar
Adams, N. M. et al. Transcription factor IRF8 orchestrates the adaptive natural killer cell response. Immunity 48, 1172–1182.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mace, E. M. et al. Biallelic mutations in IRF8 impair human NK cell maturation and function. J. Clin. Invest. 127, 306–320 (2017).
Article PubMed Google Scholar
Goh, W. et al. IKAROS and AIOLOS directly regulate AP-1 transcriptional complexes and are essential for NK cell development. Nat. Immunol. 25, 240–255 (2024).
Article CAS PubMed Google Scholar
Wang, Y. et al. The IL-15-AKT-XBP1s signaling pathway contributes to effector functions and survival in human NK cells. Nat. Immunol. 20, 10–17 (2019).
Article CAS PubMed Google Scholar
Rabacal, W. et al. Transcription factor KLF2 regulates homeostatic NK cell proliferation and survival. Proc. Natl Acad. Sci. USA 113, 5370–5375 (2016).
Article CAS PubMed PubMed Central Google Scholar
Li, M. et al. DDIT3 directs a dual mechanism to balance glycolysis and oxidative phosphorylation during glutamine deprivation. Adv. Sci. 8, e2003732 (2021).
Article Google Scholar
Kallies, A. et al. A role for Blimp1 in the transcriptional network controlling natural killer cell maturation. Blood 117, 1869–1879 (2011).
Article CAS PubMed Google Scholar
Domínguez Conde, C. et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 376, eabl5197 (2022).
Article PubMed PubMed Central Google Scholar
Mazzurana, L. et al. Tissue-specific transcriptional imprinting and heterogeneity in human innate lymphoid cells revealed by full-length single-cell RNA-sequencing. Cell Res. 31, 554–568 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2021).
Brownlie, D. et al. Accumulation of tissue-resident natural killer cells, innate lymphoid cells, and CD8⁺ T cells towards the center of human lung tumors. Oncoimmunology 12, 2233402 (2023).
Article PubMed PubMed Central Google Scholar
Combes, A. J., Samad, B. & Krummel, M. F. Defining and using immune archetypes to classify and treat cancer. Nat. Rev. Cancer 23, 491–505 (2023).
Article CAS PubMed Google Scholar
Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Differential abundance testing on single-cell data using k-nearest neighbor graphs. Nat. Biotechnol. 40, 245–253 (2022).
Article CAS PubMed Google Scholar
Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12, 1088 (2021).
Article CAS PubMed PubMed Central Google Scholar
Biancalani, T. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat. Methods 18, 1352–1362 (2021).
Article PubMed PubMed Central Google Scholar
Nersesian, S. et al. NK cell infiltration is associated with improved overall survival in solid cancers: a systematic review and meta-analysis. Transl. Oncol. 14, 100930 (2021).
Article CAS PubMed Google Scholar
Cursons, J. et al. A gene signature predicting natural killer cell infiltration and improved survival in melanoma patients. Cancer Immunol. Res. 7, 1162–1174 (2019).
Article CAS PubMed Google Scholar
Chu, T., Wang, Z., Pe’er, D. & Danko, C. G. Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat. Cancer 3, 505–517 (2022).
Article CAS PubMed PubMed Central Google Scholar
Colaprico, A. et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 44, e71 (2016).
Article PubMed Google Scholar
Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 38, 675–678 (2020).
Article CAS PubMed PubMed Central Google Scholar
Béziat, V., Descours, B., Parizot, C., Debré, P. & Vieillard, V. NK cell terminal differentiation: correlated stepwise decrease of NKG2A and acquisition of KIRs. PLoS ONE 5, e11966 (2010).
Article PubMed PubMed Central Google Scholar
Béziat, V. et al. NK cell responses to cytomegalovirus infection lead to stable imprints in the human KIR repertoire and involve activating KIRs. Blood 121, 2678–2688 (2013).
Article PubMed PubMed Central Google Scholar
Lee, J. et al. Epigenetic modification and antibody-dependent expansion of memory-like NK cells in human cytomegalovirus-infected individuals. Immunity 42, 431–442 (2015).
Article CAS PubMed PubMed Central Google Scholar
Dogra, P. et al. Tissue determinants of human NK cell development, function, and residence. Cell 180, 749–763.e13 (2020).
Article CAS PubMed PubMed Central Google Scholar
Poon, M. M. L. et al. Tissue adaptation and clonal segregation of human memory T cells in barrier sites. Nat. Immunol. 24, 309–319 (2023).
Article CAS PubMed PubMed Central Google Scholar
Szabo, P. A., Miron, M. & Farber, D. L. Location, location, location: tissue resident memory T cells in mice and humans. Sci. Immunol. 4, eaas9673 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kumar, B. V. et al. Human tissue-resident memory T cells are defined by core transcriptional and functional signatures in lymphoid and mucosal sites. Cell Rep. 20, 2921–2934 (2017).
Article CAS PubMed PubMed Central Google Scholar
Melsen, J. E. et al. Human bone marrow-resident natural killer cells have a unique transcriptional profile and resemble resident memory CD8⁺ T cells. Front. Immunol. 9, 1829 (2018).
Article PubMed PubMed Central Google Scholar
de Andrade, L. F. et al. Discovery of specialized NK cell populations infiltrating human melanoma metastases. JCI Insight 4, e133103 (2019).
Article PubMed PubMed Central Google Scholar
Jameson, G. & Robinson, M. W. Insights Into human intrahepatic NK cell function from single cell RNA sequencing datasets. Front. Immunol. 12, 649311 (2021).
Article CAS PubMed PubMed Central Google Scholar
Vecchi, L. et al. Phospholipase A₂ drives tumorigenesis and cancer aggressiveness through its interaction with annexin A1. Cells 10, 1472 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cantoni, C. et al. NK cells, tumor cell transition, and tumor progression in solid malignancies: new hints for NK-based immunotherapy? J. Immunol. Res. 2016, 4684268 (2016).
Article PubMed PubMed Central Google Scholar
Platonova, S. et al. Profound coordinated alterations of intratumoral NK cell phenotype and function in lung carcinoma. Cancer Res. 71, 5412–5422 (2011).
Article CAS PubMed Google Scholar
Carrega, P. et al. CD56^brightperforin^low noncytotoxic human NK cells are abundant in both healthy and neoplastic solid tissues and recirculate to secondary lymphoid organs via afferent lymph. J. Immunol. 192, 3805–3815 (2014).
Article CAS PubMed Google Scholar
Lachota, M. et al. Mapping the chemotactic landscape in NK cells reveals subset-specific synergistic migratory responses to dual chemokine receptor ligation. eBioMedicine 96, 104811 (2023).
Article CAS PubMed PubMed Central Google Scholar
Carrega, P. et al. Natural killer cells infiltrating human nonsmall-cell lung cancer are enriched in CD56 bright CD16⁻ cells and display an impaired capability to kill tumor cells. Cancer 112, 863–875 (2008).
Article PubMed Google Scholar
Rezaeifard, S., Talei, A., Shariat, M. & Erfani, N. Tumor infiltrating NK cell (TINK) subsets and functional molecules in patients with breast cancer. Mol. Immunol. 136, 161–167 (2021).
Article CAS PubMed Google Scholar
Wendel, M., Galani, I. E., Suri-Payer, E. & Cerwenka, A. Natural killer cell accumulation in tumors is dependent on IFN-gamma and CXCR3 ligands. Cancer Res. 68, 8437–8445 (2008).
Article CAS PubMed Google Scholar
Bao, W. et al. HER2 interacts with CD44 to up-regulate CXCR4 via epigenetic silencing of microRNA-139 in gastric cancer cells. Gastroenterology 141, 2076–2087.e6 (2011).
Article CAS PubMed Google Scholar
Xie, P. et al. CD44 potentiates hepatocellular carcinoma migration and extrahepatic metastases via the AKT/ERK signaling CXCR4 axis. Ann. Transl. Med. 10, 689 (2022).
Article CAS PubMed PubMed Central Google Scholar
O’Flanagan, C. H. et al. Dissociation of solid tumor tissues with cold active protease for single-cell RNA-seq minimizes conserved collagenase-associated stress responses. Genome Biol. 20, 210 (2019).
Article PubMed PubMed Central Google Scholar
Yang, S. et al. Decontamination of ambient RNA in single-cell RNA-seq with DecontX. Genome Biol. 21, 57 (2020).
Article PubMed PubMed Central Google Scholar
Zhu, X. S. et al. Transcriptional scaffold: CIITA interacts with NF-Y, RFX, and CREB to cause stereospecific regulation of the class II major histocompatibility complex promoter. Mol. Cell. Biol. 20, 6051–6061 (2000).
Article CAS PubMed PubMed Central Google Scholar
Porse, B. T. et al. Loss of C/EBP alpha cell cycle control increases myeloid progenitor proliferation and transforms the neutrophil granulocyte lineage. J. Exp. Med. 202, 85–96 (2005).
Article CAS PubMed PubMed Central Google Scholar
Colanzi, A. et al. Molecular mechanism and functional role of brefeldin A-mediated ADP-ribosylation of CtBP1/BARS. Proc. Natl Acad. Sci. USA 110, 9794–9799 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bellelli, R. et al. POLE3-POLE4 Is a Histone H3-H4 Chaperone that Maintains Chromatin Integrity during DNA Replication. Mol. Cell 72, 112–126.e5 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, S. et al. The transcription factors Egr2 and Egr3 are essential for the control of inflammation and antigen-induced proliferation of B and T cells. Immunity 37, 685–696 (2012).
Article CAS PubMed PubMed Central Google Scholar
Bernstein, N. J. et al. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning. Cell Systems 11, 95–101.e5 (2020).
Article CAS PubMed Google Scholar
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Article PubMed PubMed Central Google Scholar
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
Article CAS PubMed Google Scholar
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Systems 1, 417–425 (2015).
Article CAS PubMed PubMed Central Google Scholar
Van de Sande, B. et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat. Protoc. 15, 2247–2276 (2020).
Article PubMed Google Scholar
Moerman, T. et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics 35, 2159–2161 (2019).
Article CAS PubMed Google Scholar
Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011).
Article CAS PubMed PubMed Central Google Scholar
Murphy, A. E. & Skene, N. G. A balanced measure shows superior performance of pseudobulk methods in single-cell RNA-sequencing analysis. Nat. Commun. 13, 7851 (2022).
Article CAS PubMed PubMed Central Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Article CAS PubMed PubMed Central Google Scholar
Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods 19, 171–178 (2022).
Article CAS PubMed PubMed Central Google Scholar
Tran, K. A. et al. Performance of tumour microenvironment deconvolution methods in breast cancer using single-cell simulated bulk mixtures. Nat. Commun. 14, 5758 (2023).
Article CAS PubMed PubMed Central Google Scholar
Therneau, T. A Package for Survival Analysis in R. R package version 3.5-7. CRAN https://CRAN.R-project.org/package=survival (2023).
Netskar, H., Pfefferle, A. & Malmberg, K.-J. Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping. Zenodo https://doi.org/10.5281/zenodo.8434223 (2024).

Download references

Acknowledgements

Large parts of the analyses were run using the Machine learning infrastructure (ML Nodes), University Centre for Information Technology, University of Oslo, Norway. This publication is part of the Human Cell Atlas (www.humancellatlas.org/publications), HCA-106. We thank the flow cytometry and genetics core facilities at Oslo University Hospital. We thank Merck KGaA for providing tool reagents. This work was supported by the Swedish Research Council (grant nos 223310 to K.-J.M. and 2021-03069 and 2021-01039 to N.M.), the Swedish Children’s Cancer Society (grant no. PR2020-1059 to K.-J.M.), the Swedish Cancer Society (grant nos 21-1793Pj to K.-J.M., 22-2319Pj to N.M. and 23-2946Pj to J.M.), Sweden’s Innovation Agency (K.-J.M.), the Center for Innovative Medicine (CIMED, grant no. 20200680 to N.M.), the Tornspiran Foundation (N.M.), the Karolinska Institutet (K.-J.M.), the Research Council of Norway (grant nos 275469 and 237579 to K.-J.M.), Center of Excellence: Precision Immunotherapy Alliance (grant no. 332727 to K.-J.M.), the Norwegian Cancer Society (grant nos 190386 and 223310 to K.-J.M.), the South-Eastern Norway Regional Health Authority (grant nos 2021-073 and 2024-053 to K.-J.M.), EU H2020-MSCA Research and Innovation program (grant no. 801133 to K.-J.M.), the Knut and Alice Wallenberg Foundation (grant no. 2018.0106 to K.-J.M.), the Swedish Foundation for Strategic Research (K.-J.M.) and the US National Cancer Institute (grant nos P01 CA111412 and P009500901 to K.-J.M. and R21AI130760 to A.H.).

Funding

Open access funding provided by Karolinska Institute.

Author information

These authors contributed equally: Herman Netskar, Aline Pfefferle.
These authors jointly supervised this work: Amir Horowitz, Karl-Johan Malmberg.

Authors and Affiliations

Department of Cancer Immunology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
Herman Netskar & Karl-Johan Malmberg
Precision Immunotherapy Alliance, University of Oslo, Oslo, Norway
Herman Netskar & Karl-Johan Malmberg
Center for Infectious Medicine, Department of Medicine Huddinge, Karolinska Institutet, Stockholm, Sweden
Aline Pfefferle, Ebba Sohlberg, Jakob Michaëlsson & Karl-Johan Malmberg
Fate Therapeutics, San Diego, CA, USA
Jodie P. Goodridge
Wellcome Sanger Institute, Wellcome Genome Clymphoid cells (ILCs)ampus, Hinxton, Cambridge, UK
Olli Dufva
Wellcome-MRC Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, UK
Sarah A. Teichmann
Department of Medicine, University of Cambridge, Cambridge, UK
Sarah A. Teichmann
Center for Hematology and Regenerative Medicine, Department of Medicine Huddinge, Karolinska Institutet, Huddinge, Sweden
Demi Brownlie & Nicole Marquardt
Oslo Cancer Cluster, NEC OncoImmunity AS, Oslo, Norway
Trevor Clancy
Department of Vaccine Informatics, Institute for Tropical Medicine, Nagasaki University, Nagasaki, Japan
Trevor Clancy
Department of Immunology & Immunotherapy, Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Amir Horowitz
Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Amir Horowitz

Authors

Herman Netskar
View author publications
You can also search for this author in PubMed Google Scholar
Aline Pfefferle
View author publications
You can also search for this author in PubMed Google Scholar
Jodie P. Goodridge
View author publications
You can also search for this author in PubMed Google Scholar
Ebba Sohlberg
View author publications
You can also search for this author in PubMed Google Scholar
Olli Dufva
View author publications
You can also search for this author in PubMed Google Scholar
Sarah A. Teichmann
View author publications
You can also search for this author in PubMed Google Scholar
Demi Brownlie
View author publications
You can also search for this author in PubMed Google Scholar
Jakob Michaëlsson
View author publications
You can also search for this author in PubMed Google Scholar
Nicole Marquardt
View author publications
You can also search for this author in PubMed Google Scholar
Trevor Clancy
View author publications
You can also search for this author in PubMed Google Scholar
Amir Horowitz
View author publications
You can also search for this author in PubMed Google Scholar
Karl-Johan Malmberg
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.P.G., A.H. and A.P. performed the scRNA-seq experiments. A.P. performed the in vitro experiments and analysis. N.M., D.B. and J.M. provided the data on patients with NSCLC (flow cytometry). H.N. performed the computations. H.N. and A.P. performed the bioinformatic analysis. E.S., T.C., O.D., S.A.T., A.H. and K.-J.M. provided scientific input. A.P. wrote the manuscript with support from H.N., A.H., and K.-J.M. All authors edited the manuscript.

Corresponding authors

Correspondence to Aline Pfefferle or Karl-Johan Malmberg.

Ethics declarations

Competing interests

J.P.G. is an employee at Fate Therapeutics. T.C. is an employee at NEC OncoImmunity AS. K.-J.M. is a consultant at Fate Therapeutics and Vycellix and has research support from Fate Therapeutics, Oncopeptides for studies unrelated to this work. O.D. has received research funding from Gilead Sciences and Incyte, unrelated to this work, and personal fees from Sanofi, unrelated to this work. S.A.T. is a scientific advisory board member of ForeSite Labs, QIAGEN and Element Biosciences, and a co-founder and equity holder of TransitionBio and EnsoCell Therapeutics, and a part-time employee of GlaxoSmithKline. A.H. has received funding from Astra Zeneca/MedImmune, unrelated to this work. A.H. is a consultant at Purple BioTech. The other authors declare no competing interests.

Peer review

Peer review information

Nature Immunology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Jamie D. K. Wilson, in collaboration with the Nature Immunology team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Peripheral blood NK cell subsets and sorting strategy.

(a, b) AUCell scores of gene signatures for CD56^bright and CD56^dim NK regulones (a) and proteomes (b). (c, d) Sorting strategy for phenotypically defined functional PB-NK cell subsets sequenced in one donor with (c) and one without (d) an adaptive expansion. (e) Heatmap depicting similarity between our five annotated transcriptional NK cell subsets (y-axis) and the Meta-NK defined NK subsets (x-axis). The scale represent gene set activity calculated by AUCell (a-b, e).

Extended Data Fig. 2 RNA velocity.

(a) Graphical depiction of inferring RNA velocity based on spliced vs unspliced transcripts. (b) RNA velocity plots for ZEB2 and BCL11B transcripts stratified by subset annotation in donors with an adaptive expansion.

Extended Data Fig. 3 Healthy tissue dataset annotation using CellTypist.

(a) Heatmap depicting expression of signature genes of the main immune populations annotated by CellTypist across all tissue samples. (b-f) UMAP representation showing integration of all healthy tissue datasets, prostate (b), lung (c), pancreas (d), skin (e), breast (f), with individual cell subtypes annotated using CellTypist.

Extended Data Fig. 4 Solid tumor dataset annotation using CellTypist.

(a-g) UMAP representation showing integration of all solid tumor datasets, PRAD (a), NSCLC (b), SKCM (c), PAAD (d), BRAC (e), GBM (f), SARC (g), with individual cell subtypes annotated using CellTypist.

Extended Data Fig. 5 Tissue-residency scoring of NK cells.

(a, b) UMAP representation showing integration of all healthy tissue (a) and solid tumor (b) datasets, with lymphocytes populations visualized. (c, d) IL7R expression in annotated NK cells (CD56^bright, CD56^dim) and ILCs (ILC2, ILC3) in tissues (c) and tumors (d). (e, f) Dotplots depicting expression of genes defining the literature-TR and atlas-TR signatures in CD56^bright and CD56^dim subsets in healthy blood and across all tissue types (e) and stratified by individual tissues (f). (g) Tissue-residency scoring (atlas-TR) of CD56^bright and CD56^dim annotated NK cells in individual tissue and tumor types. The scale represents gene set activity calculated by AUCell (g).

Extended Data Fig. 6 Phenotyping of NK cells in NSCLC patient samples.

(a) Gating strategy for CD56^bright and CD56^dim NK cells in healthy donor (PBMC) and NSCLC samples (Tumor). (b-e) Representative plots of CD56^dim NK cells (b) and quantification of NGK2A (c), KIR (d) and CD57 (e) expression in PBMC (n = 19) and NSCLC samples (n = 25), from 23 independent experiments. Data were analyzed using two-tailed Mann-Whitney test (c-e). All bar graphs represent the mean ± s.d. Actual p values are indicated.

Source data

Extended Data Fig. 7 Characterization of cellular states of NK cells identified in pan-cancer cell atlas.

(a) Beaswarm plot depicting differential abundance of TiNK or Ref-NK (PB-NK, TrNK) enriched neighborhoods, clustered based on subset annotation of individual neighborhoods. (b) TiNK fraction of cells in neighborhoods within each neighborhood group. The boxplot indicates the median with the interquartile range (IQR), whiskers extend to the farthest point within 1.5 times the IQR from the box. n is the number of neighborhoods in each group: group 1, n = 382; group 2, n = 1261; group 3, n = 1239; group 4, n = 871; group 5, n = 1427; group 6, n = 1752. (c-e) Volcano plots depicting differentially expressed genes (DEGs) between Group 4 vs. Group 3/5/6 (c), Group 5 vs. Group 3/4/6 (d), Group 6 vs. Group 3/4/5 (e). Differential expression analysis was performed using the findNhoodGroupMarkers method within the miloR package. Counts were aggregated per sample; groups were compared using edgeR and the adjusted p-values were used for the plots. (f-i) Gene set enrichment analysis (GSEA) for DEGs identified between Group 1 vs Group 2 (f), Group 3 vs Group 4/5/6 (g), Group 5 vs Group 3/4/6 (h) and Group 6 vs Group 3/4/5 (i). Volcano plots: log fold change cutoff at 0.5, p < 0.05. GSEA plots: p value cutoff 0.5 (red line).

Extended Data Fig. 8 Intercellular communication in TME.

(a-g) Scatterplot depicting incoming and outgoing interaction strength of individual cell types in BRAC (a), PAAD (b), PRAD (c), NSCLC (d), SARC (e), SKCM (f), GBM (g) as identified by CellChat. (h) Violin plots showing expression of ligands for the CCL (CCL3, CCL5) communication pathway in NSCLC.

Extended Data Fig. 9 In vitro validation of IFNG-HLA-E-KLRC1 axis in NSCLC.

(a) Representative histogram of HLA-E expression of A549 cells pre-treated (24 h) with and without IFNγ. (b-d) Viability (b), frequency (c) and geometric MFI (d) of HLA-E + A549 cells (n = 4, biological replicates from two independent experiments). (e) Gating strategy and representative contour plots for functional readout of CD56^bright and CD56^dim NK cells against A549 target cells pre-treated with and without IFNγ (24 h) in presence and absence of α-NKG2A antibody (E:T 1:1, 4 h). Data were analyzed using two-tailed Mann-Whitney test (b-d). All bar graphs represent the mean ± s.d. Actual p values are indicated.

Source data

Extended Data Fig. 10 Deconvolution of TCGA datasets.

Distribution of CD56^bright and CD56^dim NK cells in deconvoluted TCGA datasets. The boxplots indicates the median with the interquartile range (IQR), whiskers extend to the farthest point within 1.5 times the IQR from the box. For each plot and each subset n is the number of patients for each tumor type: SARC, n = 88; PAAD, n = 183; NSCLC, n = 600; BRAC, n = 1231; SKCM, n = 473; PRAD, n = 554; GBM, n = 175.

Supplementary information

Supplementary Information

Supplementary Fig. 1 and Tables 1–5 including references.

Reporting Summary

Peer Review File

Supplementary Data 1

Reagent and antibody information, including catalog nos and titrations.

Source data

Source Data Figs. 1, 4 and 6 and Extended Data Figs. 6 and 9

Statistical source data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Netskar, H., Pfefferle, A., Goodridge, J.P. et al. Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping. Nat Immunol (2024). https://doi.org/10.1038/s41590-024-01884-z

Download citation

Received: 25 October 2023
Accepted: 30 May 2024
Published: 02 July 2024
DOI: https://doi.org/10.1038/s41590-024-01884-z

This article is cited by

Chameleon impersonation of NK cells and ILC1s
- M. Zeeshan Chaudhry
- Gabrielle T. Belz
Nature Immunology (2024)
Understanding NK cell heterogeneity
- Alexandra Flemming
Nature Reviews Immunology (2024)

Subjects

Abstract

Similar content being viewed by others

Main

Results

NK cell subset annotation using predictive gene signatures

Fate-specific gene-regulatory networks

Transfer learning to generate pan-cancer atlas

Altered NK cell subset frequencies across tissues and tumors

Six functionally distinct cellular states of NK cells

State-specific signaling in the TME links to functionality

Ratio of cellular states is predictive of patient outcome

Discussion

Methods

Cell processing

Flow cytometry screening

FACS sorting

ScRNA-seq

ScRNA-seq data collection and processing

Quality control and normalization of scRNA-seq data

Integration of scRNA-seq data

Dimensionality reduction, clustering and visualization of scRNA-seq data

Cell-type annotations and harmonization

Calculation of signature scores

Pseudotime and RNA velocity analysis

GRN analysis

Bulk RNA-seq for TF and target validation

Reference mapping

Cell–cell communication inference using CellChat

Differential gene expression analysis

Differential abundance analysis using Milo

GSEA

Spatial transcriptomics

Clinical and bulk RNA-seq data from TCGA and TARGET

Deconvolution of bulk RNA-seq

Survival analysis

Samples from patients with primary NSCLC

Functional assay using A549 cells

Reagents and antibodies

Reporting summary

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links