TCR β chain repertoire characteristic between healthy human CD4+ and CD8+ T cells

Abstract T cell is vital in the adaptive immune system, which relays on T-cell receptor (TCR) to recognize and defend against infection and tumors. T cells are mainly divided into well-known CD4+ and CD8+ T cells, which can recognize short peptide antigens presented by major histocompatibility complex (MHC) class II and MHC class I respectively in humoral and cell-mediated immunity. Due to the Human Leukocyte Antigen (HLA) diversity and restriction with peptides complexation, TCRs are quite diverse and complicated. To better elucidate the TCR in humans, the present study shows the difference between the TCR repertoire in CD4+ and CD8+ T cells from 30 healthy donors. The result showed count, clonality, diversity, frequency, and VDJ usage in CD4+ and CD8+ TCR-β repertoire is different, but CDR3 length is not. The Common Clone Cluster result showed that CD4+ and CD8+ TCR repertoires are connected separately between the bodies, which is odd considering the HLA diversity. More knowledge about TCR makes more opportunities for immunotherapy. The TCR repertoire is still a myth for discovery.


Introduction
When pathogen and antigen are encountered and processed by antigen-presenting cells (macrophage, dendritic cell, B cell, etc.), then short peptides presented through major histocompatibility complex (MHC) molecules to recognize T-cell receptors (TCRs) on the surface of T cells [1].TCR signaling cooperated with costimulatory molecules, cytokines, integrins, chemokines, and metabolites, which drives T cells to differentiate into CD4+ and CD8+ T cells [2].CD4+ T cells can differentiate into T helper type 1 (Th1), Th2, Th17, follicular helper T, and regulatory T (Treg) cells.CD8+ effector T cells fight against pathogens at initial exposure, and memory T cells provide defense against future infection [3].The immune balance was delicately manipulated [4,5].
TCR dynamically changes by the antigens of the immune system faced, such as tumors and infection, which involves HLA diversity and shows the unique TCR repertoire for an individual.The TCR repertoire changes during the time and the antigen, and TCR repertoire analysis is an important way to comprehensively understand the TCR's nature.TCR signaling impacts the fate of T cells, including expansion, differentiation, and antigen recognition, which is still unclear the contribution of TCR difference.
The emergence of high-throughput sequencing technology and bioinformatics provides opportunities to analyze and annotate immune repertoire data, which can reveal the meaning of immune prediction and progression (tumor microenvironment characterization, minimal residual disease assessment, transplantation, autoimmune disease, and immune checkpoint inhibitor effective evaluation) [6][7][8] and target antigens (HIV, HBV, HCV, SARS, CoV-2, cancer, etc.) [9,10].
In our previous studies, we focus on MHC-I [11] and MHC-II [12] presentation function in diseases, and TCR repertoire diversity and recognition of MR1 [13].In the present study, we focus on the TCR repertoire difference between CD4+ and CD8+ T cells.The meaning of the TCR repertoire is still waiting

Data analysis
Unless otherwise specified, unique productive TCRβ sequences were defined by CDR3 nucleotide sequence and V and J gene.To identify unique productive TCRβ sequences, individual samples were downloaded from the Im-munoSEQ software and analyzed by VDJtools [17], and productive rearrangements were filtered.VDJtools is a software framework that can analyze TCR repertoire processing tools and allows applying a diverse set of post-analysis strategies.Basic statistics and segment usage module include general statistics (clonotype and read count, number and frequency of non-coding clonotypes, convergent recombination of CDR3 amino acid sequences) [17].Variable and joining segment usage profiles and their pairing frequency in rearranged receptor junction sequences [17].Repertoire overlap module includes routines for computing sets of overlapping clonotypes and their characteristic [17].Diversity analysis includes routines for visualizing clonotype frequency distribution and computing repertoire diversity estimates [17].Sample clustering is based on computed repertoire similarity measures [17].When analyzed by amino acid sequence, unique productive TCRβ sequences were defined by CDR3 amino acid sequence.From these, sample template counts across unique productive TCRβ sequences were normalized to the frequency of detection [15].

Result Count, clonality, diversity, and frequency in CD4+ and CD8+ TCR-β repertoire
The counts of CD4+ TCRβ repertoire (Mean 165639) from 30 healthy donors are more than CD8+ TCRβ (Mean 112800), which compared with a group (P=0.0119) or individual (P=0.0082, Figure 1A).The productive Simpson clonality of CD4+ TCRβ repertoire from healthy donors is less than CD8+ TCRβ compared with group or individual (Figure 1B).The diversity of CD4+ TCRβ repertoire is estimated by Extrapolated Chao diversity estimate, d50, Inverse Simpson index, and Efron-Thisted estimate, which are more than CD8+ TCRβ compared with a group or individual (Figure 1C-F).The mean clonotype frequency of CD4+ TCRβ repertoire from healthy donors is less than CD8+ TCRβ compared with a group or individual (Figure 1G,H).Non-coding clonotypes diversity of CD4+ TCRβ repertoire from healthy donors is less than CD8+ TCRβ compared with a group or individual (Figure 1I), non-coding clonotypes frequency of CD4+ and CD8+TCRβ repertoire from healthy donors are no significant (Figure 1J).The Productive Simpson Clonality is calculated for a sample as the square root of Simpson's diversity index for all productive rearrangements.Values near 1 represent samples with predominant rearrangements.Clonality values near 0 represent more polyclonal samples.The estimates computed on original data could be biased by uneven sampling depth (sample size), of those only chaoE is properly normalized to be compared between samples.d50 A method for identifying normal immune status or abnormal immune status in an individual, wherein a normal immune status is characterized by the presence of a greater diversity of clonotypes represented by the significant percentage of the total number of cells, and an abnormal immune status is characterized by the presence of a significantly lower number of clonotypes represented by the significant percentage of the total number of cells [18].

CDR3 length in CD4 and CD8 TCR-β repertoire
The length of the CDR3 in nucleotides, starting from the first base of the codon for the conserved cysteine in the V gene through the last base of the codon for the conserved residue in the J gene.CDR3 length histogram for productive rearrangements frequency of CD4+ and CD8+TCRβ repertoire from healthy donors are shown in Figure 2A.The mean length of CDR3 nucleotide sequence (Figure 2B), mean number of inserted random nucleotides in CDR3 sequence (Figure 2C), mean number of nucleotides that lied between V and J segment (Figure 2D) are no significant in CD4+ and CD8+TCRβ repertoire from healthy donors

Common top 150 clones VDJ usage
Common clones were listed from 30 samples of CD4+ TCRβ repertoire or 30 samples of CD8+ TCRβ repertoire, and sequenced by accumulated frequency.Each sample's VDJ usage frequency from the top 150 clones was analyzed.The common top 150 clones in TRBV04-02 (Figure 6A), TRBV06-04 (Figure 6B), TRBV06-05 (Figure 6C), and TRBV09-01 (Figure 6D) of CD4+ and CD8+ TCRβ repertoire from healthy donors, which trends similar with bulk TCRβ repertoire.The CD8 common top 150 clones TRBV19-01 (Figure 6F) repertoire have more frequency than CD8+, which has a different trend compared with bulk TCRβ repertoire.The common top 150 clones of TRBV19-01 are found in all samples, which is quite special among other TRBV genes.The common top 150 clone's repertoire is barely found in TRBV10-03 (Figure 6E), TRBV29-01 (Figure 6G), and TRBV30-01 (Figure 6H), which they have not enough frequency in the bulk repertoire

Common clone cluster
The common clone frequency in productive rearrangements was analyzed from 30 samples of CD4+ TCRβ repertoire or 30 samples of CD8+ TCRβ repertoire.The common clone frequency of CD4+ TCRβ repertoire from healthy donors is less than CD8+ TCRβ compared with a group or individual (Figure 7A), which showed CD8 TCR clones are more shared and CD4 TCR clones are more unique with a different individual.Multi-dimensional scaling (MDS) for an all-versus-all pairwise overlap of repertoire similarity measures.Pairwise overlap circos plot showed count, frequency, and diversity are shared between samples.The MDS and Pairwise overlap circos plot for 30 samples of CD4+ TCRβ repertoire and 30 samples of CD8+ TCRβ repertoire (Figure 7B), showed that CD4+ and CD8+ TCRβ repertoire could be separated by the line, which means CD4+ TCRβ repertoire and CD8+ TCRβ repertoire are more similar or conserved between different people (Figure 7C).The MDS of 30 samples of CD4+ TCRβ repertoire (Figure 8A) and the MDS of 30 samples of CD8+ TCRβ repertoire (Figure 8B) both showed three groups: Cluster Yellow, Cluster Blue (Figure 8C-E), and Cluster Red (Figure 8C-E), which showed that different individual shared similar TCRβ clones.

Discussion
In this study, we collected TCRβ repertoire in CD4+ and CD8+ T Cells from 30 health donors from three papers (Supplementary Table S1) [14][15][16], to find the law of the CD4+ and CD8+ T Cells' TCRβ clone pattern.The CD4+ TCRβ repertoire has more counts and diversity, less clonality, and mean frequency compared with CD8+ TCRβ (Figure 1A-H).The CD4+ non-coding TCR clone diversity (Figure 1I) has more diversity compared with CD8+ TCRβ, which is the same as the coding clones.The CD4+ and CD8+ CDR3 length (Figure 2A-D) and non-coding TCR frequency (Figure 1J) showed no significance.The non-coding TCR clone is the T-cell preselection repertoire [19].The result showed that the CD4+ and CD8+ TCRβ repertoire frequencies are the same in the T-cell preselection,   but the diversity is made at the beginning, and the frequency changes during T cell maturity and activation (Figure 1A-J).
The common clones of CD4+ TCRβ repertoire are less than CD8+ TCRβ (Figure 7A), and bulk CD4+ TCRβ repertoire had more count and diversity (Figure 1A-H), which showed CD8+ TCRβ in a different individual that may because of the same foreign antigen (bacteria or virus).It is interesting that MDS (Figure 7B,C)) result showed that one person's CD8+ TCRβ repertoire is more similar to other people's, not similar to his/her own CD4+ TCRβ repertoire, may CD8+ and CD4+ TCRβ repertoire have some hidden pattern.The MDS of 30 samples of CD4+ TCRβ repertoire (Figure 8A) and the MDS of 30 samples of CD8+ TCRβ repertoire (Figure 8B) both showed three same groups (Figure 8C-H), so different people have similar TCRβ repertoire may because of the HLA similarity, which may provide a clue that HLA influences the TCR repertoire.Though one person's CD8+ TCRβ repertoire is more similar to other people's than his/her own CD4+ TCRβ repertoire, the same cluster (Figure 8A,B) of CD4+ and CD8+ TCRβ repertoire showed that CD4+ and CD8+ TCR repertoire are connected separately between the bodies.
TCR recognition is vital to defend against infection and tumors [9].The TCR can recognize peptide antigens (MHC-I, MHC-II, and CD1) and other antigens [20], in which MR1 and HLA-E present metabolites and non-self-lipids.Indicating that T cells have additional roles in immune responses to tissue homeostasis and inflammation [21].Cancer studies have found that high diversity in the TCR repertoire may be associated with better prognosis [22].Cancer immunotherapy has recently undergone rapid development for clinical use, such as chimeric antigen receptor (CAR)-T cells and TCR-T cells.
T cell is vital in adaptive immune response, not only in defending against mutation and foreign antigen but also in maintaining immune homeostasis.The recognition and function of T cells rely on TCR, which is diverse and can recognize antigens, but the relationship between the TCR and antigen presentation molecules is still a mystery.Deciphering the secret of TCR diversity and clonality would find a way to uncover the mystery of the immune system.

Figure 2 .
Figure 2. CDR3 length in CD4 and CD8 TCR-β repertoire (A) CDR3 length histogram for productive rearrangements frequency.(B) The mean length of CDR3 nucleotide sequence, weighted by clonotype frequency.(C) The mean number of inserted random nucleotides in CDR3 sequence.(D) The mean number of nucleotides that lie between V and J segment sequences in CDR3.(N = 30; Left is non-paired t-test of CD4 and CD8 TCR-β repertoire as two groups, Right is paired t-test of CD4 and CD8 TCR-β repertoire for individual; ns, no significant).

Figure 7 .
Figure 7. Common Clone Cluster in CD4 and CD8 TCR-β repertoire (A) Common clone in productive rearrangements (N = 30; Left is non-paired t-test of CD4 and CD8 TCR-β repertoire as two groups; Right is paired t-test of CD4 and CD8 TCR-β repertoire for individual; ****P<0.0001).(B) Multi-dimensional scaling (MDS) for an all-versus-all pairwise overlap of repertoire similarity measures.(C) Pairwise overlap circos plot.Count, frequency, and diversity panels correspond to the read count, frequency (both non-symmetric), and the total number of clonotypes that are shared between samples.

Table 1 Sample Overview (Continued) Sample name Total templates Productive templates Fraction productive
[14][15][16]otechnologies) from three different studies (Table1and Supplementary TableS1)[14][15][16].The alignment of sequencing reads on V, D and J segments of TCR, defined according to IMGT (THE INTERNATIONAL IMMUNOGENETICS INFORMATION SYSTEM, www.imgt.org),assembly of aligned sequences into clonotypes, conversion from nucleotides into amino acid sequences, and computation of the sequencing counts were performed and retrieved from immuneACCESS by Adaptive ImmunoSEQ software.