Approximately half of the human genome includes repetitive sequences, and these DNA sequences (as well as their transcribed repetitive RNA and translated amino-acid repeat sequences) are known as the repeatome. Within this repeatome there are a couple of million tandem repeats, dispersed throughout the genome. These tandem repeats have been estimated to constitute ∼8% of the entire human genome. These tandem repeats can be located throughout exons, introns and intergenic regions, thus potentially affecting the structure and function of tandemly repetitive DNA, RNA and protein sequences. Over more than three decades, more than 60 monogenic human disorders have been found to be caused by tandem-repeat mutations. These monogenic tandem-repeat disorders include Huntington's disease, a variety of ataxias, amyotrophic lateral sclerosis and frontotemporal dementia, as well as many other neurodegenerative diseases. Furthermore, tandem-repeat disorders can include fragile X syndrome, related fragile X disorders, as well as other neurological and psychiatric disorders. However, these monogenic tandem-repeat disorders, which were discovered via their dominant or recessive modes of inheritance, may represent the ‘tip of the iceberg’ with respect to tandem-repeat contributions to human disorders. A previous proposal that tandem repeats may contribute to the ‘missing heritability’ of various common polygenic human disorders has recently been supported by a variety of new evidence. This includes genome-wide studies that associate tandem-repeat mutations with autism, schizophrenia, Parkinson's disease and various types of cancers. In this article, I will discuss how tandem-repeat mutations and polymorphisms could contribute to a wide range of common disorders, along with some of the many major challenges of tandem-repeat biology and medicine. Finally, I will discuss the potential of tandem repeats to be therapeutically targeted, so as to prevent and treat an expanding range of human disorders.

Our understanding of the human genome has been transformed in recent decades by genome sequencing and other approaches to mapping genetic mutations and polymorphisms. For monogenic disorders, this has involved identification of causative mutations. These mutations can take a variety of different forms, affecting both protein coding and non-coding sequences across the genome, and ranging from point mutations to structural mutations. As approximately half of the human genome is constituted by repetitive sequences, or repeatome [1], this provides a rich substrate for mutation, as many types of repetitive DNA, including tandem repeats, are highly mutatable. A small subset of the approximately two million tandem repeats distributed across the human genome have thus far been implicated in monogenic disorders, as discussed below. However, this may only represent a fraction of the total contribution of tandem repeats to human health and disease. Given the latest estimate that ∼8% of the entire human genome consists of tandem repeats [2], there is enormous scope for tandem repeats to affect human development, function and dysfunction.

For polygenic disorders, approaches such as genome-wide association studies (GWAS) have mapped genes and intergenic regions associated with a variety of human conditions. However, GWAS of human disorders and traits has left ‘missing heritability’ for these polygenic disorders [3,4]. The approximately two million tandem-repeat sequences (tandemly repeated DNA motifs) in the human genome have only recently become a target for genome-wide studies [5–9]. It has been proposed that tandem repeats can contribute to this missing heritability for a wide range of common disorders [3]. This article will focus predominantly on this issue, and its relevance to our understanding of the pathogenesis of a wide range of disorders, as well as ongoing challenges in the field of tandem repeats. The relevance to the prevention and treatment of disease will also be discussed.

It is clear that the approximately two million tandem repeats (including short tandem repeats, or STRs, and variable number tandem repeats, or VNTRs) in the human genome are not unique to our bipedal primate species (which is only one of tens of millions of extant species that share the planet with us). Comparative genomics demonstrates high homologies of human genome-wide tandem repeats with other primates, other mammals and more disparate vertebrates [10]. Furthermore, invertebrate genomes are also extensively populated by tandem repeats, as are other non-animal species, from plants to microbes. This comparative genomics can help us understand how different types of tandem repeats, in different parts of genomes, evolved a diversity of structures and functions. These biological roles, at the molecular level, can include regulation of epigenetics, gene expression, RNA structure and function, and protein structure and function [11].

Another key source of tandem-repeat variability involves heterogeneity across humans, both healthy and diseased populations. Comparative human genomics reveals tight biological constraints on many thousands of tandem repeats. These constraints can be most extreme for tandem repeats that encode amino-acid repeats in proteins, such as polyglutamine tracts. One extreme example is the CAG/glutamine repeat in the FOXP2 gene/protein, which is ∼40 glutamines in length in the protein, and almost invariant in humans [12,13]. Notably, a family with a specific mutation (outside the tandem repeat) in the FOXP2 gene, had an extreme speech disorder [12], reflecting the strong evolutionary pressures on this gene during human evolution [13]. In contrast, the CAG/glutamine repeat in the huntingtin (HTT) gene/protein that expands to cause Huntington's disease (HD) is far more polymorphic in the general population, with evidence for functional impacts (short of the HD threshold) of this tandem-repeat polymorphism (TRP) in the non-HD general population [14–17]. In fact, a key characteristic of ‘pathogenic repeats’ (those associated with monogenic tandem-repeat disorders) is that they tend to be polymorphic in the human population, relative to other tandem repeats [2,18].

Comparative genomics at the DNA level will thus have much to offer with respect to understanding the evolution and function of the millions of tandem repeats, which are currently largely ‘genomic dark matter’. Furthermore, comparative genome-wide epigenomics, transcriptomics and proteomics (of tandemly repeated DNA, RNA and polypeptide sequences) will also be highly informative. Many tandem repeats located outside of coding regions appear to regulate epigenetic modifications, and thus the spatiotemporal control of gene expression [7,19,20]. Similarly, the majority of the human genome can be transcribed, and tandemly repeated RNA sequences can modulate RNA structure and function in many different ways [21–24]. Finally, tandemly repeated DNA sequences in coding regions, and sometimes outside of coding regions via repeat-associated non-ATG (RAN) translation [25–27], can encode repetitive amino-acid sequences within proteins, thus modulating various aspects of protein structure and function.

There are now well over 60 monogenic tandem-repeat disorders identified, including many neurological disorders, and many others that also impinge upon the nervous system. Furthermore, this list of tandem-repeat disorders continues to expand, year by year. As these tandem-repeat disorders have been extensively reviewed in recent years (e.g. [11,28–39]), their individual pathogenic mechanisms, biologies and therapeutic targets will not be a focus of this article.

These monogenic disorders include Huntington's disease, Friedreich ataxia, various spinocerebellar ataxis, fragile X syndrome, and C9ORF72-associated amyotrophic lateral sclerosis (motor neuron disease) and frontotemporal dementia. The large number of monogenic tandem-repeat disorders is direct evidence of the functional importance of these tandem repeats. It demonstrates that the human body, and the nervous system in particular, is intolerant of major mutations (particularly expansions) in many of these tandem repeats.

However, the greatest collective burden of disease in humans is not due to monogenic disorders, but rather is associated with more common polygenic disorders (and their complex and heterogenous pathogenic mixtures of genetic and environmental risk factors, or genomes and ‘enviromes’; [40]). The era of GWAS, which has thus far been largely based on microarrays genotyping single-nucleotide polymorphisms (SNPs), or single-nucleotide variants (SNVs), has found genetic associations across the genome for a wide variety of human disorders and traits. However, GWAS has also left substantial ‘missing heritability’, and it is an urgent priority to fully understand genetic contributions to disease, including those residing in the repeatome, and tandem repeats in particular.

It has been previously proposed that tandem repeats may make a major contribution to the missing heritability of polygenic disorders [3]. In recent years, genome-sequencing approaches, together with innovations in tandem-repeat bioinformatics, have provided new insights into common conditions such as autism [41–43], which have been followed up in subsequent studies [44]. There is evidence, although less extensive than for autism, that tandem-repeat mutations (and polymorphisms) could contribute to risk for schizophrenia [45,46] and Parkinson's disease [47]. Other genome-wide approaches to tandem repeats have revealed major contributions to various forms of cancer [48,49]. A recent study identified VNTRs with very strong associations to glaucoma and colorectal cancer [49]. Furthermore, genome-wide tandem repeats may not only contribute to common polygenic disorders, but also complex traits [50].

Considering how few polygenic disorders have been thoroughly investigated with respect to tandem-repeat associations, there is enormous potential for discovery. In theory, existing GWAS datasets based on SNP-chip microarrays can be reanalysed with imputation approaches, so that tandem repeats linked to disease-associated SNPs can be imputed [6]. However, a problem with such approaches is that many tandem repeats mutate more frequently (are far more mutable) than single nucleotides. Therefore, SNPs are unlikely to ‘tag’ many tandem repeats, as the increased mutability of tandem repeats may confound such linkage approaches [3].

As the cost of genome sequencing has decreased, genome-sequencing approaches have continued to transform clinical genetics. Whole-genome sequencing (WGS) and whole-exome sequencing (WES; cheaper and faster, but less comprehensive) allows detailed mapping of tandem repeats throughout genomes and exomes, and comparison between disease and control populations, or across the phenotypic spectrum of different traits. One caveat is that genome/exome sequences generated from short-read sequencing (e.g. Illumina) may not be able to accurately capture longer tandem-repeat sequences [10,11]. However, more recent long-read sequencing technologies (e.g. Oxford Nanopore Technologies and PacBio) are better positioned to fully map tandem repeats, and their associations with human disorders [10,11,51,52]. Furthermore, long-read sequencing technology is improving variant detection and the identification of causal variants in human disease. For example, long-read sequencing is replacing Southern blots as gold-standard genotyping for many repeat-expansion diseases [52].

The science of tandem repeats, like many other areas of science, has been catalysed by new technologies and analytic approaches. In the case of tandem-repeat sequencing, the advent of long-read sequencing technologies (as mentioned above) has opened up new possibilities regarding genome-wide mapping of tandem repeats, within and across individuals [10,11,35,52–55]. However, complementary advances have been made using novel bioinformatic approaches [8,29,56–73]. In addition to new bioinformatic tools to interrogate tandem repeats, there has also been progress with the development of systematic cataloging and databasing of tandem repeats (e.g. [74]).

The characterisation and mapping of tandem repeats across the human genome are thus becoming more routine (e.g. [5,75–77]). Importantly, genome-wide bioinformatic approaches have begun to link tandem-repeat sequences to molecular mechanisms, such as regulation of epigenetic modifications and associated gene expression [7,20,78,79]. However, there is an urgent need for further progress, so that whole-genome sequencing can be used routinely, and affordably, to accurately genotype all tandem repeats in the genome, and reliably identify pathogenic variants.

The ultimate aim of clinical genetics and genomics is to facilitate novel approaches to prevent, treat, and eventually cure, human disorders. The biology of tandem repeats provides a rich source of therapeutic targets. For example, in Huntington's disease (HD), a range of different approaches are being taken to either correct the tandem-repeat (CAG) expansion mutation (for example by CRISPR gene editing), lower somatic gene expression levels (for example by antisense oligonucleotide therapeutics; [80,81]), or target ‘downstream’ pathogenic pathways, such as polyglutamine toxicity (e.g. [31,82,83]) and various aspects of brain-body interactions, including the microbiota-gut-brain axis (e.g. [84,85]).

CRISPR (clustered regularly interspaced short palindromic repeats) gene editing is being pursued by several academic groups and companies, as are antisense oligonucleotides [86] and small-molecule drugs [87]. A specific variant of the CRISPR approach has been developed to target transcription from tandem repeats [88] and these and other approaches offer significant hope for the prevention and treatment of such fatal monogenic diseases.

In theory, somatic gene-editing approaches, such as those provided by CRISPR technologies, could be applied to all tandem-repeat disorders. In practice, there are many remaining challenges. One challenge is delivery, and this is a problem that has faced the gene therapy field more widely, for decades. The brain is a particularly challenging organ for targeted gene editing, due primarily to the blood-brain barrier. Another challenge is constituted by potential off-target effects of gene editing. This is a particularly significant problem for tandem-repeat disorders, as the repetitive nature of the target can be a challenge for specificity. Similarly for the many autosomal dominant tandem-repeat disorders (with Huntington's disease being amongst the most common), ideal therapeutic approaches will be allele specific, and thus the therapy must selectively target the tandem repeat-expanded allele, leaving the normal ‘healthy’ allele intact [81,86].

For some tandem-repeat disorders, it appears that somatic expansion of the tandem repeat contributes to pathogenesis (e.g. [89–93]). This may occur during development and/or adulthood, via various mechanisms (e.g. [94,95]). Therefore, the tandem-repeat target for gene-editing may vary in length between different cells, systems and organs. This presents an additional challenge for gene-editing approaches that target highly specific repeat lengths. One approach to target such somatic repeat expansion involves small-molecule therapeutics that could inhibit expansion, and perhaps even induce contraction of pathologically expanded repeats (e.g. [96–98]).

The first grand challenge of tandem-repeat biology will be accurately mapping all tandem repeats across all known species. We know for example that tandem repeats have evolved extensively during human evolution, in large populations (e.g. [99]), with short-term mutational dynamics (without long-term evolutionary pressures) also observed in individual families (e.g. [100]). We need to know much more about how tandem repeats evolve (e.g. [22,101]) and how this relates to organismal development, structure, function and evolution.

The second grand challenge will be to fully characterise tandem-repeat polymorphisms across large human populations, and relate this to phenotypic information. This approach is beginning to emerge (e.g. [102]). However, with approximately eight billion humans on the planet, many of whom are located in countries embarking on population-wide genome sequencing approaches to facilitate genomic and precision medicine, we are only at the beginning of a long journey. Integrating this tandem-repeat genomics with phenomics, to establish how tandem-repeat polymorphisms and mutations contributed to human diseases and traits, will facilitate novel approaches to the prevention and treatment of a wide variety of disorders.

The third grand challenge is to accelerate therapeutic development (discussed above) to find novel ways to prevent or treat all human disorders involving tandem repeats. The development of new therapies will in some cases be tandem repeat, and disease, specific. However, commonalities across disorders (e.g. polyglutamine and polyalanine tracts), may inform novel approaches targeting multiple diseases. Furthermore, tandem-repeat therapeutics may have DNA, RNA or protein targets, and involve a wide range of rapidly evolving technologies, from small molecules, to DNA and RNA editing, and biologics targeting repetitive RNA and amino-acid sequences.

Due to space constraints, this article has only been able provide a flavour of the excitement and enormous potential of the tandem-repeat field. Whilst the focus has been on ‘tandem-repeat medicine’ and ‘tandem-repeat therapeutics’, the potential applications run far beyond human biology and disease. As the vast majority, if not all, other species have tandem repeats in their genomes, our understanding of genome-wide tandem-repeat biology at molecular, cellular and systems levels could have immense impacts. Whilst tandem repeats, and the rest of the repeatome, have long been the ‘dark matter of the genome’, scientific illumination promises to be transformative. There are undoubtedly novel applications of tandem-repeat biology in ecology, conservation, agriculture, and beyond. But most excitingly, understanding and targeting tandem repeats offers new hope to prevent, treat, and eventually cure, a wide range of devastating disorders, thus improving human health and reducing morbidity and mortality.

  • Approximately 8% of the human genome consists of tandemly repeated DNA sequences, known as tandem repeats, located throughout both genic and intergenic regions

  • These tandem repeats, the most common of which are short tandem repeats (STRs), have been associated with over 60 monogenic human disorders (e.g. Huntington's disease, many ataxias, amyotrophic lateral sclerosis, frontotemporal dementia, fragile X syndrome and other predominantly neurological disorders)

  • Tandem repeats have recently been associated with common polygenic disorders, including autism, schizophrenia, Parkinson's disease, and many cancers, suggesting major involvement in ‘missing heritability’

  • Tandem repeats constitute major therapeutic targets for this large, and expanding, list of human disorders, and current candidate approaches include tandem-repeat targeted CRISPR gene editing, antisense oligonucleotides, biologics and small-molecule drugs

The author declares that there are no competing interests associated with this manuscript.

A.J.H. has been supported by a National Health and Medical Research Council (NHMRC) Principal Research Fellowship (GNT1117148) and his laboratory is also supported by NHMRC Project and Ideas Grants, an Australian Research Council (ARC) Discovery Project, European Union Joint Programme on Neurodegenerative Diseases (EU-JPND; with NHMRC co-funding), the DHB Foundation (Equity Trustees), the Hereditary Disease Foundation (HDF) and the Flicker of Hope Foundation.

Open access for this article was enabled by the participation of University of Melbourne in an all-inclusive Read & Publish agreement with Portland Press and the Biochemical Society under a transformative agreement with CAUL.

I thank the many wonderful past and present members of the Hannan Laboratory, and many collaborators and colleagues, for their research which has informed my thoughts in this field. I also gratefully acknowledge the many families affected by Huntington's disease, and other tandem-repeat disorders, that I have had the pleasure and privilege of interacting with over the past three decades, as they have provided a constant source of inspiration for our ongoing research.

CRISPR

clustered regularly interspaced short palindromic repeats

GWAS

genome-wide association studies

HD

Huntington's disease

HTT

huntingtin

SNPs

single-nucleotide polymorphisms

SNVs

single nucleotide variants

STRs

short tandem repeats

VNTRs

variable number tandem repeats

WES

whole-exome sequencing

WGS

whole-genome sequencing

1
Nurk
,
S.
,
Koren
,
S.
,
Rhie
,
A.
,
Rautiainen
,
M.
,
Bzikadze
,
A.V.
,
Mikheenko
,
A.
et al. (
2022
)
The complete sequence of a human genome
.
Science
376
,
44
53
2
English
,
A.
,
Dolzhenko
,
E.
,
Jam
,
H.Z.
,
Mckenzie
,
S.
,
Olson
,
N.D.
,
De Coster
,
W.
et al. (
2023
)
Benchmarking of small and large variants across tandem repeats
.
bioRxiv
3
Hannan
,
A.J.
(
2010
)
Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for ‘missing heritability’
.
Trends Genet.
26
,
59
65
4
Brandes
,
N.
,
Weissbrod
,
O.
and
Linial
,
M.
(
2022
)
Open problems in human trait genetics
.
Genome Biol.
23
,
131
5
Willems
,
T.
,
Zielinski
,
D.
,
Yuan
,
J.
,
Gordon
,
A.
,
Gymrek
,
M.
and
Erlich
,
Y.
(
2017
)
Genome-wide profiling of heritable and de novo STR variations
.
Nat. Methods
14
,
590
592
6
Saini
,
S.
,
Mitra
,
I.
,
Mousavi
,
N.
,
Fotsing
,
S.F.
and
Gymrek
,
M.
(
2018
)
A reference haplotype panel for genome-wide imputation of short tandem repeats
.
Nat. Commun.
9
,
4397
7
Fotsing
,
S.F.
,
Margoliash
,
J.
,
Wang
,
C.
,
Saini
,
S.
,
Yanicky
,
R.
,
Shleizer-Burko
,
S.
et al. (
2019
)
The impact of short tandem repeat variation on gene expression
.
Nat. Genet.
51
,
1652
1659
8
Mousavi
,
N.
,
Shleizer-Burko
,
S.
,
Yanicky
,
R.
and
Gymrek
,
M.
(
2019
)
Profiling the genome-wide landscape of tandem repeat expansions
.
Nucleic Acids Res.
47
,
e90
9
Shi
,
Y.
,
Niu
,
Y.
,
Zhang
,
P.
,
Luo
,
H.
,
Liu
,
S.
,
Zhang
,
S.
et al. (
2023
)
Characterization of genome-wide STR variation in 6487 human genomes
.
Nat. Commun.
14
,
2092
10
Gymrek
,
M.
(
2017
)
A genomic view of short tandem repeats
.
Curr. Opin. Genet. Dev.
44
,
9
16
11
Hannan
,
A.J.
(
2018
)
Tandem repeats mediating genetic plasticity in health and disease
.
Nat. Rev. Genet.
19
,
286
298
12
Lai
,
C.S.
,
Fisher
,
S.E.
,
Hurst
,
J.A.
,
Vargha-Khadem
,
F.
and
Monaco
,
A.P.
(
2001
)
A forkhead-domain gene is mutated in a severe speech and language disorder
.
Nature
413
,
519
523
13
Enard
,
W.
,
Przeworski
,
M.
,
Fisher
,
S.E.
,
Lai
,
C.S.
,
Wiebe
,
V.
,
Kitano
,
T.
et al. (
2002
)
Molecular evolution of FOXP2, a gene involved in speech and language
.
Nature
418
,
869
872
14
Gardiner
,
S.L.
,
van Belzen
,
M.J.
,
Boogaard
,
M.W.
,
van Roon-Mom
,
W.M.C.
,
Rozing
,
M.P.
,
van Hemert
,
A.M.
et al. (
2017
)
Huntingtin gene repeat size variations affect risk of lifetime depression
.
Transl. Psychiatry
7
,
1277
15
Faquih
,
T.O.
,
Aziz
,
N.A.
,
Gardiner
,
S.L.
,
Li-Gao
,
R.
,
de Mutsert
,
R.
,
Milaneschi
,
Y.
et al. (
2023
)
Normal range CAG repeat size variations in the HTT gene are associated with an adverse lipoprotein profile partially mediated by body mass index
.
Hum. Mol. Genet.
32
,
1741
1752
16
Estevez-Fraga
,
C.
,
Altmann
,
A.
,
Parker
,
C.S.
,
Scahill
,
R.I.
,
Costa
,
B.
,
Chen
,
Z.
et al. (
2023
)
Genetic topography and cortical cell loss in Huntington's disease link development and neurodegeneration
.
Brain
146
,
4532
4546
17
Schultz
,
J.L.
,
Neema
,
M.
and
Nopoulos
,
P.C.
(
2023
)
Unravelling the role of huntingtin: from neurodevelopment to neurodegeneration
.
Brain
146
,
4408
4410
18
Ziaei Jam
,
H.
,
Li
,
Y.
,
DeVito
,
R.
,
Mousavi
,
N.
,
Ma
,
N.
,
Lujumba
,
I.
et al. (
2023
)
A deep population reference panel of tandem repeat variation
.
Nat. Commun.
14
,
6711
19
Bakhtiari
,
M.
,
Park
,
J.
,
Ding
,
Y.C.
,
Shleizer-Burko
,
S.
,
Neuhausen
,
S.L.
,
Halldórsson
,
B.V.
et al. (
2021
)
Variable number tandem repeats mediate the expression of proximal genes
.
Nat. Commun.
12
,
2075
20
Mortazavi
,
M.
,
Ren
,
Y.
,
Saini
,
S.
,
Antaki
,
D.
,
St Pierre
,
C.L.
,
Williams
,
A.
et al. (
2022
)
SNPs, short tandem repeats, and structural variants are responsible for differential gene expression across C57BL/6 and C57BL/10 substrains
.
Cell Genom.
2
,
100102
21
Hale
,
M.A.
,
Johnson
,
N.E.
and
Berglund
,
J.A.
(
2019
)
Repeat-associated RNA structure and aberrant splicing
.
Biochim. Biophys. Acta Gene Regul. Mech.
1862
,
194405
22
Herbert
,
A.
(
2020
)
Simple repeats as building blocks for genetic computers
.
Trends Genet.
36
,
739
750
23
Hasuike
,
Y.
,
Tanaka
,
H.
,
Gall-Duncan
,
T.
,
Mehkary
,
M.
,
Nakatani
,
K.
,
Pearson
,
C.E.
et al. (
2021
)
Overlapping mechanisms of lncRNA and expanded microsatellite RNA
.
Wiley Interdiscip. Rev. RNA
12
,
e1634
24
Ninomiya
,
K.
and
Hirose
,
T.
(
2020
)
Short tandem repeat-enriched architectural RNAs in nuclear bodies: functions and associated diseases
.
Noncoding RNA
6
,
6
25
Zu
,
T.
,
Gibbens
,
B.
,
Doty
,
N.S.
,
Gomes-Pereira
,
M.
,
Huguet
,
A.
,
Stone
,
M.D.
et al. (
2011
)
Non-ATG-initiated translation directed by microsatellite expansions
.
Proc. Natl Acad. Sci. U.S.A.
108
,
260
265
26
Green
,
K.M.
,
Linsalata
,
A.E.
and
Todd
,
P.K.
(
2016
)
RAN translation—what makes it run?
Brain Res.
1647
,
30
42
27
Guo
,
S.
,
Nguyen
,
L.
and
Ranum
,
L.P.W.
(
2022
)
RAN proteins in neurodegenerative disease: repeating themes and unifying therapeutic strategies
.
Curr. Opin. Neurobiol.
72
,
160
170
28
Pattamatta
,
A.
,
Cleary
,
J.D.
and
Ranum
,
L.P.W.
(
2018
)
All in the family: repeats and ALS/FTD
.
Trends Neurosci.
41
,
247
250
29
Tankard
,
R.M.
,
Bennett
,
M.F.
,
Degorski
,
P.
,
Delatycki
,
M.B.
,
Lockhart
,
P.J.
and
Bahlo
,
M.
(
2018
)
Detecting expansions of tandem repeats in cohorts sequenced with short-read sequencing data
.
Am. J. Hum. Genet.
103
,
858
873
30
Buijsen
,
R.A.M.
,
Toonen
,
L.J.A.
,
Gardiner
,
S.L.
and
van Roon-Mom
,
W.M.C.
(
2019
)
Genetics, mechanisms, and therapeutic progress in polyglutamine spinocerebellar ataxias
.
Neurotherapeutics
16
,
263
286
31
Gonzalez-Alegre
,
P.
(
2019
)
Recent advances in molecular therapies for neurological disease: triplet repeat disorders
.
Hum. Mol. Genet.
28
,
R80
R87
32
Rodriguez
,
C.M.
and
Todd
,
P.K.
(
2019
)
New pathologic mechanisms in nucleotide repeat expansion disorders
.
Neurobiol. Dis.
130
,
104515
33
Depienne
,
C.
and
Mandel
,
J.L.
(
2021
)
30 years of repeat expansion disorders: what have we learned and what are the remaining challenges?
Am. J. Hum. Genet.
108
,
764
785
34
Malik
,
I.
,
Kelley
,
C.P.
,
Wang
,
E.T.
and
Todd
,
P.K.
(
2021
)
Molecular mechanisms underlying nucleotide repeat expansion disorders
.
Nat. Rev. Mol. Cell Biol.
22
,
589
607
35
Gall-Duncan
,
T.
,
Sato
,
N.
,
Yuen
,
R.K.C.
and
Pearson
,
C.E.
(
2022
)
Advancing genomic technologies and clinical awareness accelerates discovery of disease-associated tandem repeat sequences
.
Genome Res.
32
,
1
27
36
Swinnen
,
B.
,
Robberecht
,
W.
and
Van Den Bosch
,
L.
(
2020
)
RNA toxicity in non-coding repeat expansion disorders
.
EMBO J.
39
,
e101112
37
Annear
,
D.J.
and
Kooy
,
R.F.
(
2023
)
Unravelling the link between neurodevelopmental disorders and short tandem CGG-repeat expansions
.
Emerg. Top. Life Sci.
,
ETLS20230021
38
Kumar
,
M.
,
Tyagi
,
N.
and
Faruq
,
M.
(
2023
)
The molecular mechanisms of spinocerebellar ataxias for DNA repeat expansion in disease
.
Emerg. Top. Life Sci.
,
ETLS20230013
39
Panoyan
,
M.A.
and
Wendt
,
F.R.
(
2023
)
The role of tandem repeat expansions in brain disorders
.
Emerg. Top. Life Sci.
,
ETLS20230022
40
McOmish
,
C.E.
,
Burrows
,
E.L.
and
Hannan
,
A.J.
(
2014
)
Identifying novel interventional strategies for psychiatric disorders: integrating genomics, ‘enviromics’ and gene-environment interactions in valid preclinical models
.
Br. J. Pharmacol.
171
,
4719
4728
41
Mitra
,
I.
,
Huang
,
B.
,
Mousavi
,
N.
,
Ma
,
N.
,
Lamkin
,
M.
,
Yanicky
,
R.
et al. (
2021
)
Patterns of de novo tandem repeat mutations and their role in autism
.
Nature
589
,
246
250
42
Trost
,
B.
,
Engchuan
,
W.
,
Nguyen
,
C.M.
,
Thiruvahindrapuram
,
B.
,
Dolzhenko
,
E.
,
Backstrom
,
I.
et al. (
2020
)
Genome-wide detection of tandem DNA repeats that are expanded in autism
.
Nature
586
,
80
86
43
Hannan
,
A.J.
(
2021
)
Repeat DNA expands our understanding of autism spectrum disorder
.
Nature.
589
,
200
202
44
Annear
,
D.J.
,
Vandeweyer
,
G.
,
Sanchis-Juan
,
A.
,
Raymond
,
F.L.
and
Kooy
,
R.F.
(
2022
)
Non-Mendelian inheritance patterns and extreme deviation rates of CGG repeats in autism
.
Genome Res.
32
,
1967
1980
45
Mojarad
,
B.A.
,
Engchuan
,
W.
,
Trost
,
B.
,
Backstrom
,
I.
,
Yin
,
Y.
,
Thiruvahindrapuram
,
B.
et al. (
2022
)
Genome-wide tandem repeat expansions contribute to schizophrenia risk
.
Mol. Psychiatry
27
,
3692
3698
46
Wen
,
J.
,
Trost
,
B.
,
Engchuan
,
W.
,
Halvorsen
,
M.
,
Pallotto
,
L.M.
,
Mitina
,
A.
et al. (
2023
)
Rare tandem repeat expansions associate with genes involved in synaptic and neuronal signaling functions in schizophrenia
.
Mol. Psychiatry
28
,
475
482
47
Bustos
,
B.I.
,
Billingsley
,
K.
,
Blauwendraat
,
C.
,
Gibbs
,
J.R.
,
Gan-Or
,
Z.
,
Krainc
,
D.
et al. (
2023
)
Genome-wide contribution of common short-tandem repeats to Parkinson's disease genetic risk
.
Brain
146
,
65
74
48
Erwin
,
G.S.
,
Gürsoy
,
G.
,
Al-Abri
,
R.
,
Suriyaprakash
,
A.
,
Dolzhenko
,
E.
,
Zhu
,
K.
et al. (
2023
)
Recurrent repeat expansions in human cancer genomes
.
Nature
613
,
96
102
49
Mukamel
,
R.E.
,
Handsaker
,
R.E.
,
Sherman
,
M.A.
,
Barton
,
A.R.
,
Hujoel
,
M.L.A.
,
McCarroll
,
S.A.
et al. (
2023
)
Repeat polymorphisms underlie top genetic risk loci for glaucoma and colorectal cancer
.
Cell
186
,
3659
3673.e23
50
Wendt
,
F.R.
,
Pathak
,
G.A.
and
Polimanti
,
R.
(
2022
)
Phenome-wide association study of loci harboring de novo tandem repeat mutations in UK Biobank exomes
.
Nat. Commun.
13
,
7682
51
Logsdon
,
G.A.
,
Vollger
,
M.R.
and
Eichler
,
E.E.
(
2020
)
Long-read human genome sequencing and its applications
.
Nat. Rev. Genet.
21
,
597
614
52
Chaisson
,
M.J.P.
,
Sulovari
,
A.
,
Valdmanis
,
P.N.
,
Miller
,
D.E.
and
Eichler
,
E.E.
(
2023
)
Advances in the discovery and analyses of human tandem repeats
.
Emerg. Top. Life Sci.
,
ETLS20230074
53
Chintalaphani
,
S.R.
,
Pineda
,
S.S.
,
Deveson
,
I.W.
and
Kumar
,
K.R.
(
2021
)
An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics
.
Acta Neuropathol. Commun.
9
,
98
54
Stevanovski
,
I.
,
Chintalaphani
,
S.R.
,
Gamaarachchi
,
H.
,
Ferguson
,
J.M.
,
Pineda
,
S.S.
,
Scriba
,
C.K.
et al. (
2022
)
Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing
.
Sci. Adv.
8
,
eabm5386
55
Rafehi
,
H.
,
Bennett
,
M.F.
and
Bahlo
,
M.
(
2023
)
Detection and discovery of repeat expansions in ataxia enabled by next-generation sequencing: present and future
.
Emerg. Top. Life Sci.
,
ETLS20230018
56
Gymrek
,
M.
,
Golan
,
D.
,
Rosset
,
S.
and
Erlich
,
Y.
(
2012
)
lobSTR: a short tandem repeat profiler for personal genomes
.
Genome Res.
22
,
1154
1162
57
Gymrek
,
M.
and
Erlich
,
Y.
(
2013
)
Profiling short tandem repeats from short reads
.
Methods Mol. Biol.
1038
,
113
135
58
Gelfand
,
Y.
,
Hernandez
,
Y.
,
Loving
,
J.
and
Benson
,
G.
(
2014
)
VNTRseek-a computational tool to detect tandem repeat variants in high-throughput sequencing data
.
Nucleic Acids Res.
42
,
8884
8894
59
Dolzhenko
,
E.
,
van Vugt
,
J.J.F.A.
,
Shaw
,
R.J.
,
Bekritsky
,
M.A.
,
van Blitterswijk
,
M.
,
Narzisi
,
G.
et al. (
2017
)
Detection of long repeat expansions from PCR-free whole-genome sequence data
.
Genome Res.
27
,
1895
1903
60
Kristmundsdóttir
,
S.
,
Sigurpálsdóttir
,
B.D.
,
Kehr
,
B.
and
Halldórsson
,
B.V.
(
2017
)
popSTR: population-scale detection of STR variants
.
Bioinformatics
33
,
4041
4048
61
Bahlo
,
M.
,
Bennett
,
M.F.
,
Degorski
,
P.
,
Tankard
,
R.M.
,
Delatycki
,
M.B.
and
Lockhart
,
P.J.
(
2018
)
Recent advances in the detection of repeat expansions with short-read next-generation sequencing
.
F1000Res.
7
,
F1000 Faculty Rev-736
62
Bakhtiari
,
M.
,
Shleizer-Burko
,
S.
,
Gymrek
,
M.
,
Bansal
,
V.
and
Bafna
,
V.
(
2018
)
Targeted genotyping of variable number tandem repeats with adVNTR
.
Genome Res.
28
,
1709
1719
63
Dashnow
,
H.
,
Lek
,
M.
,
Phipson
,
B.
,
Halman
,
A.
,
Sadedin
,
S.
,
Lonsdale
,
A.
et al. (
2018
)
STRetch: detecting and discovering pathogenic short tandem repeat expansions
.
Genome Biol.
19
,
121
64
Dolzhenko
,
E.
,
Deshpande
,
V.
,
Schlesinger
,
F.
,
Krusche
,
P.
,
Petrovski
,
R.
,
Chen
,
S.
et al. (
2019
)
Expansionhunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions
.
Bioinformatics
35
,
4754
4756
65
Mitsuhashi
,
S.
,
Frith
,
M.C.
,
Mizuguchi
,
T.
,
Miyatake
,
S.
,
Toyota
,
T.
,
Adachi
,
H.
et al. (
2019
)
Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
.
Genome Biol.
20
,
58
66
Bolognini
,
D.
,
Magi
,
A.
,
Benes
,
V.
,
Korbel
,
J.O.
and
Rausch
,
T.
(
2020
)
TRiCoLOR: tandem repeat profiling using whole-genome long-read sequencing data
.
Gigascience
9
,
giaa101
67
Dolzhenko
,
E.
,
Bennett
,
M.F.
,
Richmond
,
P.A.
,
Trost
,
B.
,
Chen
,
S.
,
van Vugt
,
J.J.F.A.
et al. (
2020
)
Expansionhunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
.
Genome Biol.
21
,
102
68
Chiu
,
R.
,
Rajan-Babu
,
I.-S.
,
Friedman
,
J.M.
and
Birol
,
I.
(
2021
)
Straglr: discovering and genotyping tandem repeat expansions using whole genome long-read sequences
.
Genome Biol.
22
,
224
69
Mousavi
,
N.
,
Margoliash
,
J.
,
Pusarla
,
N.
,
Saini
,
S.
,
Yanicky
,
R.
and
Gymrek
,
M.
(
2021
)
TRTools: a toolkit for genome-wide analysis of tandem repeats
.
Bioinformatics
37
,
731
733
70
Dashnow
,
H.
,
Pedersen
,
B.S.
,
Hiatt
,
L.
,
Brown
,
J.
,
Beecroft
,
S.J.
,
Ravenscroft
,
G.
et al. (
2022
)
STRling: a k-mer counting approach that detects short tandem repeat expansions at known and novel loci
.
Genome Biol.
23
,
257
71
Fang
,
L.
,
Liu
,
Q.
,
Monteys
,
A.M.
,
Gonzalez-Alegre
,
P.
,
Davidson
,
B.L.
and
Wang
,
K.
(
2022
)
Deeprepeat: direct quantification of short tandem repeats on signal data from nanopore sequencing
.
Genome Biol.
23
,
108
72
Vollger
,
M.R.
,
Kerpedjiev
,
P.
,
Phillippy
,
A.M.
and
Eichler
,
E.E.
(
2022
)
Stainedglass: interactive visualization of massive tandem repeat structures with identity heatmaps
.
Bioinformatics
38
,
2049
2051
73
Taylor
,
A.S.
,
Barros
,
D.
,
Gobet
,
N.
,
Schuepbach
,
T.
,
McAllister
,
B.
,
Aeschbach
,
L.
et al. (
2022
)
Repeat detector: versatile sizing of expanded tandem repeats and identification of interrupted alleles from targeted DNA sequencing
.
NAR Genom. Bioinform.
4
,
lqac089
74
Lundström
,
O.S.
,
Adriaan Verbiest
,
M.
,
Xia
,
F.
,
Jam
,
H.Z.
,
Zlobec
,
I.
,
Anisimova
,
M.
et al. (
2023
)
WebSTR: a population-wide database of short tandem repeat variation in humans
.
J Mol Biol.
435
,
168260
75
Rajan-Babu
,
I.-S.
,
Peng
,
J.J.
,
Chiu
,
R.
;
IMAGINE Study
;
CAUSES Study
;
Li
,
C.
et al. (
2021
).
Genome-wide sequencing as a first-tier screening test for short tandem repeat expansions
.
Genome Med.
13
,
126
76
Fearnley
,
L.G.
,
Bennett
,
M.F.
and
Bahlo
,
M.
(
2022
)
Detection of repeat expansions in large next generation DNA and RNA sequencing data without alignment
.
Sci. Rep.
12
,
13124
77
Erdmann
,
H.
,
Schöberl
,
F.
,
Giurgiu
,
M.
,
Leal Silva
,
R.M.
,
Scholz
,
V.
,
Scharf
,
F.
et al. (
2023
)
Parallel in-depth analysis of repeat expansions in ataxia patients by longread sequencing
.
Brain
146
,
1831
1843
78
Gymrek
,
M.
,
Willems
,
T.
,
Guilmatre
,
A.
,
Zeng
,
H.
,
Markus
,
B.
,
Georgiev
,
S.
et al. (
2016
)
Abundant contribution of short tandem repeats to gene expression variation in humans
.
Nat. Genet.
48
,
22
29
79
Quilez
,
J.
,
Guilmatre
,
A.
,
Garg
,
P.
,
Highnam
,
G.
,
Gymrek
,
M.
,
Erlich
,
Y.
et al. (
2016
)
Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans
.
Nucleic Acids Res.
44
,
3750
3762
80
Zain
,
R.
and
Smith
,
C.I.E.
(
2019
)
Targeted oligonucleotides for treating neurodegenerative tandem repeat diseases
.
Neurotherapeutics
16
,
248
262
81
Tabrizi
,
S.J.
,
Estevez-Fraga
,
C.
,
van Roon-Mom
,
W.M.C.
,
Flower
,
M.D.
,
Scahill
,
R.I.
,
Wild
,
E.J.
et al. (
2022
)
Potential disease-modifying therapies for huntington's disease: lessons learned and future opportunities
.
Lancet Neurol.
21
,
645
658
82
van der Bent
,
M.L.
,
Evers
,
M.M.
and
Vallès
,
A.
(
2022
)
Emerging therapies for Huntington's disease: focus on N-terminal Huntingtin and Huntingtin exon 1
.
Biologics
16
,
141
160
83
Moreira-Gomes
,
T.
and
Nóbrega
,
C.
(
2023
)
From the disruption of RNA metabolism to the targeting of RNA-binding proteins: the case of polyglutamine spinocerebellar ataxias
.
J. Neurochem.
84
Kong
,
G.
,
Cao
,
K.L.
,
Judd
,
L.M.
,
Li
,
S.
,
Renoir
,
T.
and
Hannan
,
A.J.
(
2020
)
Microbiome profiling reveals gut dysbiosis in a transgenic mouse model of Huntington's disease
.
Neurobiol. Dis.
135
,
104268
85
Gubert
,
C.
,
Choo
,
J.M.
,
Love
,
C.J.
,
Kodikara
,
S.
,
Masson
,
B.A.
,
Liew
,
J.J.M.
et al. (
2022
)
Faecal microbiota transplant ameliorates gut dysbiosis and cognitive deficits in Huntington's disease mice
.
Brain Commun.
4
,
fcac205
86
Bennett
,
C.F.
,
Kordasiewicz
,
H.B.
and
Cleveland
,
D.W.
(
2021
)
Antisense drugs make sense for neurological diseases
.
Annu. Rev. Pharmacol. Toxicol.
61
,
831
852
87
Childs-Disney
,
J.L.
,
Yang
,
X.
,
Gibaut
,
Q.M.R.
,
Tong
,
Y.
,
Batey
,
R.T.
and
Disney
,
M.D.
(
2022
)
Targeting RNA structures with small molecules
.
Nat. Rev. Drug Discov.
21
,
736
762
88
Pinto
,
B.S.
,
Saxena
,
T.
,
Oliveira
,
R.
,
Méndez-Gómez
,
H.R.
,
Cleary
,
J.D.
,
Denes
,
L.T.
et al. (
2017
)
Impeding transcription of expanded microsatellite repeats by deactivated Cas9
.
Mol. Cell
68
,
479
490.e5
89
Yau
,
W.Y.
,
O'Connor
,
E.
,
Sullivan
,
R.
,
Akijian
,
L.
and
Wood
,
N.W.
(
2018
)
DNA repair in trinucleotide repeat ataxias
.
FEBS J.
285
,
3669
3682
90
Deshmukh
,
A.L.
,
Caron
,
M.C.
,
Mohiuddin
,
M.
,
Lanni
,
S.
,
Panigrahi
,
G.B.
,
Khan
,
M.
et al. (
2021
)
FAN1 exo- not endo-nuclease pausing on disease-associated slipped-DNA repeats: a mechanism of repeat instability
.
Cell Rep.
37
,
110078
91
Deshmukh
,
A.L.
,
Porro
,
A.
,
Mohiuddin
,
M.
,
Lanni
,
S.
,
Panigrahi
,
G.B.
,
Caron
,
M.C.
et al. (
2021
)
FAN1, a DNA repair nuclease, as a modifier of repeat expansion disorders
.
J. Huntingtons Dis.
10
,
95
122
92
Wheeler
,
V.C.
and
Dion
,
V.
(
2021
)
Modifiers of CAG/CTG repeat instability: insights from mammalian models
.
J. Huntingtons Dis.
10
,
123
148
93
Zhao
,
X.
,
Kumari
,
D.
,
Miller
,
C.J.
,
Kim
,
G.Y.
,
Hayward
,
B.
,
Vitalo
,
A.G.
et al. (
2021
)
Modifiers of somatic repeat instability in mouse models of friedreich ataxia and the fragile X-related disorders: implications for the mechanism of somatic expansion in Huntington's disease
.
J. Huntingtons Dis.
10
,
149
163
94
McGinty
,
R.J.
and
Mirkin
,
S.M.
(
2018
)
Cis- and trans-modifiers of repeat expansions: blending model systems with human genetics
.
Trends Genet.
34
,
448
465
95
Polleys
,
E.J.
,
Del Priore
,
I.
,
Haber
,
J.E.
and
Freudenreich
,
C.H.
(
2023
)
Structure-forming CAG/CTG repeats interfere with gap repair to cause repeat expansions and chromosome breaks
.
Nat. Commun.
14
,
2469
96
Nakamori
,
M.
,
Panigrahi
,
G.B.
,
Lanni
,
S.
,
Gall-Duncan
,
T.
,
Hayakawa
,
H.
,
Tanaka
,
H.
et al. (
2020
)
A slipped-CAG DNA-binding small molecule induces trinucleotide-repeat contractions in vivo
.
Nat. Genet.
52
,
146
159
97
Verma
,
A.K.
,
Khan
,
E.
,
Bhagwat
,
S.R.
and
Kumar
,
A.
(
2020
)
Exploring the potential of small molecule-based therapeutic approaches for targeting trinucleotide repeat disorders
.
Mol. Neurobiol.
57
,
566
584
98
Hasuike
,
Y.
,
Tanaka
,
H.
,
Gall-Duncan
,
T.
,
Mehjary
,
M.
,
Nakatani
,
K.
,
Pearson
,
C.E.
et al. (
2022
)
CAG repeat-binding small molecule improves motor coordination impairment in a mouse model of Dentatorubral-pallidoluysian atrophy
.
Neurobiol. Dis.
163
,
105604
99
Sulovari
,
A.
,
Li
,
R.
,
Audano
,
P.A.
,
Porubsky
,
D.
,
Vollger
,
M.R.
,
Logsdon
,
G.A.
et al. (
2019
)
Human-specific tandem repeat expansion and differential gene expression during primate evolution
.
Proc. Natl Acad. Sci. U.S.A.
116
,
23243
23253
100
Steely
,
C.J.
,
Watkins
,
W.S.
,
Baird
,
L.
and
Jorde
,
L.B.
(
2022
)
The mutational dynamics of short tandem repeats in large, multigenerational families
.
Genome Biol.
23
,
253
101
Verbiest
,
M.
,
Maksimov
,
M.
,
Jin
,
Y.
,
Anisimova
,
M.
,
Gymrek
,
M.
and
Bilgin Sonay
,
T.
(
2023
)
Mutation and selection processes regulating short tandem repeats give rise to genetic and phenotypic diversity across species
.
J. Evol. Biol.
36
,
321
336
102
Mukamel
,
R.E.
,
Handsaker
,
R.E.
,
Sherman
,
M.A.
,
Barton
,
A.R.
,
Zheng
,
Y.
,
McCarroll
,
S.A.
et al. (
2021
)
Protein-coding repeat polymorphisms strongly shape diverse human phenotypes
.
Science
373
,
1499
1505
This is an open access article published by Portland Press Limited on behalf of the Biochemical Society and the Royal Society of Biology and distributed under the Creative Commons Attribution License 4.0 (CC BY). Open access for this article was enabled by the participation of University of Melbourne in an all-inclusive Read & Publish agreement with Portland Press and the Biochemical Society under a transformative agreement with CAUL.