EFTA01140242.pdf

DataSet-9 14 pages 12,142 words document
👁 1 💬 0
📄 Extracted Text (12,142 words)
Personalized genomic disease risk of volunteers
Manuel L. Gonzalez-Garay'', Amy L. McGuireb, Stacey Pereirab, and C. Thomas Casket'

'Center for Molecular Imaging, Division of Genomia and Bioinformatics, The Brown Foundation Institute of Molecular Medicine, University of Texas Health
Science Center, Houston, TX 77030; and ', Center for Medical Ethics and Health Policy, Department of Medicine and Medical Ethics, and 'Department of
Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030
Contributed by C. Thomas Caskey, August 27, 2013 (sent for review July 11, 2013)
Next-generation sequencing (NGS) Is commonly used for                        Results
researching the causes of genetic disorders. However, its useful-            Categories of Variants to Report to Patients. Variants obtained
ness in clinical practice for medical diagnosis is in early de-              from our workflow (described in Fig, 1) were reported using
velopment. In this report, we demonstrate the value of NGS for               three categories. Our first variant category consists of variants
genetic risk assessment and evaluate the limitations and barriers            identified in an individual where the alleles are found in Human
for the adoption of this technology into medical practice. We                Genome Mutation Database (HGMD) (13, 14) and labeled
performed whole exome sequencing (WES) on 81 volunteers, and                 disease-causing mutations (DM). These alleles also were re-
for each volunteer, we requested personal medical histories,                 quired to be rare (<1% allele frequency in 6,500 exomes from
constructed a three-generation pedigree, and required their                  the National Heart, Lung, and Blood Institute (NHLBI) Exome
participation in a comprehensive educational program. We lim-                Sequencing Project (15) and the 1,000 Genomes Project
ited our clinical reporting to disease risks based on only rare              Genomes (16, 17)] and predicted to be damaging to protein
damaging mutations and known pathogenic variations in genes                  function by two of three predictions algorithms [Polyphen 2.0
previously reported to be associated with human disorders. We                (18), Sift (19-24), and MutationTaster (25)] using Database of
identified 271 recessive risk alleles (214 genes), 126 dominant risk         Human Non-synonymous SNVs and their functional predictions
alleles (101 genes), and 3 X-recessive risk alleles (3 genes). We            and annotations (dbNSFP) (26) as described in Fig. 2. The genome
linked personal disease histories with causative disease genes in            sequence data of each volunteer were reviewed and interpreted,
18 volunteers. Furthermore, by incorporating family histories into           taking into account personal medical history, a three-generation
our genetic analyses, we identified an additional five heritable             pedigree with family history of diseases, and bioinformatics
diseases. Traditional genetic counseling and disease education               analysis. The medical history of each volunteer in this cohort was
were provided in verbal and written reports to all volunteers. Our           rich with detail because each had a private physician used for
report demonstrates that when genome results are carefully                   annual examinations, and in some cases. disease therapy. Fig. 3
interpreted and integrated with an individual's medical records              summarizes the results of our pipeline: we recruited 81 non-
and pedigree data, NGS is a valuable diagnostic tool for genetic             related volunteers and sequenced their genomic DNA using
disease risk.                                                                exome sequencing. We detected 65,582 unique nonsyttonymous
                                                                             coding variants (nscv). Every nscv was interrogated for human
molecular medicine I disease prediction I whole exome sequencing             inherited disease mutations using the HGMD (13, 14) database
                                                                             from Biobase (DM category consisting of 109,708 variations).
                                                                             We were able to detect 1,036 HGMD (13, 14) DM variations.
S   equencing the whole genome of patients with genetic dis-
     orders has become reality since the sequencing of the first
individual human in 2007 (1). Further advances in massively
                                                                             After using the filters described in Fig. 2, the number was reduced
                                                                             to 275 pathogenic variants. We identified in our cohort 208 au-
                                                                             tosomal recessive (AR) alleles (169 genes), 64 autosomal domi-
parallel DNA sequencing are reducing the price of sequencing                 nant (AD) alleles (44 genes), and three X-linked recessive (XLR)
an entire genome or exome. The quality and speed of sequencing
and analyzing a personal genome are improving at an unprece-                   Significance
dented pace, making possible the introduction of next-generation
sequencing (NGS) into the clinic on a research basis (2-7).                    Replacing traditional methods for genetic testing of inheritable
Advancements in NOS have stimulated international research                     disorders with next-generation sequencing (NGS) will reduce
initiatives to identify genetic links to rare disorders in children,           the cost of genetic testing and increase the information avail-
with an average diagnostic success of 20-25% and the discovery                 able for the patients. NGS will become an invaluable resource
of new disease-gene associations (8-12).                                       for the patient and physicians, especially if the sequencing in-
   The rapidly increasing number of aging adults in our society                formation is stored properly and reanalyzed as bioinformatics
will place unprecedented demands on the health care system. To                 tools and annotations improve. NGS is still at the early stages
provide adults with a healthy longevity we need to develop                     of development and it is full of false-positive and -negative
a system to identify genetic risk and apply early intervention on              results and requires infrastructure and specialized personnel to
pathology progression. In this report, we decided to sequence the              properly analyze the results. This paper will explain our expe-
whole exomes of a healthy adult cohort of 81 volunteers and                    rience with an adult population, our bioinformatics analysis,
evaluate the value of applying NOS in combination with medical                 and our clinical decisions to assure that our genetic diagnostics
history and pedigree data. In this report we plan to address three             were accurate to detect carrier status and serious medical
main questions. (i) What genetic discoveries need to be provided               conditions in our volunteers.
to the volunteers? (ii) What is the practical value of delivering
this information to volunteers? (iii) What are the challenges and            Author contributions: m I.G.-0. and CT.C. designed research; PAL.G.4 . A LM.. 5.P.. and
barriers to the adoption of this powerful technology into medical            CT.C. performed research; PA.L.G.4. analysed data; and M.LGA...A.L.M. S.P. and CT.C.
practice?                                                                    mote the paper.
   The individual genetic reports yield helpful medical risk in-             The authors declare no conflict of Interest.
formation, suggesting that population sequencing of asymptom-                Freely available online through the PNAS open access option.
atic adults may prove to be valuable and useful. We provided to              'To whom correspondence may be addressed. E-mail: manuell.GonzalezGarayeluth,unc.
the participants, under our institutional review board, genetic               edu or tcaskeyelbotedu.
risk findings from the analyses and genetic counseling to discuss            This article contains supporting information online at vninv.pnas.orgiloalcupisuppildoi:10.
their results.                                                               1073/Dna 13159341IONIXTv0Plementet

www.yroas.orgrcgikloill0.10734mas.1315936110                                                                                           PNAS Earty Edition I 1of 6




                                                                                                                                                       EFTA01140242
                                                                                     physicians for further analytes measurements. There were two
                                    IINININOISNONInC•                                individuals with morbid obesity (body mass index of 32 and 37
                                                                                     kghto who carried an MC4R allele associated with pediatric
         NovoMen moNnint la enlocenot MIN Went+ meow,:
                                                                   SAPAtesb/Picard   obesity and rare heterozygotic adults (35, 36). Two ophthalmo-
                         SAMMe                     .Remove duplicate                 logic disease/gene associations were identified. The childhood
                                                   •Rnallbrate aligrenents   GAN     brittle corneal syndrome type 1 occurred in a volunteer who had
                                                   •local rtalignmnits               undergone successful corneal transplant and carried a putative
                                                                                     compound heterozygosity in ZNF469 (37). One volunteer was
                            GAIN% Bantian                                            under care for macular dystrophy and carried an ABCA4 allele
a                 c" dance  SNIVIndel taller                                         (38). One sterile male volunteer was found to have an insertion
                  %N s                                                               in gene USP26 (known to be responsible for infertility in men)
                         SNM                annotated sciostindels a IMoons
                         Welt                   Cereal° deournamoon                  (39). Associations for melanoma and breast cancer were identi-
                                   uwEll                                             fied. The two patients with melanoma carried different gene
                                   ANNMAR                                            allele associations: GRIN2A and BAG4 (40-42). Two volunteers
                                                                                     diagnosed with breast cancer had different allele associations in
Hg. 1. Workflow for processing NGS data. Raw sequencing data are aligned             BRCA2 (43, 44). Single cases of early onset prostate (LRP2) (45)
against the reference sequence using Novoalign software from NovoCraft.
                                                                                     and follicular thyroid cancer (TPR) cancer were identified (46,
SAM files are preprocessed using SAMtoots and Picard to create BAM files and
                                                                                     47). A volunteer with nonsyndromic deafness was found to have
remove duplicates. The Genome Analysis Tcolkit (GATK) is then used to
recalibrate the alignments, perform local realignments, and identify SNPs and
                                                                                     risk alleles in two genes associated with autosomal dominant
indels. Finally, SnpEff and ANNOVAR are used to annotate variants.                   (AD) deafness and had a three-generation positive family history
                                                                                     of deafness (48). In each case, the volunteer was instructed to
                                                                                     inform their Physician and was requested to confirm the ge-
alleles (3 genes). These data resulted in an average of 3.5 disease                  nomic allele identification in a Clinical Laboratory Improve-
allele reports per volunteer.                                                        ment Amendments (CLIA)-certified laboratory, even when each
   The approach for a second category of variants consisted of                       reported allele had been sequenced twice in independent studies.
creating a personalized list of candidate genes from Online                          The finding provided information for personal and family risk
Mendelian Inheritance in Man (OMIM) (27, 28) known to be                             counseling not possible before gene association.
associated with the disorders reported in the medical literature.
                                                                                     Incorporation of Three•Generation Pedigrees into the Genetic Analyses.
We detected 131 alleles (131 genes) using this approach. Each one
of these variants provided a potential causation for the volunteer's                 The three-generation pedigree medical information was analyzed
disorders. Each one of the variations obtained from this approach                    to identify those volunteer families who warranted additional ge-
passed our stringent pipeline. This approach added on average                        netic study. Table S2 lists those genetic disorders identified by
another 2.0 disease alleles per volunteer report.                                    pedigree/familial medical history. In each case, the volunteer was
   The third approach used a family history to create a person-                      counseled for the family risk and encouraged to contact at risk
alized list of candidate genes from OMIM (27, 28). and as be-                        family members who may benefit from focused genetic studies.
fore, we compared our list of candidate genes with the disorders                     Three of the families have reported that they have had their fa-
reported in the family history.                                                      milial genetic diagnosis resolved at this time paraganglioma (49),
   Before reporting an allele to the volunteer, we reviewed the                      Prader-Willi syndrome (50, 51), and ankylosing spondylitis (AS)
original publications that support the pathogenicity of all of the                   (52)1. One additional family is under study rourette syndrome
alleles (HGMD) and/or the evidence associating the gene with                         (534 Additional familial disease risks were identified by history
the disorder (OMIM). At this time, all three abovementioned                          for atrial fibrillation (AR), bicuspid aortic valve (BAV), dyslexia
categories of investigation were reported in full recognition;                       (AR), Fatny's (XLR), gall stones (AD), and myotonic dystrophy
some would be found to be non-disuse-producing alleles as                            (anticipation AD). Success with this approach was productive but
databases improve and functional assays complement informatics                       not universally accepted because disease/gene resolution requires
predictions. We have updated clinical reports as these data                          interaction with interested and motivated family members.
emerged and counseled the patients on the options for reducing or
eliminating the disease risk.
                                                                                                         WWICall Rant NM            1               Sam Ms
Disease Genes Identified in the Cohort. Table SI summarizes our
                                                                                                                        dOSNP 132                   CGI var.annosadonFie
disease associations. Matching personal medical records to per-
sonal genome reports was informative. We elected to report                                                      ,0. GamretSess brown Gene OK%)
findings as disease-gene associations instead of reporting findings
as diagnostic because we did not included in our study traditional                                                      mpgfl LANNCNAR (na-toding vatianta)
"surrogate markers" (analytes, proteins, and imaging) for the                                                    4 Stank:IKIND Db foe (Mum, Cause* Mutations
confirmation of a disease diagnosis. We considered potentially                                                                          1%
                                                                                                                        Aker Out variants MM >r
causative findings to be those mutations that are predicted to be                                               n     Should have been convect damaging for
damaging in addition to being reported in either HGMD (13, 14)                                                        menu 2/3 ptecutions tools


                                                                                                                .11
or OMIM (27, 28) databases. These mutations are considered to                                                       . frolyphen-2. Sift and hivtatronTaster1
be "need to know" and are reported to volunteers. There was                                                           Sternalfrequencyfilter < 3%
identification of associations for vascular disease and/or hyper-
cholesterolemia in five individuals related to LDL receptor                                              YPOVarlards Won
(LDLR) alleles. LDLR mutations are causative of early onset                          Fig. 2. Pipeline to generate variants reports. Every variant in the variant call
autosomal dominant coronary artery disease (CAD) and manifest                        format file is annotated using spnEff and ANNOVAR; nonsynonymous cod-
hypercholesterolemia (29, 30). Three individuals were taking                         ing variants are annotated using the commercial version of the HGMD da-
statins related to their hypercholesterolemia. Two individuals                       tabase. (Left) Our selection of variants by the creation of a personalized
were not under care but had history of personal hypercholester-                      candidate gene list using medical history and family history for each vol-
olemia and in one case a son with hypercholesterolemia.                              unteer. Mutations with a minor allele frequency of >1% are removed using
   There were four volunteers detected with risk genes for di-                       frequencies from the NHLSI exome sequencing project (ESP), 1,000 Genomes
abetes mellitus (31-34). Two of the individuals were under                           Project. Variants that are consider benign by two of three predictions tools
therapy for diabetes 2, whereas two additional volunteers had                        are removed (using dbNSFP). Finally, we remove variants that are present in
elevated fasting blood sugars and were being followed by their                       our cohort more than three times.


2 of 6 I www.pnas.orgfcgikloW10.10734mas.1315934110                                                                                                 Gonzalez-Garay et al.




                                                                                                                                                          EFTA01140243
                                                                 81volunteers

                                                         65,582 NSCV NSC-snps Exon Sequencing
                            Using HGMD
                            (109,708 annotated
                                                         1,036 NSC-sts from HGMD
                            variants)
                                                            A
                                275 NSC-snps from HGMD after filtering             160 NSC•snps from OMIM

                                                Medical and family History Interpretation
                                                                             ck.1
                             Medical History         Family History          Negative History
                                      B                      B            206 HGMD Autosomal recessive (169 Genes)
                               23 disease-gene          4 resolved
                                                                          63 MAIM (No.HGMD)Autosemal recessive (63 Genes)
                               associations             1In progress      3 HGMD X linked recessive (3 Genes)
                                                                          6 OMIM (No-HGMD) X linked recessive (6 Genes)
                                                                          64 HGMD Autosomal Dominant (44 Genes)
                                                                          62 °Mill (tio.HGIND)AulosoM31Dominant (62 Genes)

              Fig. 3. Summary of result. The flowchart provides the number of variants from each step of the pipeline described in Fig. 2.

   Table S3 provides a sampling of the recessive risk alleles. They          BRCAI, BRCAZ PALB2, R4D5IC, and RADS& Two volunteers
constitute the majority of the observed alleles. Of the 160 off-             with BRCA2 risk alleles were diagnosed with breast cancer. One
spring of the 81 volunteers, no children were affected with these            man carried a premature chain termination mutation and has
disorders. MI volunteers indicated their families were complete,             a first-degree relative with breast cancer (50s). A third volunteer
and thus, no spousal genetic studies were recommended, but                   had a frame shift mutation (high-risk allele) but not found to
information was proposed to be provided to reproductive age                  have breast cancer. All alleles were predicted to be damaging.
descendants. Many of the genes identified are pan of prenatal                Eight volunteers had first-degree relatives with breast cancer,
carrier screens and/or newborn state-sponsored screening pro-                whereas four had a negative family history of disease. All were
grams [phenylketonuria, maple syrup urine disease, cystic fibro-             advised to seek confirmation via a CLIA-cenified laboratory.
sis, Niemann-Pick disease, Gaucher disease, factor V Leiden                  One patient with an HGMD (13, 14) allele was confirmed but
thrombophilia, medium-chain acyl-CoA dehydrogenase (MCAD)                    predicted to be "neutral" by a commercial laboratory. All were
deficiency]. Undoubtedly, NGS will expand the number of non-                 counseled regarding the need for regular mammograms and
unreported disease alleles and scope of genes studied for couples            gynecological examinations and were requested to inform their
in the pregnancy setting. The Beyond Batten Disease Foundation               physician of this research risk allele identification.
of Austin, TX (54), has this goal.                                              Table S6 displays the colon cancer alleles. There was no disease
   Table S4 shows that a category of high concern was the                    incidence of colon cancer in this group with the exception of one
identification of XLR disease risk alleles among our female vol-             volunteer with a positive dysplastic polyp biopsy. Five volunteers
unteers. One volunteer had an affected son (isolated case) with              had a positive family history of colon cancer. Five volunteers had
Fabry disease that was diagnosed before our study. There were                no family history of disease. All were advised to obtain confir-
four disease alleles identified, each listed in HGMD (13, 14).               matory CLIA-certified laboratory diagnosis and advise their phy-
There was no family history of these disorders found in the three-           sician of the research allele identification. Of the 10 volunteers,
generation pedigree of each. MI were counseled to have their test            many had undergone colonoscopy as pan of their health care.
confirmed and daughters studied in a CLIA-certified laboratory                  Table S7 includes all of the remaining type of cancers. Two
given the high disease risk (50% for men). Three men in our study            volunteers diagnosed with melanomas were found to have dif-
had alleles predicted from the OMIM (27, 28) disease database to             ferent disease gene risk alleles. We identified 10 volunteers with
be causative for cutis laxus, Duchenne muscular dystrophy, con-              prostate risk alleles. One volunteer reported a diagnosis of
genital nystagmus, and hemophilia A, illustrating the challenge of           prostate cancer at age 55 while the other nine volunteers
predicting damaging mutations bioinformatically. None had the                reported no familial history of the disease. Genetic counseling
disorders. Counseling and family study were individualized for               for cancer risk required the greatest counseling time. The con-
each disease risk. Volunteers were made aware of database errors             cepts of the two-hit hypothesis (55) and "somatic mutations"
in the reports.                                                              (56) were difficult to grasp for the volunteers, even when we
   Tables S5-510 provide a third category that is very problem-              discussed the subject in great detail during the education session.
atic, the AD group. The allele identification is as previously               All volunteers were provided information regarding standard of
described, but counseling is more difficult because of variation in          practice approaches for early detection of the respective cancer.
severity and time onset. For this age group of volunteers, the                  Table S8 lists all of the affected volunteers with cardiomyop-
interest was high because disease prevention was frequently                  athies (57). Five volunteers had a medical history of cardiac
expressed as a goal in the face-to-face counseling meetings. A               dysrhythmia with identified risk alleles. One younger (50s) vol-
poststudy survey also reflected this objective. We focused in this           unteer had first-degree relatives requiring pacemakers and car-
paper on the three major causes of death in the United States:               ried two risk alleles. Three volunteers had either stent placements
cancer, cardiovascular disease, and neurodegenerative disease.               or bypass procedures related to CAD. Each was in their 70s.
In our analysis of each volunteer, we reviewed the genomic and                  Table S9 lists the 11 volunteers who had no apparent disease
family data.                                                                 but had a positive family history of tachycardia, sudden death,
   Table S5 lists the breast cancer risk results. There were 12              and CAD and carried risk alleles. We provide this experience to
volunteers found to have breast cancer risk alleles of genes                 broaden alertness to both genetic causation and risk of disease

Gonzalez-Garry et al.                                                                                                        PNOS tarty Edition I 3 of 6




                                                                                                                                             EFTA01140244
for adult-onset cardiovascular disease (58). Of the alleles listed        and the 69 sets of whole genomes from CGI (15-17, 67). How-
in Tables SE and S9, 13 alleles were found in HGMD (13, 14).              ever, we need larger datasets from very carefully phenotyped
We advised volunteers to inform their physicians of these results         patients to assist in the interpretation of the variants in our
for their long-term clinical care.                                        patients. The million genome project of the US Department of
   In Table SI0, we listed the results for adult-onset neurodegen-        Veterans Affairs (68) has the potential to provide such data, as
erative diseases. Our findings were limited but of high interest to the   well as private health plans considering adaptation of genome
cohort. It was frequently asked by volunteers if they had Alz-            sequencing.
heimer's risk. We summarize our findings for Alzheimer's and
Parkinson risk alleles (59, 60). The genes included APOE, APP,            Genetic Discoveries Provided to Volunteers. There are several
PSENI, MAPT, ElF461, GBA, GIGYF2, LRRIC.2, PARIC2, PM20DI,                approaches to disclose the results to volunteers. Groups like
and SNCA. There were nine volunteers with HGMD (13, 14) listed            Patel et al. use the statistics and epidemiology approach in
risk alleles. Of these, two had a positive family history of Parkinson    reporting the polygenic risk assessment using common SNPs that
disease and one with Alzheimer's disease. One of the PARK2 alleles        have been previous associated with genetic disorders from ge-
occurred in a volunteer who provided a history of three second-           nome-wide association studies (69). The PGP-10 project uses an
degree relatives in a sibship affected with disease. The reminder had     automated tool or Genome Environment Trait Evidence (GET-
no family history of either disease. There were 25 alleles predicted      Evidence) system, with is a system that is collaboratively edited
to be damaging. One is a frameshift allele. None of these volunteers      (70). For this project, we decided to focus on reporting only high-
had a family history of disease.                                          quality variants that are rare in the population and considered
                                                                          damaging by two of three commonly used predictions algorithms.
Discussion                                                                In addition, the variant has to be either reported in HGMD
Exome Sequendng Is Limited. The full spectrum of disease muta-            under category DM or the gene has to have been previous
tion identification is not satisfied by exome sequencing alone            associated with a genetic disorder (OMIM). The group of vol-
because large deletions, copy number variations (CNVs), and               unteers consisted of adults with complete medical and family
triplet repeats are not reliably identified at this time. Further-        history so we personalized the reports as described in Fig. 2 to
more, exon capture relies on probe design. For example, the               specifically try to identify molecular explanations for the mal-
discovery of the MAGEL2 mutation in our Prader-Willi patient              adies reported in their medical or family history. This approach
was made using whole genome sequencing (WGS) from com-                    generated reports that were easy to explain and accepted by the
plete genomics and missed by exome capture because of high GC             patients during the genetic counseling session.
content (51). The accuracy of coding allele identifications was.
however, quite high and thus of great utility as a genome                 Medical Histories and Family Pedigrees Complement Sequencing
screening approach. CGI (61) sequencing produced higher cov-              Resift. The utility of genome data was significantly enhanced
erage than exome sequencing data for CNV, large deletions,                when integrating standard medical care features of personal and
and regulatory elements will have utility as we analyze previously        family disease diagnosis. The significant number of 23 disease
labeled "junk" DNA for disease causation (62). There is also the          associations in all likelihood represents a bias of our volunteers
issue of our limited knowledge of disease alleles within the              to seek answers to their personal disease history. This observa-
databases. One of our biggest challenges for the interpretation of        tion may hold a key to how we obtain maximal use of genome
human genomes is the lack of gene annotations and the errors in           sequencing--sequence the disease index cases. Our experience
databases. Our knowledge base for human disorders is small.               would suggest a high value for that utilization. This approach has
There are only —100,000 pathogenic variants in the HGMD (13,              been clearly documented to be successful for pediatric genetic
14) database and a fraction of them have errors. If we do not use         disorders but not exploited for adult-onset disease. The practical
annotated variants but instead gene annotations as our source of          value of this study is summarized in Tables SI and S2 and fell
information, we can calculate the fraction of knowledge that we           into two general categories: (i) new knowledge of the genetic risk
can use at this time. For example, the number of genes associ-            and heritability for themselves and family; and (ii) options for
ated with human disorders reported by HGMD (13, 14), OMIM                 therapy (CAD) or imaging (cancer) for personal and extended
(27, 28), UniProtICB (63), Gene Atlas (64), etc. is 4,622. From           family care. By using the medical and family history, we were
the 4,622 genes, only 1,955 genes have high-quality data because          able to clarify the genetic risk in 6 of the 81 cases. One of the
they are part of the GeneTest (65) database. GeneTest (65) is             cases yielded a new discovery of a gene associated with Prader-
a database originally created by the National Center for Bio-             Willi syndrome. which is described in another paper (51).
technology Information to track all of the laboratories worldwide
that offer a genetic test for a gene. With this information, we           Prenatal vs. Adult Genetic Screening. The technology and this report
know that the fraction of genes that we can use for the in-               beg the question of whether we are prepared to offer adult disease
terpretation of a human genome of a successful high-quality               risk screening. Currently, prenatal and newborn screening for
whole exome or whole genome dataset is -7-18% when using                  a selected set of frequently occurring disease alleles (not genome
the high confidence set of 1,955 genes or a set of 4,622 genes.           sequencing) is a standard of practice. There are questions that
Despite these limitations, this report documents the utility for          deserve medical and ethical review before adult screening
disease associations and risk.                                            becomes a standard of practice. First, for reproductive and new-
   During the last few years, the field of NOS has developed              born diagnosis, typically only actionable childhood diseases are
a large number of tools that make it easier to handle the analysis        explored, which respects the future autonomy of the child and
of reads, variant calling, functional prediction, and annotation          preserves her right to an open future (71, 72). Because adult
(66). There are also large publicly available datasets of healthy         screening decisions would be made by an autonomous individual
individuals that can be used as controls that can be used to              for her own health decisions, broader conceptions of utility, in-
remove technology specific errors or filter out common poly-              cluding personal utility, need to be considered (73). It is a clear
morphisms. As we begin to use whole genome sequencing at an               and simple decision to provide patients with actionable genetic
increasing depth, we are discovering more variants, so these              information from a WES study; on the other hand, it is challenging
public datasets are becoming increasingly important for quality           and it raises a difficult ethical question to decide what to do with
control and filtering of variants in smaller projects. One of the         incidental genetic findings that are not actionable and could lead
main limitations is the lack of access to public and private ge-          to physiological distress to the patient (e.g. APO-E for Alzheimer
nome and exome variants. There are thousands of datasets, but             dictate). Despite this ethical dilemma our group of volunteers
the majority are inaccessible to the scientific community. We             elected to receive information even if the genetic information
recognize the existence of the 1,000 Genomes project, the                 might not be actionable. Only 3% of the volunteers were uncertain
NHLBI Exome Sequencing Project (ESP), Exome variant server,               about receiving nonactionable information (SIPausnuly Survey).

4 of 6 I www.pnas.orglegildoi/10.10734wias.13I5934110                                                                      Gotualez-Garay et al.




                                                                                                                                EFTA01140245
Volunteer Response to Clinical Reports. From our poststudy survey,                      technology and cost. Bioinformatics focused on the practical ex-
we found that 72% of the responders reported speaking with                              traction of medical relevant/actionable data are a challenge. We
their physician about their results. This raises important ques-                        relied heavily on HGMD alleles for "need to know" information
tions about whether nongeneticists are adequately prepared to                           to patients. This approach is flawed in three ways: (i) databases
counsel patients based on WES results and whether such follow-                          contain errors; (ii) highly validated disease databases are scattered,
up will lead to iatrogenic harm or unjustified use of health care                       private, and limited; and (iii) the future will provide more disease
resources (74). Twenty-five percent reported changing their                             risk alleles by sequencing than by patient reports in the literature.
behaviors because of the results, which is surprising given that                        Our current limitation for interpretation of a genome is not the
previous reports found no significant behavior change resulting                         quality of the data of the coverage of the genome but our disease
from adult risk screening in a direct-to-consumer setting (75).                         knowledge database. R. Cotton's Human Variome Project (62)
Despite that all of the participants were clearly informed that                         together with Beijing Genome Institute are proposing to create
their results originated from two independent sequencing experi-                        a highly validated disease allele database.
ments and that we advised them to have their results clinically                            New technological advances such as structure-based pre-
validated in a CLIA-certified laboratory, 78% reported that they                        diction of protein-protein interactions on a genome wide scale
did not have the results confirmed. This low percentage of                              (80), 3D structure of protein active and contact sites (SI), high-
confirmatory results from the volunteers raises the question of                         throughput functional assays of damaging alleles (81-83), and
whether it is sufficient to counsel research participants to have                       new approaches that combine analytes, metabolomics and ge-
results clinically confirmed or if investigators should be required                     netic information from a single individual (84) are just a few
to confirm results before disclosure.                                                   examples of the new technologies that will help us to generate
   It was apparent for some volunteers that they were seeking                           better interpretation of genomic data.
information related to familial diseases. Resolution of these                              The delivery of the genome risk information will need to be
questions required family member interest and motivation be-                            carried out by a new cadre of physicians and counselors skilled in
cause, in all cases, we had sequenced the nonrisk family mem-                           medicine, genetics, and education/counseling. These experts will
ber. We followed up each case with a referral to a qualified                            need to integrate into medical care as well as has been done for
genetics program with diagnostic capacity for the suspected                             newborn screening, prenatal diagnosis, and newborn genetic
genetic disease.                                                                        disease diagnosis.
   Our efforts to analyze cancer, cardiovascular, neurodegener-                            The approach of adult screening is in its early phase but from
ative, and obesity/diabetes risk were successful but needed con-                        our data appears very promising. We conclude that the genomic
siderable education/counseling to avoid confusion over risk vs.                         study of adults deserves intensified effort to determine if "need
diagnosis. Second, there are standard of care options for those                         to know" genome information has the utility for improved
with risk alleles for cancer, cardiovascular disease, and diabetes                      quality of health for our aging population.
for disease modification or early diagnosis. 'Thus, sequencing
serves as a new screening risk detection approach toward the                            Materials and Methods
objective of improved health. It is expected that genomic studies                       The oversight of this research was under two institutional review boards: (i)
will increase surveillance studies (e.g., colonoscopy. gynecologic                      HSC-IMM-08-0641 (University of Texas Health Science Center at Houston)
examinations, mammograms, cardiovascular markers and scan-                              and (ii) H-30710 (Baylor College of Medicine).
ning studies) but has the possibility of more precisely identifying
the patients who may benefit from rlititsce prevention surveillance.                    Cohort Description. The cohort consists of members and spouses in the
   The area of adult-onset neurologic disorders is an increasing                        Houston Chapter of the Young President Organization (YPO) (85). Theentire
concern worldwide as our population ages, thus exposing disease                         description of the cohort can be found in SI Materials and Methods.
incidence not seen earlier. The genetic disease discoveries are
limited. Confirmatory diagnostics such as image analysis and                            MS Sequencing. Standard NGS was performed using illumine HighSeq; an
biomarkers/surrogate markers are just emerging, and prevention                          extended explanation can be found in Materials and Methods.
therapeutic options are nonexistent. Although one might ques-
tion the utility of screening for these disorders at this time, the                     Sequencing Analysis. Fig. 1 illustrates OUf pipeline, and fig. 2 describes our
experience with Huntington disease (76) screening taught valu-                          pipeline to detect known pathogenic variations. Additional details can be
able lessons on how to proceed with studying and counseling                             found in Sf Materials and Methods.
families at risk. Furthermore, there are new therapeutic trials in
disease prevention for Alzheimer's (58) and Parkinson disease                           Counseing. Genome counseling was conducted by a board-certified internist
based on the genetic cause of disease. These clinical trials use                        and medical geneticist by both individual meetings and two written sum-
genetic diagnosis to select participants, which is also a successful                    maries over a period of 12 mo. Additional information can be found in SI
approach in cancer drug development (77-79).                                            Materials and Methods.

Barriers to the Adoption of Genetic Screening via Sequendng. Al-                        ACKNOWLEDGMENTS. This work was supported by the Cullen Foundation
                                                                                        for Higher Education and the Governing Board of the Greater Houston
though the above comments would present the case for the value of                       Community Foundation. The funding organizations made the awards to the
adult genetic screening via whole genome sequencing, there are                          University of Texas Health Science Center at Houston and Baylor College of
major issues to be addressed. In our opinion, the least is sequencing                   Medicine. C.T.C. was the principal investigator of both grants.


1. Lew S. et al. (2007) The diploid genome sequence of an individual human. PLoS Riot   8. Anonymous Finding of rare disease genes in Canada (forge Canada). Available at
   3(10):4254.                                                                             http/Avenv.genomebccaipartfolia/projects/health.projecb/finding.of.raredisease.
2. Bamshad Mi, et aL (2011) Excaie sequencing as a tool for Mendelian disease gene         genevincanada.forge-canada/. Accessed September 19,2013.
   discovery. Nat Rev Genet 12(11):74S-7SS.                                             9. Gehl WA, et al. (2012) The National Institutes of Health 8a-diagnosed diseases pro-
3. Tabor 14K, Berkman BE. Hull 5C. aamShad Ml (2011) GenanKs really gets personal:         gram: Insights into rare diseases. Genet Med ta(tkm-59.
   How exome and whole genome sequencing challenge the ethical framework of hu-         10. Gant WA et al. 12012) The !Catena! Institutes of Health Lnoiegnesect diseases pro-
   man genetics research. Am Med Genet A 1SSA(12):2916-2924.                                gram: Insights Into rare diseases Genet Med 14(1)51-59.
4. Lander ES R011)Genomesequeuingannhersary. The accelerator. Scknce 331(6020):         11. Gehl WA lifft 0 (2011) The NIH undiagnosed diseases program: Lessons learned.
   1024.                                                                                    /AMA 305(I8):1904 -I905.
S. Lander ES 0011) Initial impact of the sequencing of the human genome. Nature         12. Koenekoop RK. et al; Finding of Rare Disease Genes (FORGE) Canada Consortium
   470(7333):187-197.                                                                       (2012) Mutations in NMNAT1 MAO Leber congenital amaurosis and identify a new
6. Biesedser LC, Burke W, Kahane I, Non SE, limn ern R (2012) Next.generation se.           disease pathway for retinal degeneration. Nat Genet 44(9):1035-1039.
   quencing in the clinic Are we ready? Nat Rev Genet 13(11)1318424.                    13. Stetson PD. et al. (2012) The Human Gene Mutation Database (IMMO) and Its ex-
7. Hennekam Rc, Biese<ker LG (2012) Next-generation sequencing demands next-gen-            ploitation in the fields of personalized genomlcs and molecular evolution. Curr Pro-
   eration phenotypIng. Men Muth 33(5)1384-886.                                             tocol erolnlorm 39:1.13.1-1.1320.


Genzakz-Gairay et al.                                                                                                                           PNAS Early Edition I 5 of 6




                                                                                                                                                                EFTA01140246
14. Stenson PD, et al. (2009) The Human Gene Mutation Database: 2008 update. Genome          M. Ruel Let al. (2008) Impairment of SLC17A8 encoding vesicular glutamate transporter.
    Med 1(1)13.                                                                                  3, VGLUT3, underlies nOnSyndrOmk deafness DFNA2S and inner hair cell dysfunction
IS. Anonymous NHLBI exome sequencing project (ESP)exane variant server. Available at             in null mice. Am .1 Hum Genet 83(2):278-292.
    http:Nevsgswashington.edteEVSL Accessed September 19, 2013.                              49. van Hulstelp LT, Dekkers OM, Mn Fl. Smlt 1W, Calmat EP 0012) Risk of malignant
16. Oarke L Zheng-Bradley X. et at 12012) The 1800 Genomes Project: Data management              paraganglioma 1n 9211B-mutation and 50410mtnatiOn canals A systematic review
    and canmunity access. Nat Methods 9(5)459-462.                                               and meta-analysis./ Med Genet 49(12):768            
ℹ️ Document Details
SHA-256
a0fad13d642c4be63205236ed6e02a19b6242c66947461e6b3913c15fff00744
Bates Number
EFTA01140242
Dataset
DataSet-9
Document Type
document
Pages
14
Comments 0

Loading comments…