📄 Extracted Text (13,473 words)
REPORT OF THE RUTGERS RESEARCH
ADVISORY BOARD
Investigation into Allegations of Research Misconduct
Against Dr. William Brown
April 25, 2012
EFTA01145646
I. HISTORICAL BACKGROUND
Drs. Trivers, Palestis and Zaatari (2009), in a paper titled "Anatomy of a Fraud," accused Dr.
William Brown of committing research misconduct and including false research results in Brown
et al. (2005), a paper published in Nature with Dr. Trivers as a co-author. Dr. Brown and
another coauthor of this same Nature paper (Dr. Cronk), have denied these charges in a written
rebuttal (Brown and Cronk, 2009). The research in question was funded by the National Science
Foundation (NSF) which was acknowledged in the paper.
Rutgers University became aware of these accusations in 2009 and, following NSF guidelines
and University policy, completed an inquiry into these allegations. The recommendation from
this inquiry, as noted on December 22, 2009 in a letter from Dr. Pazzani to NSF, was to
undertake a full investigation of the allegations. The NSF agreed with this recommendation and,
on February 18, 2010, asked Rutgers to undertake the full investigation.
We begin this report of the full investigation with a summary of our findings and then present a
more detailed description of the study design, the charges of misconduct, and the actions we took
to investigate the charges. We conclude the report with detailed explanations of our analysis of
the evidence for the three main allegations that were made against Dr. Brown by Drs. Trivers,
Palestis and Zaatari (2009) and the reasoning for the findings.
1
EFTA01145647
II. SUMMARY OF OUR FINDINGS - WE FIND THAT:
• Substantial (clear and convincing) evidence exists that research fraud has occurred in
several areas. Evidence exists that:
o Based upon the investigator's (Dr. Brown's) knowledge of subject performance,
or access to existing evaluations of subject performance, there was biased
selection of subjects who were to be included in the symmetry / asymmetry
comparison groups so as to artificially obtain desired results;
o There was falsification of avenged data scores from the Jamaican children's
evaluations;
o There were omissions of data availability and documentation, as well as
conflicting data sets that are consistent with a cover-up against the charges.
• The study design is very complicated and, in some instances, not well defined by the
investigators. There are multiple copies of files, some with > 80,000 data fields, for
analysis. There are innumerable accusations and rebuttals.
o The scale and complexity of the study make it very hard for us to document each
and every allegation against Dr. Brown that was made by Drs. Trivers, Palestis
and Zaatari (2009) in such a way that:
■ Will be easily understood even by analytically oriented persons;
■ Will address all conceivable rebuttals that could be made by the accused
and accusers.
o With these concerns in mind, we decided to focus on the most substantive of the
allegations.
■ This report makes no findings with regard to whether other allegations
regarding Dr. Brown's research are well founded or not.
2
EFTA01145648
III. BRIEF SYNOPSIS OF THE STUDY IN QUESTION (Brown et al., 2005)
Hypothesis — Persons (Jamaican children) who are more physically symmetrical will be
perceived to be better dancers by their peers. This will be true more so for males than for
females.
Data Source - Ongoing Study of -183 Jamaican Children that began in the middle 1990s
- Measures of Symmetry were taken on the Jamaican Children in 1996 and 2002:
o Summed relative absolute differences between left and right side of body were used
to calculate what is defined as "fluctuating asymmetry" (FA).
o This was a cumulative score of mean adjusted absolute differences in relative
asymmetry size (FA) summed across nine body parts, with a higher score indicating a
person is more asymmetric.
NOTE — While we believe that the score more accurately should be
labeled Relative Fluctuating Asymmetry (RFA), due to the adjustment of
the differences by the mean score, the term FA is used in the paper so we
also do so here in the report to be consistent.
o The calculations of fluctuating asymmetry and summed score are described in more
detail later in this report.
- Videotapes of dancing were made on some of the children in 2004/2005.
o As part of a complicated process described later the same Jamaican children in the
main study were also later the judges of the dancing ability of the selected 40
Jamaican dancers described below.
Study Analysis Sample chosen in Early 2005
- From the larger subject population, 40 children (20 girls and 20 boys) were selected into four
groups based on the following stated criteria:
Asymmetric boys = 10 boys who were in the "top 1/3rds for FA asymmetry scores both in
1996 and again in 2002
Symmetric boys = 10 boys who were in the "lowest 1/t for FA asymmetry scores both
in 1996 and again in 2002
Asymmetric girls = 10 girls who were in the "top 1/3"I" for FA asymmetry scores both in
1996 and again in 2002
Symmetric girls = 10 girls who were in the "lowest 1/3m" for FA asymmetry scores both
in 1996 and again in 2002
3
EFTA01145649
NOTES
I. While there were -183 original subjects eligible to be selected for the 40
dancers, the number of potential subjects was smaller, —106 due to various
reasons including missing FA data in 1996 and/or 2002, and/or not later
being filmed dancing.
2. It is not totally clear from the paper (Brown et al., 2005) whether the
top/bottom 113,d means for both boys and girls pooled together or if the
top/bottom 1/3ffis were calculated for each sex separately; the indications
were that the sexes were pooled together. Nor is it clear from the paper
whether the pools of subjects used for consideration in 1996 and 2002
included all persons with available data for the given year of consideration
or were restricted to persons with complete data for both 1996 and 2002.
3. From the allegations made by Drs. Trivers, Palestis and Zaatari (2009) and
the rebuttal of Drs. Brown and Cronk (2009), more than 10 subjects were
eligible to be included in three of the above four groups and the process of
how the total available pool of subjects were reduced to 10 for these three
groups is one of the salient issues in the allegations of misconduct.
4. There are also claims in the rebuttal of Drs. Brown and Cronk (2009) that
persons with what was deemed to be "poor dancing tape quality" were
excluded from consideration for these four symmetry/asymmetry groups.
But Dr. Brown and Dr. Cronk presented no evidence as to how this was
done nor any information as to whether and how such exclusions were
documented.
5. Importantly for the charges of fraud being made, before these 40 subjects
were chosen, two Rutgers undergraduates had evaluated the dancing tapes
and it appears that these scores were available to Dr. Brown prior to the
selection process. Drs. Trivers, Palestis and Zaatari (2009) claim this on
page 13. In Drs. Brown and Cronk's rebuttal (2009) it states that the
Rutgers undergraduate evaluations of the tapes were "not all available" at
the time the 40 dancers were selected. But Drs. Brown and Cronk do not
elaborate further on this to indicate what portion of the Rutgers
undergraduate evaluations were available at that time. However, it is clear
on page 3 of the rebuttal by Drs. Brown and Cronk (2009) that the dance
animations had been viewed by the Rutgers undergraduates and that at least
some of the dance scores assigned by these undergraduates were used as
part of a grant application made on February 23, 2005 by Dr. Brown, which
was prior to the selection of subjects to groups. Thus it is not disputed that
Dr. Brown had access to at least some of the Rutgers undergraduate
evaluations of the dance animations before the selection of the 40 dancers.
In any case, none of this elaborate prescreening of the subjects is
EFTA01145650
mentioned in the Nature (2005) paper or in any available Appendices to the
paper.
Evaluation of Dancing Outcome in March 2005
I. The same Jamaican Children (155 of the 183 children) evaluated the
"digitalized" dance routines.
• Digitalized means that the identity and appearance of the dancer
was hidden.
2. The plan was for each of the 155 children to evaluate each of the tapes
from the digitalized dancers.
• The overall score of each of the 40 tapes was the average score of
all children who evaluated the tapes with these caveats:
i. If a child evaluated his/her own digitalized dance, that score
was excluded;
ii. There appear in the data (c.f. sent to us by Dr. Palestis as
described later in this report) evaluations that were
incomplete or incorrect and may have thus been excluded
from the mean score. But this is not fully documented in the
article or by other information sent to us by Dr. Brown (or
by Dr. Palestis).
IV. SUMMARY OF THE ALLEGATIONS OF MISCONDUCT THAT WERE
MADE BY DRS. TRIVERS, PALESTIS AND ZAATARI (2009)
I. Parties Involved:
a. The party alleged to have engaged in research misconduct:
• Dr. William Brown — First author of Brown et al. (2005) who was a
post-doctoral student in Anthropology at Rutgers when this work was
done. He is now a faculty member at The University of Bedfordshire
in Bedford, UK. He was one of 7 authors on the paper.
b. The parties alleging research misconduct (Drs. Trivers, Palestis, Zaatari,
2009):
• Dr. Robert Trivers — Senior author of the paper Brown et al. (2005)
and in the same Department as Dr. Cronk (Anthropology).
EFTA01145651
• Dr. Brian Palestis — Not a coauthor of Brown et al. (2005) and has not
been affiliated with Rutgers (he is at Wagner College). He apparently
has done the statistical analysis to "confirm" and support Dr. Trivers'
position.
• Dr. Darin Zaatari — Not a coauthor of Brown et al. (2005). She was a
Ph.D. student at Rutgers when the Nature paper was written and has
since graduated. She was apparently involved in much of the initial
investigation by the accusers.
c. Dr. Lee Cronk — The second author of Brown et al. (2005) and the Principle
Investigator on the Grant. He is not accused of committing any misconduct.
but has come to the defense of Dr. Brown.
2. Sequence of accusations and rebuttals as stated by the accusers and then the accused
(paraphrasing the words used).
a. Soon after the publication of Brown et al. (2005), Dr. Trivers (through
communications with persons who were unable to obtain the same results that
were in the paper from what they believed to be the data used for the analyses)
developed concerns about the data Dr. Brown had used and analyses that
appeared in the article. He and the other accusers began contacting Dr. Brown
for explanations and the specific data sets that Dr. Brown used. (A series of
emails resulting from these contacts was sent to us by Dr. Palestis).
b. Not being satisfied with Dr. Brown's response, Drs. Trivers, Palestis and
Zaatari (2009) conducted their own analysis of the data and facts as they saw
them, ultimately leading to the point where they concluded that fraud had
been committed by Dr. Brown.
c. Drs. Trivers, Palestis and Zaatari attempted to have the journal "Nature"
publish a letter retracting the article. When Nature refused to do this, they
attempted to have an expose published in another journal. When this did not
happen, they published their own 91 page analysis "Anatomy of a Fraud"
(Trivers, Palestis and Zaatari, 2009).
d. When contacted by the Rutgers Office of the General Council about the
document "Anatomy of a Fraud," Drs. Brown and Cronk prepared a -50 page
(including appendices) rebuttal to the accusations (Brown and Cronk, 2009).
3. The Major Accusations by Drs. Trivers, Palestis and Zaatari:
a. Dr. Brown falsified some of the 1996 and 2002 fluctuating asymmetry (FA)
scores on selected subjects in a fashion that i) moved boy/girl dancers to
whom the two Rutgers undergraduates had given worse dance ratings into the
top 1/3`ds of the FA asymmetry scales (most asymmetric) for 1996 and 2002
6
EFTA01145652
(and thus caused these worse dancers to meet the selection criteria for being
asymmetric) and ii) moved boy/girl dancers who had been accorded better
dance ratings by the two Rutgers undergraduate students into the bottom 1/3rds
of the FA asymmetry scales for 1996 and 2002 (and thus caused these better
dancers to meet the selection criteria for being symmetric).
• The hypothesis, as presented in the allegations, was that Jamaican
children and the Rutgers undergraduates would rate the dancers in
about the same way. The specific allegation is that Dr. Brown
leveraged this possibility to spike the top u3rd'asymmetric group with
bad dancers and the bottom 1/risymmetric group with good dancers.
b. More than 10 boys or girls met the criteria to potentially be included in three
of the four groups. When that happened, Dr. Brown selected the 10 who
would be included into the groups. The allegation is that he did this in a
biased fashion so as to selectively choose from the eligible subjects those who
were rated as worse dancers by the Rutgers undergraduates to place into the
asymmetric groups and similarly selectively chose subjects who were rated as
better dancers by the Rutgers undergraduates into the symmetric groups.
c. After the Jamaican children dance evaluations were collected and scored, Dr.
Brown falsified the Jamaican children's dancing score ratings to enable results
which statistically supported the hypothesis of the paper.
d. NOTE — As was stated earlier in this report, Drs. Trivers, Palestis and
Zaatari's 2009 document further contains a multitude of other accusations. We
did not investigation every allegation, but focused on those where it could be
efficiently proved and substantively determined that misconduct occurred.
(c.f. by Brown) in relation to the 2005 Nature paper.
V. SUMMARY OF THE ACTIONS WE HAVE TAKEN TO INDEPENDENTLY
INVESTIGATE THE CHARGES AND REBUTTALS
I. The Rutgers Office of the General Council had already requested and received data
sets relevant to the accusations when this committee became involved. We
nevertheless requested from both Dr. Brown and Dr. Palestis (who did much of the
analysis for the Trivers, Palestis and Zaatari (2009) report) all data sets and materials
that had relevance to the allegations and, in particular, instructions on how to
calculate a) the 1996 and 2002 FA scores and b) the Jamaican student dance rating
scores from the raw data.
• Dr. Palestis responded within 1 week of the request and sent us, among other
things, data sets that included:
7
EFTA01145653
i. The data for 1996 and 2002 asymmetry scores which Dr. Palestis said
Dr. Brown had sent him earlier and data for the same scores that
exists in the database that Dr. Trivers' group maintains for the
ongoing Jamaican study. (Dr. Brown did not collect the FA score data
himself but received it from others who had collected these data).
ii. The individual —155 Jamaican students' dance ratings of the 40 study
subjects. Dr. Palestis said he had received these data earlier from Dr.
Brown.
iii. Instructions on how to calculate all scores in the data described above
in i. and ii.
iv. Summaries of Dr. Palestis', Trivers' and Zaatari's comparative
analyses of the data they received from Dr. Brown with their own data
from the Trivers group database and the results reported in the 2005
Nature article.
• Dr. Brown responded later:
i. He did send his data for the 1996 and 2002 asymmetry scores which
after making a considerable number of comparisons appear to us to be
essentially the same data that Dr. Palestis had sent us which he stated
he had received from Dr. Brown.
ii. Dr. Brown indicated that he no longer had raw data on the Jamaican
students' ratings of the 40 dancers. When we questioned him further
about these data, Dr. Brown said that the data Dr. Palestis et al. used
for their analysis of dance scores (i.e. attributed to being from Dr.
Brown) must be old or corrupt and the correct data could not be
recovered from it. Quoting Dr. Brown from his January 25, 2011
email to Dr. Pazzani, "1sent a file to Dr. Brian Palestis some time
ago, but it appears that thisfile is either corrupted or an earlier
version of the one used by the research assistant (Le., to decide which
ratings would be included in the average). If I couldfind thefile
or figure out how to calculate the averagesfrom the one I sent Dr.
Brian Palestis, I would send it to you along with detailed instructions
to help with the investigation. .... Nonetheless, I will look for the file I
sent to Dr. Palestis and attempt again to reconstruct the average
ratings." We have not received any further correspondence from Dr.
Brown on this issue.
2. We requested from all participants who we believed might have access to these
documents (Drs. Trivers, Brown and Cronk) to send us copies of earlier versions of
the paper and the reviews, most notably the original paper that was submitted to
Nature along with the review and the response to the review.
• The first response was from Dr. Cronk who sent us multiple copies of earlier
versions of the paper.
EFTA01145654
• Dr. Brown followed with one copy of an earlier version of the paper.
• Dr. Trivers sent a copy of the review report of the first submission of the
paper.
3. To be thorough, we sent emails and/or made phone calls to other coauthors (Drs.
Keith Grochow, Amy Jacobson, C. Karen Liu, and Zoran Popovic) asking if they
had any knowledge that could be relevant to the investigation. None of these authors
were alleged to have engaged in misconduct, and while some were mentioned in the
rebuttal, it did not seem as if they would have knowledge pertinent to the charges.
• Dr. Popovic responded (and we believe he was also speaking for Drs. Liu and
Grochow) that they had been aware of these allegations for quite some time,
had been contacted about them by several sources and, as coauthors of the
paper, were anxious to know the findings of the investigation. They had no
significant new information on this to share with us.
• Dr. Jacobson responded that she had no role in the study beyond being the
field site manager for the general project and thus could not help us further.
4. After we had undertaken our analysis and were ready to finalize the report, we sent
emails to Dr. Brown and Dr. Cronk requesting explanations on two findings we made
that could reflect inconsistencies or fraud in the study design and analysis.
• Dr. Cronk met with us at Rutgers in early October 2011 and Dr. Brown sent
a written reply in December 2011.
VI. SUMMARY OF OUR ANALYSIS AND FINDINGS
We first followed the approach that Drs. Trivers, Palestis and Zaatari (2009) had used (including
examining the rebuttals made by Drs. Brown and Cronk (2009)) to see if the arguments put forth
were valid and then to see if we could replicate the different analyses with the data sets we had
been given. While we found that the previous approaches which had been used were well
reasoned and exhaustive, we tried to streamline and distill the analyses to be more easily
understandable, communicable and addressable. The result is the following findings relevant to
the main allegations of fraud that were made in the last paragraph of page 5 of Trivers, Palestis
and Zaatari (2009).
1. Allegation - The 1996 and 2002 FA asymmetry scores of the 40 dancers who were
chosen for the study groups were systematically fabricated in a fashion to make better
dancers more symmetric and worse dancers less symmetric.
Our Conclusion — There is clear and convincing evidence to support the allegations that
this alleged research misconduct occurred.
9
EFTA01145655
a. This fabrication occurred and did cause dancers who were rated better by the
Rutgers undergraduates to be more likely to be inserted into the "symmetric" boys
and girls groups and dancers who were rated worse by the Rutgers undergraduates
to be more likely to be inserted into the "asymmetric" boys and girls groups.
b. It does not seem possible that; this fabrication i) could have happened by chance,
ii) could have been perpetrated by anyone other than Dr. Brown or iii) if had it
been perpetrated by someone other than Dr. Brown, that Dr. Brown would not
have noticed this problem and reported it after years of questioning by Dr.
Trivers' group and then by us.
2. Allegation - When Dr. Brown had the opportunity of choosing 10 subjects from a group
of more than 10 to make the final top/bottom symmetry group for boys and girls, he
chose the subjects in a way that favored the alternative hypothesis (i.e. based on the
Rutgers undergraduate students dance evaluations).
Our Conclusion - There is clear and convincing statistical evidence to support the
allegations that the alleged research misconduct occurred. Dr. Brown either used the data
collected by the Rutgers undergraduates (or some other informed evaluations of the
digitalized dances) to carefully select subjects as alleged. Thus, for three of the four
groups, among eligible dancers, those with better Rutgers undergraduate ratings were
placed into the symmetric groups and those with poorer Rutgers undergraduate ratings
were placed into the asymmetric groups.
3. Allegation - Dr. Brown fabricated the Jamaican children averaged dance score summaries
of the 40 dancers in order to obtain statistically significant findings that supported the
alternative hypothesis.
Our Conclusion — There is enough evidence to support that the alleged research
misconduct occurred. Dr. Brown is unable to produce data that can support the findings
he reported in Nature (2005) which, as both the first author and as the person who
undertook that data analysis, he should be able to do. However, Dr. Palestis produced a
data set he claims to have received from Dr. Brown. Dr. Brown subsequently
acknowledged he sent this data to Dr. Palestis and sent the same data to us, but claims
that this data is incorrect / unusable and that he no longer has the correct data. It is thus
impossible to know exactly what was done in the analysis by Dr. Brown because he relies
on claims of unwritten / undocumented or otherwise unexaminable reasons for exclusions
and/or incorrectness of some values in this existing data. Nonetheless, the findings of our
analyses on the only existing raw dancer rating data initially provided by Dr. Palestis, are
very consistent with those of Trivers et al. and are incompatible with the findings
reported in Nature (2005).
We now present detailed explanations of our findings on the three main allegations of research
misconduct that were made against Dr. Brown.
10
EFTA01145656
1. Allegation - The 1996 and 2002 FA asymmetry scores of the 40 dancers who were
chosen for the study groups were systematically fabricated in a fashion to make
better dancers more symmetric and worse dancers less symmetric.
Our Conclusion - There is clear and convincing evidence to support the allegations
that this alleged research misconduct occurred.
a. This fabrication occurred and did cause dancers who were rated better by
the Rutgers undergraduates to be more likely to be inserted into the
"symmetric" boys and girls groups and dancers who were rated worse by the
Rutgers undergraduates to be more likely to be inserted into the
"asymmetric" boys and girls groups.
b. It seems impossible that; i) this fabrication could have happened by chance,
ii) it could have been done by anyone other than Dr. Brown, or iii) had it
been by someone other than Dr. Brown, that Dr. Brown would not have
noticed this problem and reported it after years of questioning by Dr.
Trivers' group and then by us.
EVIDENCE
For Fabrication of Asymmetry Scores
A. The 1996 and 2002 asymmetry scores in the data sets sent to us by Dr. Palestis were entirely
internally self-consistent (i.e. the data did not contradict itself) with respect to the Fluctuating
Asymmetry (FA) scores and their component variables.
B. The 1996 and 2002 FA scores and their components in the data sent to us by Dr. Brown
were:
I) Internally self-consistent for all subjects who were not chosen to be one of the 40
dancers.
2) In general, not internally self-consistent (data contradicted itself) for the 40 subjects who
were chosen to be dancers as described in Section C below.
C. The non-self-consistency of FA scores in Dr. Brown's data is, in our view, impossible to
explain by anything other than fabrication of some of the data by a person who, at the time of
the fabrication, did not realize that the other items also needed to be changed for the data to
be self-consistent or otherwise did not think to change these items.
For each subject, the Fluctuating Asymmetry (FA) score was calculated as a sum of absolute
relative asymmetry for 9 body parts (elbow, wrist, knee, ankle, foot, ear, 3nd digit, 4th digit and Sth
digit) as described below.
FA = ERA, where P = 1,..., 9 enumerates the nine body parts and RAI, is the
relative asymmetry of the given body part (i.e. hand, ear, 4th digit, etc.)
11
EFTA01145657
ADp
For each body part, RAF, — — with
M p
ADJ, = Absolute Value of [Left Side Measure — Right Side Measure]
MP = Average of Left Side Measure and Right Side Measure
The values of ADP , MP and RAP are saved in the data sets we received from Drs. Brown and
Palestis for each person and body part P (Dr. Brown's data sent to us is missing ADP for the 3nd
digit in 1996 and for the ears in 2002). As described above for each subject and body part, if we
go into the data sets and for the Pth body part and year (1996, 2002) take the ratio of the values
ADp l MP of a given child, this ratio is always the same as the value of RAI, for that Pth body part
of that child during the same year in the data (i.e. self-consistent) in Dr. Palestis' data as it should
be. The observed ratio ADp1 MP is also always equal to (i.e to within three decimal places) the
value of RAP for the same body part of the child in the same year in Dr. Brown's data for all
subjects not selected into the study; with any differences that were less than 3 decimal places
being very small (i.e. of order < 10.10) and thus being likely due to round off error at some stage
and otherwise having no impact on the FA score.
However, the ratios of ADp1 MP are largely not equal (within 3 decimal places) to the values of
RAP for the same body part of the same child in the given year (1996 or 2002)for almost all of
the 40 subjects selected to the study groups for Dr. Brown's data among the body parts that were
included in the FA score.
For example, with P = 4thdigit, going to subject 15 in 1996 (who was selected as one of the 40
dancers) in Dr. Brown's data we observe
ADP = 0.875
MP = 55.888
RAP = 0.0076 (which rounded to three decimal places in Tables 1 and 2 is 0.008)
But looking at ADP / MP for this person gives the self-consistent value of RAP as 0 .875 /
55.888 = 0.0157 (which rounded to three decimal places in Tables 1 and 2 is 0.016)
In their rebuttal to the allegations by Drs. Trivers, Palestis and Zaatari (2009), Drs. Brown and
Cronk (2009) suggest that some data discrepancies might be due to "rounding" errors. However,
it is obvious that the difference between 0.0157 and 0.0076 is too large to be due to round-off
error and that this difference does not qualitatively change by only taking the measures of ADP
and MP out to 3 Vs.4 or more decimal places. The same is true for the other inconsistencies we
observed between RAP and ADP / MP in Dr. Brown's data in the 40 selected dancers.
Furthermore, it should be noted that for all study subjects and all body parts P, the values of
ADP and MP for body part P of any given subject do not differ between Dr. Brown's and Dr.
12
EFTA01145658
Palestis' data sets. The inconsistent values of RAP do not equal the ratios of the corresponding
ADp l MP for the vast majority of body parts in the 40 selected dancers in Dr. Brown's data set
and differ from the RAF in Dr. Palestis' data set (which always equals the ratio of the
corresponding ADP! Mp ). As we just noted, when these differences between Dr. Brown's and
Dr. Palestis' data occur, the RAF in Dr. Palestis' data is equal to the ratio of the corresponding
AA, IMP while the RAP in Dr. Brown's data does not equal the ratio of the corresponding
AA, IMP .
In other words, if we look at the 4thdigit of subject 15 for 1996 in Dr. Palestis' data, we see the
correct and self-consistent values
ADJ, = 0.875
MP = 55.888
RAP =0.0157 (i.e. = 0.875 / 55.888)
Due to there being 9 body parts measured on 2 different years (1996 and 2002) and 290 subjects
in the data set, we cannot show all the comparisons here. However, Table 1 displays the values
of ADP, MP , the actual ratio of these values ADP / MP and RAP for the 4th digit in 1996 among
the first 30 dancers in Dr. Brown's data set which includes some who were selected into the 40
asymmetric / symmetric dancers. Those who were selected into the final 40 asymmetric /
symmetric dancers are highlighted in red in Table 1. When the recorded RAF does not equal the
ratio of the corresponding AA,/ MP , the last 2 columns in Table I are highlighted in bold. Table
2 shows the same comparisons for the 40 selected dancers. For 34 of these subjects, the RAP
does not equal the corresponding ratio ADp l MP As is true for the other subjects and body
parts in 1996 and 2002, the recorded RAP always equals the observed ratio ADP / MP in
subjects who were not selected to be in the 40 dancers but usually does not for those who were
selected.
It should be noted that measures for ADP MP and RAP are recorded for 1996 in Dr. Brown's
dataset on one body part (the hand) that was not used in the FA score. For this body part, the
ratio of ADP / Mr. always equals the corresponding RAF in the 40 selected dancers in spite of the
fact just noted above that it seldom does for the body parts that were included in the 1996 FA
score. It should also be noted that sometimes values for ADP and MP are present but the
corresponding result for RAP is missing in Dr. Brown's data. For example, this happens with
ID 7 in Table 1 and ID 287 in Table 2. However, we have found that in settings when this
happens an entire set of values for at least one other summed body part in that year is missing.
For example, ID 7 is missing measures of ADP MP and RAP for foot in 1996 and ID 287 is
missing the values of ADP MP and RAP for elbow in 1996. Thus the missing RAP 's for the 4th
digit of IDs 7 and 287 in 1996 could reflect that all of the RAF 's needed for the 1996 sum were
not available for those IDs. (While Dr. Brown in fact has a sum recorded for 1996 FA of ID 287
13
EFTA01145659
as shown in column 2 of Table 3 that is mentioned later in this report, this was not possible as
column 3 of the same table shows since elbow was missing.)
When we met with Dr. Cronk in October 2011, he had no explanation for the discrepancies
between the recorded RAp and the actual ratios ADP /MI, for body parts among the 40 selected
subjects. Dr. Brown also acknowledged the inconsistencies existed as well when he replied to
our questions on December 1, 2011, but his only explanations alluded to the fact that either he
did not know how they could happen and/or that the data we had received from him may not
have been the same data that he actually used in 2005 and/or that these errors may have been
introduced by other people before he received the data.
To quote (with salient phrases underlined by us) from part of his response to our questions on
this issue that he returned on December I, 2011 .... "This is interesting as you rightly point out
the hand was not used as one of the FA's in the composite. Recall that all information that was
used and presented in the Nature paper was not from the master dataset I sent you. Any values
that are included in thisfile were pastedfrom thefile used in 2005. This is clear evidence that
the file I was working with in 2005 is indeed different from thefile you attached as I previously
claimed". It is challenging to explain why these inconsistencies occur. Recall that when
making the datasetfor Dr Palestis well after the dance paper was published in Nature (the email
andfile time stamps indicate thisfact) it came to my awareness that there were errors. I should
point out that these initial errors were introduced before I began working on the project. Indeed
to make the so-called masterfile for Dr Palestis involved me merging, cutting and pasting from
different files some of which I no longer have access. Since errors were discovered after I made
the file I am skeptical about the validity of this file. You have discovered another problem, to
which i have no logical explanation. I acknowledge it to be there but as to how it emerged (and
when) is unclear to me. Without the original files i was working with it difficult to isolate how
and when discrepancies emerged in this post-publication dataset."
The only explanation we can see for the non-self-consistencies in Dr. Brown's data is that Dr.
Palestis' data set is correct and that the values for RAP were altered in Br. Brown's data so that
they would sum to the values of FA for those subjects in 1996 and 2002 which had also been
altered. But this was only done within the 40 selected dancers and was done by someone who
was either not aware that the corresponding values for ADP and Mp also needed to be altered to
make the data self-consistent or otherwise did not bother to do so. We see no conceivable way
this alteration could happen by chance or accident; we conclude it must be the result of
fabrication. For example, non-self-consistencies between ADP and Alp were RAP observed at
least once in 39 of the 40 selected dancers compared to never in the 66 other filmed dancers with
available FA data for 1996 and 2002 who were not selected. The P-value for this to occur by
chance alone is less than one in 10-27 times by exact test.
The Alteration of Asymmetry Scores Was Done by Dr. Brown
14
EFTA01145660
It seems impossible that anyone else except Dr. Brown (who did the data analysis for the paper
and held the data set) would have access to these data to alter only the values of RA, and
corresponding summed FAA. We do not see how someone creating a data set in 2005 before Dr.
Brown began working on the project would have the reason or ability to alter these values only
among those 40 people who ultimately at a later date became selected to be dancers using what is
now an incompletely defined process and, what would have been at the time of that alteration, an
unknowable process.
The Alteration of Asymmetry Scores Favored the Invtstieator's Hypothesis in a Wav That
Could Have Been Foreseen by Dr. Brown
The complexity of the study design and fact that this design was not clearly explained (and
further confused by caveats such as persons were excluded from consideration because their
videos were deemed un-evaluable) complicates a certain determination of "what would have
happened" if the data had not been fabricated as we believe it was. However, we compare in
Tables 3 and 4 respectively the differences [Dr. Brown's data summed FA — Correct Summed
FA] for 1996 and 2002 respectively. By "Correct Summed FA" we mean the summed FA that is
self-consistent with the AD, and M,, in the data set. For example, in Table 3 for ID 15, the value
for summed 1996 FA in Dr. Brown's data was 0.110 (in column 2). However, based on the
actual values of AD, and M,, for the 9 body parts in 1996 and their ratios, the correct (i.e. self-
consistent) 1996 FA for ID 15 was 0.163 (in column 3). This means that Dr. Brown's summed
1996 FA for ID 15 was shifted -0.053 (in column 4) from the correct value (-0.053 = 0.110 —
0.163) making that person more symmetric than they would be by the self-consistent FA
measure. Column 5 has the averaged Rutgers undergraduate dancer scores for ID 15 which was
123.93. Now 123.93 was one of the higher scores meaning this person's summed FA was
shifted lower by 0.053 to make this person more symmetric by Dr. Brown's score, and this
person was also rated as a relatively good dancer by the Rutgers undergraduate students. The
format for Table 4 is the same as that for Table 3 except that 2002 rather than 1996 FA scores are
involved.
In order to see if the shifts (from self-consistent) in the 1996 and 2002 FA scores in Dr. Brown's
data were associated with the Rutgers undergraduates' dance scores, we examined the
correlations of the shifts (column 4) with the averaged Rutgers undergraduate scores (Column 5)
in Tables 3 and 4 among those dancers where Dr. Brown's value differed from the self-consistent
value. These analyses were restricted to only those subjects in 1996 and 2002 respectively,
where Dr. Brown's FA differed from the correct self-consistent FA. For 1996 (Table 3) the shift
between Dr. Brown's value and the self-consistent value was negatively correlated with the
averaged Rutgers undergraduate dancer scores (p= -0.39 with M.0157 for no association by
Formula 16.25 in Berenson and Levine, 1999). For 2002 (Table 4) the shift between Dr.
Brown's value and the self-consistent value was also negatively correlated with the averaged
Rutgers undergraduate dancer scores p= -0.24 for 2002 with P=0.245, (by Formula 16.25 in
Berenson and Levine, 1999). This means that, compared to bad dancers, good dancers were
more shifted towards symmetry by the alterations in Dr. Brown's FA scores in both 1996 and
2002, something that would support the alternative hypothesis.
15
EFTA01145661
Using Fisher's (1950) method (as described below on page 23 of this report) to pool the p-
values from 1996 and 2002 together with the fact that the shifts were in the same direction gives
an overall two-sided P-value of 0.0152 for the shifts in 1996 and 2002 simultaneously being
directionally associated with Rutgers undergraduate scores. In other words, it is not likely that
the shifts in the FA scores that Dr. Brown's data had from the correct self-consistent FA scores
for 1996 and 2002 would correlate with the averaged Rutgers undergraduate evaluations in the
direction of the alternative hypothesis as strongly as they did.
It should be noted that Drs. Brown and Cronk's rebuttal (2009) claims that some or all of Rutgers
undergraduate evaluations were not available when the 40 symmetric / asymmetric dancers were
selected. But even if that were the case, it does not invalidate the findings of this test which
indicate that the changes in FA within Dr. Brown's data were directionally associated with a
supposedly independent measure of the dancing ability. For example, others (including we
believe almost certainly Dr. Brown) were also able to view the animation tapes before the 40
dancers were selected. Thus Dr. Brown could have used dancer evaluation information from
sources other than the Rutgers undergraduate students to base any decisions for fabrication. As
these dancer evaluations from other sources would also likely agree with the Rutgers
undergraduate students with respect to quality of dance, the fabricated shifts in FA would still be
statistically associated with the Rutgers undergraduate scores in the direction of the alternative
hypothesis even if the undergraduate scores were not used in the fabrication process. The
rebuttal from Drs. Brown and Cronk (2009) mentions tapes being excluded from consideration
for selection by the investigators due to poor quality, an assertion that means that the tapes must
have been viewed in advance to screen for this. It stands to reason that the perception of Dr.
Brown and others on dancing ability would be in the same directions as that of the Rutgers
undergraduate students and, if so, this association of shifts in FA from the self-consistent value
to Dr. Brown's value with Rutgers undergraduate ratings would transfer to the same associations
with other ratings of dancing ability as well.
16
EFTA01145662
2. Allegation - When Dr. Brown had the opportunity of choosing 10 subjects from a
group of more than 10 to make the final top/bottom symmetry group for boys and
girls, he chose the subjects in a way that favored the alternative hypothesis (i.e.
based on the Rutgers undergraduate students dance evaluations).
Our conclusion - There is clear and convincing statistical evidence to support the
allegations that the alleged research misconduct occurred. Dr. Brown either used
the data collected by the Rutgers undergraduates or some other informed
evaluations of the digitalized dances to carefully select subjects as alleged. Thus, for
three of the four groups, among eligible dancers, those with better Rutgers
undergraduate ratings were placed into the symmetric groups and those with worse
Rutgers undergraduate ratings were placed into the asymmetric groups.
EVIDENCE
With respect to this charge (that there was a biased pre-selection of the 10 subjects when more
than 10 were eligible such that those chosen were biased in the direction of the alternative
hypothesis when the Jamaican students evaluated the tapes), the background may be summarized
as follows: 167 individuals were assessed for FA in 1996 and 2002. Of these, according to
Trivers, Palestis, Zaatari (2009), 167 were filmed while dancing using a motion capture
technique of whom 106 had complete FA data for 1996 and 2002. It was then decided that the
effect of FA on perceived dance ability would be compared across four groups of 10 individuals
each: symmetrical males, asymmetrical males, symmetrical females and asymmetrical females.
To identify the 10 subjects for each group that would be drawn from the larger population, a
criterion was established, namely that each of the 10 subjects for each group must fall in either
the i) the upper thirds of the symmetry-asymmetry scale for both 1996 and 2002 or ii) the lower
thirds of the symmetry-asymmetry scale for both 1996 and 2002. Dr. Brown's review of his FA
data for both years using these criteria identified 13 "symmetrical" eligible males, 13
asymmetrical eligible males, 10 symmetrical eligible females and 16 asymmetrical eligible
females (Trivers, Palestis, Zaatari 2009; Brown and Cronk 2009). That is, for three of the four
groups, there were too many possible subjects and 10 subjects needed to be selected from the
pool. The charge against Dr. Brown is that the selection process was not random or blind but
done deliberately with the intent of increasing the probability that the main alternative hypothesis
would be statistically substantiated.
The 40 dance animations were ultimately evaluated by 155 Jamaicans who had also served as
dancers or dancer candidates to provide the outcome data for Dr. Brown's study. However, as
noted earlier, the animations were pre-evaluated by two undergraduate dance students of Rutgers
University. Dr. Brown allegedly had access to these evaluations and, allegedly, used them to
select the 40 animations from the larger pool of eligible subjects as described above. However,
even if Dr. Brown did not have access to these Rutgers undergraduates' dance evaluation scores,
he and/or others had access to the tapes and their own ratings of these tapes might be similar to
17
EFTA01145663
those of the Rutgers undergraduate students. So, is there evidence either way that Dr. Brown did
or did not use a randomized / blind procedure to select the 40 subjects from the larger pool of 52
= 13 + 10+ 13 + 16?
The expl
ℹ️ Document Details
SHA-256
1d741fc8e9be21df548ee04e7882f82ce03fc22bc4a5ef73c4956e911632d5e1
Bates Number
EFTA01145646
Dataset
DataSet-9
Document Type
document
Pages
38
Comments 0