EFTA01145645
EFTA01145646 DataSet-9
EFTA01145684

EFTA01145646.pdf

DataSet-9 38 pages 13,473 words document
P17 V12 P19 V11 V9
Open PDF directly ↗ View extracted text
👁 1 💬 0
📄 Extracted Text (13,473 words)
REPORT OF THE RUTGERS RESEARCH ADVISORY BOARD Investigation into Allegations of Research Misconduct Against Dr. William Brown April 25, 2012 EFTA01145646 I. HISTORICAL BACKGROUND Drs. Trivers, Palestis and Zaatari (2009), in a paper titled "Anatomy of a Fraud," accused Dr. William Brown of committing research misconduct and including false research results in Brown et al. (2005), a paper published in Nature with Dr. Trivers as a co-author. Dr. Brown and another coauthor of this same Nature paper (Dr. Cronk), have denied these charges in a written rebuttal (Brown and Cronk, 2009). The research in question was funded by the National Science Foundation (NSF) which was acknowledged in the paper. Rutgers University became aware of these accusations in 2009 and, following NSF guidelines and University policy, completed an inquiry into these allegations. The recommendation from this inquiry, as noted on December 22, 2009 in a letter from Dr. Pazzani to NSF, was to undertake a full investigation of the allegations. The NSF agreed with this recommendation and, on February 18, 2010, asked Rutgers to undertake the full investigation. We begin this report of the full investigation with a summary of our findings and then present a more detailed description of the study design, the charges of misconduct, and the actions we took to investigate the charges. We conclude the report with detailed explanations of our analysis of the evidence for the three main allegations that were made against Dr. Brown by Drs. Trivers, Palestis and Zaatari (2009) and the reasoning for the findings. 1 EFTA01145647 II. SUMMARY OF OUR FINDINGS - WE FIND THAT: • Substantial (clear and convincing) evidence exists that research fraud has occurred in several areas. Evidence exists that: o Based upon the investigator's (Dr. Brown's) knowledge of subject performance, or access to existing evaluations of subject performance, there was biased selection of subjects who were to be included in the symmetry / asymmetry comparison groups so as to artificially obtain desired results; o There was falsification of avenged data scores from the Jamaican children's evaluations; o There were omissions of data availability and documentation, as well as conflicting data sets that are consistent with a cover-up against the charges. • The study design is very complicated and, in some instances, not well defined by the investigators. There are multiple copies of files, some with > 80,000 data fields, for analysis. There are innumerable accusations and rebuttals. o The scale and complexity of the study make it very hard for us to document each and every allegation against Dr. Brown that was made by Drs. Trivers, Palestis and Zaatari (2009) in such a way that: ■ Will be easily understood even by analytically oriented persons; ■ Will address all conceivable rebuttals that could be made by the accused and accusers. o With these concerns in mind, we decided to focus on the most substantive of the allegations. ■ This report makes no findings with regard to whether other allegations regarding Dr. Brown's research are well founded or not. 2 EFTA01145648 III. BRIEF SYNOPSIS OF THE STUDY IN QUESTION (Brown et al., 2005) Hypothesis — Persons (Jamaican children) who are more physically symmetrical will be perceived to be better dancers by their peers. This will be true more so for males than for females. Data Source - Ongoing Study of -183 Jamaican Children that began in the middle 1990s - Measures of Symmetry were taken on the Jamaican Children in 1996 and 2002: o Summed relative absolute differences between left and right side of body were used to calculate what is defined as "fluctuating asymmetry" (FA). o This was a cumulative score of mean adjusted absolute differences in relative asymmetry size (FA) summed across nine body parts, with a higher score indicating a person is more asymmetric. NOTE — While we believe that the score more accurately should be labeled Relative Fluctuating Asymmetry (RFA), due to the adjustment of the differences by the mean score, the term FA is used in the paper so we also do so here in the report to be consistent. o The calculations of fluctuating asymmetry and summed score are described in more detail later in this report. - Videotapes of dancing were made on some of the children in 2004/2005. o As part of a complicated process described later the same Jamaican children in the main study were also later the judges of the dancing ability of the selected 40 Jamaican dancers described below. Study Analysis Sample chosen in Early 2005 - From the larger subject population, 40 children (20 girls and 20 boys) were selected into four groups based on the following stated criteria: Asymmetric boys = 10 boys who were in the "top 1/3rds for FA asymmetry scores both in 1996 and again in 2002 Symmetric boys = 10 boys who were in the "lowest 1/t for FA asymmetry scores both in 1996 and again in 2002 Asymmetric girls = 10 girls who were in the "top 1/3"I" for FA asymmetry scores both in 1996 and again in 2002 Symmetric girls = 10 girls who were in the "lowest 1/3m" for FA asymmetry scores both in 1996 and again in 2002 3 EFTA01145649 NOTES I. While there were -183 original subjects eligible to be selected for the 40 dancers, the number of potential subjects was smaller, —106 due to various reasons including missing FA data in 1996 and/or 2002, and/or not later being filmed dancing. 2. It is not totally clear from the paper (Brown et al., 2005) whether the top/bottom 113,d means for both boys and girls pooled together or if the top/bottom 1/3ffis were calculated for each sex separately; the indications were that the sexes were pooled together. Nor is it clear from the paper whether the pools of subjects used for consideration in 1996 and 2002 included all persons with available data for the given year of consideration or were restricted to persons with complete data for both 1996 and 2002. 3. From the allegations made by Drs. Trivers, Palestis and Zaatari (2009) and the rebuttal of Drs. Brown and Cronk (2009), more than 10 subjects were eligible to be included in three of the above four groups and the process of how the total available pool of subjects were reduced to 10 for these three groups is one of the salient issues in the allegations of misconduct. 4. There are also claims in the rebuttal of Drs. Brown and Cronk (2009) that persons with what was deemed to be "poor dancing tape quality" were excluded from consideration for these four symmetry/asymmetry groups. But Dr. Brown and Dr. Cronk presented no evidence as to how this was done nor any information as to whether and how such exclusions were documented. 5. Importantly for the charges of fraud being made, before these 40 subjects were chosen, two Rutgers undergraduates had evaluated the dancing tapes and it appears that these scores were available to Dr. Brown prior to the selection process. Drs. Trivers, Palestis and Zaatari (2009) claim this on page 13. In Drs. Brown and Cronk's rebuttal (2009) it states that the Rutgers undergraduate evaluations of the tapes were "not all available" at the time the 40 dancers were selected. But Drs. Brown and Cronk do not elaborate further on this to indicate what portion of the Rutgers undergraduate evaluations were available at that time. However, it is clear on page 3 of the rebuttal by Drs. Brown and Cronk (2009) that the dance animations had been viewed by the Rutgers undergraduates and that at least some of the dance scores assigned by these undergraduates were used as part of a grant application made on February 23, 2005 by Dr. Brown, which was prior to the selection of subjects to groups. Thus it is not disputed that Dr. Brown had access to at least some of the Rutgers undergraduate evaluations of the dance animations before the selection of the 40 dancers. In any case, none of this elaborate prescreening of the subjects is EFTA01145650 mentioned in the Nature (2005) paper or in any available Appendices to the paper. Evaluation of Dancing Outcome in March 2005 I. The same Jamaican Children (155 of the 183 children) evaluated the "digitalized" dance routines. • Digitalized means that the identity and appearance of the dancer was hidden. 2. The plan was for each of the 155 children to evaluate each of the tapes from the digitalized dancers. • The overall score of each of the 40 tapes was the average score of all children who evaluated the tapes with these caveats: i. If a child evaluated his/her own digitalized dance, that score was excluded; ii. There appear in the data (c.f. sent to us by Dr. Palestis as described later in this report) evaluations that were incomplete or incorrect and may have thus been excluded from the mean score. But this is not fully documented in the article or by other information sent to us by Dr. Brown (or by Dr. Palestis). IV. SUMMARY OF THE ALLEGATIONS OF MISCONDUCT THAT WERE MADE BY DRS. TRIVERS, PALESTIS AND ZAATARI (2009) I. Parties Involved: a. The party alleged to have engaged in research misconduct: • Dr. William Brown — First author of Brown et al. (2005) who was a post-doctoral student in Anthropology at Rutgers when this work was done. He is now a faculty member at The University of Bedfordshire in Bedford, UK. He was one of 7 authors on the paper. b. The parties alleging research misconduct (Drs. Trivers, Palestis, Zaatari, 2009): • Dr. Robert Trivers — Senior author of the paper Brown et al. (2005) and in the same Department as Dr. Cronk (Anthropology). EFTA01145651 • Dr. Brian Palestis — Not a coauthor of Brown et al. (2005) and has not been affiliated with Rutgers (he is at Wagner College). He apparently has done the statistical analysis to "confirm" and support Dr. Trivers' position. • Dr. Darin Zaatari — Not a coauthor of Brown et al. (2005). She was a Ph.D. student at Rutgers when the Nature paper was written and has since graduated. She was apparently involved in much of the initial investigation by the accusers. c. Dr. Lee Cronk — The second author of Brown et al. (2005) and the Principle Investigator on the Grant. He is not accused of committing any misconduct. but has come to the defense of Dr. Brown. 2. Sequence of accusations and rebuttals as stated by the accusers and then the accused (paraphrasing the words used). a. Soon after the publication of Brown et al. (2005), Dr. Trivers (through communications with persons who were unable to obtain the same results that were in the paper from what they believed to be the data used for the analyses) developed concerns about the data Dr. Brown had used and analyses that appeared in the article. He and the other accusers began contacting Dr. Brown for explanations and the specific data sets that Dr. Brown used. (A series of emails resulting from these contacts was sent to us by Dr. Palestis). b. Not being satisfied with Dr. Brown's response, Drs. Trivers, Palestis and Zaatari (2009) conducted their own analysis of the data and facts as they saw them, ultimately leading to the point where they concluded that fraud had been committed by Dr. Brown. c. Drs. Trivers, Palestis and Zaatari attempted to have the journal "Nature" publish a letter retracting the article. When Nature refused to do this, they attempted to have an expose published in another journal. When this did not happen, they published their own 91 page analysis "Anatomy of a Fraud" (Trivers, Palestis and Zaatari, 2009). d. When contacted by the Rutgers Office of the General Council about the document "Anatomy of a Fraud," Drs. Brown and Cronk prepared a -50 page (including appendices) rebuttal to the accusations (Brown and Cronk, 2009). 3. The Major Accusations by Drs. Trivers, Palestis and Zaatari: a. Dr. Brown falsified some of the 1996 and 2002 fluctuating asymmetry (FA) scores on selected subjects in a fashion that i) moved boy/girl dancers to whom the two Rutgers undergraduates had given worse dance ratings into the top 1/3`ds of the FA asymmetry scales (most asymmetric) for 1996 and 2002 6 EFTA01145652 (and thus caused these worse dancers to meet the selection criteria for being asymmetric) and ii) moved boy/girl dancers who had been accorded better dance ratings by the two Rutgers undergraduate students into the bottom 1/3rds of the FA asymmetry scales for 1996 and 2002 (and thus caused these better dancers to meet the selection criteria for being symmetric). • The hypothesis, as presented in the allegations, was that Jamaican children and the Rutgers undergraduates would rate the dancers in about the same way. The specific allegation is that Dr. Brown leveraged this possibility to spike the top u3rd'asymmetric group with bad dancers and the bottom 1/risymmetric group with good dancers. b. More than 10 boys or girls met the criteria to potentially be included in three of the four groups. When that happened, Dr. Brown selected the 10 who would be included into the groups. The allegation is that he did this in a biased fashion so as to selectively choose from the eligible subjects those who were rated as worse dancers by the Rutgers undergraduates to place into the asymmetric groups and similarly selectively chose subjects who were rated as better dancers by the Rutgers undergraduates into the symmetric groups. c. After the Jamaican children dance evaluations were collected and scored, Dr. Brown falsified the Jamaican children's dancing score ratings to enable results which statistically supported the hypothesis of the paper. d. NOTE — As was stated earlier in this report, Drs. Trivers, Palestis and Zaatari's 2009 document further contains a multitude of other accusations. We did not investigation every allegation, but focused on those where it could be efficiently proved and substantively determined that misconduct occurred. (c.f. by Brown) in relation to the 2005 Nature paper. V. SUMMARY OF THE ACTIONS WE HAVE TAKEN TO INDEPENDENTLY INVESTIGATE THE CHARGES AND REBUTTALS I. The Rutgers Office of the General Council had already requested and received data sets relevant to the accusations when this committee became involved. We nevertheless requested from both Dr. Brown and Dr. Palestis (who did much of the analysis for the Trivers, Palestis and Zaatari (2009) report) all data sets and materials that had relevance to the allegations and, in particular, instructions on how to calculate a) the 1996 and 2002 FA scores and b) the Jamaican student dance rating scores from the raw data. • Dr. Palestis responded within 1 week of the request and sent us, among other things, data sets that included: 7 EFTA01145653 i. The data for 1996 and 2002 asymmetry scores which Dr. Palestis said Dr. Brown had sent him earlier and data for the same scores that exists in the database that Dr. Trivers' group maintains for the ongoing Jamaican study. (Dr. Brown did not collect the FA score data himself but received it from others who had collected these data). ii. The individual —155 Jamaican students' dance ratings of the 40 study subjects. Dr. Palestis said he had received these data earlier from Dr. Brown. iii. Instructions on how to calculate all scores in the data described above in i. and ii. iv. Summaries of Dr. Palestis', Trivers' and Zaatari's comparative analyses of the data they received from Dr. Brown with their own data from the Trivers group database and the results reported in the 2005 Nature article. • Dr. Brown responded later: i. He did send his data for the 1996 and 2002 asymmetry scores which after making a considerable number of comparisons appear to us to be essentially the same data that Dr. Palestis had sent us which he stated he had received from Dr. Brown. ii. Dr. Brown indicated that he no longer had raw data on the Jamaican students' ratings of the 40 dancers. When we questioned him further about these data, Dr. Brown said that the data Dr. Palestis et al. used for their analysis of dance scores (i.e. attributed to being from Dr. Brown) must be old or corrupt and the correct data could not be recovered from it. Quoting Dr. Brown from his January 25, 2011 email to Dr. Pazzani, "1sent a file to Dr. Brian Palestis some time ago, but it appears that thisfile is either corrupted or an earlier version of the one used by the research assistant (Le., to decide which ratings would be included in the average). If I couldfind thefile or figure out how to calculate the averagesfrom the one I sent Dr. Brian Palestis, I would send it to you along with detailed instructions to help with the investigation. .... Nonetheless, I will look for the file I sent to Dr. Palestis and attempt again to reconstruct the average ratings." We have not received any further correspondence from Dr. Brown on this issue. 2. We requested from all participants who we believed might have access to these documents (Drs. Trivers, Brown and Cronk) to send us copies of earlier versions of the paper and the reviews, most notably the original paper that was submitted to Nature along with the review and the response to the review. • The first response was from Dr. Cronk who sent us multiple copies of earlier versions of the paper. EFTA01145654 • Dr. Brown followed with one copy of an earlier version of the paper. • Dr. Trivers sent a copy of the review report of the first submission of the paper. 3. To be thorough, we sent emails and/or made phone calls to other coauthors (Drs. Keith Grochow, Amy Jacobson, C. Karen Liu, and Zoran Popovic) asking if they had any knowledge that could be relevant to the investigation. None of these authors were alleged to have engaged in misconduct, and while some were mentioned in the rebuttal, it did not seem as if they would have knowledge pertinent to the charges. • Dr. Popovic responded (and we believe he was also speaking for Drs. Liu and Grochow) that they had been aware of these allegations for quite some time, had been contacted about them by several sources and, as coauthors of the paper, were anxious to know the findings of the investigation. They had no significant new information on this to share with us. • Dr. Jacobson responded that she had no role in the study beyond being the field site manager for the general project and thus could not help us further. 4. After we had undertaken our analysis and were ready to finalize the report, we sent emails to Dr. Brown and Dr. Cronk requesting explanations on two findings we made that could reflect inconsistencies or fraud in the study design and analysis. • Dr. Cronk met with us at Rutgers in early October 2011 and Dr. Brown sent a written reply in December 2011. VI. SUMMARY OF OUR ANALYSIS AND FINDINGS We first followed the approach that Drs. Trivers, Palestis and Zaatari (2009) had used (including examining the rebuttals made by Drs. Brown and Cronk (2009)) to see if the arguments put forth were valid and then to see if we could replicate the different analyses with the data sets we had been given. While we found that the previous approaches which had been used were well reasoned and exhaustive, we tried to streamline and distill the analyses to be more easily understandable, communicable and addressable. The result is the following findings relevant to the main allegations of fraud that were made in the last paragraph of page 5 of Trivers, Palestis and Zaatari (2009). 1. Allegation - The 1996 and 2002 FA asymmetry scores of the 40 dancers who were chosen for the study groups were systematically fabricated in a fashion to make better dancers more symmetric and worse dancers less symmetric. Our Conclusion — There is clear and convincing evidence to support the allegations that this alleged research misconduct occurred. 9 EFTA01145655 a. This fabrication occurred and did cause dancers who were rated better by the Rutgers undergraduates to be more likely to be inserted into the "symmetric" boys and girls groups and dancers who were rated worse by the Rutgers undergraduates to be more likely to be inserted into the "asymmetric" boys and girls groups. b. It does not seem possible that; this fabrication i) could have happened by chance, ii) could have been perpetrated by anyone other than Dr. Brown or iii) if had it been perpetrated by someone other than Dr. Brown, that Dr. Brown would not have noticed this problem and reported it after years of questioning by Dr. Trivers' group and then by us. 2. Allegation - When Dr. Brown had the opportunity of choosing 10 subjects from a group of more than 10 to make the final top/bottom symmetry group for boys and girls, he chose the subjects in a way that favored the alternative hypothesis (i.e. based on the Rutgers undergraduate students dance evaluations). Our Conclusion - There is clear and convincing statistical evidence to support the allegations that the alleged research misconduct occurred. Dr. Brown either used the data collected by the Rutgers undergraduates (or some other informed evaluations of the digitalized dances) to carefully select subjects as alleged. Thus, for three of the four groups, among eligible dancers, those with better Rutgers undergraduate ratings were placed into the symmetric groups and those with poorer Rutgers undergraduate ratings were placed into the asymmetric groups. 3. Allegation - Dr. Brown fabricated the Jamaican children averaged dance score summaries of the 40 dancers in order to obtain statistically significant findings that supported the alternative hypothesis. Our Conclusion — There is enough evidence to support that the alleged research misconduct occurred. Dr. Brown is unable to produce data that can support the findings he reported in Nature (2005) which, as both the first author and as the person who undertook that data analysis, he should be able to do. However, Dr. Palestis produced a data set he claims to have received from Dr. Brown. Dr. Brown subsequently acknowledged he sent this data to Dr. Palestis and sent the same data to us, but claims that this data is incorrect / unusable and that he no longer has the correct data. It is thus impossible to know exactly what was done in the analysis by Dr. Brown because he relies on claims of unwritten / undocumented or otherwise unexaminable reasons for exclusions and/or incorrectness of some values in this existing data. Nonetheless, the findings of our analyses on the only existing raw dancer rating data initially provided by Dr. Palestis, are very consistent with those of Trivers et al. and are incompatible with the findings reported in Nature (2005). We now present detailed explanations of our findings on the three main allegations of research misconduct that were made against Dr. Brown. 10 EFTA01145656 1. Allegation - The 1996 and 2002 FA asymmetry scores of the 40 dancers who were chosen for the study groups were systematically fabricated in a fashion to make better dancers more symmetric and worse dancers less symmetric. Our Conclusion - There is clear and convincing evidence to support the allegations that this alleged research misconduct occurred. a. This fabrication occurred and did cause dancers who were rated better by the Rutgers undergraduates to be more likely to be inserted into the "symmetric" boys and girls groups and dancers who were rated worse by the Rutgers undergraduates to be more likely to be inserted into the "asymmetric" boys and girls groups. b. It seems impossible that; i) this fabrication could have happened by chance, ii) it could have been done by anyone other than Dr. Brown, or iii) had it been by someone other than Dr. Brown, that Dr. Brown would not have noticed this problem and reported it after years of questioning by Dr. Trivers' group and then by us. EVIDENCE For Fabrication of Asymmetry Scores A. The 1996 and 2002 asymmetry scores in the data sets sent to us by Dr. Palestis were entirely internally self-consistent (i.e. the data did not contradict itself) with respect to the Fluctuating Asymmetry (FA) scores and their component variables. B. The 1996 and 2002 FA scores and their components in the data sent to us by Dr. Brown were: I) Internally self-consistent for all subjects who were not chosen to be one of the 40 dancers. 2) In general, not internally self-consistent (data contradicted itself) for the 40 subjects who were chosen to be dancers as described in Section C below. C. The non-self-consistency of FA scores in Dr. Brown's data is, in our view, impossible to explain by anything other than fabrication of some of the data by a person who, at the time of the fabrication, did not realize that the other items also needed to be changed for the data to be self-consistent or otherwise did not think to change these items. For each subject, the Fluctuating Asymmetry (FA) score was calculated as a sum of absolute relative asymmetry for 9 body parts (elbow, wrist, knee, ankle, foot, ear, 3nd digit, 4th digit and Sth digit) as described below. FA = ERA, where P = 1,..., 9 enumerates the nine body parts and RAI, is the relative asymmetry of the given body part (i.e. hand, ear, 4th digit, etc.) 11 EFTA01145657 ADp For each body part, RAF, — — with M p ADJ, = Absolute Value of [Left Side Measure — Right Side Measure] MP = Average of Left Side Measure and Right Side Measure The values of ADP , MP and RAP are saved in the data sets we received from Drs. Brown and Palestis for each person and body part P (Dr. Brown's data sent to us is missing ADP for the 3nd digit in 1996 and for the ears in 2002). As described above for each subject and body part, if we go into the data sets and for the Pth body part and year (1996, 2002) take the ratio of the values ADp l MP of a given child, this ratio is always the same as the value of RAI, for that Pth body part of that child during the same year in the data (i.e. self-consistent) in Dr. Palestis' data as it should be. The observed ratio ADp1 MP is also always equal to (i.e to within three decimal places) the value of RAP for the same body part of the child in the same year in Dr. Brown's data for all subjects not selected into the study; with any differences that were less than 3 decimal places being very small (i.e. of order < 10.10) and thus being likely due to round off error at some stage and otherwise having no impact on the FA score. However, the ratios of ADp1 MP are largely not equal (within 3 decimal places) to the values of RAP for the same body part of the same child in the given year (1996 or 2002)for almost all of the 40 subjects selected to the study groups for Dr. Brown's data among the body parts that were included in the FA score. For example, with P = 4thdigit, going to subject 15 in 1996 (who was selected as one of the 40 dancers) in Dr. Brown's data we observe ADP = 0.875 MP = 55.888 RAP = 0.0076 (which rounded to three decimal places in Tables 1 and 2 is 0.008) But looking at ADP / MP for this person gives the self-consistent value of RAP as 0 .875 / 55.888 = 0.0157 (which rounded to three decimal places in Tables 1 and 2 is 0.016) In their rebuttal to the allegations by Drs. Trivers, Palestis and Zaatari (2009), Drs. Brown and Cronk (2009) suggest that some data discrepancies might be due to "rounding" errors. However, it is obvious that the difference between 0.0157 and 0.0076 is too large to be due to round-off error and that this difference does not qualitatively change by only taking the measures of ADP and MP out to 3 Vs.4 or more decimal places. The same is true for the other inconsistencies we observed between RAP and ADP / MP in Dr. Brown's data in the 40 selected dancers. Furthermore, it should be noted that for all study subjects and all body parts P, the values of ADP and MP for body part P of any given subject do not differ between Dr. Brown's and Dr. 12 EFTA01145658 Palestis' data sets. The inconsistent values of RAP do not equal the ratios of the corresponding ADp l MP for the vast majority of body parts in the 40 selected dancers in Dr. Brown's data set and differ from the RAF in Dr. Palestis' data set (which always equals the ratio of the corresponding ADP! Mp ). As we just noted, when these differences between Dr. Brown's and Dr. Palestis' data occur, the RAF in Dr. Palestis' data is equal to the ratio of the corresponding AA, IMP while the RAP in Dr. Brown's data does not equal the ratio of the corresponding AA, IMP . In other words, if we look at the 4thdigit of subject 15 for 1996 in Dr. Palestis' data, we see the correct and self-consistent values ADJ, = 0.875 MP = 55.888 RAP =0.0157 (i.e. = 0.875 / 55.888) Due to there being 9 body parts measured on 2 different years (1996 and 2002) and 290 subjects in the data set, we cannot show all the comparisons here. However, Table 1 displays the values of ADP, MP , the actual ratio of these values ADP / MP and RAP for the 4th digit in 1996 among the first 30 dancers in Dr. Brown's data set which includes some who were selected into the 40 asymmetric / symmetric dancers. Those who were selected into the final 40 asymmetric / symmetric dancers are highlighted in red in Table 1. When the recorded RAF does not equal the ratio of the corresponding AA,/ MP , the last 2 columns in Table I are highlighted in bold. Table 2 shows the same comparisons for the 40 selected dancers. For 34 of these subjects, the RAP does not equal the corresponding ratio ADp l MP As is true for the other subjects and body parts in 1996 and 2002, the recorded RAP always equals the observed ratio ADP / MP in subjects who were not selected to be in the 40 dancers but usually does not for those who were selected. It should be noted that measures for ADP MP and RAP are recorded for 1996 in Dr. Brown's dataset on one body part (the hand) that was not used in the FA score. For this body part, the ratio of ADP / Mr. always equals the corresponding RAF in the 40 selected dancers in spite of the fact just noted above that it seldom does for the body parts that were included in the 1996 FA score. It should also be noted that sometimes values for ADP and MP are present but the corresponding result for RAP is missing in Dr. Brown's data. For example, this happens with ID 7 in Table 1 and ID 287 in Table 2. However, we have found that in settings when this happens an entire set of values for at least one other summed body part in that year is missing. For example, ID 7 is missing measures of ADP MP and RAP for foot in 1996 and ID 287 is missing the values of ADP MP and RAP for elbow in 1996. Thus the missing RAP 's for the 4th digit of IDs 7 and 287 in 1996 could reflect that all of the RAF 's needed for the 1996 sum were not available for those IDs. (While Dr. Brown in fact has a sum recorded for 1996 FA of ID 287 13 EFTA01145659 as shown in column 2 of Table 3 that is mentioned later in this report, this was not possible as column 3 of the same table shows since elbow was missing.) When we met with Dr. Cronk in October 2011, he had no explanation for the discrepancies between the recorded RAp and the actual ratios ADP /MI, for body parts among the 40 selected subjects. Dr. Brown also acknowledged the inconsistencies existed as well when he replied to our questions on December 1, 2011, but his only explanations alluded to the fact that either he did not know how they could happen and/or that the data we had received from him may not have been the same data that he actually used in 2005 and/or that these errors may have been introduced by other people before he received the data. To quote (with salient phrases underlined by us) from part of his response to our questions on this issue that he returned on December I, 2011 .... "This is interesting as you rightly point out the hand was not used as one of the FA's in the composite. Recall that all information that was used and presented in the Nature paper was not from the master dataset I sent you. Any values that are included in thisfile were pastedfrom thefile used in 2005. This is clear evidence that the file I was working with in 2005 is indeed different from thefile you attached as I previously claimed". It is challenging to explain why these inconsistencies occur. Recall that when making the datasetfor Dr Palestis well after the dance paper was published in Nature (the email andfile time stamps indicate thisfact) it came to my awareness that there were errors. I should point out that these initial errors were introduced before I began working on the project. Indeed to make the so-called masterfile for Dr Palestis involved me merging, cutting and pasting from different files some of which I no longer have access. Since errors were discovered after I made the file I am skeptical about the validity of this file. You have discovered another problem, to which i have no logical explanation. I acknowledge it to be there but as to how it emerged (and when) is unclear to me. Without the original files i was working with it difficult to isolate how and when discrepancies emerged in this post-publication dataset." The only explanation we can see for the non-self-consistencies in Dr. Brown's data is that Dr. Palestis' data set is correct and that the values for RAP were altered in Br. Brown's data so that they would sum to the values of FA for those subjects in 1996 and 2002 which had also been altered. But this was only done within the 40 selected dancers and was done by someone who was either not aware that the corresponding values for ADP and Mp also needed to be altered to make the data self-consistent or otherwise did not bother to do so. We see no conceivable way this alteration could happen by chance or accident; we conclude it must be the result of fabrication. For example, non-self-consistencies between ADP and Alp were RAP observed at least once in 39 of the 40 selected dancers compared to never in the 66 other filmed dancers with available FA data for 1996 and 2002 who were not selected. The P-value for this to occur by chance alone is less than one in 10-27 times by exact test. The Alteration of Asymmetry Scores Was Done by Dr. Brown 14 EFTA01145660 It seems impossible that anyone else except Dr. Brown (who did the data analysis for the paper and held the data set) would have access to these data to alter only the values of RA, and corresponding summed FAA. We do not see how someone creating a data set in 2005 before Dr. Brown began working on the project would have the reason or ability to alter these values only among those 40 people who ultimately at a later date became selected to be dancers using what is now an incompletely defined process and, what would have been at the time of that alteration, an unknowable process. The Alteration of Asymmetry Scores Favored the Invtstieator's Hypothesis in a Wav That Could Have Been Foreseen by Dr. Brown The complexity of the study design and fact that this design was not clearly explained (and further confused by caveats such as persons were excluded from consideration because their videos were deemed un-evaluable) complicates a certain determination of "what would have happened" if the data had not been fabricated as we believe it was. However, we compare in Tables 3 and 4 respectively the differences [Dr. Brown's data summed FA — Correct Summed FA] for 1996 and 2002 respectively. By "Correct Summed FA" we mean the summed FA that is self-consistent with the AD, and M,, in the data set. For example, in Table 3 for ID 15, the value for summed 1996 FA in Dr. Brown's data was 0.110 (in column 2). However, based on the actual values of AD, and M,, for the 9 body parts in 1996 and their ratios, the correct (i.e. self- consistent) 1996 FA for ID 15 was 0.163 (in column 3). This means that Dr. Brown's summed 1996 FA for ID 15 was shifted -0.053 (in column 4) from the correct value (-0.053 = 0.110 — 0.163) making that person more symmetric than they would be by the self-consistent FA measure. Column 5 has the averaged Rutgers undergraduate dancer scores for ID 15 which was 123.93. Now 123.93 was one of the higher scores meaning this person's summed FA was shifted lower by 0.053 to make this person more symmetric by Dr. Brown's score, and this person was also rated as a relatively good dancer by the Rutgers undergraduate students. The format for Table 4 is the same as that for Table 3 except that 2002 rather than 1996 FA scores are involved. In order to see if the shifts (from self-consistent) in the 1996 and 2002 FA scores in Dr. Brown's data were associated with the Rutgers undergraduates' dance scores, we examined the correlations of the shifts (column 4) with the averaged Rutgers undergraduate scores (Column 5) in Tables 3 and 4 among those dancers where Dr. Brown's value differed from the self-consistent value. These analyses were restricted to only those subjects in 1996 and 2002 respectively, where Dr. Brown's FA differed from the correct self-consistent FA. For 1996 (Table 3) the shift between Dr. Brown's value and the self-consistent value was negatively correlated with the averaged Rutgers undergraduate dancer scores (p= -0.39 with M.0157 for no association by Formula 16.25 in Berenson and Levine, 1999). For 2002 (Table 4) the shift between Dr. Brown's value and the self-consistent value was also negatively correlated with the averaged Rutgers undergraduate dancer scores p= -0.24 for 2002 with P=0.245, (by Formula 16.25 in Berenson and Levine, 1999). This means that, compared to bad dancers, good dancers were more shifted towards symmetry by the alterations in Dr. Brown's FA scores in both 1996 and 2002, something that would support the alternative hypothesis. 15 EFTA01145661 Using Fisher's (1950) method (as described below on page 23 of this report) to pool the p- values from 1996 and 2002 together with the fact that the shifts were in the same direction gives an overall two-sided P-value of 0.0152 for the shifts in 1996 and 2002 simultaneously being directionally associated with Rutgers undergraduate scores. In other words, it is not likely that the shifts in the FA scores that Dr. Brown's data had from the correct self-consistent FA scores for 1996 and 2002 would correlate with the averaged Rutgers undergraduate evaluations in the direction of the alternative hypothesis as strongly as they did. It should be noted that Drs. Brown and Cronk's rebuttal (2009) claims that some or all of Rutgers undergraduate evaluations were not available when the 40 symmetric / asymmetric dancers were selected. But even if that were the case, it does not invalidate the findings of this test which indicate that the changes in FA within Dr. Brown's data were directionally associated with a supposedly independent measure of the dancing ability. For example, others (including we believe almost certainly Dr. Brown) were also able to view the animation tapes before the 40 dancers were selected. Thus Dr. Brown could have used dancer evaluation information from sources other than the Rutgers undergraduate students to base any decisions for fabrication. As these dancer evaluations from other sources would also likely agree with the Rutgers undergraduate students with respect to quality of dance, the fabricated shifts in FA would still be statistically associated with the Rutgers undergraduate scores in the direction of the alternative hypothesis even if the undergraduate scores were not used in the fabrication process. The rebuttal from Drs. Brown and Cronk (2009) mentions tapes being excluded from consideration for selection by the investigators due to poor quality, an assertion that means that the tapes must have been viewed in advance to screen for this. It stands to reason that the perception of Dr. Brown and others on dancing ability would be in the same directions as that of the Rutgers undergraduate students and, if so, this association of shifts in FA from the self-consistent value to Dr. Brown's value with Rutgers undergraduate ratings would transfer to the same associations with other ratings of dancing ability as well. 16 EFTA01145662 2. Allegation - When Dr. Brown had the opportunity of choosing 10 subjects from a group of more than 10 to make the final top/bottom symmetry group for boys and girls, he chose the subjects in a way that favored the alternative hypothesis (i.e. based on the Rutgers undergraduate students dance evaluations). Our conclusion - There is clear and convincing statistical evidence to support the allegations that the alleged research misconduct occurred. Dr. Brown either used the data collected by the Rutgers undergraduates or some other informed evaluations of the digitalized dances to carefully select subjects as alleged. Thus, for three of the four groups, among eligible dancers, those with better Rutgers undergraduate ratings were placed into the symmetric groups and those with worse Rutgers undergraduate ratings were placed into the asymmetric groups. EVIDENCE With respect to this charge (that there was a biased pre-selection of the 10 subjects when more than 10 were eligible such that those chosen were biased in the direction of the alternative hypothesis when the Jamaican students evaluated the tapes), the background may be summarized as follows: 167 individuals were assessed for FA in 1996 and 2002. Of these, according to Trivers, Palestis, Zaatari (2009), 167 were filmed while dancing using a motion capture technique of whom 106 had complete FA data for 1996 and 2002. It was then decided that the effect of FA on perceived dance ability would be compared across four groups of 10 individuals each: symmetrical males, asymmetrical males, symmetrical females and asymmetrical females. To identify the 10 subjects for each group that would be drawn from the larger population, a criterion was established, namely that each of the 10 subjects for each group must fall in either the i) the upper thirds of the symmetry-asymmetry scale for both 1996 and 2002 or ii) the lower thirds of the symmetry-asymmetry scale for both 1996 and 2002. Dr. Brown's review of his FA data for both years using these criteria identified 13 "symmetrical" eligible males, 13 asymmetrical eligible males, 10 symmetrical eligible females and 16 asymmetrical eligible females (Trivers, Palestis, Zaatari 2009; Brown and Cronk 2009). That is, for three of the four groups, there were too many possible subjects and 10 subjects needed to be selected from the pool. The charge against Dr. Brown is that the selection process was not random or blind but done deliberately with the intent of increasing the probability that the main alternative hypothesis would be statistically substantiated. The 40 dance animations were ultimately evaluated by 155 Jamaicans who had also served as dancers or dancer candidates to provide the outcome data for Dr. Brown's study. However, as noted earlier, the animations were pre-evaluated by two undergraduate dance students of Rutgers University. Dr. Brown allegedly had access to these evaluations and, allegedly, used them to select the 40 animations from the larger pool of eligible subjects as described above. However, even if Dr. Brown did not have access to these Rutgers undergraduates' dance evaluation scores, he and/or others had access to the tapes and their own ratings of these tapes might be similar to 17 EFTA01145663 those of the Rutgers undergraduate students. So, is there evidence either way that Dr. Brown did or did not use a randomized / blind procedure to select the 40 subjects from the larger pool of 52 = 13 + 10+ 13 + 16? The expl
ℹ️ Document Details
SHA-256
1d741fc8e9be21df548ee04e7882f82ce03fc22bc4a5ef73c4956e911632d5e1
Bates Number
EFTA01145646
Dataset
DataSet-9
Document Type
document
Pages
38

Comments 0

Loading comments…
Link copied!