Korean J Pain 2010; 23(1): 35-41
Published online March 31, 2010 https://doi.org/10.3344/kjp.2010.23.1.35
Copyright © The Korean Pain Society.
Kyoung Hoon Yim, MD, Francis Sahngun Nahm, MD*, Kyoung Ah Han, MD*, and Soo Young Park, MD*
Department of Anesthesiology and Pain Medicine, Seoul National University Bundang Hospital, Seongnam, Korea.
*Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul, Korea.
Correspondence to: Francis Sahngun Nahm, MD. Department of Anesthesiology and Pain Medicine, Seoul National University Bundang Hospital, 166, Gumi-ro, Bundang-gu, Seongnam 463-707, Korea. Tel: +82-31-787-7499, Fax: +82-31-787-4063, hiitsme@hanmail.net
Received: October 16, 2009; Revised: October 26, 2009; Accepted: November 11, 2009
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the
All the articles, except case reports and editorials, published from 2004 to 2008 in the
One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%).
We found various types of statistical errors in the articles published in the
Keywords: data interpretation, statistical analysis, statistics
The statistical method of analysis is to collect, arrange, and draw general regularity from data; this is recognized as the most fundamental and universal method to prove soundness of conclusions in all scientific research. Undoubtedly, the statistical method of analysis is important in that the final purpose of medical research is clinical application, since inappropriate statistical techniques may deteriorate the quality of research articles or cause decisive error, thus leading to wrong treatment.
Although rapid progress of the computer programs for statistical analysis in recent times has allowed for convenience in analyzing data, there has been an increased danger in obtaining the wrong results from statistical analysis or misinterpreting the analyzed results if correct understanding of fundamental statistical concepts is lacking [1].
Although many articles have been published since the first issue of the
Among the 296 articles published in the
For the cases where only descriptive statistics was used, the number of those articles was counted, while the types and frequency of the used statistical methods were analyzed in the cases where inferential statistics was applied. The validity of the statistical method in each article was evaluated by using the revised Checklist for Assessing the Methodological and Statistical Validity of Medical Articles (Table 1) [2]. On the checklist, items such as the type of study, type of applied statistical method, and validity of applying the statistical method were included. The item regarding the validity of applying the statistical method was divided into 2 categories; "errors of omission" and "errors of commission". "Errors of omission," were caused by an insufficient report of the analysis procedure and data by the researcher, and "errors of commission," were caused by statistical mistreatment. In the "errors of omission," items included the following: ① Incomplete description of basic data, ② Incomplete description of applied statistical methods, ③ No statistics were used even though statistical methods were required, and ④ No evidence that described statistical methods was used. In the "errors of commission," items included the following: ① Inadequate description of measures of central tendency or dispersion, ② Incorrect analysis, and ③ Unwarranted conclusion.
The statistics checklist for individual articles was filled by statistics professionals and pain medical specialists together. If more than one statistical method was used in one article, the number of times was added to the calculation individually. If there were different statistical errors in one article, each error count was added up, while only one time was added to the calculation if the same error was repeated more than once in one article.
The completed checklists were statistically analyzed with SPSS statistics version 17.0 (SPSS Inc., Chicago, USA) to derive the frequency and percentage of each item.
A total of 20 (14.4%) articles out of the 139 articles employed only descriptive statistics; and inferential statistics was used in 119 (85.6%) articles (Table 2). The inferential statistics was used 252 times in the 119 articles, among which the t-test was the most frequently used at 53 times (21.0%), followed by the χ2 test at 40 times (15.9%), the analysis of variance (ANOVA) at 25 times (9.9%), the Mann-Whitney U test at 23 times (9.1%), and the paired t-test at 22 times (8.7%). The distribution of each of the applied statistical methods is shown in Table 3.
Out of the 139 target articles, 29 (20.9%) articles were free from statistical errors. From the 110 (79.1%) articles where the statistical analysis was inappropriately applied, the number of errors found according to the statistics checklist (Table 1) was 266 (2.4 time/article).
"Errors of omission" were found 101 times in 70 articles (1.44 time/article). Among these, the most frequent error was "no statistics were used even though statistical methods were required" at 41 times (40.6%), followed by "incomplete description of applied statistical methods" at 24 times (23.8%) (Table 4).
"Errors of commission" were found 165 times in 86 articles (1.92 time/article). Among these, "inadequate description of measures of central tendency or dispersion" was registered at 35 times (21.2%), "incorrect analysis" at 123 times (74.5%), and "unwarranted conclusion" at 7 times (4.2%). Out of the 123 occasions where incorrect statistical analysis was used, "parametric inference for nonparametric data" was the most common error at 56 times (33.9%), followed by "chi-square test on the data with inappropriate sample size" at 24 times (14.5%) (Table 5).
The importance of statistical analysis in medical research papers is ever increasing day by day, therefore, it can be said that evaluation of statistical validity in medical research articles is very important nowadays when evidence based medicine is highly valued.
Since 1990s, several academic societies in Korea have investigated the current status of statistics applied in the articles published in their journals [3-7], and the Checklist for Assessing the Methodological and Statistical Validity of Medical Articles [2], has been revised for individual academic societies with the intent of being used for the analysis of the articles [3,5,7]. Also in this study, we used the checklist [2], which has been used many times in previous studies, in analyzing the target articles as objectively as possible.
We could verify that many kinds of statistical methods have been used in the original papers published in the
The occasions where "no statistics were used even though statistical methods were required," was the most frequent item in the "errors of omission," as 41 articles in this study were given these results. One representative example was where different concentrations of a drug were given to individual groups. Although only the difference in means was compared and analyzed for the difference between the effects on the individual groups, the conclusion that "the effect was proportional to the dose of the drug" was made and reported. In such a case, the analytical procedure which can prove the correlation between dose and effect should be carried out in order to make the correct conclusion. Another example is the case where the number of animals or the number of experimental targets was not clearly described; rather, it was described as "5-7 for each group" or "18-20 persons for each group," which is an incorrect description. In addition, cases of "no evidence that described methods were used" were found among the "errors of omission." If a statistical method was actually used, it should be explicitly mentioned. In addition, the statistical method used for each analysis should be precisely classified and described rather than listing the names of statistical methods.
A considerable number of occasions where "incorrect analysis" from the "errors of commission" were verified in this study, including following examples:
First, the errors found in high frequency were "parametric inference for nonparametric data" (33.9%) and "chi-square test on the data with inappropriate sample size" (14.5%). Furthermore, this result is particularly significant for the quality evaluation of not only the individual articles, but also the
Second, 32 examples where the experimental data were expressed as "mean ± standard error" were found. Standard error is used to estimate how much the mean value can be varied when repeated sampling of a different sample with an equal sample size was carried out from a population. Therefore, since standard error is to be used to estimate the distribution of a population mean, the data observed by the researcher must be expressed in the form of "mean ± standard deviation" rather than "mean ± standard error" [10].
Third, when comparing the means of three groups or more, although it is necessary to show that there is a group with a different mean by post hoc analysis, the error to conclude that a specific group had a different mean without this process was found in 14 cases. The parametric statistical method used to compare the means of three different groups or more is a one way analysis of variance (ANOVA), of which the null hypothesis is: "The means of all the groups are equal (H0: µ1 = µ2 = µ3 = ··· = µn)." Although it can test whether the means of all the groups are equal or not, it cannot specifically tell which groups have a difference in means among them. Therefore, if there is a difference between groups, it is necessary to check through post hoc analysis which specific group has the difference with others [11].
Fourth, there were three cases where the error of comparing the means on the categorical variables was made. For example, when measuring patients' satisfaction in the three classes of high, moderate, and low, it is not right that the researcher renders arbitrary scores for each class and compares the means, since the variable of patients' satisfaction is one of the categorical variables. In this case, the statistical method for the analysis of categorical variables must be applied. The most fundamental factor in statistical analysis is to understand the types of variables to analyze, because the analytical method to be used is dependent upon the type of scale.
Fifth, the dependence/independence of the variables to analyze is also important. One representative example is the test for paired samples. The paired t-test, which is frequently used in comparing the degree of pain before and after a treatment, should have the same sample size for the two groups since the t-test is supposed to compare the difference of the two dependent groups.
The fact that only 20.9% of the articles published in the
Considering the analysis of the statistical errors found in other Korean journals, the proportion of the articles where no statistical error was found was 19.0% for the
Differentiating the major fatal errors from the minor statistical errors is important, since the former may raise serious questions regarding the validity and reliability of the study. Caution should be paid to avoid making the mistake of devaluating significant academic achievements by taking all types of statistical errors overly seriously, and thus, exaggerating minor errors to an unnecessary extent. Hence, this should be remembered in the reviewing process.
Although statistics has a vast range, only several statistical methods are employed for medical articles. Based on the analysis of 1,828 medical articles, one can understand and interpret 70% of the medical articles if one precisely understands descriptive statistics, the t-test, the chi-square test, and Fisher's exact test [20]. As understood from the results of our study that the most frequently used analytical methods in the
In conclusion, we have found many statistical errors in the articles published in the
Each number represents the number of articles.
ANOVA: analysis of variance. Each different kind of statistical technique in the same article was counted separately.
Each different kind of error in the same article was counted separately. But two or more of the same kind of errors in an article were counted as one.
ANOVA: analysis of variances. Each different kind of error in the same article was counted separately. But two or more of the same kind of errors in an article were counted as one.