Wanna try the paired sample t-test and Wilcoxon signed-rank test?
Today, I am going to walk you through a paired sample t-test with an interesting example! (with a surprise! turn to the Wilcoxon signed-rank test at the end:))
Research Story: Imagine that we have a face photo of someone (maybe you?:). Then, we manipulate this image and blur it. Now, we have both clear and blurred versions of that image. We ask people to rate these images in terms of attractiveness. We want to know if the attractiveness ratings given to these two images differ. Perhaps the blurred face image looks more attractive than the normal one OR not:)
What to do: Since the raters are the same, paired t-test might be suitable for the analysis. However, it has some assumptions. The sampling distribution of the differences between ratings should be normal (we are going to test that), and the data is measured at least on an interval level (we got this one for sure!).
Checking for the outliers: Let us first check if the data has outliers. To do that, we are going to use “pandas, scipy, and matplotlib” in Python. By the way, our data (ttestblog.csv) consisted of the two columns: Clear & Blurred.
import pandas as pd from scipy import stats import matplotlib.pyplot as plt df = pd.read_csv('/Users/your path/ttestblog.csv') #Check out the outliers: df[['Clear', 'Blurred']].plot(kind='box') plt.savefig('Outliers.png')
The resulting boxplot looked like below:
Checking for normality assumption: As mentioned above, paired sample t-test requires that the difference of the ratings to be normally distributed. Therefore, we first need to calculate the difference in attractiveness ratings given to the clear images and the blurred images. Then, we are going to draw the histogram indicating the distribution of these difference data.
df['difference'] = df['Clear'] - df['Blurred'] df['difference'].plot(kind='hist', title='Clear Image minus Blurred Image') plt.savefig('Clear Image minus Blurred Image')
The resulting histogram:
Let us also draw a Q-Q plot:
stats.probplot(df['difference'], plot=plt) plt.title('Image Difference Q-Q Plot') plt.savefig('Image Difference Q-Q Plot')
The resulting Q-Q plot:
Normality check with the Shapiro-Wilk test:
normality = stats.shapiro(df['difference']) print(normality)
The result: (0.6627291440963745, 2.613695088065653e-20). The test is significant, therefore, the difference ratings are NOT normally distributed. Nay!
We should not be computing paired sample t-test but for the sake of the example, let us do it:
ttest = stats.ttest_rel(df['Clear'], df['Blurred']) print(ttest)
The result: statistic=0.8990311874310098, pvalue=0.36966086326571657.
So, if our difference data were normally distributed, we would conclude that the difference between the ratings is not significantly different.
Going back to our non-normality problem: What should we do when the difference data is not normally distributed? Wilcoxon signed-rank test can be computed. To do that:
import pandas as pd from scipy import stats df = pd.read_csv('/Users/yourpath/ttestblog.csv') df['difference'] = df['Clear'] - df['Blurred'] result = stats.wilcoxon(df['difference'])
The result was: statistic=1101.5, p value=0.6381557112285803. Therefore, the attractiveness ratings given to the clear and blurred version of the face image did not significantly differ.