ttest()

Description

Conducts various comparison tests between two groups and returns data tables as Pandas DataFrames with relevant information pertaining to the statistical test conducted.

This method can perform the following tests:

Independent sample t-test [1]

psudo-code: ttest(group1, group2, equal_variances = True, paired = False)

Paired sample t-test [2]

psudo-code: ttest(group1, group2, equal_variances = True, paired = True)

Welch’s t-test [1]

psudo-code: ttest(group1, group2, equal_variances = False, paired = False)

Wilcoxon signed-rank test [3]

psudo-code: ttest(group1, group2, equal_variances = False, paired = True)

Note

Deprecation Warning

This function is being deprecated in the future during the updating and streamlining of the package.

Parameters

Input

ttest(group1, group2, group1_name= None, group2_name= None, equal_variances= True, paired= False, wilcox_parameters = {“zero_method” : “pratt”, “correction” : False, “mode” : “auto”}, welch_dof = “satterthwaite”)

group1 and group2 : Requires the data to be a Pandas Series.

group1_name and group2_name : Will override the series name.

equal_variances : Tells whether equal variances is assumed or not. If equal variances are not assumed and the data is unpaired, then the Welch’s t-test will be conducted using Satterthwaite or Welch degrees of freedom (default is Satterthwaite).

paired : Tells whether the data are paired. If the data is paired and equal variances are assumed then a paired sample t-test will be conducted. If the data is paired and equal variances are not assumed then a Wilcoxon signed-rank test will be conducted.

wilcox_parameters : A dictionary which contains the testing specifications for the Wilcoxon signed-rank test.

welch_dof : A string to indicate which calculation is to be used when calculating the degrees of freedom. Can either be “welch” or “satterthwaite” (default).

Returns

Will return 2 Pandas DataFrames (default) as a tuple. The first returned DataFrame will contain the summary statistics while the second returned DataFrame contains the test results.

DataFrame 1

(All except Wilcoxon signed-rank test) has summary statistic information including variable name, total number of non-missing observations, standard deviation, standard error, and the 95% confidence interval. This is the same information returned from the summary_cont() method.

For the Wilcoxon signed-rank test, this will contain descriptive information regarding the signed-rank.

DataFrame 2

(All except Wilcoxon signed-rank test) has the test results for the statistical tests. Included in this is an effect size measures of r, Cohen’s d, Hedge’s g, and Glass’s \(\Delta\) for the independent sample t-test, paired sample t-test, and Welch’s t-test.

For the Wilcoxon signed-rank test, the returned DataFrame contains the mean for both comparison points, the W-statistic, the Z-statistic, the two-sided p-value, and effect size measures of Pearson r and Rank-Biserial r.

Welch Degrees of freedom

There are two degrees of freedom options available when calculating the Welch’s t-test. The default is to use the Satterthwaite (1946) calculation with the option to use the Welch (1947) calculation.

\[\frac{(\frac{s^2_x}{n_x} + \frac{s^2_y}{n_y})^2}{\frac{(\frac{s^2_x}{n_x})^2}{n_x-1} + \frac{(\frac{s^2_y}{n_y})^2}{n_y-1} }\]

\[-2 + \frac{(\frac{s^2_x}{n_x} + \frac{s^2_y}{n_y})^2}{\frac{(\frac{s^2_x}{n_x})^2}{n_x+1} + \frac{(\frac{s^2_y}{n_y})^2}{n_y+1} }\]

Effect Size Measures Formulas

Cohen’s d_s (between subjects design)

Cohen’s d_s [4] for a between groups design is calculated with the following equation:

\[d_s = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{(n_1 - 1)SD^2_1 + (n_2 - 1)SD^2_2}{n_1 + n_2 - 2}}}\]

Hedges’s g_s (between subjects design)

Cohen’s d_s gives a biased estimate of the effect size for a population and Hedges and Olkin [5] provides an unbiased estimation. The differences between Hedges’s g and Cohen’s d is negligible when sample sizes are above 20, but it is still preferable to report Hedges’s g [6]. Hedge’s g_s is calculated using the following formula:

\[\text{Hedges's g}_s = \text{Cohen's d}_s \times (1 - \frac{3}{4(n_1 + n_2 - 9)})\]

Glass’s \(\Delta\) (between or within subjects design)

Glass’s \(\Delta\) is the mean differences between the two groups divided by the standard deviation of the control group. When used in a within subjects design, it is recommended to use the pre- standard deviation in the denominator [7]; the following formula is used to calculate Glass’s \(\Delta\):

\[\Delta = \frac{(\bar{x}_1 - \bar{x}_2)}{SD_1}\]

Cohen’s d_av (within subject design)

Another version of Cohen’s d is used in within subject designs. This is noted by the subscript “av”. The formula for Cohen’s d_av [7] is as follows:

\[d_{av} = \frac{M_{diff}}{\frac{SD_{1} + SD_{2}}{2}}\]

Pearson correlation coefficient r (between or within subjects design)

Rosenthal [8] provided the following formula to calculate the Pearson correlation coefficient r using the t-value and degrees of freedom:

\[r = \sqrt{\frac{t^2}{t^2 + df}}\]

Rosenthal [8] provided the following formula to calculate the Pearson correlation coefficient r using the z-value and N. This formula is used to calculate the r coefficient for the Wilcoxon ranked-sign test. Note, that N is the total number of observations.

\[r = \frac{Z}{\sqrt{N}}\]

Rank-Biserial correlation coefficient r (between or within subjects design)

The Rank-Biserial r [9] is also provided for the Wilcoxon signed-rank test as is calculated as:

\[\text{Rank-Biserial r = } \frac{\sum{Ranks}_{+} - \sum{Ranks}_{-}}{\sum{Ranks}_{total}}\]

Examples

Loading Packages and Data

import numpy, pandas, researchpy

numpy.random.seed(12345678)

df = pandas.DataFrame(numpy.random.randint(10, size= (100, 2)),
                  columns= ['healthy', 'non-healthy'])

Independent t-test

# Independent t-test

# If you don't store the 2 returned DataFrames, it outputs as a tuple and
# is displayed
researchpy.ttest(df['healthy'], df['non-healthy'])

(      Variable      N   Mean        SD        SE  95% Conf.  Interval
    healthy  100.0  4.590  2.749086  0.274909   4.044522  5.135478
non-healthy  100.0  4.160  3.132495  0.313250   3.538445  4.781555
   combined  200.0  4.375  2.947510  0.208420   3.964004  4.785996,
                                  Independent t-test   results
           Difference (healthy - non-healthy) =     0.4300
                           Degrees of freedom =   198.0000
                                            t =     1.0317
                        Two side test p value =     0.3035
                       Difference < 0 p value =     0.8483
                       Difference > 0 p value =     0.1517
                                    Cohen's d =     0.1459
                                    Hedge's g =     0.1454
                                Glass's delta =     0.1564
                                            r =     0.0731)

# Otherwise you can store them as objects
des, res = researchpy.ttest(df['healthy'], df['non-healthy'])

des

	Variable	N	Mean	SD	SE	95% Conf.	Interval
0	healthy	100.0	4.590	2.749086	0.274909	4.044522	5.135478
1	non-healthy	100.0	4.160	3.132495	0.313250	3.538445	4.781555
2	combined	200.0	4.375	2.947510	0.208420	3.964004	4.785996

res

	Independent t-test	results
0	Difference (healthy - non-healthy) =	0.4300
1	Degrees of freedom =	198.0000
2	t =	1.0317
3	Two side test p value =	0.3035
4	Difference < 0 p value =	0.8483
5	Difference > 0 p value =	0.1517
6	Cohen's d =	0.1459
7	Hedge's g =	0.1454
8	Glass's delta =	0.1564
9	r =	0.0731

Paired Sample t-test

# Paired samples t-test
des, res = researchpy.ttest(df['healthy'], df['non-healthy'],
                            paired= True)

des

	Variable	N	Mean	SD	SE	95% Conf.	Interval
0	healthy	100.0	4.59	2.749086	0.274909	4.044522	5.135478
1	non-healthy	100.0	4.16	3.132495	0.313250	3.538445	4.781555
2	diff	100.0	0.43	4.063275	0.406327	-0.376242	1.236242

res

	Paired samples t-test	results
0	Difference (healthy - non-healthy) =	0.4300
1	Degrees of freedom =	99.0000
2	t =	1.0583
3	Two side test p value =	0.2925
4	Difference < 0 p value =	0.8537
5	Difference > 0 p value =	0.1463
6	Cohen's d =	0.1058
7	Hedge's g =	0.1054
8	Glass's delta =	0.1564
9	r =	0.1058

Welch’s t-test

# Welch's t-test
des, res = researchpy.ttest(df['healthy'], df['non-healthy'],
                            equal_variances= False)

des

	Variable	N	Mean	SD	SE	95% Conf.	Interval
0	healthy	100.0	4.590	2.749086	0.274909	4.044522	5.135478
1	non-healthy	100.0	4.160	3.132495	0.313250	3.538445	4.781555
2	combined	200.0	4.375	2.947510	0.208420	3.964004	4.785996

res

	Welch's t-test	results
0	Difference (healthy - non-healthy) =	0.4300
1	Degrees of freedom =	194.7181
2	t =	1.0317
3	Two side test p value =	0.3035
4	Difference < 0 p value =	0.8483
5	Difference > 0 p value =	0.1517
6	Cohen's d =	0.1459
7	Hedge's g =	0.1454
8	Glass's delta =	0.1564
9	r =	0.0737

Wilcoxon Signed-Rank Test

# Wilcoxon signed-rank test
desc, res = researchpy.ttest(df['healthy'], df['non-healthy'],
                             equal_variances= False, paired= True)

sign	obs	sum ranks	expected
positive	52	2,804.5000	2,502.5000
negative	39	2,200.5000	2,502.5000
zero	9	45.0000	45.0000
all	100	5,050.0000	5,050.0000

Wilcoxon signed-rank test	results
Mean for healthy =	4.5900
Mean for non-healthy =	4.1600
W value =	2,200.5000
Z value =	1.0411
p value =	0.2978
Rank-Biserial r =	0.1196
Pearson r =	0.1041