# difference_test()¶

Conducts a few different statistical tests which test for a difference between independent or related samples with or without equal variances and has the ability to calculate the effect size of the observed difference. The data is returned in a Pandas DataFrame by default, but can be returned as a dictionary if specified.

This method is similar to researchpy.ttest(), except it allows the user to use the formula syntax.

- This method can perform the following tests:
Independent sample t-test []

psudo-code: difference_test(formula_like, data, equal_variances = True, independent_samples = True)

Paired sample t-test []

psudo-code: difference_test(formula_like, data, equal_variances = True, independent_samples = False)

Welch’s t-test []

psudo-code: difference_test(formula_like, data, equal_variances = False, independent_samples = True)

Wilcoxon ranked-sign test []

By default, discards all zero-differences; this is known as the ‘wilcox’ method.

psudo-code: difference_test(formula_like, data, equal_variances = False, independent_samples = False)

2 objects will be returned for all available tests; the first object will be a descriptive summary table and the second will be the testing result information which will include the effect size measures if indicated.

## Arguments¶

**difference_test(formula_like, data = {}, conf_level = 0.95, equal_variances = True, independent_samples = True, wilcox_parameters = {“zero_method” : “wilcox”, “correction” : False, “mode” : “auto”}, **keywords)**

formula_like: A valid formula ; for example, “DV ~ IV”.

data: data to perform the analysis on - contains the dependent and independent variables.

conf_level: Specify the confidence interval to be calculated.

equal_variances: Boolean to indicate if equal variances are assumed.

independent_samples: Boolean to indicate if groups and independent of each other.

wilcox_parameters: A dictionary with optional methods for calculating the Wilcoxon signed-rank test. For more information, see scipy.stats.wilcoxon.

**conduct(return_type = “Dataframe”, effect_size = None)**

return_type: Specify if the results should be returned as a Pandas DataFrame (default) or a Python dictionary (= ‘Dictionary’).

effect_size: Specify if effect sizes should be calculated, default value is None. * Available options are: None, “Cohen’s D”, “Hedge’s G”, “Glass’s delta1”, “Glass’s delta2”, “r”, and/or “all”.

User can specify any combination of effect sizes, or use “all” which will calculate all effect sizes.

Only effect size “r” is supported for the Wilcoxon ranked-sign test.

**returns**

- 2 objects will be returned within a tuple:

First object will be the descriptive summary information.

Second object will be the statistical testing information.

Note

This can be a one step, or two step process.

**One step**
.. code:: python

difference_test(“DV ~ IV”, data).conduct()

**Two step**
.. code:: python

model = difference_test(“DV ~ IV”, data) model.conduct()

### Effect size measures formulas¶

#### Cohen’s d_{s} (between subjects design)¶

Cohen’s d_{s} [] for a between groups design is calculated
with the following equation:

#### Cohen’s d_{av} (within subject design)¶

Another version of Cohen’s d is used in within subject designs. This is noted
by the subscript “av”. The formula for Cohen’s d_{av} [] is
as follows:

#### Hedges’s g_{s} (between subjects design)¶

Cohen’s d_{s} gives a biased estimate of the effect size for a population
and Hedges and Olkin [] provides an unbiased estimation. The
differences between Hedges’s g and Cohen’s d is negligible when sample sizes
are above 20, but it is still preferable to report Hedges’s g [].
Hedge’s g_{s} is calculated using the following formula:

#### Hedges’s g_{av} (within subjects design)¶

Cohen’s d_{av} gives a biased estimate of the effect size for a population
and Hedges and Olkin [] provides a correction to be applied to provide an unbiased estimate.
Hedge’s g_{av} is calculated using the following formula [] :

#### Glass’s \(\Delta\) (between or within subjects design)¶

Glass’s \(\Delta\) is the mean differences between the two groups divided by the standard deviation of the first condition/group or by the second condition/group. When used in a within subjects design, it is recommended to use the pre- standard deviation in the denominator []; the following formulas are used to calculate Glass’s \(\Delta\):

#### Point-Biserial correlation coefficient r (between or within subjects design)¶

Tthe following formula to calculate the Point-Biserial correlation coefficient r using the t-value and degrees of freedom:

The following formula is used to calculate the Point-Biserial correlation coefficient r using the W-value and N. This formula is used to calculate the r coefficient for the Wilcoxon ranked-sign test.

\[r = \sqrt{\frac{W}{\sum{\text{rank}}}}\]

## Examples¶

First let’s create an example data set to work through the examples. This will be done using numpy (to create fake data) and pandas (to hold the data in a data frame).

```
import numpy, pandas, researchpy
numpy.random.seed(12345678)
df = pandas.DataFrame(numpy.random.randint(10, size= (100, 2)),
columns= ['No', 'Yes'])
df["id"] = range(1, df.shape[0] + 1)
df.head()
```

```
No Yes id
3 2 1
4 1 2
0 1 3
8 2 4
6 6 5
```

If one has data like this and doesn’t want to reshape the data, then *researchpy.different_test()* will not work and
one should use *researchpy.ttest()* instead. However, moving forward researchpy will be going in the
direction of syntax style input and it is recommended to get comfortable using this
approach if one plans to use researchpy in the future.

Currently the data is in a wide format and it needs to be in a long format, i.e. one variable with the dependent variable data and another with the independent variable data. The current data structure won’t work and it needs to be reshaped; there are a few ways to do this, one will be shown below.

```
df2 = pandas.melt(df, id_vars = "id", value_vars = ["No", "Yes"],
var_name = "Exercise", value_name = "StressReactivity")
df2.head()
```

```
id Exercise StressReactivity
1 No 3
2 No 4
3 No 0
4 No 8
5 No 6
```

Now the data is in the correct structure.

```
# Independent t-test
# If you don't store the 2 returned DataFrames, it outputs as a tuple and
# is displayed
difference_test("StressReactivity ~ C(Exercise)",
data = df2,
equal_variances = True,
independent_samples = True).conduct(effect_size = "all")
```

```
( Variable N Mean SD SE 95% Conf. Interval
0 healthy 100.0 4.590 2.749086 0.274909 4.044522 5.135478
1 non-healthy 100.0 4.160 3.132495 0.313250 3.538445 4.781555
2 combined 200.0 4.375 2.947510 0.208420 3.964004 4.785996,
Independent t-test results
0 Difference (healthy - non-healthy) = 0.4300
1 Degrees of freedom = 198.0000
2 t = 1.0317
3 Two side test p value = 0.3035
4 Difference < 0 p value = 0.8483
5 Difference > 0 p value = 0.1517
6 Cohen's d = 0.1459
7 Hedge's g = 0.1454
8 Glass's delta = 0.1564
9 r = 0.0731)
```

```
# Otherwise you can store them as objects
summary, results = difference_test("StressReactivity ~ C(Exercise)",
data = df2,
equal_variances = True,
independent_samples = True).conduct(effect_size = "all")
summary
```

```
Name N Mean Variance SD SE 95% Conf. Interval
0 No 100 4.590 7.55747 2.74909 0.274909 4.044522 5.135478
1 Yes 100 4.160 9.81253 3.1325 0.313250 3.538445 4.781555
2 combined 200 4.375 8.68781 2.94751 0.208420 3.964004 4.785996
3 diff 0.430 0.416773 -0.391884 1.251884
```

```
results
```

```
Independent samples t-test Results
0 Difference (No - Yes) 0.430000
1 Degrees of freedom = 198.000000
2 t = 1.031736
3 Two sided test p-value = 0.303454
4 Difference < 0 p-value = 0.848273
5 Difference > 0 p-value = 0.151727
6 Cohen's Ds 0.145909
7 Hedge's G 0.145356
8 Glass's delta1 0.156416
9 Glass's delta2 0.137271
10 Point-Biserial r 0.073126
```

```
# Paired samples t-test
summary, results = difference_test("StressReactivity ~ C(Exercise)",
data = df2,
equal_variances = True,
independent_samples = False).conduct(effect_size = "all")
summary
```

```
Name N Mean Variance SD SE 95% Conf. Interval
0 No 100 4.59 7.55747 2.749086 0.274909 4.044522 5.135478
1 Yes 100 4.16 9.81253 3.132495 0.313250 3.538445 4.781555
3 diff 0.43 4.063275 0.406327 -0.376242 1.236242
```

```
results
```

```
Paired samples t-test Results
0 Difference (No - Yes) 0.430000
1 Degrees of freedom = 99.000000
2 t = 1.058260
3 Two sided test p-value = 0.292512
4 Difference < 0 p-value = 0.853744
5 Difference > 0 p-value = 0.146256
6 Cohen's Dav 0.146219
7 Hedge's Gav 0.145665
8 Glass's delta1 0.156416
9 Glass's delta2 0.137271
10 Point-Biserial r 0.105763
```

```
# Welch's t-test
summary, results = difference_test("StressReactivity ~ C(Exercise)",
data = df2,
equal_variances = False,
independent_samples = True).conduct(effect_size = "all")
summary
```

```
Name N Mean Variance SD SE 95% Conf. Interval
0 No 100 4.590 7.55747 2.74909 0.274909 4.044522 5.135478
1 Yes 100 4.160 9.81253 3.1325 0.313250 3.538445 4.781555
2 combined 200 4.375 8.68781 2.94751 0.208420 3.964004 4.785996
3 diff 0.430 0.416773 -0.391919 1.251919
```

```
results
```

```
Welch's t-test Results
0 Difference (No - Yes) 0.430000
1 Degrees of freedom = 196.651845
2 t = 1.031736
3 Two sided test p-value = 0.303476
4 Difference < 0 p-value = 0.848268
5 Difference > 0 p-value = 0.151732
6 Cohen's Ds 0.145909
7 Hedge's G 0.145356
8 Glass's delta1 0.156416
9 Glass's delta2 0.137271
10 Point-Biserial r 0.073375
```

```
# Wilcoxon signed-rank test
summary, results = difference_test("StressReactivity ~ C(Exercise)",
data = df2,
equal_variances = False,
independent_samples = False).conduct(effect_size = "r")
summary
```

```
Name N Mean Variance SD SE 95% Conf. Interval
0 No 100 4.59 7.55747 2.74909 0.274909 4.044522 5.135478
1 Yes 100 4.16 9.81253 3.1325 0.313250 3.538445 4.781555
```

```
results
```

```
Wilcoxon signed-rank test Results
0 (No = Yes)
1 W = 1849.5
2 Two sided p-value = 0.333755
3 Point-Biserial r 0.366238
```

```
# Exporting descriptive table (summary) and result table (results) to same
# csv file
summary.to_csv("C:\\Users\\...\\test.csv", index= False)
results.to_csv("C:\\Users\\...\\test.csv", index= False, mode= 'a')
```

## References¶

*scipy.stats.chi2_contingency*. The Scipy community, 2016. Retrived when last updated May 12, 2016. URL: http://lagrange.univ-lyon1.fr/docs/scipy/0.17.1/generated/scipy.stats.chi2_contingency.html.*scipy.stats.fisher_exact*. The SciPy community, 2018. Retrieved when last updated on May 5, 2018. URL: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.fisher_exact.html.*scipy.stats.kendalltau*. 2018. Retrieved when last updated on May 5, 2018. URL: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kendalltau.html.*scipy.stats.pearsonr*. The SciPy community, 2018. Retrieved when last updated on May 11, 2014. URL: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.pearsonr.html.*scipy.stats.spearmanr*. The SciPy community, 2018. Retrieved when last updated on May 11, 2014. URL: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.spearmanr.html.*scipy.stats.ttest_ind*. The SciPy community, 2018. Retrieved when last updated on May 5, 2018. URL: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html.*scipy.stats.ttest_rel*. The SciPy community, 2018. Retrieved when last updated on May 5, 2018. URL: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html.*scipy.stats.wilcoxon*. The SciPy community, 2018. Retrieved when last updated on May 5, 2018. URL: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html.*statsmodels.stats.contingency_tables.mcnemar*. Statsmodels-developers, 2018. URL: https://www.statsmodels.org/dev/generated/statsmodels.stats.contingency_tables.mcnemar.html.Jacob Cohen.

*Statistical Power Analysis for the Behavioral Sciences*. Lawrence Erlbaum Associates, second edition, 1988. ISBN 0-8058-0283-5.Harald Cramér.

*Mathematical methods of statistics (PMS-9)*. Volume 9. Princeton university press, 2016.Larry Hedges and Ingram Olkin.

*Journal of Educational Statistics*, chapter Statistical Methods in Meta-Analysis. Volume 20. Academic Press, Inc., 1985, 10.2307/1164953.Rex B. Kline.

*Beyond significance testing: Reforming data analysis methods in behavioral research*. American Psychological Association, 2004. http://dx.doi.org/10.1037/10693-000.Daniel Lakens. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and anovas.

*Frontiers in Psychology*, November 2013. doi:10.3389/fpsyg.2013.00863.Robert Rosenthal.

*The hand-book of research synthesis*, chapter Parametric measures of effect size, pages 231–244. New York, NY: Russel Sage Foundation, 1994.