summary_cont()
==============
Returns a nice data table as a Pandas DataFrame that includes the variable name,
total number of non-missing observations, standard deviation, standard error,
and the 95% confidence interval. This is compatible with Pandas Series,
DataFrame, and GroupBy objects.
Arguments
----------
**summary_cont(group1, conf = 0.95, decimals = 4)**
* **group1**, must either be a Pandas Series or DataFrame with multiple
columns stated
* **conf**, must be entered in decimal format. The default confidence interval being calculated is at 95%
* **decimals**, rounds the output table to the specified decimal.
**returns**
* Pandas DataFrame
Examples
--------
.. code:: python
import numpy, pandas, researchpy
numpy.random.seed(12345678)
df = pandas.DataFrame(numpy.random.randint(10, size= (100, 2)),
columns= ['healthy', 'non-healthy'])
df['tx'] = ""
df.loc[0:50, 'tx'] = "Placebo"
df.loc[50:101, 'tx'] = "Experimental"
df['dose'] = ""
df.loc[0:26, 'dose'] = "10 mg"
df.loc[26:51, 'dose'] = "25 mg"
df.loc[51:76, 'dose'] = "10 mg"
df.loc[76:101, 'dose'] = "25 mg"
.. code:: python
# Summary statistics for a Series (single variable)
researchpy.summary_cont(df['healthy'])
.. raw:: html
|
Variable |
N |
Mean |
SD |
SE |
95% Conf. |
Interval |
| 0 |
healthy |
100.0 |
4.59 |
2.749086 |
0.274909 |
4.044522 |
5.135478 |
.. code:: python
# Summary statistics for multiple Series
researchpy.summary_cont(df[['healthy', 'non-healthy']])
.. raw:: html
|
Variable |
N |
Mean |
SD |
SE |
95% Conf. |
Interval |
| 0 |
healthy |
100.0 |
4.59 |
2.749086 |
0.274909 |
4.044522 |
5.135478 |
| 1 |
non-healthy |
100.0 |
4.16 |
3.132495 |
0.313250 |
3.538445 |
4.781555 |
.. code:: python
# Easy to export results, assign to Python object which will have
# the Pandas DataFrame class
results = researchpy.summary_cont(df[['healthy', 'non-healthy']])
results.to_csv("results.csv", index= False)
.. code:: python
# This works with GroupBy objects as well
researchpy.summary_cont(df['healthy'].groupby(df['tx']))
.. raw:: html
|
N |
Mean |
SD |
SE |
95% Conf. |
Interval |
| tx |
|
|
|
|
|
|
| Experimental |
50 |
4.66 |
2.560373 |
0.362091 |
3.943096 |
5.376904 |
| Placebo |
50 |
4.52 |
2.950199 |
0.417221 |
3.693944 |
5.346056 |
.. code:: python
# Even with a GroupBy object with a hierarchical index
researchpy.summary_cont(df.groupby(['tx', 'dose'])['healthy', 'non-healthy'])
.. raw:: html
|
|
healthy |
non-healthy |
|
|
count |
mean |
std |
sem |
95% Conf. |
Interval |
count |
mean |
std |
sem |
95% Conf. |
Interval |
| tx |
dose |
|
|
|
|
|
|
|
|
|
|
|
|
| Experimental |
10 mg |
25 |
4.360000 |
2.514624 |
0.502925 |
3.374267 |
5.345733 |
25 |
4.160000 |
3.197395 |
0.639479 |
2.906621 |
5.413379 |
| 25 mg |
25 |
4.960000 |
2.621704 |
0.524341 |
3.932292 |
5.987708 |
25 |
4.240000 |
3.205204 |
0.641041 |
2.983560 |
5.496440 |
| Placebo |
10 mg |
26 |
4.115385 |
2.984318 |
0.585273 |
2.968250 |
5.262520 |
26 |
3.961538 |
3.143002 |
0.616393 |
2.753407 |
5.169670 |
| 25 mg |
24 |
4.958333 |
2.911434 |
0.594294 |
3.793517 |
6.123150 |
24 |
4.291667 |
3.168859 |
0.646841 |
3.023859 |
5.559474 |
.. code:: python
# Above is the default output, but if the results want to be compared
# above/below each other use .apply()
df.groupby(['tx', 'dose'])['healthy', 'non-healthy'].apply(researchpy.summary_cont)
.. raw:: html
|
|
|
Variable |
N |
Mean |
SD |
SE |
95% Conf. |
Interval |
| tx |
dose |
|
|
|
|
|
|
|
|
| Experimental |
10 mg |
0 |
healthy |
25.0 |
4.360000 |
2.514624 |
0.502925 |
3.322014 |
5.397986 |
| 1 |
non-healthy |
25.0 |
4.160000 |
3.197395 |
0.639479 |
2.840180 |
5.479820 |
| 25 mg |
0 |
healthy |
25.0 |
4.960000 |
2.621704 |
0.524341 |
3.877814 |
6.042186 |
| 1 |
non-healthy |
25.0 |
4.240000 |
3.205204 |
0.641041 |
2.916957 |
5.563043 |
| Placebo |
10 mg |
0 |
healthy |
26.0 |
4.115385 |
2.984318 |
0.585273 |
2.909992 |
5.320777 |
| 1 |
non-healthy |
26.0 |
3.961538 |
3.143002 |
0.616393 |
2.692052 |
5.231024 |
| 25 mg |
0 |
healthy |
24.0 |
4.958333 |
2.911434 |
0.594294 |
3.728942 |
6.187724 |
| 1 |
non-healthy |
24.0 |
4.291667 |
3.168859 |
0.646841 |
2.953575 |
5.629758 |