summary_cont() ============== Returns a nice data table as a Pandas DataFrame that includes the variable name, total number of non-missing observations, standard deviation, standard error, and the 95% confidence interval. This is compatible with Pandas Series, DataFrame, and GroupBy objects. Arguments ---------- **summary_cont(group1, conf = 0.95, decimals = 4)** * **group1**, must either be a Pandas Series or DataFrame with multiple columns stated * **conf**, must be entered in decimal format. The default confidence interval being calculated is at 95% * **decimals**, rounds the output table to the specified decimal. **returns** * Pandas DataFrame Examples -------- .. code:: python import numpy, pandas, researchpy numpy.random.seed(12345678) df = pandas.DataFrame(numpy.random.randint(10, size= (100, 2)), columns= ['healthy', 'non-healthy']) df['tx'] = "" df.loc[0:50, 'tx'] = "Placebo" df.loc[50:101, 'tx'] = "Experimental" df['dose'] = "" df.loc[0:26, 'dose'] = "10 mg" df.loc[26:51, 'dose'] = "25 mg" df.loc[51:76, 'dose'] = "10 mg" df.loc[76:101, 'dose'] = "25 mg" .. code:: python # Summary statistics for a Series (single variable) researchpy.summary_cont(df['healthy']) .. raw:: html
Variable N Mean SD SE 95% Conf. Interval
0 healthy 100.0 4.59 2.749086 0.274909 4.044522 5.135478
.. code:: python # Summary statistics for multiple Series researchpy.summary_cont(df[['healthy', 'non-healthy']]) .. raw:: html
Variable N Mean SD SE 95% Conf. Interval
0 healthy 100.0 4.59 2.749086 0.274909 4.044522 5.135478
1 non-healthy 100.0 4.16 3.132495 0.313250 3.538445 4.781555
.. code:: python # Easy to export results, assign to Python object which will have # the Pandas DataFrame class results = researchpy.summary_cont(df[['healthy', 'non-healthy']]) results.to_csv("results.csv", index= False) .. code:: python # This works with GroupBy objects as well researchpy.summary_cont(df['healthy'].groupby(df['tx'])) .. raw:: html
N Mean SD SE 95% Conf. Interval
tx
Experimental 50 4.66 2.560373 0.362091 3.943096 5.376904
Placebo 50 4.52 2.950199 0.417221 3.693944 5.346056
.. code:: python # Even with a GroupBy object with a hierarchical index researchpy.summary_cont(df.groupby(['tx', 'dose'])['healthy', 'non-healthy']) .. raw:: html
healthy non-healthy
count mean std sem 95% Conf. Interval count mean std sem 95% Conf. Interval
tx dose
Experimental 10 mg 25 4.360000 2.514624 0.502925 3.374267 5.345733 25 4.160000 3.197395 0.639479 2.906621 5.413379
25 mg 25 4.960000 2.621704 0.524341 3.932292 5.987708 25 4.240000 3.205204 0.641041 2.983560 5.496440
Placebo 10 mg 26 4.115385 2.984318 0.585273 2.968250 5.262520 26 3.961538 3.143002 0.616393 2.753407 5.169670
25 mg 24 4.958333 2.911434 0.594294 3.793517 6.123150 24 4.291667 3.168859 0.646841 3.023859 5.559474
.. code:: python # Above is the default output, but if the results want to be compared # above/below each other use .apply() df.groupby(['tx', 'dose'])['healthy', 'non-healthy'].apply(researchpy.summary_cont) .. raw:: html
Variable N Mean SD SE 95% Conf. Interval
tx dose
Experimental 10 mg 0 healthy 25.0 4.360000 2.514624 0.502925 3.322014 5.397986
1 non-healthy 25.0 4.160000 3.197395 0.639479 2.840180 5.479820
25 mg 0 healthy 25.0 4.960000 2.621704 0.524341 3.877814 6.042186
1 non-healthy 25.0 4.240000 3.205204 0.641041 2.916957 5.563043
Placebo 10 mg 0 healthy 26.0 4.115385 2.984318 0.585273 2.909992 5.320777
1 non-healthy 26.0 3.961538 3.143002 0.616393 2.692052 5.231024
25 mg 0 healthy 24.0 4.958333 2.911434 0.594294 3.728942 6.187724
1 non-healthy 24.0 4.291667 3.168859 0.646841 2.953575 5.629758