summary_cat() ============= Returns a data table as a Pandas DataFrame that includes the counts and percentages of each category. If there are missing data present (numpy.nan), they will be excluded from the counts. However, if the missing data is coded as a string, it will be included as it's own category. Arguments --------- **summary_cat(group1, ascending= False)** * **group1**, can be a Pandas Series or DataFrame with multiple columns stated * **ascending**, determines the output ascending order or not. Default is descending. **returns** * Pandas DataFrame Examples -------- .. code:: python import numpy, pandas, researchpy numpy.random.seed(123) df = pandas.DataFrame(numpy.random.randint(2, size= (101, 2)), columns= ['disease', 'treatment']) .. code:: python # Handles a single Pandas Series researchpy.summary_cat(df['disease']) .. raw:: html
Variable Outcome Count Percent
0 disease 0 53 52.48
1 1 48 47.52
.. code:: python # Can handle multiple Series, although the output is not pretty researchpy.summary_cat(df[['disease', 'treatment']]) .. raw:: html
Variable Outcome Count Percent
0 disease 0 53 52.48
1 1 48 47.52
2 treatment 1 52 51.49
3 0 49 48.51
.. code:: python # If missing is a string, it will show up as it's own category df['disease'][0] = "" researchpy.summary_cat(df['disease']) .. raw:: html
Variable Outcome Count Percent
0 disease 0 52 51.49
1 1 48 47.52
2 1 0.99
.. code:: python # However, is missing is a numpy.nan, it will be excluded from the counts df['disease'][0] = numpy.nan researchpy.summary_cat(df['disease']) .. raw:: html
Variable Outcome Count Percent
0 disease 0 52 52.0
1 1 48 48.0
.. code:: python # Results can easily be exported using many methods including the default # Pandas exporting methods results = researchpy.summary_cat(df['disease']) results.to_csv("summary_cats.csv", index= False) .. code:: python # This is the default, showing for comparison of immediately below researchpy.summary_cat(df['disease'], ascending= False) .. raw:: html
Variable Outcome Count Percent
0 disease 0 52 52.0
1 1 48 48.0
.. code:: python researchpy.summary_cat(df['disease'], ascending= True) .. raw:: html
Variable Outcome Count Percent
0 disease 1 48 48.0
1 0 52 52.0