Presenting the Results of Quantitative Analysis

Mikaila Mariel Lemonik Arthur

Quantitative Data Analysis

9 Presenting the Results of Quantitative Analysis

Mikaila Mariel Lemonik Arthur

This chapter provides an overview of how to present the results of quantitative analysis, in particular how to create effective tables for displaying quantitative results and how to write quantitative research papers that effectively communicate the methods used and findings of quantitative analysis.

Writing the Quantitative Paper

Standard quantitative social science papers follow a specific format. They begin with a title page that includes a descriptive title, the author(s)’ name(s), and a 100 to 200 word abstract that summarizes the paper. Next is an introduction that makes clear the paper’s research question, details why this question is important, and previews what the paper will do. After that comes a literature review, which ends with a summary of the research question(s) and/or hypotheses. A methods section, which explains the source of data, sample, and variables and quantitative techniques used, follows. Many analysts will include a short discussion of their descriptive statistics in the methods section. A findings section details the findings of the analysis, supported by a variety of tables, and in some cases graphs, all of which are explained in the text. Some quantitative papers, especially those using more complex techniques, will include equations. Many papers follow the findings section with a discussion section, which provides an interpretation of the results in light of both the prior literature and theory presented in the literature review and the research questions/hypotheses. A conclusion ends the body of the paper. This conclusion should summarize the findings, answering the research questions and stating whether any hypotheses were supported, partially supported, or not supported. Limitations of the research are detailed. Papers typically include suggestions for future research, and where relevant, some papers include policy implications. After the body of the paper comes the works cited; some papers also have an Appendix that includes additional tables and figures that did not fit into the body of the paper or additional methodological details. While this basic format is similar for papers regardless of the type of data they utilize, there are specific concerns relating to quantitative research in terms of the methods and findings that will be discussed here.

Methods

In the methods section, researchers clearly describe the methods they used to obtain and analyze the data for their research. When relying on data collected specifically for a given paper, researchers will need to discuss the sample and data collection; in most cases, though, quantitative research relies on pre-existing datasets. In these cases, researchers need to provide information about the dataset, including the source of the data, the time it was collected, the population, and the sample size. Regardless of the source of the data, researchers need to be clear about which variables they are using in their research and any transformations or manipulations of those variables. They also need to explain the specific quantitative techniques that they are using in their analysis; if different techniques are used to test different hypotheses, this should be made clear. In some cases, publications will require that papers be submitted along with any code that was used to produce the analysis (in SPSS terms, the syntax files), which more advanced researchers will usually have on hand. In many cases, basic descriptive statistics are presented in tabular form and explained within the methods section.

Findings

The findings sections of quantitative papers are organized around explaining the results as shown in tables and figures. Not all results are depicted in tables and figures—some minor or null findings will simply be referenced—but tables and figures should be produced for all findings to be discussed at any length. If there are too many tables and figures, some can be moved to an appendix after the body of the text and referred to in the text (e.g. “See Table 12 in Appendix A”).

Discussions of the findings should not simply restate the contents of the table. Rather, they should explain and interpret it for readers, and they should do so in light of the hypothesis or hypotheses that are being tested. Conclusions—discussions of whether the hypothesis or hypotheses are supported or not supported—should wait for the conclusion of the paper.

Creating Effective Tables

When creating tables to display the results of quantitative analysis, the most important goals are to create tables that are clear and concise but that also meet standard conventions in the field. This means, first of all, paring down the volume of information produced in the statistical output to just include the information most necessary for interpreting the results, but doing so in keeping with standard table conventions. It also means making tables that are well-formatted and designed, so that readers can understand what the tables are saying without struggling to find information. For example, tables (as well as figures such as graphs) need clear captions; they are typically numbered and referred to by number in the text. Columns and rows should have clear headings. Depending on the content of the table, formatting tools may need to be used to set off header rows/columns and/or total rows/columns; cell-merging tools may be necessary; and shading may be important in tables with many rows or columns.

Here, you will find some instructions for creating tables of results from descriptive, crosstabulation, correlation, and regression analysis that are clear, concise, and meet normal standards for data display in social science. In addition, after the instructions for creating tables, you will find an example of how a paper incorporating each table might describe that table in the text.

Descriptive Statistics

When presenting the results of descriptive statistics, we create one table with columns for each type of descriptive statistic and rows for each variable. Note, of course, that depending on level of measurement only certain descriptive statistics are appropriate for a given variable, so there may be many cells in the table marked with an — to show that this statistic is not calculated for this variable. So, consider the set of descriptive statistics below, for occupational prestige, age, highest degree earned, and whether the respondent was born in this country.

Table 1. SPSS Ouput: Selected Descriptive Statistics

Statistics
		R’s occupational prestige score (2010)	Age of respondent
N	Valid	3873	3699
N	Missing	159	333
Mean		46.54	52.16
Median		47.00	53.00
Std. Deviation		13.811	17.233
Variance		190.745	296.988
Skewness		.141	.018
Std. Error of Skewness		.039	.040
Kurtosis		-.809	-1.018
Std. Error of Kurtosis		.079	.080
Range		64	71
Minimum		16	18
Maximum		80	89
Percentiles	25	35.00	37.00
	50	47.00	53.00
	75	59.00	66.00

Statistics
R’s highest degree
N	Valid	4009
N	Missing	23
Median		2.00
Mode		1
Range		4
Minimum		0
Maximum		4

R’s highest degree
		Frequency	Percent	Valid Percent	Cumulative Percent
Valid	less than high school	246	6.1	6.1	6.1
	high school	1597	39.6	39.8	46.0
	associate/junior college	370	9.2	9.2	55.2
	bachelor’s	1036	25.7	25.8	81.0
	graduate	760	18.8	19.0	100.0
	Total	4009	99.4	100.0
Missing	System	23	.6
Total		4032	100.0

Statistics
Was r born in this country
N	Valid	3960
N	Missing	72
Mean		1.11
Mode		1

Was r born in this country
		Frequency	Percent	Valid Percent	Cumulative Percent
Valid	yes	3516	87.2	88.8	88.8
	no	444	11.0	11.2	100.0
	Total	3960	98.2	100.0
Missing	System	72	1.8
Total		4032	100.0

To display these descriptive statistics in a paper, one might create a table like Table 2. Note that for discrete variables, we use the value label in the table, not the value.

Table 2. Descriptive Statistics
	Occupational Prestige Score	Age	Highest Degree Earned	Born in This Country?
Mean	46.54	52.16	—	1.11
Median	47	53	1: Associates (9.2%)	1: Yes (88.8%)
Mode	—	—	2: High School (39.8%)	—
Standard Deviation	13.811	17.233	—	—
Variance	190.745	296.988	—	—
Skewness	0.141	0.018	—	—
Kurtosis	-0.809	-1.018	—	—
Range	64 (16-80)	71 (18-89)	Less than High School (0) – Graduate (4)	—
Interquartile Range	35-59	37-66	—	—
N	3873	3699	4009	3960

If we were then to discuss our descriptive statistics in a quantitative paper, we might write something like this (note that we do not need to repeat every single detail from the table, as readers can peruse the table themselves):

This analysis relies on four variables from the 2021 General Social Survey: occupational prestige score, age, highest degree earned, and whether the respondent was born in the United States. Descriptive statistics for all four variables are shown in Table 2. The median occupational prestige score is 47, with a range from 16 to 80. 50% of respondents had occupational prestige scores scores between 35 and 59. The median age of respondents is 53, with a range from 18 to 89. 50% of respondents are between ages 37 and 66. Both variables have little skew. Highest degree earned ranges from less than high school to a graduate degree; the median respondent has earned an associate’s degree, while the modal response (given by 39.8% of the respondents) is a high school degree. 88.8% of respondents were born in the United States.

Crosstabulation

When presenting the results of a crosstabulation, we simplify the table so that it highlights the most important information—the column percentages—and include the significance and association below the table. Consider the SPSS output below.

Table 3. R’s highest degree * R’s subjective class identification Crosstabulation
			R’s subjective class identification				Total
			lower class	working class	middle class	upper class	Total
R’s highest degree	less than high school	Count	65	106	68	7	246
	less than high school	% within R’s subjective class identification	18.8%	7.1%	3.4%	4.2%	6.2%
	high school	Count	217	800	551	23	1591
	high school	% within R’s subjective class identification	62.9%	53.7%	27.6%	13.9%	39.8%
	associate/junior college	Count	30	191	144	3	368
	associate/junior college	% within R’s subjective class identification	8.7%	12.8%	7.2%	1.8%	9.2%
	bachelor’s	Count	27	269	686	49	1031
	bachelor’s	% within R’s subjective class identification	7.8%	18.1%	34.4%	29.5%	25.8%
	graduate	Count	6	123	546	84	759
	graduate	% within R’s subjective class identification	1.7%	8.3%	27.4%	50.6%	19.0%
Total		Count	345	1489	1995	166	3995
Total		% within R’s subjective class identification	100.0%	100.0%	100.0%	100.0%	100.0%

Chi-Square Tests
	Value	df	Asymptotic Significance (2-sided)
Pearson Chi-Square	819.579^a	12	<.001
Likelihood Ratio	839.200	12	<.001
Linear-by-Linear Association	700.351	1	<.001
N of Valid Cases	3995
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 10.22.

Symmetric Measures
		Value	Asymptotic Standard Error^a	Approximate T^b	Approximate Significance
Interval by Interval	Pearson’s R	.419	.013	29.139	<.001^c
Ordinal by Ordinal	Spearman Correlation	.419	.013	29.158	<.001^c
N of Valid Cases		3995
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Based on normal approximation.

Table 4 shows how a table suitable for include in a paper might look if created from the SPSS output in Table 3. Note that we use asterisks to indicate the significance level of the results: * means p < 0.05; ** means p < 0.01; *** means p < 0.001; and no stars mean p > 0.05 (and thus that the result is not significant). Also note than N is the abbreviation for the number of respondents.


		*Respondent’s Subjective Class Identification*
		Lower Class	Working Class	Middle Class	Upper Class	Total
*Highest Degree Earned*	Less than High School	18.8%	7.1%	3.4%	4.2%	6.2%
	High School	62.9%	53.7%	27.6%	13.9%	39.8%
	Associate’s / Junior College	8.7%	12.8%	7.2%	1.8%	9.2%
	Bachelor’s	7.8%	18.1%	34.4%	29.5%	25.8%
	Graduate	1.7%	8.3%	27.4%	50.6%	19.0%
	N: 3995 Spearman Correlation 0.419***

If we were going to discuss the results of this crosstabulation in a quantitative research paper, the discussion might look like this:

A crosstabulation of respondent’s class identification and their highest degree earned, with class identification as the independent variable, is significant, with a Spearman correlation of 0.419, as shown in Table 4. Among lower class and working class respondents, more than 50% had earned a high school degree. Less than 20% of poor respondents and less than 40% of working-class respondents had earned more than a high school degree. In contrast, the majority of middle class and upper class respondents had earned at least a bachelor’s degree. In fact, 50% of upper class respondents had earned a graduate degree.

Correlation

When presenting a correlating matrix, one of the most important things to note is that we only present half the table so as not to include duplicated results. Think of the line through the table where empty cells exist to represent the correlation between a variable and itself, and include only the triangle of data either above or below that line of cells. Consider the output in Table 5.

Table 5. SPSS Output: Correlations
		Age of respondent	R’s occupational prestige score (2010)	Highest year of school R completed	R’s family income in 1986 dollars
Age of respondent	Pearson Correlation	1	.087^**	.014	.017
	Sig. (2-tailed)		<.001	.391	.314
	N	3699	3571	3683	3336
R’s occupational prestige score (2010)	Pearson Correlation	.087^**	1	.504^**	.316^**
	Sig. (2-tailed)	<.001		<.001	<.001
	N	3571	3873	3817	3399
Highest year of school R completed	Pearson Correlation	.014	.504^**	1	.360^**
	Sig. (2-tailed)	.391	<.001		<.001
	N	3683	3817	3966	3497
R’s family income in 1986 dollars	Pearson Correlation	.017	.316^**	.360^**	1
	Sig. (2-tailed)	.314	<.001	<.001
	N	3336	3399	3497	3509
**. Correlation is significant at the 0.01 level (2-tailed).

Table 6 shows what the contents of Table 5 might look like when a table is constructed in a fashion suitable for publication.

Table 6. Correlation Matrix
	Age	Occupational Prestige Score	Highest Year of School Completed	Family Income in 1986 Dollars
Age	1
Occupational Prestige Score	0.087***	1
Highest Year of School Completed	0.014	0.504***	1
Family Income in 1986 Dollars	0.017	0.316***	0.360***	1

If we were to discuss the results of this bivariate correlation analysis in a quantitative paper, the discussion might look like this:

Bivariate correlations were run among variables measuring age, occupational prestige, the highest year of school respondents completed, and family income in constant 1986 dollars, as shown in Table 6. Correlations between age and highest year of school completed and between age and family income are not significant. All other correlations are positive and significant at the p<0.001 level. The correlation between age and occupational prestige is weak; the correlations between income and occupational prestige and between income and educational attainment are moderate, and the correlation between education and occupational prestige is strong.

Regression

To present the results of a regression, we create one table that includes all of the key information from the multiple tables of SPSS output. This includes the R² and significance of the regression, either the B or the beta values (different analysts have different preferences here) for each variable, and the standard error and significance of each variable. Consider the SPSS output in Table 7.

Table 7. SPSS Output: Regression
Model	R	R Square	Adjusted R Square	Std. Error of the Estimate
1	.395^a	.156	.155	36729.04841
a. Predictors: (Constant), Highest year of school R completed, Age of respondent, R’s occupational prestige score (2010)

ANOVA^a
Model		Sum of Squares	df	Mean Square	F	Sig.
1	Regression	805156927306.583	3	268385642435.528	198.948	<.001^b
	Residual	4351948187487.015	3226	1349022996.741
	Total	5157105114793.598	3229
a. Dependent Variable: R’s family income in 1986 dollars
b. Predictors: (Constant), Highest year of school R completed, Age of respondent, R’s occupational prestige score (2010)

Coefficients^a
Model		Unstandardized Coefficients		Standardized Coefficients	t	Sig.	Collinearity Statistics
Model		B	Std. Error	Beta	t	Sig.	Tolerance	VIF
1	(Constant)	-44403.902	4166.576		-10.657	<.001
	Age of respondent	9.547	38.733	.004	.246	.805	.993	1.007
	R’s occupational prestige score (2010)	522.887	54.327	.181	9.625	<.001	.744	1.345
	Highest year of school R completed	3988.545	274.039	.272	14.555	<.001	.747	1.339
a. Dependent Variable: R’s family income in 1986 dollars

The regression output in shown in Table 7 contains a lot of information. We do not include all of this information when making tables suitable for publication. As can be seen in Table 8, we include the Beta (or the B), the standard error, and the significance asterisk for each variable; the R² and significance for the overall regression; the degrees of freedom (which tells readers the sample size or N); and the constant; along with the key to p/significance values.

Table 8. Regression Results for Dependent Variable Family Income in 1986 Dollars
	Beta & SE
Age	0.004 (38.733)
Occupational Prestige Score	0.181*** (54.327)
Highest Year of School Completed	0.272*** (274.039)
R²	0.156***
Degrees of Freedom	3229
Constant	-44,403.902
* p<0.05 p<0.01 *p<0.001

If we were to discuss the results of this regression in a quantitative paper, the results might look like this:

Table 8 shows the results of a regression in which age, occupational prestige, and highest year of school completed are the independent variables and family income is the dependent variable. The regression results are significant, and all of the independent variables taken together explain 15.6% of the variance in family income. Age is not a significant predictor of income, while occupational prestige and educational attainment are. Educational attainment has a larger effect on family income than does occupational prestige. For every year of additional education attained, family income goes up on average by $3,988.545; for every one-unit increase in occupational prestige score, family income goes up on average by $522.887.^[1]

Exercises

Choose two discrete variables and three continuous variables from a dataset of your choice. Produce appropriate descriptive statistics on all five of the variables and create a table of the results suitable for inclusion in a paper.
Using the two discrete variables you have chosen, produce an appropriate crosstabulation, with significance and measure of association. Create a table of the results suitable for inclusion in a paper.
Using the three continuous variables you have chosen, produce a correlation matrix. Create a table of the results suitable for inclusion in a paper.
Using the three continuous variables you have chosen, produce a multivariate linear regression. Create a table of the results suitable for inclusion in a paper.
Write a methods section describing the dataset, analytical methods, and variables you utilized in questions 1, 2, 3, and 4 and explaining the results of your descriptive analysis.
Write a findings section explaining the results of the analyses you performed in questions 2, 3, and 4.

Note that the actual numberical increase comes from the B values, which are shown in the SPSS output in Table 7 but not in the reformatted Table 8. ↵

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Social Data Analysis Copyright © 2021 by Mikaila Mariel Lemonik Arthur and Roger Clark is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.