Happy Labor Day! 🛠️ As we celebrate the achievements of workers and the contributions they’ve made to society, what better way to delve into the world of labor and employment than with a fresh Tidy Tuesday dataset? This week’s data comes from the comprehensive “Union Membership, Coverage, and Earnings from the CPS” dataset, courtesy of Barry Hirsch from Georgia State University, David Macpherson from Trinity University, and William Even from Miami University.
This dataset is a rich source of information, shedding light on the intricate dynamics of union membership, coverage, and earnings. So, grab your coffee, get comfortable, and let’s embark on this Tidy Tuesday journey together, exploring the intricate tapestry of labor data!
Import data
Code
import pandas as pdimport numpy as npimport seaborn as snsimport matplotlib.pyplot as pltfrom scipy import statsimport geopandasfrom lets_plot.geo_data import*from lets_plot import*from lets_plot.mapping import as_discreteLetsPlot.setup_html()
There are three distinct datasets to investigate: one focusing on demographic information, another centered on wage data, and the third associated with various states. Let’s read them all in.
1983-2022: CPS Outgoing Rotation Group (ORG) Earnings Files
The definition of union membership was expanded in 1977 to include “employee associations similar to a union”.
variable
class
description
year
double
When the data was collected.
sample_size
double
The number of wage and salary workers ages 16 and over who were surveyed.
employment
double
Wage and salary employment in thousands.
members
double
Employed workers who are union members in thousands.
covered
double
Workers covered by a collective bargaining agreement in thousands.
p_members
double
Percent of employed workers who are union members.
p_covered
double
Percent of employed workers who are covered by a collective bargaining agreement.
facet
character
The sector or demographic group contained in this row of data.
Wages data: Data sources:
1973-1981: May Current Population Survey (CPS)
1982: No union questions available
1983-2022: CPS Outgoing Rotation Group (ORG) Earnings Files
The definition of union membership was expanded in 1977 to include “employee associations similar to a union”.
variable
class
description
year
double
When the data was collected.
sample_size
double
The number of wage and salary workers ages 16 and over who were surveyed and provided earnings and hours worked information.
wage
double
Mean hourly earnings in nominal dollars.
at_cap
double
Percent of workers with weekly earnings at the top code of $999 through 1988, $1923 in 1989-97, and $2885 beginning in 1998, with individuals assigned mean earnings above the cap based on annual estimates of the gender-specific Pareto distribution.
union_wage
double
Mean wage among union members.
nonunion_wage
double
Mean wage among nonunion workers.
union_wage_premium_raw
double
The percentage difference between the union and nonunion wage.
union_wage_premium_adjusted
double
Estimated as exp(b)-1 where b is the regression coefficient on a union membership variable (equal to 1 if union and 0 otherwise) from a semi-logarithmic wage equation, with controls included for worker/job characteristics. Included in the all-worker wage equation are the control variables: years of schooling, potential years of experience [proxied by age minus years of schooling minus 6] and its square [both interacted with gender], and categorical variables for marital status, race and ethnicity, gender, part-time, large metropolitan area, state, public sector, broad industry, and broad occupation. Controls are omitted, as appropriate, for estimates within sectors or by demographic group [i.e., by class, gender, race, or industry sector]. Workers who do not report earnings but instead have them imputed [i.e., assigned] by the Census are removed from the estimation samples in all years, except 1994 and 1995 when imputed earners cannot be identified. Inclusion of imputed earners causes union wages to be understated, nonunion wages overstated, and union-nonunion wage differences understated. For 1994-95, the sample includes imputed earners and estimates in those years have been adjusted to remove the bias from imputation.
facet
character
The sector or demographic group contained in this row of data.
States data:
Data source: Current Population Survey (CPS) Outgoing Rotation Group (ORG) Earnings Files
variable
class
description
state_census_code
double
Census state code used in CPS
state
character
State name.
sector
character
Employment sector.
observations
double
CPS sample size.
employment
double
Wage and salary employment in thousands.
members
double
Employed workers who are union members in thousands.
covered
double
Workers covered by a collective bargaining agreement in thousands.
p_members
double
Percent of employed workers who are union members.
p_covered
double
Percent of employed workers who are covered by a collective bargaining agreement.
state_abbreviation
character
State abbreviation.
year
double
Year of the survey.
As in our previous Tidy Tuesday blog, I believe we can address the following questions using this dataset:
What are the overarching trends in the labor force, particularly regarding union memberships?
Do specific demographic groups or occupations display a higher likelihood of union membership?
Do union members experience any financial advantages or benefits compared to non-union workers?
Which states have the highest number of union members or affiliated unions?
What are the overarching trends in the labor force, particularly regarding union memberships?
The figure below illustrates a consistent upward trajectory in the workforce’s growth, whereas union membership and the number of individuals covered by unions have been on a declining trend over the decades.
Do specific demographic groups or occupations display a higher likelihood of union membership?
Next, our investigation will focus on identifying potential demographic disparities among workforce members and uncovering the professions with the highest likelihood of union membership. We’ll utilize the “p_members” column, which represents the percentage of employed workers who are union members, to delve into these aspects.
As evident in Figure 2, a substantial decrease in union membership is observable across nearly all demographic groups over the decades.
While certain professions exhibit declining trends in union memberships, it’s noteworthy that some public sector occupations, such as postal service, police, and local government jobs, maintain the highest and most consistent levels of union participation (Figure 3).
Do union members experience any financial advantages or benefits compared to non-union workers?
The next figure is both straightforward and highly impactful, providing a direct comparison of average wages for union members versus non-union members.
As with our previous figures, we will now delve into demographic variations and job sector disparities. To unveil these potential distinctions, we will use the “union_wage_premium_raw” column, which indicates the percentage difference between union and non-union wages.
The most significant disparities in hourly wages (Figure 5) between unionized and non-unionized workers are observed within the construction sector. These disparities have fluctuated over the decades, ranging from below 40% to as high as 80%. It’s important to note that almost all sectors exhibit higher earnings for union members, except for federal, manufacturing, and wholesale/retail workers (particularly in recent times).
In addition to gender-specific differences, it’s intriguing to observe the relationship between education levels and wage disparities between unionized and non-unionized workers. Workers with education levels below college tend to benefit more from union membership, whereas, in general, individuals with higher education levels tend to have lower wages when affiliated with a union.
Which states have the highest number of union members or affiliated unions?
To illustrate variations in union membership across states, we will leverage Lets Plot’s capabilities for visualizing spatial data. Given the extensive dataset, our focus will be on highlighting a subset of the data, specifically the most recent trends in 2022 across all sectors.
As evident in Figure 7, certain states like New York, Alaska, Hawaii, Washington, California, and Oregon have a significant portion of their workforce as union members. Conversely, states like Utah, South Dakota, and the Carolinas display lower percentages of union membership among their workforce.
There you have it! I hope you found this week’s Tidy Tuesday analysis insightful. I would encourage everyone to dive deeper into this dataset as I’ve only scratched the surface; there’s a wealth of knowledge waiting to be uncovered here.
Happy coding, and until next time, see you in our next exploration!