Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 22 October 2022

The impact of gender diversity on scientific research teams: a need to broaden and accelerate future research

  • Hannah B. Love   ORCID: orcid.org/0000-0003-0011-1328 1 , 2 ,
  • Alyssa Stephens 2 , 3 ,
  • Bailey K. Fosdick   ORCID: orcid.org/0000-0003-3736-2219 2 ,
  • Elizabeth Tofany 2 &
  • Ellen R. Fisher   ORCID: orcid.org/0000-0001-6828-8600 1 , 2 , 4  

Humanities and Social Sciences Communications volume  9 , Article number:  386 ( 2022 ) Cite this article

7031 Accesses

6 Citations

3 Altmetric

Metrics details

  • Business and management
  • Complex networks
  • Science, technology and society

Multiple studies from the literature suggest that a high proportion of women on scientific teams contributes to successful team collaboration, but how the proportion of women impacts team success and why this is the case, is not well understood. One perspective suggests that having a high proportion of women matters because women tend to have greater social sensitivity and promote even turn-taking in meetings. Other studies have found women are more likely to collaborate and are more democratic. Both explanations suggest that women team members fundamentally change team functioning through the way they interact. Yet, most previous studies of gender on scientific teams have relied heavily on bibliometric data, which focuses on the prevalence of women team members rather than how they act and interact throughout the scientific process. In this study, we explore gender diversity in scientific teams using various types of relational data to investigate how women impact team interactions. This study focuses on 12 interdisciplinary university scientific teams that were part of an institutional team science program from 2015 to 2020 aimed at cultivating, integrating, and translating scientific expertise. The program included multiple forms of evaluation, including participant observation, focus groups, interviews, and surveys at multiple time points. Using social network analysis, this article tested five hypotheses about the role of women on university-based scientific teams. The hypotheses were based on three premises previously established in the literature. Our analyses revealed that only one of the five hypotheses regarding gender roles on teams was supported by our data. These findings suggest that scientific teams may create ingroups , when an underrepresented identity is included instead of excluded in the outgroup , for women in academia. This finding does not align with the current paradigm and the research on the impact of gender diversity on teams. Future research to determine if high-functioning scientific teams disrupt rather than reproduce existing hierarchies and gendered patterns of interactions could create an opportunity to accelerate the advancement of knowledge while promoting a just and equitable culture and profession.

Similar content being viewed by others

research hypothesis on gender inequality

Towards understanding the characteristics of successful and unsuccessful collaborations: a case-based team science study

research hypothesis on gender inequality

Interpersonal relationships drive successful team science: an exemplary case-based study

research hypothesis on gender inequality

The achievement of gender parity in a large astrophysics research centre

Introduction.

Diversity in scientific teams is often a catalyst for creativity and innovation (Misra et al., 2017 ; Smith-Doerr et al., 2017 ), and numerous studies have documented that gender diversity, the equitable representation of genders, is important for the development, process, and outcomes of scientific teams (Bear and Woolley, 2011 ; Hall et al., 2018 ; Misra et al., 2017 ; Riedl et al., 2021 ; Smith-Doerr et al., 2017 ; Woolley et al., 2010 ). Furthermore, research has found evidence that a higher proportion of women on a team increases collective intelligence (Riedl et al., 2021 ; Woolley et al., 2010 ), and that gender-balanced teams lead to the best outcomes for group process (Bear and Woolley, 2011 ; Carli, 2001 ; Taps and Martin, 1990 ). When scientists hear that the proportion of women influences team performance, they often ask “What proportion is needed, and why does the proportion of women impact team success?”

The answers to these questions remain unclear. To date, most research on the impact of gender composition on scientific teams only uses quantitative metrics (e.g., comparing team rosters and bibliometric data) (Badar et al., 2013 ; Lee, 2005 ; Lerback et al., 2020 ; Pezzoni et al., 2016 ; Wagner, 2016 ; Zeng et al., 2016 ). Although these quantitative metrics provide a reasonable starting point, they emphasize the presence of women rather than their levels of integration or participation, which may perpetuate tokenism on scientific teams. As Smith-Doerr et al. ( 2017 ), reported

Our journey through the literature demonstrated a critical difference between diversity as the simple presence of women and minority scientists on teams and in workplaces, and their full integration (p. 140).

Similarly, Bear and Woolley ( 2011 ) conducted a meta-analysis of the literature from multiple disciplines and found that when diverse team members were integrated holistically, team diversity contributed to innovation. Conversely, in studies where teams had diverse membership but failed, these teams were often relying on token members and did not have authentic and full integration of those diverse members. Bear and Woolley ( 2011 ) suggest that the proportion of women on a team roster should be studied as follows:

It is not enough to simply examine the number of women in a particular institution or role. … In order to be truly effective, the role that women play in scientific teams should also be taken into consideration and promoted in order to yield the substantial benefits of increased gender diversity (p. 151).

These recent studies signal a paradigm shift in literature in the perceptions of diversity on teams because historically, diversity on teams was perceived as negative. In 1997, Baugh and Graen ( 1997 ) described teams with women and minorities were perceived to be less effective. Benschop and Doorewaard ( 1998 ) described how teams simply (re)produce gender inequality and they did not see a future in teams providing opportunities for women. Guimerà et al. ( 2005 ) claimed that while diversity may potentially spur creativity, it typically promotes conflict and miscommunication. Today, it is well accepted in the literature that to create new knowledge and solve complex global problems, studies in the science of team science (SciTS), knowledge innovation, creative, and more have documented that diversity in teams is important for the process, interactions, and outcomes (Bear and Woolley, 2011 ; Hall et al., 2018 ; Misra et al., 2017 ; Riedl et al., 2021 ; Soler-Gallart, 2017 ; Ulibarri et al., 2019 ; Woolley et al., 2010 ).

Numerous researchers have called for varied approaches to the study of women on teams. Madlock-Brown and Eichmann ( 2016 ) wrote that we “need a multi-pronged approach to deal with the persisting gender gap issues” (p. 654). Bozeman et al. ( 2013 ), explained that we understand collaboration from a bibliometric standpoint, but much more qualitative research is needed about the meaning of collaboration and the more informal side of collaboration, including mentoring, ingrained biases, and balancing collaborations (Reardon, 2022 ). Further, many of these studies about women on teams were conducted with undergraduate students within curricular settings, not with real-world scientific teams. Fundamentally, to understand gender patterns in scientific collaborations, qualitative and mixed methods research approaches are needed that study the process of scientific team development and not just team outcomes (Keyton et al., 2008 ; Wooten et al., 2014 ).

This study focused on 12 interdisciplinary university scientific teams that were part of an institutional team science program from 2015 to 2020 aimed at cultivating, integrating, and translating scientific expertise. Team science is research conducted collaboratively by small teams or larger groups (Cooke and Hilton, 2015 ). The program included multiple forms of evaluation, including participant observation, focus groups, interviews, and surveys at multiple time points. More specifically, gender diversity was explored by using mixed-methods data from team interactions to investigate two primary research questions: (1) what is the role of women on scientific teams? and (2) how do women impact team interactions?

Members of the 12 teams completed social network surveys about their relationships including who they seek advice from, who is a mentor, who serves on student committees, who they learn from, and who they collaborate with. Social network analysis studies the behavior of the individual at the micro level, the pattern of relationships (network structure) at the macro level, and the interactions between the two (Stokman, 2001 ). In the context of team science, social network analysis provides insights into how interactions are related to team success and how the social processes teams use supports the knowledge-creation process (Cravens et al., 2022 ; Giuffre, 2013 ; Granovetter, 1977 ; Love et al., 2021 ; Zhang et al., 2020 ). Utilizing these data, we calculated the indegree for each team member’s relationship with other team members. Indegree quantifies the number of other team members that stated they had the selected relationship with the given individual. For example, the advice indegree counts the number of other team members that reported receiving advice from that person. To compare results across the teams, the indegree and outdegree measures were scaled by the number of respondents to account for the total number of possible connections for individuals. These social network measures allowed us to test five hypotheses based on the current team science literature and other disciplines about how women impact team interaction and collaboration.

Hypothesis 1 : Women faculty will have a higher indegree than men faculty within the mentoring and student committee networks. Men faculty members will have a higher indegree than Women faculty members in the advice and leadership networks.

Hypothesis 2 : Men at all career stages will be more likely to be considered a leader on the team than women, measured by having a higher average scaled indegree in the leadership network.

Hypothesis 3 : Various networks will be correlated as follows:

Leadership and advice networks will be positively correlated.

Mentoring networks will not be positively correlated with leadership or advice networks.

Mentoring and student committees will be correlated.

Hypothesis 4 : The social and collaboration relations will be more positively correlated for women than for men.

Hypothesis 5 : Non-faculty team members will have more social connections on teams with a senior woman relative to those on teams without a senior woman.

These hypotheses are grounded in the literature on the persistent, latent, and subtle ways gender inequality is reproduced within organizations (Acker, 1992 ; Benschop and Doorewaard, 1998 ; Cole, 2004 ; Fraser, 1989 ; Gaughan and Bozeman, 2016 ; Madlock-Brown and Eichmann, 2016 ; Sprague and Smith, 1989 ). Many theories regarding the impact of gender diversity assume that teams reproduce socialized patterns of behavior. Zimmerman and West ( 1987 ) wrote that gender is not a biological concept, but it is a social construction that “involves a complex of socially guided perceptual, interactional, and micropolitical activities that cast particular pursuits as expressions of masculine and feminine ‘natures’” (West and Zimmerman, 1987 ). Gender is thus created by social organization and performed in our everyday lives and the ways we interact with one another (Butler, 1988 ). Gender, albeit a social construct, is an influential schema that impacts behaviors and interactions in society (West and Zimmerman, 1987 ).

According to Zimmerman and West ( 1987 ) and Butler ( 1988 ), the process of gender socialization includes ideas about who is a leader, how leaders should act, and even what leaders should look like. Many studies have found that women may not be perceived as leaders even when their status or contributions to the team are high (Bunderson, 2003 ; DiTomaso et al., 2007 ; Humbert & Guenther, 2017 ; Joshi, 2014 ). Other studies have found that men were more influential in groups, even when they were in the minority (Craig and Sherif, 1986 ), and that teams with women and minorities were perceived to be less effective (Baugh and Graen, 1997 ). Furthermore, although leadership responsibilities often become attached to specific roles, they can also be conferred and performed based on the perception of the individual qualities or capabilities of team members (Butler, 1988 ). For example, if a woman is a principal investigator (PI), a man on the team may also be considered a leader and vice-versa. These conferred roles may impact individual responsibilities and further solidify the perception of who is the team leader .

Perceptions about the roles of women and men can also impact the responsibilities they are assigned during meetings and the duties they are expected to perform in the workplace. In academia, faculty are frequently expected to engage in service work to support the university, the discipline, and the community. Service work may include mentoring, advising, and serving on committees. Recent studies suggest what has been long perceived within academia, that when controlling for rank, race/ethnicity, and discipline, women spend significantly more hours on service work when compared to their male colleagues, (Guarino and Borden, 2017 ; Misra et al., 2011 ; Urry, 2015 ). In STEM disciplines, women spend a higher percentage of their time on mentoring than their male counterparts (21% for women vs. 15% for men) (Misra et al., 2011 ). Researchers have not yet explored whether team science exacerbates or mitigates this disparity in service work.

Literature has documented that collaboration patterns are different for women and men. Women faculty and students participate in more interdisciplinary research in almost all fields at every career stage (Rhoten and Pfirman, 2007 ). In addition, women tend to have more collaborators than men (Bozeman and Gaughan, 2011 ), and studies have found that being well-connected correlates with success for women (Madlock-Brown and Eichmann, 2016 ). Is it possible that having a senior woman on the team creates a culture of collaboration, such that non-faculty, which might be traditionally marginalized on a team, are more well-connected? We evaluate this here by comparing the connectedness of non-faculty on teams with and without a senior woman.

In part, the lack of understanding about why gender diversity matters on scientific teams result from primarily studying member demographic profiles rather than studying how teams are functioning, including exchanges of knowledge, power dynamics, and the team development process which is critical to team success (Smith-Doerr et al., 2017 ). This study moves beyond team composition to expand and examine real-world scientific teams through analysis of relational data to answer the questions: What is the role of women on scientific teams; and How do women impact team interactions?

This study was conducted at a land grant, R1 University in the western region of the United States. The primary sample for this study was 12 self-formed, interdisciplinary scientific teams with varied research foci, who were participants in a competitive university-funded team science program from 2015 to 2020. To apply for funding, each team submitted a written application and competed in a pitch fest (a brief oral presentation of their proposed project) that was followed by an intensive question and answer session by the review team. The topics for the interdisciplinary teams that were selected were broadly defined across STEM-related fields. The teams were expected to contribute to high-level program goals, which included:

Increase university interest in multi-dimensional, systems-based problems

Leverage the strengths and expertise of a range of disciplines and fields

Shift the funding landscape towards investing in team science/collaborative endeavors

Develop large-scale proposals; high caliber research and scholarly outputs; new, productive, and impactful collaborations

These overarching goals were measured by having the teams report on a variety of outcome metrics, including publications, proposals submitted, and awards received.

Participation in the team science program occurred through two cohorts and lasted 24–30 months for each cohort. However, a team in the second cohort left the program after 12 months. During the program, teams met with administrative leadership, the team science research team, and some external partners every 3–4 months to provide progress updates on stated milestones and receive feedback and mentorship. Additional support was provided through individualized trainings/workshops approximately every few months throughout the program. These sessions provided additional instruction on team science principles, social network analysis interpretation, marketing/branding, diversity and inclusion, opportunity identification, philanthropic fundraising, technology transfer, visioning, and team management/leadership. Some of the training was attended by multiple teams, but often these were specifically designed for the needs and developmental stage of each team. An additional team volunteered to participate in the study but was not part of the formal program. This team, also self-formed, was an interdisciplinary team that had received a large award through a federal grant. The 13 teams were randomly assigned a number from 1 to 13 to maintain anonymity and are referred to in this study by their team number. Team 2 was excluded from the study altogether because two of the authors were members of this team.

Data collection

Multiple types of evaluation data, at multiple points in time, were collected throughout the university-based team science program including participant observation, focus groups, turn-taking data, rosters, interviews, and surveys. This study utilized the resulting data from rosters, participant observation, field notes, and responses to a social network survey. Data for this article is from social network surveys at the conclusion of the program or the closest associated data point. Selecting data from a similar timepoint follows the recommendations of Wooten et al. ( 2014 ) who differentiated between development, process, and outcome metrics for scientific teams.

Teams submitted rosters with demographic information including name, email, self-identified gender, title, college, department, and role on the team (i.e., PI, member, graduate students, etc.). Rosters were updated annually during the program and provided the data to define senior woman and junior faculty and other demographic categories.

Social network survey

Each team member on the roster was sent an email after the program end date and was asked to complete an online social network survey that had two sections: demographic and social network relational questions (see Appendix Table 2 ). Following IRB protocol #19-8622H, participation was voluntary, and all subjects were identified by name on the social network survey to allow for the complete construction of social networks. Names were deleted prior to data analysis and result reporting.

To ensure that respondents had the option to select a self-identified gender, the social network survey included a demographic question that asked participants to self-identify their gender by filling in a blank space rather than choosing from a prescribed drop-down list. This was the gender attribute used for analysis in this article. Two respondents did not answer the gender demographic survey question, and the roster data was used for these participants. There was no variability in the level of missingness across questions. Respondents either completed the survey or did not.

The network survey’s relational questions asked about the presence and absence of interactional mentoring, advice, leadership, and collaborative relationships with other members of the team. The first set of questions was developed by the research team primarily to collect information about scientific collaborations since joining the team. The survey asked, who have you:

talked about possible joint research/ideas/concepts/connections

worked on research, collaborations, tech projects, or consulting projects

worked on joint publications presentations, or conference proceeding

worked on or submitted a grant proposal; and sat on a student’s committee together (or is a member of your thesis/dissertation committee)

The second set of questions focused on social relationships within the team, including:

I learn from [ this person ]

I seek advice from [ this person ]

I hang out with [ this person ] for fun

[ this person ] is a leader on the team

[ this person ] is a mentor to me

[ this person ] is a friend

[ this person ] energizes me

Participant observation and field notes

A researcher attended two to six team meetings for each team to collect observational data. There were two exceptions to this as Team 1 did not have face-to-face team meetings, precluding participant observation; and Team 5 did not consent to observation at their meetings. After the meetings, the researcher recorded field notes to provide qualitative insights into the progress of the team development, their patterns of collaboration, and gender interactions as suggested by Marvasti ( 2004 ). The field notes supported the development of the senior women classification (see Appendix Table 1 for classification definitions). In addition to roster information, many teams had separate leadership teams that met and determined the scientific direction of the team. If a team had a woman on the leadership team, as recorded in field notes, then they received the designation of having a senior woman .

Statistical analysis

RStudio (R Studio Team, 2020 ) was used to analyze the social network data. The data were summarized using outdegree, indegree, and average degree. The outdegree of an individual is a measure of how many other team members they indicated receiving advice, mentorship, etc. from on the team. Alternatively, the indegree of an individual is a measure of how many other team members reported receiving advice, or mentorship, from that person. Average degree is the average number of immediate connections (i.e., indegree plus outdegree) for a person in a network (Giuffre, 2013 ; Hanneman and Riddle, 2005 ). To compare results across the teams, the indegree and outdegree measures were scaled by the number of respondents to account for the total number of possible connections for individuals (which is a function of both team size and response rate). The scaled indegree is thus the proportion of the team that named that team member for a given category. For example, if a team member has a scaled mentor indegree of 0.10, then 10% of the responding team members consider this individual to be a mentor. Confidence intervals for scaled indegrees were calculated using a t -distribution due to limited sample size.

The social relation question set responses were also analyzed separately and then combined for further statistical analysis. Three measures were created: collaboration, social, and professional support. To create the measure called collaboration , the following questions were combined: worked on research, collaborations, tech projects, or consulting projects; worked on joint publications presentations, or conference proceedings; worked on or submitted a grant proposal. To create the measure called social , the measures: I hang out with [this person] for fun and [this person] is a friend were combined. Finally, to create the measure called professional support , the measures: I seek advice from [this person], [this person] is a mentor to me, and sat on a student’s committee together (or is a member of your thesis/dissertation committee) were combined (see Appendix Table 2 for Terms and Associated Survey Questions).

In addition, data from the social network relational questions were used to construct multiple social network diagrams, wherein nodes represent the team members, and an edge exists from participant A to participant B if A perceived a relation with B. For example, in the mentorship network, a link from A to B signified that A considered B to be a mentor.

Field notes were analyzed using a constant comparative method (Mathison, 2013 ) to provide qualitative insights into the progress of overall and individual team development, patterns of collaboration, and gender interactions as suggested by Marvasti ( 2004 ).

Classifications

For analysis purposes, three classifications were created from the demographic data. Senior woman indicates there was a woman PI or a woman on the leadership team. Faculty was defined as an assistant, associate, and full professor. Non-faculty were defined as undergraduate students, graduate students, postdocs, research associates, community partners, and project managers. In the study, 78.5% of faculty, and 77.6% of non-faculty completed the survey (see Appendix Fig. 1 for more details on response rate and Appendix Table 1 for terms and definitions).

Demographic data

Over half of the 204 team members, 160 (78.2%), completed the survey. Out of 160 respondents, 84% of women and 73% of men completed the survey. Table 1 provides demographic data by team number. Team size ranged from a low of 6 and a high of 30 members and the average number of team members was 15. The university had seven colleges, and all teams had representation from three to seven colleges.

Hypotheses testing

Test results of the five study hypotheses are presented below.

Hypothesis 1 : Women faculty will have a higher indegree than men faculty within the mentoring and student committee networks, and men faculty members will have a higher indegree than women faculty members in the advice and leadership networks.

The first hypothesis was designed to investigate if women were perceived to be doing more service work and emotional labor (mentoring and student committee networks), and men were perceived as being leaders (leader and advice networks) (Guarino and Borden, 2017 ; Misra et al., 2011 ; Urry, 2015 ).

Figure 1 compares the average indegree values of men and women on each team in four social network diagrams (mentoring, student committees, leader, and advice). The data in Fig. 1 do not support the hypothesis that more team members went to women faculty for mentoring and for serving on student committees. Further, the data did not support that more team members went to men faculty for advice or reported viewing them as leaders.

figure 1

These are plotted against one another, where the size of the dot reflects the number of team members that completed the survey. When the number of respondents is low (a small dot), the scaled indegree is expected to be more variable, whereas when the number of respondents is high (a large dot), the scaled indegree is expected to be less variable and more representative of the whole team’s perceptions. Each graph reports a different social network question (mentor, student committee, advice, and leader).

The Fig. 1 mentoring network does, however, illustrate that teams in the study either engaged or did not engage in mentoring. On teams where women had a high mentoring indegree, men also had a high indegree in the mentoring network. This indicates that mentoring was team-specific rather than gender-specific. This aligns with other studies about team processes that found team norms (like mentoring) impact the behaviors and processes of teams (Duhigg, 2016 ; Winter et al., 2012 ).

Hypothesis 2 : Men at all career stages are more likely to be considered a leader on the team than women, measured by having a higher average scaled indegree in the leadership network (Table 2 ).

Literature in business, political science, and sociology report that men are more likely to be perceived as leaders (Baugh and Graen, 1997 ; Bunderson, 2003 ; Craig and Sherif, 1986 ; DiTomaso et al., 2007 ; Humbert and Guenther, 2017 ; Joshi, 2014 ). Based on this, we hypothesized that these perceptions would also be present in scientific teams (Table 2 , Fig. 2 ). In the study, both men faculty and men non-faculty were more likely to be reported as a leader on the team; however, this finding was not statistically significant based on a 95% confidence interval (CI) (Table 2 ).

figure 2

The values for men and women for each of the faculty types are plotted against one another. Faculty were more likely to be considered leaders than non-faculty, but there were no significant differences between reporting men or women as leaders on scientific teams.

Figure 2 illustrates the scaled indegree for women and men faculty and non-faculty, which shows faculty are more likely to be considered leaders than non-faculty. Nevertheless, there were no significant differences in whether team members reported men or women as leaders on scientific teams.

Hypothesis 3 : Based on socialized gendered perceptions various networks will be correlated as follows:

The third hypothesis focused on whether gendered perceptions resulted in certain network diagrams being correlated. Previous studies have found that men are more likely to be perceived as leaders (Baugh and Graen, 1997 ; Bunderson, 2003 ; Butler, 1988 ; Craig and Sherif, 1986 ; DiTomaso et al., 2007 ; Humbert and Guenther, 2017 ; Joshi, 2014 ) and women are more likely to be perceived as mentors or caretakers (Guarino and Borden, 2017 ; Misra et al., 2011 ; Urry, 2015 ). These perceptions are sedimented in the language used to describe men and women (Sprague and Massoni, 2005 ).

figure 3

We see the advice, leader, and mentor networks were highly correlated but only weakly correlated with the student committee network.

Based on this literature, we hypothesized that the leadership and advice networks would be correlated because both leading and giving advice suggest a greater power differential. Second, the mentoring network would not be correlated with leadership or advice networks because mentoring is more closely aligned with caregiving activities, which are considered more feminine. Third, the mentor and student committee networks would be correlated because these acts are associated with caretaking. Here, we tested if the networks related to leadership were correlated and if networks related to mentorship and service work such as serving on student committees were correlated.

Figure 3 illustrates the correlations for four of the network diagrams (mentoring, student committee, advice, and leadership) and reports the significance. The first gendered perception, that the leadership and advice networks would be correlated, was validated by the data. In the study, the leadership and advice networks were correlated (0.83). However, the hypothesis that the mentoring network would not be correlated with leadership (0.82) and advice (0.84) was not supported. These network diagrams were correlated, indicating team members who reported other team members as being leaders also reported that they received advice and mentoring from them. Finally, the hypothesis that mentoring and student committee diagrams would be correlated was also not validated by the data (0.32). One factor that could be contributing to these results comes from studies that show perceived organizational support, as well as perceived leader support, correlate with creativity and satisfaction in the workplace (Handley et al., 2015 ; Moss-Racusin et al., 2012 ; Smith et al., 2015 ). On the teams, members that are perceived as leaders are likely to provide support to others on the team. Notably, these studies did not explicitly examine gender in their findings.

A growing body of literature seeks to understand the connection between interpersonal relationships and knowledge innovation (Reference Blinded). We investigate this by considering how three types of interactions collaborative, social, and professional are intertwined on scientific teams. The purpose of this hypothesis was to closely examine the collaboration patterns of men and women and the connection between interpersonal relationships and knowledge creation. To create the measures in this hypothesis, social network survey questions were combined. For example, the measure social is a combination of: I hang out with [this person] for fun and [this person] is a friend (see the Analysis section for descriptions of all the measures).

To test what proportion of team members collaborate, given that they are also social with these individuals, we identified the team members that the person was social with and then calculated what proportion of those members they were also collaborating with. The results for this measure are given in Table 3 as proportion collaboration given social . Other items in Table 3 were developed in a similar manner.

Although our results indicate no statistical differences between men and women, we found that both men and women have intertwined relationships. If a team member is in one network (e.g., collaboration), it is likely that the person is also in another one of their networks (e.g., social). Furthermore, the overall proportion of men who have intertwined relationships in their collaboration, social, and professional support networks were higher in all proportions except proportion social given professional support (Table 3 ).

Numerous studies have attempted to tease apart gendered approaches to different collaboration styles and whether this has any impact on scientific collaborations (Bozeman et al., 2013 ; Madlock-Brown and Eichmann, 2016 ; Misra et al., 2017 ; Zeng et al., 2016 ). To build on this body of literature, this hypothesis tests the impact of senior women’s leadership, if any, on the collaborations of senior women and their impact on the network.

Figure 4 illustrates the scaled average indegree on the whole team when there are women in senior positions. A high average indegree for the team indicates that more team members and interacting and socializing on the team. The average scaled indegree on teams with a senior woman was 0.28 and without a senior woman was 0.20 ( t -test p  = 0.44; Cohen’s D effect size 0.51). The second graph in Fig. 4 illustrates the scaled average indegree on non-faculty when there are women in a senior positions. The average scaled indegree on teams with a senior woman was 0.27 and without a senior woman was 0.16 ( t -test p  = 0.42; Cohen’s D effect size 0.55). Thus, there was no evidence to conclude that senior women influenced the social interactions on the team.

figure 4

These average scaled indegree measures were then separated based on whether there was a senior woman leader on the team, and the average across all teams was marked by a black horizontal bar. Based on these data, there appears to be no systematic difference in the social interactions of teams with a woman in a senior position and teams without a woman in a senior position. Average scaled indegree of non-faculty on teams without a senior woman = 0.16. Average scaled indegree of non-faculty on teams with a senior woman = 0.27. ( t -test) p -value = 0.42.

This study explored the impact of gender diversity on 12 scientific teams by analyzing team development and process data. It investigated two primary research questions: What is the role of women on scientific teams? and How do women impact team interactions? We initially believed that the primary reason previous research had been unable to adequately explain the role of women on scientific teams and how women impact team interactions were in part due to the lack of qualitative and mixed methods studies. We based our initial hypothesis on the assumption that scientific teams reproduce existing patterns of inequality (Butler, 1988 ; West and Zimmerman, 1987 ). However, it was through the development of the five hypotheses for this study and the subsequent analysis of relational data, that we learned that our assumption was in large part not supported.

Numerous studies have found evidence of systematic discrimination and bias in awarding grants (Ginther et al., 2011 ), acceptance of publications (Lerback et al., 2020 ; Salerno et al., 2019 ), language to describe women (Ross et al., 2017 ), promotion decisions (Régner et al., 2019 ), rewards (Mitchneck et al., 2016 ), and access to resources for research (Misra et al., 2017 ) in addition to other obstacles and forms of marginalization that are invisible and unacknowledged (Rhoten and Pfirman, 2007 ; Urry, 2015 ). Why did our data not replicate these findings? We conclude with the following possible explanations.

Preliminary studies in the SciTS literature have found that team science principles may simultaneously support the advancement of women in scientific fields; and complementarily, the inclusion of women on scientific teams may increase the success of these teams (McKean, 2016 ; Woolley et al., 2010 ). Further, including women and underrepresented populations on scientific teams has the potential to “serve as a strong entry point into scientific studies for women” (Rhoten and Pfirman, 2007 , p. 72). Similarly, in sociology, Soler-Gallart ( 2017 ) found positive benefits for the whole team when scientists engaged in dialogic relations and interaction with the intention of overcoming gender barriers and discrimination. Could team science advance women in their scientific careers? If high-functioning scientific teams disrupt rather than reproduce existing hierarchies and gendered patterns of interactions, it increases the possibility that team science is a tool not only for accelerating the creation of knowledge but for the advancement of a more empowered, just, and equitable profession.

Literature has documented how including historically underrepresented identities in the ingroup changes attitudes and behaviors (Soler-Gallart, 2017 ). Allport et al. ( 1954 ) found that when members of an ingroup were in close contact and built connections with members of an outgroup, prejudice decreased. Initially, the theory about ingroups and outgroups was devised to describe race and ethnic relations; however, recent studies have generalized the findings to other topics including gender bias and discrimination (Pettigrew and Tropp, 2006 ). Today, numerous studies have documented that intergroup contact and connections can improve intergroup attitudes (Allport et al., 1954 ; Brewer, 2007 ; Dovidio et al., 2012 ; Pettigrew and Tropp, 2006 ). Is it possible that scientific teams create ingroups that include rather than exclude women?

The teams in this study were not created nor did they develop in isolation. These teams had access to team development resources like SciTS literature, team science training, and access to administrative expertise and support. The promotion and tenure package of the selected university for this study allowed faculty to include interdisciplinary and team accomplishments. Structures were in place to fund, train, build, and reward these teams. Many of these resources, interventions and structures were designed and led by a group of nine women and one man. The women, especially, emphasized diversity, equity, and inclusion from team formation to building and rewarding successes. In addition, many of the sessions were customized to meet the needs of individual teams. Did these facilitators create an ingroup ? Although we did not test the impact of these interventions and structures, other studies have previously hypothesized that modifying existing and often outmoded structures will positively impact outcomes for women (Gibbons et al., 1994 ; Hansson, 1999 ; Rhoten and Pfirman, 2007 ). Another study found that when team members participate in dialog relations and interactions instead of using prestige to gain power they were more willing to rethink concepts when presented with new information (Soler-Gallart, 2017 ). Specifically, in terms of women in science, Rolison ( 2000 , 2004 ) developed a hypothesis recommending explicitly applying Title IX principles to support women in academia. She posited that providing equal funding opportunities and resources for women would result in equal opportunities for success. Another study attributed the key to their team’s success was the inclusion of women, the community, and other diverse perspectives from the community (Soler-Gallart, 2017 ). Our findings suggest that the handful of women on our teams may have joined the ingroup in academia albeit if only for a short time.

It is important to note that we do not believe our results accurately reflect the university of study as a whole or academia in general. Team observations and resulting field notes documented numerous accounts of gender inequality and inequity where women were disempowered and had limited opportunities to contribute to the team. Moreover, we are confident that women on these teams have had individual experiences that would contradict our findings. A lack of evidence does not indicate that there is equality. Nevertheless, these results do suggest that scientific teams, developed with intention, may provide greater opportunities for women to amplify their contributions to science (McKean, 2016 ; Rhoten and Pfirman, 2007 ; Woolley et al., 2010 ).

Limitations

Previous studies on gender and scientific teams have used bibliometric data to understand patterns of collaboration. Other studies on teams have created teams in the lab using students and other volunteers. Although this study is unique and contributes to the literature, as the data are based on real-world scientific teams, we identified six limitations.

First, several teams had apprehension about participating in SciTS research, and one team left the program after year one resulting in limited data from those teams. Second, teams may have experienced the so-called Hawthorne effect (K. Baxter et al., 2015 ) and performed differently because they were part of a research study, and a researcher regularly attended team meetings. All participant observations related to the positionality of the researcher were well-documented in field notes (P. Baxter and Jack, 2008 ; Greenwood, 1993 ; Marvasti, 2004 ).

Third, we defined senior women in a manner that would be inclusive to women with and without formal titles. The senior woman designation was given based on both formal titles and field notes. Some of the teams in our study had women who were the PI or in a designated leadership position with formal titles, and other teams had women on the leadership team. It is possible that the women on these teams were seen as leaders because of their position on the team, but that their leadership came without titles, awards, and recognition that might have been associated with those titles.

Fourth, it is possible that study participants had varying definitions of mentor , advice , and leader . We anticipated different interpretations in our study plan and as a result combined data in hypothesis four to detect and account for potential differences in definitions. Nevertheless, we acknowledge that lived experiences, in general, give individuals different perspectives. Literature in political science has found that when people imagine a leader , many of the traits are more masculine (e.g., wearing a suit, being tall and bigger) (Butler, 1988 ). Fifth, we did not measure the success of the teams in this study; thus, we were unable to translate how different interaction patterns translated into team performance. Ongoing funding was, however, contingent on performance as measured by pre-determined metrics including numbers of grants, publications, invention patents, and other markers of success.

Finally, a limitation of all social network studies is that data are collected at a single point in time. Thus, temporal changes in team interactions cannot be accounted for in our sample. For example, we cannot discern whether social relationships or scientific collaborations came first. We only know that they were both happening at the time the survey was administered. Further, at the time the survey was completed, it is possible that a person had not yet established a relationship, or they had forgotten about a previous relationship.

Conclusion, recommendations, and future research

We offer three key recommendations for future research. First, scientific results that are statistically insignificant are rarely shared in the literature. Therefore, it is critical that all efforts to expand research be published to broaden and accelerate the understanding of the role of women in scientific teams (Bammer et al., 2020 ; Oliver and Boaz, 2019 ).

Second, the landscape of science is changing rapidly as a result of private and federal funders requiring the inclusion of the science of team science experts as PIs in grant applications. We recommend that researchers expand their focus and examine how scientific teams change the culture of science. Research questions might include: How do support diverse teams translate to culture changes in science and the academy? Do scientific interdisciplinary teams provide more access for historically marginalized and disenfranchised groups? Finally, to create a comprehensive understanding of elements that contribute to expertise in scientific teams, we recommend that research be conducted with a theoretical focus on team development and processes. This would include studies that explore science facilitation, learning-by-doing, and other tacit forms of expertise that lead to integration and implementation of knowledge (rather than a focus on recruitment and demographics).

Third, existing studies define gender as a binary (man/woman). This short-sighted perspective is no longer relevant in society. Gender is not a biological concept, but a social construct, “It involves a complex of socially guided perceptual, interactional, and micropolitical activities that cast particular pursuits as expressions of masculine and feminine ‘natures.’” (West and Zimmerman, 1987 , p. 125). Gender is thus created by the social organization of our everyday lives and the way we interact with one another. People often see this difference as natural , and society is structured as a response to these differences in terms of men and women. Because of this, researchers like us continue to expend time and resources asking research questions rooted in binary gender. Future research should broaden definitions of diversity and gender including non-binary definitions of gender, expand how we measure inclusivity, explore how power imbalances block expertise, and study how a balance of power promotes expertise.

In conclusion, the lack of evidence for gender impacting team roles and behaviors in our study aligns with other SciTS studies that found team composition is not the silver bullet that automatically leads to knowledge creation and innovation (Duhigg, 2016 ; Oliver and Boaz, 2019 ). Numerous SciTS studies have documented the importance of processes over team composition and relationships to build successful teams (Boix Mansilla et al., 2016 ; Gaughan and Bozeman, 2016 ; Hall et al., 2018 ; Zhang et al., 2020 ). Perhaps the reason scientific teams produce more citations and have a greater impact than siloed investigators (Wuchty et al., 2007 ) is that they are leveraging the available expertise through the authentic integration of all members.

In the future, when scientists ask, “What proportion of women is ideal on a team?” consider responding with “It is not about the number of women, but rather how women on teams are integrated and empowered.”

Data availability

Data are available upon request to protect the privacy of our study participants. Parts of the larger data set have been made publicly available via the following links: https://doi.org/10.25675/10217/214187 and https://hdl.handle.net/10217/194364 .

Acker J (1992) Gendering organizational theory. Gendering Organ Anal 6(2):248–260. https://doi.org/10.1177/089124390004002002

Article   Google Scholar  

Allport G, Clark K, Pettigrew T (1954) The nature of prejudice. http://althaschool.org/_cache/files/7/1/71f96bdb-d4c3-4514-bae2-9bf809ba9edc/97F5FE75CF9A120E7DC108EB1B0FF5EC.holocaust-the-nature-of-prejudice.doc

Badar K, Hite JM, Badir YF (2013) Examining the relationship of co-authorship network centrality and gender on academic research performance: the case of chemistry researchers in Pakistan. Scientometrics 94(2):755–775. https://doi.org/10.1007/s11192-012-0764-z

Bammer G, O’Rourke M, O’Connell D, Neuhauser L, Midgley G, Klein JT, Grigg NJ, Gadlin H, Elsum IR, Bursztyn M, Fulton EA, Pohl C, Smithson M, Vilsmaier U, Bergmann M, Jaeger J, Merkx F, Vienni Baptista B, Burgman MA, … Richardson GP (2020) Expertise in research integration and implementation for tackling complex problems: when is it needed, where can it be found and how can it be strengthened? Palgrave Commun 6(1) https://doi.org/10.1057/s41599-019-0380-0

Baugh SG, Graen GB (1997) Effects of team gender and racial composition on perceptions of team performance in cross-functional teams. Group Organ Manag 22(3):366–383. https://doi.org/10.1177/1059601197223004

Baxter K, Courage C, Caine K (2015) Understanding your users: a practical guide to user research methods. In: Understanding your users. Second edition. Morgan Kaufmann/Elsevier, Amsterdam

Baxter P, Jack S (2008) The qualitative report qualitative case study methodology: study design and implementation for novice researchers. Qual Reportual Rep 13(2):544–559. https://nsuworks.nova.edu/tqr/vol13/iss4/2

Google Scholar  

Bear JB, Woolley AW (2011) The role of gender in team collaboration and performance. Interdiscip Sci Rev 36(2):146–153. https://doi.org/10.1179/030801811X13013181961473

Benschop Y, Doorewaard H (1998) Six of one and half a dozen of the other: the gender subtext of taylorism and team-based work. Gend Work Organ 5(1):5–18. https://doi.org/10.1111/1468-0432.00042

Boix Mansilla V, Lamont M, Sato K (2016) Shared cognitive emotional interactional platforms: markers and conditions for successful interdisciplinary collaborations. Sci Technol Hum Values 41(4):571–612. https://doi.org/10.1177/0162243915614103

Bozeman B, Fay D, Slade CP (2013) Research collaboration in universities and academic entrepreneurship: the-state-of-the-art. J Technol Transf 38(1):1–67 https://doi.org/10.1007/s10961-012-9281-8

Bozeman B, Gaughan M (2011) How do men and women differ in research collaborations? An analysis of the collaborative motives and strategies of academic researchers. Res Policy 40(10):1393–1402. https://doi.org/10.1016/j.respol.2011.07.002

Brewer MB (2007) The social psychology of intergroup relations: social categorization, ingroup bias and outgroup prejudice. In Kruglanski, AW & Higgins, ET (Eds.), Social psychology: Handbook of basic principles. The Guilford Press. pp. 695–715

Bunderson JS (2003) Recognizing and utilizing expertise in work groups: a status characteristics perspective. Adm Sci Q 48(4) https://doi.org/10.2307/3556637

Butler J (1988) Performative acts and gender constitution: an essay in phenomenology and feminist theory. Theatr J 40(4):519. https://doi.org/10.2307/3207893

Carli L (2001) Gender and social influence. J Soc Issues 57(4):725–741. https://doi.org/10.1111/0022-4537.00238

Cole S (2004) Merton’s contribution to the sociology of science. Soc Stud Sci 34(6):829–844. https://doi.org/10.1177/0306312704048600

Cooke NJ, Hilton ML (2015) Enhancing the effectiveness of team science. National Academies Press

Craig JM, Sherif CW (1986) The effectiveness of men and women in problem-solving groups as a function of group gender composition. Sex Roles 14(7–8):453–466. https://doi.org/10.1007/BF00288427

Cravens AE, Jones MS, Ngai C, Zarestky J, Love HB (2022) Science facilitation: navigating the intersection of intellectual and interpersonal expertise in scientific collaboration Humanit Soc Sci Commun 9(1):1–13. https://doi.org/10.1057/s41599-022-01217-1

DiTomaso N, Post C, Smith DR, Farris GF, Cordero R (2007) Effects of structural position on allocation and evaluation decisions for scientists and engineers in industrial R&D. Adm Sci Q 52(2):175–207. https://doi.org/10.2189/asqu.52.2.175

Dovidio JF, Gaertner SL, Kawakami K (2012) Intergroup contact: the past, present, and the future. Journals.Sagepub.Com 6(1):5–21. https://doi.org/10.1177/1368430203006001009

Duhigg C (2016) What google learned from its quest to build the perfect team. N Y Times https://www.nytimes.com/2016/02/28/magazine/what-google-learned-from-its-quest-to-build-the-perfect-team.html

Fraser N (1989) Unruly practices: power, discourse, and gender in contemporary social theory. In: Feminist review, vol 40(10). University of Minnesota Press, pp. 107–108

Gaughan M, Bozeman B (2016) Using the prisms of gender and rank to interpret research collaboration power dynamics. Soc Stud Sci 46(4):536–558. https://doi.org/10.1177/0306312716652249

Article   PubMed   Google Scholar  

Gibbons M, Limoges C, Nowotny H, Schwartzman S (1994) The new production of knowledge: the dynamics of science and research in contemporary societies. In CI, Sage Publications, London. pp. ix+179

Ginther DK, Schaffer WT, Schnell J, Masimore B, Liu F, Haak LL, Kington R (2011) Race, ethnicity, and NIH research awards. Science 333(6045):1015–1019. https://doi.org/10.1126/SCIENCE.1196783/SUPPL_FILE/GINTHER_SOM.PDF

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Giuffre K (2013) Communities and networks: using social network analysis to rethink urban and community studies, 1st ed. Polity Press.

Granovetter MS (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380

Greenwood RE (1993) The case study approach. Bus Commun Q 56(4):46–48. https://doi.org/10.1177/108056999305600409

Guarino CM, Borden VMH (2017) Faculty service loads and gender: are women taking care of the academic family. Res High Educ 58(6):672–694. https://doi.org/10.1007/s11162-017-9454-2

Guimerà R, Uzzi B, Spiro J, Nunes Amaral LA, Amaral LAN, Nunes Amaral LA, Guimera R, Brian U, Spiro J, Amaral LAN, Guimerà R, Uzzi B, Spiro J, Nunes Amaral LA (2005) Sociology: team assembly mechanisms determine collaboration network structure and team performance. Science 308(5722):697–702. https://doi.org/10.1126/science.1106340

Hall KL, Vogel AL, Huang GC, Serrano KJ, Rice EL, Tsakraklides SP, Fiore SM (2018) The science of team science: a review of the empirical evidence and research gaps on collaboration in science. Am Psychol 73(4):532–548. https://doi.org/10.1037/amp0000319

Handley IM, Brown ER, Moss-Racusin CA, Smith JL (2015) Quality of evidence revealing subtle gender biases in science is in the eye of the beholder. Proc Natl Acad Sci USA 112(43):13201–13206. https://doi.org/10.1073/pnas.1510649112

Hanneman R, Riddle M (2005) Introduction to social network methods. http://www.researchgate.net/profile/Robert_Hanneman/publication/235737492_Introduction_to_social_network_methods/links/0deec52261e1577e6c000000.pdf

Hansson B (1999) Interdisciplinarity: for what purpose? Policy Sci 32(4):339–343. https://doi.org/10.1023/A:1004718320735

Humbert AL, Guenther EA (2017) D3.1 The gender diversity index, preliminary considerations and results. March 1–31. https://doi.org/10.17862/cranfield.rd.5110978.v1

Joshi A (2014) By Whom and when is women’s expertise recognized? The interactive effects of gender and education in science and engineering teams. Adm Sci Q 59(2):202–239. https://doi.org/10.1177/0001839214528331

Keyton J, Ford DJ, Smith FL (2008) A mesolevel communicative model of collaboration. Commun Theory 18(3):376–406. https://doi.org/10.1111/j.1468-2885.2008.00327.x

Lee S (2005) The impact of research collaboration on scientific productivity. Soc Stud Sci 35(5):673–702. https://doi.org/10.1177/0306312705052359

Lerback JC, Hanson B, Wooden P (2020) Association between author diversity and acceptance rates and citations in peer-reviewed earth science manuscripts. Earth Space Sci 7(5) https://doi.org/10.1029/2019EA000946

Love HB, Cross JE, Fosdick B, Crooks KR, VandeWoude S, Fisher ER (2021) Interpersonal relationships drive successful team science: an exemplary case-based study. Humanit Soc Sci Commun 8(1):1–10. https://doi.org/10.1057/s41599-021-00789-8

Madlock-Brown C, Eichmann D (2016) The scientometrics of successful women in science. In: Proceedings of the 2016 IEEE/ACM international conference on Advances in Social Networks Analysis and Mining, ASONAM 2016. pp. 654–660. https://doi.org/10.1109/ASONAM.2016.7752307 .

Marvasti AB (2004) Qualitative research in sociology: an introduction. SAGE Publications.

Mathison S (2013) Constant comparative method. In: Encyclopedia of evaluation. SAGE Publications, Inc. https://colostate.primo.exlibrisgroup.com/discovery/fulldisplay?docid=cdi_askewsholts_vlebooks_9781452261447&context=PC&vid=01COLSU_INST:01COLSU&lang=en&search_scope=MyCampus_FC_CI_PU_P&adaptor=Primo%20Central&tab=Everyth

McKean V (2016). Evidence-based organizational change to support women’s careers in research. Science of Team Science (SciTS). https://sts.memberclicks.net/assets/2016_Conference_Images/scits 2016 conference program final 05may2016.pdf

Misra J, Lundquist JH, Holmes E, Agiomavritis S (2011) The ivory ceiling of service work. Academe 97(1):22–26. https://www.aaup.org/article/ivory-ceiling-service-work?wbc_purpose=basic&WBCMODE=presentationunpublished#.YSjyXo5KhPY

Misra J, Smith-Doerr L, Dasgupta N, Weaver G, Normanly J (2017) Collaboration and gender equity among academic scientists. Soc Sci 6(1):25. https://doi.org/10.3390/socsci6010025

Mitchneck B, Smith JL, Latimer M (2016) A recipe for change: Creating a more inclusive academy. Science 352(6282):148–149. https://doi.org/10.1126/science.aad8493

Article   ADS   CAS   PubMed   Google Scholar  

Moss-Racusin CA, Dovidio JF, Brescoll VL, Graham MJ, Handelsman J (2012) Science faculty’s subtle gender biases favor male students. Proc Natl Acad Sci USA 109(41):16474–16479. https://doi.org/10.1073/pnas.1211286109

Article   ADS   PubMed   PubMed Central   Google Scholar  

Oliver K, Boaz A (2019) Transforming evidence for policy and practice: creating space for new conversations. Palgrave Commun 5(1), 1–10

Pettigrew TF, Tropp LR (2006) A meta-analytic test of intergroup contact theory. J Pers Soc Psychol 90(5):751–783. https://doi.org/10.1037/0022-3514.90.5.751

Pezzoni M, Mairesse J, Stephan P, Lane J (2016) Gender and the publication output of graduate students: a case study. PLoS ONE 11(1):e0145146. https://doi.org/10.1371/journal.pone.0145146

Article   CAS   PubMed   PubMed Central   Google Scholar  

R Studio Team (2020). RStudio: integrated development for R. RStudio, PBC

Reardon S (2022) Scientific collaborations are precarious territory for women. Nature 605(7908):179–181. https://doi.org/10.1038/D41586-022-01204-1

Régner I, Thinus-Blanc C, Netter A, Schmader T, Huguet P (2019) Committees with implicit biases promote fewer women when they do not believe gender bias exists Nat Hum Behav 3(11):1171–1179. https://doi.org/10.1038/s41562-019-0686-3

Rhoten D, Pfirman S (2007) Women in interdisciplinary science: exploring preferences and consequences. Res Policy 36(1):56–75. https://doi.org/10.1016/j.respol.2006.08.001

Riedl C, Kim YJ, Gupta P, Malone TW, Woolley AW (2021) Quantifying collective intelligence in human groups. Proc Natl Acad Sci USA 118(21) https://doi.org/10.1073/PNAS.2005737118

Rolison DR (2000) Title IX for women in academic chemistry: isn’t a millennium of affirmative action for white men sufficient? Women in the chemical workforce— NCBI Bookshelf. National Research Council (US) Chemical Sciences Roundtable. https://www.ncbi.nlm.nih.gov/books/NBK44858/

Rolison DR (2004) Women, work and the academy: strategies for responding to “post-civil rights era” gender discriminiation. The Barnard Center for Research on Women. http://www.barnard.edu/bcrw/womenandwork/rolison.htm

Ross DA, Boatright D, Nunez-Smith M, Jordan A, Chekroud A, Moore EZ (2017) Differences in words used to describe racial and gender groups in Medical Student Performance Evaluations. PLoS ONE 12(8). https://doi.org/10.1371/journal.pone.0181659

Salerno PE, Páez-Vacas M, Guayasamin JM, Stynoski JL (2019) Male principal investigators (almost) don’t publish with women in ecology and zoology. PLoS ONE 14(6):e0218598. https://doi.org/10.1371/JOURNAL.PONE.0218598

Smith-Doerr L, Alegria S, Sacco T (2017) How diversity matters in the US science and engineering workforce: a critical review considering integration in teams, fields, and organizational contexts. Engag Sci Technol Society 3:139. https://doi.org/10.17351/ests2017.142

Smith JL, Handley IM, Zale AV, Rushing S, Potvin MA (2015) Now hiring! Empirically testing a three-step intervention to increase faculty gender diversity in STEM. In: BioScience, vol. 65(11). Oxford Academic, pp. 1084–1087. https://doi.org/10.1093/biosci/biv138

Soler-Gallart M (2017) Achieving social impact sociology in the public sphere. Springer.

Sprague J, Massoni K (2005) Student evaluations and gendered expectations: What we can’t count can hurt us. Sex Roles 53(11–12):779–793 https://doi.org/10.1007/s11199-005-8292-4

Sprague J, Smith DE (1989) The everyday world as problematic: a feminist sociology. Contemp Sociol 18(4) https://doi.org/10.2307/2073155

Stokman FN (2001) Networks: social. In: Smelser NJ, Baltes PB (eds) International encyclopedia of the social & behavioral sciences. Pergamon, pp. 10509–10514 https://doi.org/10.1016/B0-08-043076-7/01934-3

Taps J, Martin PY (1990) Gender composition, attributional accounts, and women’s influence and likability in task groups. Small Group Res 21(4):471–491. https://doi.org/10.1177/1046496490214003

Ulibarri N, Cravens A, Kernbach S, Nabergoj A, Royalty A (2019) Creativity in Research. Cambridge University Press, pp. 1–317

Urry M (2015) Science and gender: scientists must work harder on equality Nature 528(7583):471–473. https://doi.org/10.1038/528471a

Wagner C (2016) Rosalind’s ghost: biology, collaboration, and the female. PLoS Biol 14(11):e2001003. https://doi.org/10.1371/journal.pbio.2001003

West C, Zimmerman DH (1987) Doing gender. Gend Soc 1(2):125–151. https://doi.org/10.1177/0891243287001002002

Winter F, Rauhut H, Helbing D (2012) How norms can generate conflict: an experiment on the failure of cooperative micro-motives on the macro-level. Soc Forces 90(3):919–948. https://doi.org/10.1093/sf/sor028

Woolley AW, Chabris CF, Pentland A, Hashmi N, Malone TW (2010) Evidence for a collective intelligence factor in the performance of human groups. Science 330(6004):686–688. https://doi.org/10.1126/science.1193147

Wooten KC, Rose RM, Ostir GV, Calhoun WJ, Ameredes BT, Brasier AR (2014) Assessing and evaluating multidisciplinary translational teams: a mixed methods approach. Eval Health Prof 37(1):33–49. https://doi.org/10.1177/0163278713504433

Wuchty S, Jones BF, Uzzi B (2007) The increasing dominance of teams in production of knowledge. Science 316(5827):1036–1039. https://doi.org/10.1126/science.1136099

Zeng XHT, Duch J, Sales-Pardo M, Moreira JAGG, Radicchi F, Ribeiro HV, Woodruff TK, Amaral LANN (2016) Differences in collaboration patterns across discipline, career stage, and gender. PLoS Biol 14(11):e1002573. https://doi.org/10.1371/journal.pbio.1002573

Zhang HH, Ding C, Schutte NS, Li R (2020) How team emotional intelligence connects to task performance: a network approach. Small Group Res 51(4):492–516. https://doi.org/10.1177/1046496419889660

Article   CAS   Google Scholar  

Download references

Acknowledgements

We thank Professor Jeni Cross, the Department of Sociology, and the Institute for Research in the Social Sciences (IRISS) at Colorado State University for helpful discussions and preliminary data collection. We also thank Professor Sue VandeWoude, College of Veterinary Medicine and Biomedical Sciences, Colorado State University, for helpful discussions and support. The research reported in this publication was supported by Colorado State University’s Office of the Vice President for Research Catalyst for Innovative Partnerships Program. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Office of the Vice President for Research. Additional funding and support were provided by grants from the National Science Foundation’s Ecology of Infectious Diseases Program (NSF EF-0723676 and NSF EF-1413925).

Author information

Authors and affiliations.

Divergent Science LLC, Fort Collins, CO, USA

Hannah B. Love & Ellen R. Fisher

Colorado State University, Fort Collins, CO, USA

Hannah B. Love, Alyssa Stephens, Bailey K. Fosdick, Elizabeth Tofany & Ellen R. Fisher

Cactus Consulting LLC, Denver, CO, USA

Alyssa Stephens

University of New Mexico, Albuquerque, NM, USA

Ellen R. Fisher

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Hannah B. Love .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

We confirm that we have complied with the American Psychological Association’s ethical standards in the treatment of their participants. All data collection methods were performed with the informed consent of the participants and followed Institutional Review Board protocol #19-8622H.

Informed consent

Following IRB protocol #19-8622H, participation was voluntary, and all subjects provided informed consent.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary tables, appendix figure 1: social network survey, appendix figure 2: response rate for women and men faculty and nonfaculty, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Love, H.B., Stephens, A., Fosdick, B.K. et al. The impact of gender diversity on scientific research teams: a need to broaden and accelerate future research. Humanit Soc Sci Commun 9 , 386 (2022). https://doi.org/10.1057/s41599-022-01389-w

Download citation

Received : 02 June 2022

Accepted : 29 September 2022

Published : 22 October 2022

DOI : https://doi.org/10.1057/s41599-022-01389-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research hypothesis on gender inequality

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Twenty years of gender equality research: A scoping review based on a new semantic indicator

Contributed equally to this work with: Paola Belingheri, Filippo Chiarello, Andrea Fronzetti Colladon, Paola Rovelli

Roles Conceptualization, Formal analysis, Funding acquisition, Visualization, Writing – original draft, Writing – review & editing

Affiliation Dipartimento di Ingegneria dell’Energia, dei Sistemi, del Territorio e delle Costruzioni, Università degli Studi di Pisa, Largo L. Lazzarino, Pisa, Italy

Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Visualization, Writing – original draft, Writing – review & editing

Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Engineering, University of Perugia, Perugia, Italy, Department of Management, Kozminski University, Warsaw, Poland

ORCID logo

Roles Conceptualization, Formal analysis, Funding acquisition, Writing – original draft, Writing – review & editing

Affiliation Faculty of Economics and Management, Centre for Family Business Management, Free University of Bozen-Bolzano, Bozen-Bolzano, Italy

  • Paola Belingheri, 
  • Filippo Chiarello, 
  • Andrea Fronzetti Colladon, 
  • Paola Rovelli

PLOS

  • Published: September 21, 2021
  • https://doi.org/10.1371/journal.pone.0256474
  • Reader Comments

9 Nov 2021: The PLOS ONE Staff (2021) Correction: Twenty years of gender equality research: A scoping review based on a new semantic indicator. PLOS ONE 16(11): e0259930. https://doi.org/10.1371/journal.pone.0259930 View correction

Table 1

Gender equality is a major problem that places women at a disadvantage thereby stymieing economic growth and societal advancement. In the last two decades, extensive research has been conducted on gender related issues, studying both their antecedents and consequences. However, existing literature reviews fail to provide a comprehensive and clear picture of what has been studied so far, which could guide scholars in their future research. Our paper offers a scoping review of a large portion of the research that has been published over the last 22 years, on gender equality and related issues, with a specific focus on business and economics studies. Combining innovative methods drawn from both network analysis and text mining, we provide a synthesis of 15,465 scientific articles. We identify 27 main research topics, we measure their relevance from a semantic point of view and the relationships among them, highlighting the importance of each topic in the overall gender discourse. We find that prominent research topics mostly relate to women in the workforce–e.g., concerning compensation, role, education, decision-making and career progression. However, some of them are losing momentum, and some other research trends–for example related to female entrepreneurship, leadership and participation in the board of directors–are on the rise. Besides introducing a novel methodology to review broad literature streams, our paper offers a map of the main gender-research trends and presents the most popular and the emerging themes, as well as their intersections, outlining important avenues for future research.

Citation: Belingheri P, Chiarello F, Fronzetti Colladon A, Rovelli P (2021) Twenty years of gender equality research: A scoping review based on a new semantic indicator. PLoS ONE 16(9): e0256474. https://doi.org/10.1371/journal.pone.0256474

Editor: Elisa Ughetto, Politecnico di Torino, ITALY

Received: June 25, 2021; Accepted: August 6, 2021; Published: September 21, 2021

Copyright: © 2021 Belingheri et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its supporting information files. The only exception is the text of the abstracts (over 15,000) that we have downloaded from Scopus. These abstracts can be retrieved from Scopus, but we do not have permission to redistribute them.

Funding: P.B and F.C.: Grant of the Department of Energy, Systems, Territory and Construction of the University of Pisa (DESTEC) for the project “Measuring Gender Bias with Semantic Analysis: The Development of an Assessment Tool and its Application in the European Space Industry. P.B., F.C., A.F.C., P.R.: Grant of the Italian Association of Management Engineering (AiIG), “Misure di sostegno ai soci giovani AiIG” 2020, for the project “Gender Equality Through Data Intelligence (GEDI)”. F.C.: EU project ASSETs+ Project (Alliance for Strategic Skills addressing Emerging Technologies in Defence) EAC/A03/2018 - Erasmus+ programme, Sector Skills Alliances, Lot 3: Sector Skills Alliance for implementing a new strategic approach (Blueprint) to sectoral cooperation on skills G.A. NUMBER: 612678-EPP-1-2019-1-IT-EPPKA2-SSA-B.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The persistent gender inequalities that currently exist across the developed and developing world are receiving increasing attention from economists, policymakers, and the general public [e.g., 1 – 3 ]. Economic studies have indicated that women’s education and entry into the workforce contributes to social and economic well-being [e.g., 4 , 5 ], while their exclusion from the labor market and from managerial positions has an impact on overall labor productivity and income per capita [ 6 , 7 ]. The United Nations selected gender equality, with an emphasis on female education, as part of the Millennium Development Goals [ 8 ], and gender equality at-large as one of the 17 Sustainable Development Goals (SDGs) to be achieved by 2030 [ 9 ]. These latter objectives involve not only developing nations, but rather all countries, to achieve economic, social and environmental well-being.

As is the case with many SDGs, gender equality is still far from being achieved and persists across education, access to opportunities, or presence in decision-making positions [ 7 , 10 , 11 ]. As we enter the last decade for the SDGs’ implementation, and while we are battling a global health pandemic, effective and efficient action becomes paramount to reach this ambitious goal.

Scholars have dedicated a massive effort towards understanding gender equality, its determinants, its consequences for women and society, and the appropriate actions and policies to advance women’s equality. Many topics have been covered, ranging from women’s education and human capital [ 12 , 13 ] and their role in society [e.g., 14 , 15 ], to their appointment in firms’ top ranked positions [e.g., 16 , 17 ] and performance implications [e.g., 18 , 19 ]. Despite some attempts, extant literature reviews provide a narrow view on these issues, restricted to specific topics–e.g., female students’ presence in STEM fields [ 20 ], educational gender inequality [ 5 ], the gender pay gap [ 21 ], the glass ceiling effect [ 22 ], leadership [ 23 ], entrepreneurship [ 24 ], women’s presence on the board of directors [ 25 , 26 ], diversity management [ 27 ], gender stereotypes in advertisement [ 28 ], or specific professions [ 29 ]. A comprehensive view on gender-related research, taking stock of key findings and under-studied topics is thus lacking.

Extant literature has also highlighted that gender issues, and their economic and social ramifications, are complex topics that involve a large number of possible antecedents and outcomes [ 7 ]. Indeed, gender equality actions are most effective when implemented in unison with other SDGs (e.g., with SDG 8, see [ 30 ]) in a synergetic perspective [ 10 ]. Many bodies of literature (e.g., business, economics, development studies, sociology and psychology) approach the problem of achieving gender equality from different perspectives–often addressing specific and narrow aspects. This sometimes leads to a lack of clarity about how different issues, circumstances, and solutions may be related in precipitating or mitigating gender inequality or its effects. As the number of papers grows at an increasing pace, this issue is exacerbated and there is a need to step back and survey the body of gender equality literature as a whole. There is also a need to examine synergies between different topics and approaches, as well as gaps in our understanding of how different problems and solutions work together. Considering the important topic of women’s economic and social empowerment, this paper aims to fill this gap by answering the following research question: what are the most relevant findings in the literature on gender equality and how do they relate to each other ?

To do so, we conduct a scoping review [ 31 ], providing a synthesis of 15,465 articles dealing with gender equity related issues published in the last twenty-two years, covering both the periods of the MDGs and the SDGs (i.e., 2000 to mid 2021) in all the journals indexed in the Academic Journal Guide’s 2018 ranking of business and economics journals. Given the huge amount of research conducted on the topic, we adopt an innovative methodology, which relies on social network analysis and text mining. These techniques are increasingly adopted when surveying large bodies of text. Recently, they were applied to perform analysis of online gender communication differences [ 32 ] and gender behaviors in online technology communities [ 33 ], to identify and classify sexual harassment instances in academia [ 34 ], and to evaluate the gender inclusivity of disaster management policies [ 35 ].

Applied to the title, abstracts and keywords of the articles in our sample, this methodology allows us to identify a set of 27 recurrent topics within which we automatically classify the papers. Introducing additional novelty, by means of the Semantic Brand Score (SBS) indicator [ 36 ] and the SBS BI app [ 37 ], we assess the importance of each topic in the overall gender equality discourse and its relationships with the other topics, as well as trends over time, with a more accurate description than that offered by traditional literature reviews relying solely on the number of papers presented in each topic.

This methodology, applied to gender equality research spanning the past twenty-two years, enables two key contributions. First, we extract the main message that each document is conveying and how this is connected to other themes in literature, providing a rich picture of the topics that are at the center of the discourse, as well as of the emerging topics. Second, by examining the semantic relationship between topics and how tightly their discourses are linked, we can identify the key relationships and connections between different topics. This semi-automatic methodology is also highly reproducible with minimum effort.

This literature review is organized as follows. In the next section, we present how we selected relevant papers and how we analyzed them through text mining and social network analysis. We then illustrate the importance of 27 selected research topics, measured by means of the SBS indicator. In the results section, we present an overview of the literature based on the SBS results–followed by an in-depth narrative analysis of the top 10 topics (i.e., those with the highest SBS) and their connections. Subsequently, we highlight a series of under-studied connections between the topics where there is potential for future research. Through this analysis, we build a map of the main gender-research trends in the last twenty-two years–presenting the most popular themes. We conclude by highlighting key areas on which research should focused in the future.

Our aim is to map a broad topic, gender equality research, that has been approached through a host of different angles and through different disciplines. Scoping reviews are the most appropriate as they provide the freedom to map different themes and identify literature gaps, thereby guiding the recommendation of new research agendas [ 38 ].

Several practical approaches have been proposed to identify and assess the underlying topics of a specific field using big data [ 39 – 41 ], but many of them fail without proper paper retrieval and text preprocessing. This is specifically true for a research field such as the gender-related one, which comprises the work of scholars from different backgrounds. In this section, we illustrate a novel approach for the analysis of scientific (gender-related) papers that relies on methods and tools of social network analysis and text mining. Our procedure has four main steps: (1) data collection, (2) text preprocessing, (3) keywords extraction and classification, and (4) evaluation of semantic importance and image.

Data collection

In this study, we analyze 22 years of literature on gender-related research. Following established practice for scoping reviews [ 42 ], our data collection consisted of two main steps, which we summarize here below.

Firstly, we retrieved from the Scopus database all the articles written in English that contained the term “gender” in their title, abstract or keywords and were published in a journal listed in the Academic Journal Guide 2018 ranking of the Chartered Association of Business Schools (CABS) ( https://charteredabs.org/wp-content/uploads/2018/03/AJG2018-Methodology.pdf ), considering the time period from Jan 2000 to May 2021. We used this information considering that abstracts, titles and keywords represent the most informative part of a paper, while using the full-text would increase the signal-to-noise ratio for information extraction. Indeed, these textual elements already demonstrated to be reliable sources of information for the task of domain lexicon extraction [ 43 , 44 ]. We chose Scopus as source of literature because of its popularity, its update rate, and because it offers an API to ease the querying process. Indeed, while it does not allow to retrieve the full text of scientific articles, the Scopus API offers access to titles, abstracts, citation information and metadata for all its indexed scholarly journals. Moreover, we decided to focus on the journals listed in the AJG 2018 ranking because we were interested in reviewing business and economics related gender studies only. The AJG is indeed widely used by universities and business schools as a reference point for journal and research rigor and quality. This first step, executed in June 2021, returned more than 55,000 papers.

In the second step–because a look at the papers showed very sparse results, many of which were not in line with the topic of this literature review (e.g., papers dealing with health care or medical issues, where the word gender indicates the gender of the patients)–we applied further inclusion criteria to make the sample more focused on the topic of this literature review (i.e., women’s gender equality issues). Specifically, we only retained those papers mentioning, in their title and/or abstract, both gender-related keywords (e.g., daughter, female, mother) and keywords referring to bias and equality issues (e.g., equality, bias, diversity, inclusion). After text pre-processing (see next section), keywords were first identified from a frequency-weighted list of words found in the titles, abstracts and keywords in the initial list of papers, extracted through text mining (following the same approach as [ 43 ]). They were selected by two of the co-authors independently, following respectively a bottom up and a top-down approach. The bottom-up approach consisted of examining the words found in the frequency-weighted list and classifying those related to gender and equality. The top-down approach consisted in searching in the word list for notable gender and equality-related words. Table 1 reports the sets of keywords we considered, together with some examples of words that were used to search for their presence in the dataset (a full list is provided in the S1 Text ). At end of this second step, we obtained a final sample of 15,465 relevant papers.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0256474.t001

Text processing and keyword extraction

Text preprocessing aims at structuring text into a form that can be analyzed by statistical models. In the present section, we describe the preprocessing steps we applied to paper titles and abstracts, which, as explained below, partially follow a standard text preprocessing pipeline [ 45 ]. These activities have been performed using the R package udpipe [ 46 ].

The first step is n-gram extraction (i.e., a sequence of words from a given text sample) to identify which n-grams are important in the analysis, since domain-specific lexicons are often composed by bi-grams and tri-grams [ 47 ]. Multi-word extraction is usually implemented with statistics and linguistic rules, thus using the statistical properties of n-grams or machine learning approaches [ 48 ]. However, for the present paper, we used Scopus metadata in order to have a more effective and efficient n-grams collection approach [ 49 ]. We used the keywords of each paper in order to tag n-grams with their associated keywords automatically. Using this greedy approach, it was possible to collect all the keywords listed by the authors of the papers. From this list, we extracted only keywords composed by two, three and four words, we removed all the acronyms and rare keywords (i.e., appearing in less than 1% of papers), and we clustered keywords showing a high orthographic similarity–measured using a Levenshtein distance [ 50 ] lower than 2, considering these groups of keywords as representing same concepts, but expressed with different spelling. After tagging the n-grams in the abstracts, we followed a common data preparation pipeline that consists of the following steps: (i) tokenization, that splits the text into tokens (i.e., single words and previously tagged multi-words); (ii) removal of stop-words (i.e. those words that add little meaning to the text, usually being very common and short functional words–such as “and”, “or”, or “of”); (iii) parts-of-speech tagging, that is providing information concerning the morphological role of a word and its morphosyntactic context (e.g., if the token is a determiner, the next token is a noun or an adjective with very high confidence, [ 51 ]); and (iv) lemmatization, which consists in substituting each word with its dictionary form (or lemma). The output of the latter step allows grouping together the inflected forms of a word. For example, the verbs “am”, “are”, and “is” have the shared lemma “be”, or the nouns “cat” and “cats” both share the lemma “cat”. We preferred lemmatization over stemming [ 52 ] in order to obtain more interpretable results.

In addition, we identified a further set of keywords (with respect to those listed in the “keywords” field) by applying a series of automatic words unification and removal steps, as suggested in past research [ 53 , 54 ]. We removed: sparse terms (i.e., occurring in less than 0.1% of all documents), common terms (i.e., occurring in more than 10% of all documents) and retained only nouns and adjectives. It is relevant to notice that no document was lost due to these steps. We then used the TF-IDF function [ 55 ] to produce a new list of keywords. We additionally tested other approaches for the identification and clustering of keywords–such as TextRank [ 56 ] or Latent Dirichlet Allocation [ 57 ]–without obtaining more informative results.

Classification of research topics

To guide the literature analysis, two experts met regularly to examine the sample of collected papers and to identify the main topics and trends in gender research. Initially, they conducted brainstorming sessions on the topics they expected to find, due to their knowledge of the literature. This led to an initial list of topics. Subsequently, the experts worked independently, also supported by the keywords in paper titles and abstracts extracted with the procedure described above.

Considering all this information, each expert identified and clustered relevant keywords into topics. At the end of the process, the two assignments were compared and exhibited a 92% agreement. Another meeting was held to discuss discordant cases and reach a consensus. This resulted in a list of 27 topics, briefly introduced in Table 2 and subsequently detailed in the following sections.

thumbnail

https://doi.org/10.1371/journal.pone.0256474.t002

Evaluation of semantic importance

Working on the lemmatized corpus of the 15,465 papers included in our sample, we proceeded with the evaluation of semantic importance trends for each topic and with the analysis of their connections and prevalent textual associations. To this aim, we used the Semantic Brand Score indicator [ 36 ], calculated through the SBS BI webapp [ 37 ] that also produced a brand image report for each topic. For this study we relied on the computing resources of the ENEA/CRESCO infrastructure [ 58 ].

The Semantic Brand Score (SBS) is a measure of semantic importance that combines methods of social network analysis and text mining. It is usually applied for the analysis of (big) textual data to evaluate the importance of one or more brands, names, words, or sets of keywords [ 36 ]. Indeed, the concept of “brand” is intended in a flexible way and goes beyond products or commercial brands. In this study, we evaluate the SBS time-trends of the keywords defining the research topics discussed in the previous section. Semantic importance comprises the three dimensions of topic prevalence, diversity and connectivity. Prevalence measures how frequently a research topic is used in the discourse. The more a topic is mentioned by scientific articles, the more the research community will be aware of it, with possible increase of future studies; this construct is partly related to that of brand awareness [ 59 ]. This effect is even stronger, considering that we are analyzing the title, abstract and keywords of the papers, i.e. the parts that have the highest visibility. A very important characteristic of the SBS is that it considers the relationships among words in a text. Topic importance is not just a matter of how frequently a topic is mentioned, but also of the associations a topic has in the text. Specifically, texts are transformed into networks of co-occurring words, and relationships are studied through social network analysis [ 60 ]. This step is necessary to calculate the other two dimensions of our semantic importance indicator. Accordingly, a social network of words is generated for each time period considered in the analysis–i.e., a graph made of n nodes (words) and E edges weighted by co-occurrence frequency, with W being the set of edge weights. The keywords representing each topic were clustered into single nodes.

The construct of diversity relates to that of brand image [ 59 ], in the sense that it considers the richness and distinctiveness of textual (topic) associations. Considering the above-mentioned networks, we calculated diversity using the distinctiveness centrality metric–as in the formula presented by Fronzetti Colladon and Naldi [ 61 ].

Lastly, connectivity was measured as the weighted betweenness centrality [ 62 , 63 ] of each research topic node. We used the formula presented by Wasserman and Faust [ 60 ]. The dimension of connectivity represents the “brokerage power” of each research topic–i.e., how much it can serve as a bridge to connect other terms (and ultimately topics) in the discourse [ 36 ].

The SBS is the final composite indicator obtained by summing the standardized scores of prevalence, diversity and connectivity. Standardization was carried out considering all the words in the corpus, for each specific timeframe.

This methodology, applied to a large and heterogeneous body of text, enables to automatically identify two important sets of information that add value to the literature review. Firstly, the relevance of each topic in literature is measured through a composite indicator of semantic importance, rather than simply looking at word frequencies. This provides a much richer picture of the topics that are at the center of the discourse, as well as of the topics that are emerging in the literature. Secondly, it enables to examine the extent of the semantic relationship between topics, looking at how tightly their discourses are linked. In a field such as gender equality, where many topics are closely linked to each other and present overlaps in issues and solutions, this methodology offers a novel perspective with respect to traditional literature reviews. In addition, it ensures reproducibility over time and the possibility to semi-automatically update the analysis, as new papers become available.

Overview of main topics

In terms of descriptive textual statistics, our corpus is made of 15,465 text documents, consisting of a total of 2,685,893 lemmatized tokens (words) and 32,279 types. As a result, the type-token ratio is 1.2%. The number of hapaxes is 12,141, with a hapax-token ratio of 37.61%.

Fig 1 shows the list of 27 topics by decreasing SBS. The most researched topic is compensation , exceeding all others in prevalence, diversity, and connectivity. This means it is not only mentioned more often than other topics, but it is also connected to a greater number of other topics and is central to the discourse on gender equality. The next four topics are, in order of SBS, role , education , decision-making , and career progression . These topics, except for education , all concern women in the workforce. Between these first five topics and the following ones there is a clear drop in SBS scores. In particular, the topics that follow have a lower connectivity than the first five. They are hiring , performance , behavior , organization , and human capital . Again, except for behavior and human capital , the other three topics are purely related to women in the workforce. After another drop-off, the following topics deal prevalently with women in society. This trend highlights that research on gender in business journals has so far mainly paid attention to the conditions that women experience in business contexts, while also devoting some attention to women in society.

thumbnail

https://doi.org/10.1371/journal.pone.0256474.g001

Fig 2 shows the SBS time series of the top 10 topics. While there has been a general increase in the number of Scopus-indexed publications in the last decade, we notice that some SBS trends remain steady, or even decrease. In particular, we observe that the main topic of the last twenty-two years, compensation , is losing momentum. Since 2016, it has been surpassed by decision-making , education and role , which may indicate that literature is increasingly attempting to identify root causes of compensation inequalities. Moreover, in the last two years, the topics of hiring , performance , and organization are experiencing the largest importance increase.

thumbnail

https://doi.org/10.1371/journal.pone.0256474.g002

Fig 3 shows the SBS time trends of the remaining 17 topics (i.e., those not in the top 10). As we can see from the graph, there are some that maintain a steady trend–such as reputation , management , networks and governance , which also seem to have little importance. More relevant topics with average stationary trends (except for the last two years) are culture , family , and parenting . The feminine topic is among the most important here, and one of those that exhibit the larger variations over time (similarly to leadership ). On the other hand, the are some topics that, even if not among the most important, show increasing SBS trends; therefore, they could be considered as emerging topics and could become popular in the near future. These are entrepreneurship , leadership , board of directors , and sustainability . These emerging topics are also interesting to anticipate future trends in gender equality research that are conducive to overall equality in society.

thumbnail

https://doi.org/10.1371/journal.pone.0256474.g003

In addition to the SBS score of the different topics, the network of terms they are associated to enables to gauge the extent to which their images (textual associations) overlap or differ ( Fig 4 ).

thumbnail

https://doi.org/10.1371/journal.pone.0256474.g004

There is a central cluster of topics with high similarity, which are all connected with women in the workforce. The cluster includes topics such as organization , decision-making , performance , hiring , human capital , education and compensation . In addition, the topic of well-being is found within this cluster, suggesting that women’s equality in the workforce is associated to well-being considerations. The emerging topics of entrepreneurship and leadership are also closely connected with each other, possibly implying that leadership is a much-researched quality in female entrepreneurship. Topics that are relatively more distant include personality , politics , feminine , empowerment , management , board of directors , reputation , governance , parenting , masculine and network .

The following sections describe the top 10 topics and their main associations in literature (see Table 3 ), while providing a brief overview of the emerging topics.

thumbnail

https://doi.org/10.1371/journal.pone.0256474.t003

Compensation.

The topic of compensation is related to the topics of role , hiring , education and career progression , however, also sees a very high association with the words gap and inequality . Indeed, a well-known debate in degrowth economics centers around whether and how to adequately compensate women for their childbearing, childrearing, caregiver and household work [e.g., 30 ].

Even in paid work, women continue being offered lower compensations than their male counterparts who have the same job or cover the same role [ 64 – 67 ]. This severe inequality has been widely studied by scholars over the last twenty-two years. Dealing with this topic, some specific roles have been addressed. Specifically, research highlighted differences in compensation between female and male CEOs [e.g., 68 ], top executives [e.g., 69 ], and boards’ directors [e.g., 70 ]. Scholars investigated the determinants of these gaps, such as the gender composition of the board [e.g., 71 – 73 ] or women’s individual characteristics [e.g., 71 , 74 ].

Among these individual characteristics, education plays a relevant role [ 75 ]. Education is indeed presented as the solution for women, not only to achieve top executive roles, but also to reduce wage inequality [e.g., 76 , 77 ]. Past research has highlighted education influences on gender wage gaps, specifically referring to gender differences in skills [e.g., 78 ], college majors [e.g., 79 ], and college selectivity [e.g., 80 ].

Finally, the wage gap issue is strictly interrelated with hiring –e.g., looking at whether being a mother affects hiring and compensation [e.g., 65 , 81 ] or relating compensation to unemployment [e.g., 82 ]–and career progression –for instance looking at meritocracy [ 83 , 84 ] or the characteristics of the boss for whom women work [e.g., 85 ].

The roles covered by women have been deeply investigated. Scholars have focused on the role of women in their families and the society as a whole [e.g., 14 , 15 ], and, more widely, in business contexts [e.g., 18 , 81 ]. Indeed, despite still lagging behind their male counterparts [e.g., 86 , 87 ], in the last decade there has been an increase in top ranked positions achieved by women [e.g., 88 , 89 ]. Following this phenomenon, scholars have posed greater attention towards the presence of women in the board of directors [e.g., 16 , 18 , 90 , 91 ], given the increasing pressure to appoint female directors that firms, especially listed ones, have experienced. Other scholars have focused on the presence of women covering the role of CEO [e.g., 17 , 92 ] or being part of the top management team [e.g., 93 ]. Irrespectively of the level of analysis, all these studies tried to uncover the antecedents of women’s presence among top managers [e.g., 92 , 94 ] and the consequences of having a them involved in the firm’s decision-making –e.g., on performance [e.g., 19 , 95 , 96 ], risk [e.g., 97 , 98 ], and corporate social responsibility [e.g., 99 , 100 ].

Besides studying the difficulties and discriminations faced by women in getting a job [ 81 , 101 ], and, more specifically in the hiring , appointment, or career progression to these apical roles [e.g., 70 , 83 ], the majority of research of women’s roles dealt with compensation issues. Specifically, scholars highlight the pay-gap that still exists between women and men, both in general [e.g., 64 , 65 ], as well as referring to boards’ directors [e.g., 70 , 102 ], CEOs and executives [e.g., 69 , 103 , 104 ].

Finally, other scholars focused on the behavior of women when dealing with business. In this sense, particular attention has been paid to leadership and entrepreneurial behaviors. The former quite overlaps with dealing with the roles mentioned above, but also includes aspects such as leaders being stereotyped as masculine [e.g., 105 ], the need for greater exposure to female leaders to reduce biases [e.g., 106 ], or female leaders acting as queen bees [e.g., 107 ]. Regarding entrepreneurship , scholars mainly investigated women’s entrepreneurial entry [e.g., 108 , 109 ], differences between female and male entrepreneurs in the evaluations and funding received from investors [e.g., 110 , 111 ], and their performance gap [e.g., 112 , 113 ].

Education has long been recognized as key to social advancement and economic stability [ 114 ], for job progression and also a barrier to gender equality, especially in STEM-related fields. Research on education and gender equality is mostly linked with the topics of compensation , human capital , career progression , hiring , parenting and decision-making .

Education contributes to a higher human capital [ 115 ] and constitutes an investment on the part of women towards their future. In this context, literature points to the gender gap in educational attainment, and the consequences for women from a social, economic, personal and professional standpoint. Women are found to have less access to formal education and information, especially in emerging countries, which in turn may cause them to lose social and economic opportunities [e.g., 12 , 116 – 119 ]. Education in local and rural communities is also paramount to communicate the benefits of female empowerment , contributing to overall societal well-being [e.g., 120 ].

Once women access education, the image they have of the world and their place in society (i.e., habitus) affects their education performance [ 13 ] and is passed on to their children. These situations reinforce gender stereotypes, which become self-fulfilling prophecies that may negatively affect female students’ performance by lowering their confidence and heightening their anxiety [ 121 , 122 ]. Besides formal education, also the information that women are exposed to on a daily basis contributes to their human capital . Digital inequalities, for instance, stems from men spending more time online and acquiring higher digital skills than women [ 123 ].

Education is also a factor that should boost employability of candidates and thus hiring , career progression and compensation , however the relationship between these factors is not straightforward [ 115 ]. First, educational choices ( decision-making ) are influenced by variables such as self-efficacy and the presence of barriers, irrespectively of the career opportunities they offer, especially in STEM [ 124 ]. This brings additional difficulties to women’s enrollment and persistence in scientific and technical fields of study due to stereotypes and biases [ 125 , 126 ]. Moreover, access to education does not automatically translate into job opportunities for women and minority groups [ 127 , 128 ] or into female access to managerial positions [ 129 ].

Finally, parenting is reported as an antecedent of education [e.g., 130 ], with much of the literature focusing on the role of parents’ education on the opportunities afforded to children to enroll in education [ 131 – 134 ] and the role of parenting in their offspring’s perception of study fields and attitudes towards learning [ 135 – 138 ]. Parental education is also a predictor of the other related topics, namely human capital and compensation [ 139 ].

Decision-making.

This literature mainly points to the fact that women are thought to make decisions differently than men. Women have indeed different priorities, such as they care more about people’s well-being, working with people or helping others, rather than maximizing their personal (or their firm’s) gain [ 140 ]. In other words, women typically present more communal than agentic behaviors, which are instead more frequent among men [ 141 ]. These different attitude, behavior and preferences in turn affect the decisions they make [e.g., 142 ] and the decision-making of the firm in which they work [e.g., 143 ].

At the individual level, gender affects, for instance, career aspirations [e.g., 144 ] and choices [e.g., 142 , 145 ], or the decision of creating a venture [e.g., 108 , 109 , 146 ]. Moreover, in everyday life, women and men make different decisions regarding partners [e.g., 147 ], childcare [e.g., 148 ], education [e.g., 149 ], attention to the environment [e.g., 150 ] and politics [e.g., 151 ].

At the firm level, scholars highlighted, for example, how the presence of women in the board affects corporate decisions [e.g., 152 , 153 ], that female CEOs are more conservative in accounting decisions [e.g., 154 ], or that female CFOs tend to make more conservative decisions regarding the firm’s financial reporting [e.g., 155 ]. Nevertheless, firm level research also investigated decisions that, influenced by gender bias, affect women, such as those pertaining hiring [e.g., 156 , 157 ], compensation [e.g., 73 , 158 ], or the empowerment of women once appointed [ 159 ].

Career progression.

Once women have entered the workforce, the key aspect to achieve gender equality becomes career progression , including efforts toward overcoming the glass ceiling. Indeed, according to the SBS analysis, career progression is highly related to words such as work, social issues and equality. The topic with which it has the highest semantic overlap is role , followed by decision-making , hiring , education , compensation , leadership , human capital , and family .

Career progression implies an advancement in the hierarchical ladder of the firm, assigning managerial roles to women. Coherently, much of the literature has focused on identifying rationales for a greater female participation in the top management team and board of directors [e.g., 95 ] as well as the best criteria to ensure that the decision-makers promote the most valuable employees irrespectively of their individual characteristics, such as gender [e.g., 84 ]. The link between career progression , role and compensation is often provided in practice by performance appraisal exercises, frequently rooted in a culture of meritocracy that guides bonuses, salary increases and promotions. However, performance appraisals can actually mask gender-biased decisions where women are held to higher standards than their male colleagues [e.g., 83 , 84 , 95 , 160 , 161 ]. Women often have less opportunities to gain leadership experience and are less visible than their male colleagues, which constitute barriers to career advancement [e.g., 162 ]. Therefore, transparency and accountability, together with procedures that discourage discretionary choices, are paramount to achieve a fair career progression [e.g., 84 ], together with the relaxation of strict job boundaries in favor of cross-functional and self-directed tasks [e.g., 163 ].

In addition, a series of stereotypes about the type of leadership characteristics that are required for top management positions, which fit better with typical male and agentic attributes, are another key barrier to career advancement for women [e.g., 92 , 160 ].

Hiring is the entrance gateway for women into the workforce. Therefore, it is related to other workforce topics such as compensation , role , career progression , decision-making , human capital , performance , organization and education .

A first stream of literature focuses on the process leading up to candidates’ job applications, demonstrating that bias exists before positions are even opened, and it is perpetuated both by men and women through networking and gatekeeping practices [e.g., 164 , 165 ].

The hiring process itself is also subject to biases [ 166 ], for example gender-congruity bias that leads to men being preferred candidates in male-dominated sectors [e.g., 167 ], women being hired in positions with higher risk of failure [e.g., 168 ] and limited transparency and accountability afforded by written processes and procedures [e.g., 164 ] that all contribute to ascriptive inequality. In addition, providing incentives for evaluators to hire women may actually work to this end; however, this is not the case when supporting female candidates endangers higher-ranking male ones [ 169 ].

Another interesting perspective, instead, looks at top management teams’ composition and the effects on hiring practices, indicating that firms with more women in top management are less likely to lay off staff [e.g., 152 ].

Performance.

Several scholars posed their attention towards women’s performance, its consequences [e.g., 170 , 171 ] and the implications of having women in decision-making positions [e.g., 18 , 19 ].

At the individual level, research focused on differences in educational and academic performance between women and men, especially referring to the gender gap in STEM fields [e.g., 171 ]. The presence of stereotype threats–that is the expectation that the members of a social group (e.g., women) “must deal with the possibility of being judged or treated stereotypically, or of doing something that would confirm the stereotype” [ 172 ]–affects women’s interested in STEM [e.g., 173 ], as well as their cognitive ability tests, penalizing them [e.g., 174 ]. A stronger gender identification enhances this gap [e.g., 175 ], whereas mentoring and role models can be used as solutions to this problem [e.g., 121 ]. Despite the negative effect of stereotype threats on girls’ performance [ 176 ], female and male students perform equally in mathematics and related subjects [e.g., 177 ]. Moreover, while individuals’ performance at school and university generally affects their achievements and the field in which they end up working, evidence reveals that performance in math or other scientific subjects does not explain why fewer women enter STEM working fields; rather this gap depends on other aspects, such as culture, past working experiences, or self-efficacy [e.g., 170 ]. Finally, scholars have highlighted the penalization that women face for their positive performance, for instance when they succeed in traditionally male areas [e.g., 178 ]. This penalization is explained by the violation of gender-stereotypic prescriptions [e.g., 179 , 180 ], that is having women well performing in agentic areas, which are typical associated to men. Performance penalization can thus be overcome by clearly conveying communal characteristics and behaviors [ 178 ].

Evidence has been provided on how the involvement of women in boards of directors and decision-making positions affects firms’ performance. Nevertheless, results are mixed, with some studies showing positive effects on financial [ 19 , 181 , 182 ] and corporate social performance [ 99 , 182 , 183 ]. Other studies maintain a negative association [e.g., 18 ], and other again mixed [e.g., 184 ] or non-significant association [e.g., 185 ]. Also with respect to the presence of a female CEO, mixed results emerged so far, with some researches demonstrating a positive effect on firm’s performance [e.g., 96 , 186 ], while other obtaining only a limited evidence of this relationship [e.g., 103 ] or a negative one [e.g., 187 ].

Finally, some studies have investigated whether and how women’s performance affects their hiring [e.g., 101 ] and career progression [e.g., 83 , 160 ]. For instance, academic performance leads to different returns in hiring for women and men. Specifically, high-achieving men are called back significantly more often than high-achieving women, which are penalized when they have a major in mathematics; this result depends on employers’ gendered standards for applicants [e.g., 101 ]. Once appointed, performance ratings are more strongly related to promotions for women than men, and promoted women typically show higher past performance ratings than those of promoted men. This suggesting that women are subject to stricter standards for promotion [e.g., 160 ].

Behavioral aspects related to gender follow two main streams of literature. The first examines female personality and behavior in the workplace, and their alignment with cultural expectations or stereotypes [e.g., 188 ] as well as their impacts on equality. There is a common bias that depicts women as less agentic than males. Certain characteristics, such as those more congruent with male behaviors–e.g., self-promotion [e.g., 189 ], negotiation skills [e.g., 190 ] and general agentic behavior [e.g., 191 ]–, are less accepted in women. However, characteristics such as individualism in women have been found to promote greater gender equality in society [ 192 ]. In addition, behaviors such as display of emotions [e.g., 193 ], which are stereotypically female, work against women’s acceptance in the workplace, requiring women to carefully moderate their behavior to avoid exclusion. A counter-intuitive result is that women and minorities, which are more marginalized in the workplace, tend to be better problem-solvers in innovation competitions due to their different knowledge bases [ 194 ].

The other side of the coin is examined in a parallel literature stream on behavior towards women in the workplace. As a result of biases, prejudices and stereotypes, women may experience adverse behavior from their colleagues, such as incivility and harassment, which undermine their well-being [e.g., 195 , 196 ]. Biases that go beyond gender, such as for overweight people, are also more strongly applied to women [ 197 ].

Organization.

The role of women and gender bias in organizations has been studied from different perspectives, which mirror those presented in detail in the following sections. Specifically, most research highlighted the stereotypical view of leaders [e.g., 105 ] and the roles played by women within firms, for instance referring to presence in the board of directors [e.g., 18 , 90 , 91 ], appointment as CEOs [e.g., 16 ], or top executives [e.g., 93 ].

Scholars have investigated antecedents and consequences of the presence of women in these apical roles. On the one side they looked at hiring and career progression [e.g., 83 , 92 , 160 , 168 , 198 ], finding women typically disadvantaged with respect to their male counterparts. On the other side, they studied women’s leadership styles and influence on the firm’s decision-making [e.g., 152 , 154 , 155 , 199 ], with implications for performance [e.g., 18 , 19 , 96 ].

Human capital.

Human capital is a transverse topic that touches upon many different aspects of female gender equality. As such, it has the most associations with other topics, starting with education as mentioned above, with career-related topics such as role , decision-making , hiring , career progression , performance , compensation , leadership and organization . Another topic with which there is a close connection is behavior . In general, human capital is approached both from the education standpoint but also from the perspective of social capital.

The behavioral aspect in human capital comprises research related to gender differences for example in cultural and religious beliefs that influence women’s attitudes and perceptions towards STEM subjects [ 142 , 200 – 202 ], towards employment [ 203 ] or towards environmental issues [ 150 , 204 ]. These cultural differences also emerge in the context of globalization which may accelerate gender equality in the workforce [ 205 , 206 ]. Gender differences also appear in behaviors such as motivation [ 207 ], and in negotiation [ 190 ], and have repercussions on women’s decision-making related to their careers. The so-called gender equality paradox sees women in countries with lower gender equality more likely to pursue studies and careers in STEM fields, whereas the gap in STEM enrollment widens as countries achieve greater equality in society [ 171 ].

Career progression is modeled by literature as a choice-process where personal preferences, culture and decision-making affect the chosen path and the outcomes. Some literature highlights how women tend to self-select into different professions than men, often due to stereotypes rather than actual ability to perform in these professions [ 142 , 144 ]. These stereotypes also affect the perceptions of female performance or the amount of human capital required to equal male performance [ 110 , 193 , 208 ], particularly for mothers [ 81 ]. It is therefore often assumed that women are better suited to less visible and less leadership -oriented roles [ 209 ]. Women also express differing preferences towards work-family balance, which affect whether and how they pursue human capital gains [ 210 ], and ultimately their career progression and salary .

On the other hand, men are often unaware of gendered processes and behaviors that they carry forward in their interactions and decision-making [ 211 , 212 ]. Therefore, initiatives aimed at increasing managers’ human capital –by raising awareness of gender disparities in their organizations and engaging them in diversity promotion–are essential steps to counter gender bias and segregation [ 213 ].

Emerging topics: Leadership and entrepreneurship

Among the emerging topics, the most pervasive one is women reaching leadership positions in the workforce and in society. This is still a rare occurrence for two main types of factors, on the one hand, bias and discrimination make it harder for women to access leadership positions [e.g., 214 – 216 ], on the other hand, the competitive nature and high pressure associated with leadership positions, coupled with the lack of women currently represented, reduce women’s desire to achieve them [e.g., 209 , 217 ]. Women are more effective leaders when they have access to education, resources and a diverse environment with representation [e.g., 218 , 219 ].

One sector where there is potential for women to carve out a leadership role is entrepreneurship . Although at the start of the millennium the discourse on entrepreneurship was found to be “discriminatory, gender-biased, ethnocentrically determined and ideologically controlled” [ 220 ], an increasing body of literature is studying how to stimulate female entrepreneurship as an alternative pathway to wealth, leadership and empowerment [e.g., 221 ]. Many barriers exist for women to access entrepreneurship, including the institutional and legal environment, social and cultural factors, access to knowledge and resources, and individual behavior [e.g., 222 , 223 ]. Education has been found to raise women’s entrepreneurial intentions [e.g., 224 ], although this effect is smaller than for men [e.g., 109 ]. In addition, increasing self-efficacy and risk-taking behavior constitute important success factors [e.g., 225 ].

Finally, the topic of sustainability is worth mentioning, as it is the primary objective of the SDGs and is closely associated with societal well-being. As society grapples with the effects of climate change and increasing depletion of natural resources, a narrative has emerged on women and their greater link to the environment [ 226 ]. Studies in developed countries have found some support for women leaders’ attention to sustainability issues in firms [e.g., 227 – 229 ], and smaller resource consumption by women [ 230 ]. At the same time, women will likely be more affected by the consequences of climate change [e.g., 230 ] but often lack the decision-making power to influence local decision-making on resource management and environmental policies [e.g., 231 ].

Research gaps and conclusions

Research on gender equality has advanced rapidly in the past decades, with a steady increase in publications, both in mainstream topics related to women in education and the workforce, and in emerging topics. Through a novel approach combining methods of text mining and social network analysis, we examined a comprehensive body of literature comprising 15,465 papers published between 2000 and mid 2021 on topics related to gender equality. We identified a set of 27 topics addressed by the literature and examined their connections.

At the highest level of abstraction, it is worth noting that papers abound on the identification of issues related to gender inequalities and imbalances in the workforce and in society. Literature has thoroughly examined the (unconscious) biases, barriers, stereotypes, and discriminatory behaviors that women are facing as a result of their gender. Instead, there are much fewer papers that discuss or demonstrate effective solutions to overcome gender bias [e.g., 121 , 143 , 145 , 163 , 194 , 213 , 232 ]. This is partly due to the relative ease in studying the status quo, as opposed to studying changes in the status quo. However, we observed a shift in the more recent years towards solution seeking in this domain, which we strongly encourage future researchers to focus on. In the future, we may focus on collecting and mapping pro-active contributions to gender studies, using additional Natural Language Processing techniques, able to measure the sentiment of scientific papers [ 43 ].

All of the mainstream topics identified in our literature review are closely related, and there is a wealth of insights looking at the intersection between issues such as education and career progression or human capital and role . However, emerging topics are worthy of being furtherly explored. It would be interesting to see more work on the topic of female entrepreneurship , exploring aspects such as education , personality , governance , management and leadership . For instance, how can education support female entrepreneurship? How can self-efficacy and risk-taking behaviors be taught or enhanced? What are the differences in managerial and governance styles of female entrepreneurs? Which personality traits are associated with successful entrepreneurs? Which traits are preferred by venture capitalists and funding bodies?

The emerging topic of sustainability also deserves further attention, as our society struggles with climate change and its consequences. It would be interesting to see more research on the intersection between sustainability and entrepreneurship , looking at how female entrepreneurs are tackling sustainability issues, examining both their business models and their company governance . In addition, scholars are suggested to dig deeper into the relationship between family values and behaviors.

Moreover, it would be relevant to understand how women’s networks (social capital), or the composition and structure of social networks involving both women and men, enable them to increase their remuneration and reach top corporate positions, participate in key decision-making bodies, and have a voice in communities. Furthermore, the achievement of gender equality might significantly change firm networks and ecosystems, with important implications for their performance and survival.

Similarly, research at the nexus of (corporate) governance , career progression , compensation and female empowerment could yield useful insights–for example discussing how enterprises, institutions and countries are managed and the impact for women and other minorities. Are there specific governance structures that favor diversity and inclusion?

Lastly, we foresee an emerging stream of research pertaining how the spread of the COVID-19 pandemic challenged women, especially in the workforce, by making gender biases more evident.

For our analysis, we considered a set of 15,465 articles downloaded from the Scopus database (which is the largest abstract and citation database of peer-reviewed literature). As we were interested in reviewing business and economics related gender studies, we only considered those papers published in journals listed in the Academic Journal Guide (AJG) 2018 ranking of the Chartered Association of Business Schools (CABS). All the journals listed in this ranking are also indexed by Scopus. Therefore, looking at a single database (i.e., Scopus) should not be considered a limitation of our study. However, future research could consider different databases and inclusion criteria.

With our literature review, we offer researchers a comprehensive map of major gender-related research trends over the past twenty-two years. This can serve as a lens to look to the future, contributing to the achievement of SDG5. Researchers may use our study as a starting point to identify key themes addressed in the literature. In addition, our methodological approach–based on the use of the Semantic Brand Score and its webapp–could support scholars interested in reviewing other areas of research.

Supporting information

S1 text. keywords used for paper selection..

https://doi.org/10.1371/journal.pone.0256474.s001

Acknowledgments

The computing resources and the related technical support used for this work have been provided by CRESCO/ENEAGRID High Performance Computing infrastructure and its staff. CRESCO/ENEAGRID High Performance Computing infrastructure is funded by ENEA, the Italian National Agency for New Technologies, Energy and Sustainable Economic Development and by Italian and European research programmes (see http://www.cresco.enea.it/english for information).

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 9. UN. Transforming our world: The 2030 Agenda for Sustainable Development. General Assembley 70 Session; 2015.
  • 11. Nature. Get the Sustainable Development Goals back on track. Nature. 2020;577(January 2):7–8
  • 37. Fronzetti Colladon A, Grippa F. Brand intelligence analytics. In: Przegalinska A, Grippa F, Gloor PA, editors. Digital Transformation of Collaboration. Cham, Switzerland: Springer Nature Switzerland; 2020. p. 125–41. https://doi.org/10.1371/journal.pone.0233276 pmid:32442196
  • 39. Griffiths TL, Steyvers M, editors. Finding scientific topics. National academy of Sciences; 2004.
  • 40. Mimno D, Wallach H, Talley E, Leenders M, McCallum A, editors. Optimizing semantic coherence in topic models. 2011 Conference on Empirical Methods in Natural Language Processing; 2011.
  • 41. Wang C, Blei DM, editors. Collaborative topic modeling for recommending scientific articles. 17th ACM SIGKDD international conference on Knowledge discovery and data mining 2011.
  • 46. Straka M, Straková J, editors. Tokenizing, pos tagging, lemmatizing and parsing ud 2.0 with udpipe. CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies; 2017.
  • 49. Lu Y, Li, R., Wen K, Lu Z, editors. Automatic keyword extraction for scientific literatures using references. 2014 IEEE International Conference on Innovative Design and Manufacturing (ICIDM); 2014.
  • 55. Roelleke T, Wang J, editors. TF-IDF uncovered. 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval—SIGIR ‘08; 2008.
  • 56. Mihalcea R, Tarau P, editors. TextRank: Bringing order into text. 2004 Conference on Empirical Methods in Natural Language Processing; 2004.
  • 58. Iannone F, Ambrosino F, Bracco G, De Rosa M, Funel A, Guarnieri G, et al., editors. CRESCO ENEA HPC clusters: A working example of a multifabric GPFS Spectrum Scale layout. 2019 International Conference on High Performance Computing & Simulation (HPCS); 2019.
  • 60. Wasserman S, Faust K. Social network analysis: Methods and applications: Cambridge University Press; 1994.
  • 141. Williams JE, Best DL. Measuring sex stereotypes: A multination study, Rev: Sage Publications, Inc; 1990.
  • 172. Steele CM, Aronson J. Stereotype threat and the test performance of academically successful African Americans. In: Jencks C, Phillips M, editors. The Black–White test score gap. Washington, DC: Brookings; 1998. p. 401–27

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.7(4); 2021 Apr

Logo of heliyon

Gendered stereotypes and norms: A systematic review of interventions designed to shift attitudes and behaviour

Rebecca stewart.

a BehaviourWorks Australia, Monash Sustainable Development Institute, Monash University, Melbourne, Victoria, Australia

Breanna Wright

Steven roberts.

b School of Social Sciences, Faculty of Arts, Monash University, Melbourne, Victoria, Australia

Natalie Russell

c Victorian Health Promotion Foundation (VicHealth), Melbourne, Victoria, Australia

Associated Data

Data included in article.

In the face of ongoing attempts to achieve gender equality, there is increasing focus on the need to address outdated and detrimental gendered stereotypes and norms, to support societal and cultural change through individual attitudinal and behaviour change. This article systematically reviews interventions aiming to address gendered stereotypes and norms across several outcomes of gender inequality such as violence against women and sexual and reproductive health, to draw out common theory and practice and identify success factors. Three databases were searched; ProQuest Central, PsycINFO and Web of Science. Articles were included if they used established public health interventions types (direct participation programs, community mobilisation or strengthening, organisational or workforce development, communications, social marketing and social media, advocacy, legislative or policy reform) to shift attitudes and/or behaviour in relation to rigid gender stereotypes and norms. A total of 71 studies were included addressing norms and/or stereotypes across a range of intervention types and gender inequality outcomes, 55 of which reported statistically significant or mixed outcomes. The implicit theory of change in most studies was to change participants' attitudes by increasing their knowledge/awareness of gendered stereotypes or norms. Five additional strategies were identified that appear to strengthen intervention impact; peer engagement, addressing multiple levels of the ecological framework, developing agents of change, modelling/role models and co-design of interventions with participants or target populations. Consideration of cohort sex, length of intervention (multi-session vs single-session) and need for follow up data collection were all identified as factors influencing success. When it comes to engaging men and boys in particular, interventions with greater success include interactive learning, co-design and peer leadership. Several recommendations are made for program design, including that practitioners need to be cognisant of breaking down stereotypes amongst men (not just between genders) and the avoidance of reinforcing outdated stereotypes and norms inadvertently.

Gender; Stereotypes; Social norms; Attitude change; Behaviour change; Men and masculinities

1. Introduction

Gender is a widely accepted social determinant of health [ 1 , 2 ], as evidenced by the inclusion of Gender Equality as a standalone goal in the United Nations Sustainable Development Goals [ 3 ]. In light of this, momentum is building around the need to invest in gender-transformative programs and initiatives designed to challenge harmful power and gender imbalances, in line with increasing acknowledgement that ‘restrictive gender norms harm health and limit life choices for all’ ([ 2 ] pe225, see also [ 1 , 4 ]).

Gender-transformative programs and interventions seek to critically examine gender related norms and expectations and increase gender equitable attitudes and behaviours, often with a focus on masculinity [ 5 , 6 ]. They are one of five approaches identified by Gupta [ 6 ] as part of a continuum that targets social change via efforts to address gender (in particular gender-based power imbalances), violence prevention and sexual and reproductive health rights. The approaches in ascending progressive order are; reinforcing damaging gender (and sexuality) stereotypes, gender neutral, gender sensitive, gender-transformative , and gender empowering. The emerging evidence pertaining to the effectiveness of gender-transformative interventions points to the importance of programs challenging the gender binary and related norms, as opposed to focusing only on specific behaviours or attitudes [ 1 , 7 , 8 ]. This understanding is in part derived from a growing appreciation of the need to address outdated and detrimental gendered stereotypes and norms in order to support societal and cultural change in relation to this issue [ 9 , 10 , 11 ]. In addition to this focus on gender-transformative interventions is an increasing call for the engagement of men and boys not only as allies but as participants, partners and agents of change in gender equality efforts [ 12 , 13 ].

When examining the issue of gender inequality, it is necessary to consider the underlying drivers that allow for the maintenance and ongoing repetition of sex-based disparities in access to resources, power and opportunities [ 14 ]. The drivers can largely be categorised as either, ‘structural and systemic’, or ‘social norms and gendered stereotypes’ [ 15 ]. Extensive research and work has, and continues to be, undertaken in relation to structural and systemic drivers. From this perspective, efforts to address inequalities have focused on areas societal institutions exert influence over women's rights and access. One example (of many) is the paid workforce and attempts to address unequal gender representation through policies and practices around recruitment [ 16 , 17 ], retention via tactics such as flexible working arrangements [ 18 , 19 , 20 ] and promotion [ 16 ].

The focus of this review, however, is stereotypes and norms, incorporating the attitudes, behavioural intentions and enacted behaviours that are produced and reinforced as a result of structures and systems that support inequalities. Both categories of drivers (structural and systemic and social norms and gendered stereotypes) are influenced by and exert influence upon each other. Heise and colleagues [ 12 ] suggest that gendered norms uphold the gender system and are embedded in institutions (i.e. structurally), thus determining who occupies positions of leadership, whose voices are heard and listened to, and whose needs are prioritised [ 10 ]. As noted by Kågesten and Chandra-Mouli [ 1 ], addressing both categories of drivers is crucial to the broader strategy needed to meet the UN Sustainable Development Goals.

Stereotypes are widely held, generalised assumptions regarding common traits (including strengths and weaknesses), based on group categorisation [ 21 , 22 ]. Traditional gendered stereotypes see the attribution of agentic traits such as ambition, power and competitiveness as inherent in men, and communal traits such as nurturing, empathy and concern for others as characteristics of women [ 21 , 23 , 24 , 25 , 26 ]. In addition to these descriptive stereotypes (i.e. beliefs about specific characteristics a person possesses based on their gender) are prescriptive stereotypes, which are beliefs about specific characteristics that a person should possess based on their gender [ 21 , 25 ]. Gender-based stereotypes are informed by social norms relating to ideals and practices of masculinity and femininity (e.g. physical attributes, temperament, occupation/role suitability, etc.), which are subject to the influence of culture and time [ 15 , 21 , 26 ].

Social norms are informal (often unspoken) rules governing the behaviour of a group, emerging out of interactions with others and sanctioned by social networks [ 27 ]. Whilst stereotypes inform our assumptions about someone based on their gender [ 21 ], social norms govern the expected and accepted behaviour of women and men, often perpetuating gendered stereotypes (i.e. men as agentic, women as communal) [ 12 ]. Cialdini and Trost [ 27 ] delineate norms by suggesting that, in addition to these general societal behavioural expectations (see also [ 28 , 29 ]), there are personal norms (what we expect of ourselves) [ 30 ], and subjective norms (what we think others expect of us) [ 31 ]. Within subjective norms, there are injunctive norms (behaviours perceived as being approved by others) and descriptive norms (our observations and expectations of what most others are doing). Despite being malleable and subjective to cultural and socio-historical influences, portrayals and perpetuation of these stereotypes and social norms restrict aspirations, expectations and participation of both women and men, with demonstrations of counter-stereotypical behaviours often met with resistance and backlash ([ 12 , 24 , 32 ], see also [ 27 , 33 ]). These limitations are evident both between and among women and men, demonstrative of the power hierarchies that gender inequality and its drivers produce and sustain [ 12 ].

There is an extensive literature that explores interventions targeting gendered stereotypes and norms, each focusing on specific outcomes of gender inequality, such as violence against women [ 13 ], gender-based violence and sexual and reproductive health (including HIV prevention, treatment, care and support) [ 5 , 8 ], parental involvement [ 34 ], sexual and reproductive health rights [ 23 , 35 ], and health and wellbeing [ 2 ]. Comparisons of learnings across these focus areas remains difficult however due to the current lack of a synthesis of interventions across outcomes.

Despite this gap, one of the key findings to arise out of the literature relates to the common, and often implicit, theory of change around shifting participants' attitudes by increasing their knowledge/awareness of gendered stereotypes or norms, and the assumption that this will then lead to behaviour change. This was identified by Jewkes and colleagues [ 13 ] in their review of 67 intervention evaluations in relation to the prevention of violence against women, a finding they noted was in contradiction of research across disciplines which has consistently found this relationship to be complex and bidirectional [ 36 , 37 ]. Similarly, The International Centre for Research on Women indicate the ‘problematic assumption[s] regarding pathways to change’ ([ 7 ] p26) as one of the challenges to engaging men and boys in gender equality work, noting also the focus of evaluation, when undertaken, being on changes in attitude rather than behaviour. Ruane-McAteer and colleagues [ 35 ] made the same observation when looking at interventions aimed at gender equality in sexual and reproductive health, highlighting the need for greater interrogation into the intended outcomes of interventions including what the underlying theory of change is. These findings lend further support to the utilisation of the gender-transformative approach identified by Gupta [ 6 ] if fundamental and sustained shifts in understanding, attitudes and behaviour relating to gender inequality is the desired outcome.

In sum, much is known about gender stereotypes and norms and the contribution they make to perpetuating and sustaining gender inequality through the various outcomes discussed above. Less is known however about how to support and sustain more equitable attitudes and behaviours when it comes to addressing gender equality more broadly. This systematic review aims to address the question which intervention characteristics support change in attitudes and behaviour in relation to rigid gender stereotypes and norms. It will do this by consolidating the literature to determine what has been done and what works. This includes querying which intervention types work for whom in terms of participant age and sex, as well as delivery style and duration. Additionally, it will consider the theories of change being used to address attitudes and behaviours and how these shifts are being measured, including for impact longevity. Finally, it will allow for insight into interventions specifically targeting men and boys in relation to rigid gender stereotypes and norms, seeking out particular characteristics that are supportive of work engaging this particular cohort. These questions are intentionally broad and based on the framing of the above question it is expected that the review will capture primarily interventions that address underlying societal factors that support a culture in which harmful power and gender imbalances exist by addressing gender inequitable attitudes and behaviours. In asking these questions, this review consolidates the knowledge generated to date, to strengthen the design, development and implementation of future interventions, a synthesis that appears to be both absent and needed.

2.1. Data sources and search strategy

This review was undertaken in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [ 38 ]. A protocol was registered on the Open Science Framework (Title: Gendered norms: A systematic review of how to achieve change in rigid gender stereotypes, accessible at https://osf.io/gyk25/ ). Qualitative, quantitative and mixed method studies were identified through three electronic databases searched in February 2019 (ProQuest Central, PsycINFO and Web of Science). Four search strategies were developed in consultation with a subject librarian and tested across all three databases. The final strategy was confirmed by the lead author and a second reviewer (see Table 1 ).

Table 1

Search terms used.

There were no date or language exclusions, Title, Abstract & Keyword filters were applied where possible, and truncation was used in line with database specifications. The following intervention categories were included due to their standing in public health literature as being effective to create population level impact and having proven effective in addressing other significant health and social issues [ 39 ]; direct participation programs (referred to also as education based interventions throughout this review), community mobilisation or strengthening, organisational or workforce development, communications, social marketing and social media, advocacy, legislative or policy reform. Table 2 provides descriptions of each of these intervention categories that have been obtained from the actions outlined in the World Health Organisation's Ottawa Charter [ 40 ] and Jakarta Declaration [ 41 ] and are a comprehensive set of strategies grounded in prevention theory [ 42 ]. For the purposes of this review, legislative and policy reform within community, educational, organisational and workforce settings were included. Government legislation and policy reform were excluded.

Table 2

Public health intervention categories.

2.2. Screening

Initial search results were merged and duplicates removed using EndNote before transferring data management to Covidence for screening. Two researchers independently screened titles and abstracts excluding studies based on the criteria stipulated in Table 3 .

Table 3

Inclusion and exclusion criteria.

The University Library document request service was used to obtain articles otherwise inaccessible or in languages other than English. In cases where full-text or English versions were unable to be obtained, the study was excluded. Full-text screening was undertaken by the same two researchers independently and the final selection resulted in 71 included studies (see Figure 1 ).

Figure 1

PRISMA diagram of screening and study selection.

2.3. Data extraction

Data extraction was undertaken by the first author and checked for accuracy by the second author. Discrepancies were resolved by consensus with the remaining three authors. The extracted data included: citation, year and location of study, participant demographics (gender, age), study design, setting, theoretical underpinnings, motivation for study, measurement tools/instruments, primary outcomes and results. A formal meta-analysis was not conducted given heterogeneity of outcome variables and measures, due in part to the broad nature of the review question.

2.4. Quality appraisal

Three established quality appraisal tools were used to account for the different study designs included, the McMasters Critical Review Form – Qualitative Studies 2.0 [ 43 ], the McMasters Critical Review Form – Quantitative Studies [ 44 ], Mixed Methods Appraisal Tool (MMAT), version 2018 [ 45 ]. The first author completed quality appraisal for all studies, with the second author undertaking an accuracy check on ten percent of studies. The appraisal score represents the proportion of ‘yes’ responses out of the total number of criteria. ‘Not reported’ was treated as a ‘no’ response. A discussion of the outcomes is located under Results.

2.5. Data synthesis

Included studies were explored using a modified narrative synthesis approach comprising three elements; developing a theory of how interventions worked, why and with whom, developing a preliminary synthesis of findings of included studies, and exploring relationships in studies reporting statistically significant outcomes [ 46 ]. Preliminary analysis was conducted using groupings of studies based on intervention type and thematic analysis based on gender inequality outcomes driving the study and features of the studies including participant sex and age and intervention delivery style and duration [ 46 ]. A conceptual model was developed (see Theory of Change section under Results) as the method of relationship exploration amongst studies reporting significant results, using qualitative case descriptions [ 47 ]. The narrative synthesis was undertaken under the premise that the ‘evidence being synthesised in a systematic review does not necessarily offer a series of discrete answers to a specific question’, so much as ‘each piece of evidence offers are partial picture of the phenomenon of interest’ ([ 46 ] p21).

3.1. Literature search

The literature search returned 4,050 references after the removal of duplicates (see Figure 1 ), from which 210 potentially relevant abstracts were identified. Full-text review resulted in a final list of 71 articles evaluating 69 distinct interventions aligned with the public health methodologies outlined in Table 2 . Table 4 provides a list of the included studies, categorised by intervention type. Studies fell into eight categories of interventions in total, with several combining two methodology types described in Table 2 .

Table 4

Included articles categorised by intervention type.

3.2. Quality assessment

Overall, the results of the quality appraisal indicated a moderate level of confidence in the results. The appraisal scores for the 71 studies ranged from poor (.24) to excellent (.96). The median appraisal score was .71 for all included studies (n = 71) and .76 for studies reporting statistically significant positive results (n = 32). The majority of studies were rated moderate quality (n = 57, 80%), with moderate quality regarded as .50 - .79 [ 119 ]. Ten studies were regarded as high quality (14%, >.80), and four were rated as poor (6%, <.50) [ 119 ]. Of the studies with significant outcomes, one rated high quality (.82) and the remaining 31 were moderate quality, with 18 of these (58% of 31) rating >.70. For the 15 randomised control trials (including n = 13 x cluster), all articles provided clear study purposes and design, intervention details, reported statistical significance of results, reported appropriate analysis methods and drew appropriate conclusions. However, only four studies appropriately justified sampling process and selection. For the qualitative studies (n = 5), the lowest scoring criteria were in relation to describing the process of purposeful selection (n = 1, 20%) and sampling done until redundancy in data was reached (n = 2, 40%). For the quantitative studies (n = 47) the lowest scoring criteria were in relation to sample size justification (n = 8, 17%) and avoiding contamination (n = 1, 2%) and co-intervention (n = 0, none of the studies provided information on this) in regards to intervention participants. For the Mixed Method studies (n = 19) the lowest scoring criteria in relation to the qualitative component of the research was in relation to the findings being adequately derived from the data (n = 9, 47%), and for the mixed methods criteria it was in relation to adequately addressing the divergences and inconsistencies between quantitative and qualitative results (n = 6, 32%).

3.3. Measures

Measures of stereotypes and norms varied across quantitative and mixed method studies with 31 (47%) of the 66 articles reporting the use of 25 different psychometric evaluation tools. The remaining 35 (53%) of quantitative and mixed methods studies reported developing measurement tools specific to the study with inconsistencies in description and provision of psychometric properties. Of the studies that used psychometric evaluation tools, the most frequently used were the Gender Equitable Men Scale (GEMS, n = 6, plus n = 2 used questions from the GEMS), followed by the Gender Role Conflict Scale I (GRCS-I, n = 5, plus n = 1 used a Short Form version) and the Gender-Stereotyped Attitude Scale for Children (GASC, n = 5). Whilst most studies used explicit measures as listed here, implicit measures were also used across several studies, including the Gender-Career Implicit Attitudes Test (n = 1). The twenty-four studies that undertook qualitative data collection used interviews (participant n = 15, key informant n = 3) as well as focus groups (n = 8), ethnographic observations (n = 5) and document analysis (n = 2). Twenty (28%) of the 71 studies measured behaviour and/or behavioural intentions, of which 9 (45%) used self-report measures only, four (20%) used self-report and observational data, and two (10%) used observation only. Follow-up data was collected for four of the studies using self-report measures, and two using observation measures, and one using both methods.

3.4. Study and intervention characteristics

Table 5 provides a summary of study and intervention characteristics. All included studies were published between 1990 and 2019; n = 8 (11%) between 1990 and 1999, n = 15 (21%) between 2000 and 2009, and the majority n = 48 (68%) from 2010 to 2019. Interventions were delivered in 23 countries (one study did not specify a location), with the majority conducted in the U.S. (n = 33, 46%), followed by India (n = 10, 14%). A further 15 studies (21%) were undertaken in Africa across East Africa (n = 7, Ethiopia, Malawi, Mozambique, Uganda), South Africa (n = 6), and West Africa (n = 2, Nigeria, Senegal). The remaining fifteen studies were conducted in Central and South America (n = 4, Mexico, Guatemala, El Salvador and Argentina), Europe (n = 3, Ireland, Spain and Turkey), Nepal (n = 2), and one study each in Australia, China, Oman, Pakistan, Sri Lanka and the United Kingdom. Forty-seven (66%) studies employed quantitative methods, 19 (27%) reported both quantitative and qualitative (mixed) methods, and the remaining five studies (7%) reported qualitative methods. Forty-two of the quantitative and mixed-method approaches were non-randomised control trials, 13 were cluster randomised control trials, two were randomised control trials, and eight were quantitative descriptive studies.

Table 5

Summarised study and intervention characteristics (n = 71).

Based on total study sample sizes, data was reported on 46,673 participants. Sample sizes ranged from 15 to 122 for qualitative, 7 to 2887 for mixed methods, and 21 to 6073 for quantitative studies. Of the 71 studies, 23 (32%) reported on children (<18 years old), 13 (18%) on adolescents/young adults (<30 years old), 29 (41%) on adults (>18 years old), and six (8%) studies did not provided details on participant age. Thirty-seven (52%) studies recruited participants from educational settings (i.e. kindergarten, primary, middle and secondary/high school, tertiary including college residential settings, and summer camps/schools), 32 (45%) from general community settings (including home and sports), three from therapy-based programs for offenders (i.e. substance abuse and partner abuse prevention), and one sourced participants from both educational (vocational) and a workplace (factory).

As per Table 5 , the greatest proportion of all studies engaged mixed sex cohorts (n = 39, 55%), looked at norms (n = 34, 48%), were undertaken in community settings (n = 32, 45%), were education/direct participant interventions (n = 47, 66%) and undertook pre and post intervention evaluation (n = 49, 69%). Twenty-four studies reported on follow up data collection, with 10 reporting maintenance of outcomes.

Intervention lengths were varied, from individual sessions (90 min) to ongoing programs (up to 6 years) and were dependent on intervention type. Table 6 provides the duration range by intervention type.

Table 6

Intervention type and duration.

Of the 71 studies examined in this review, 10 (14%) stated a gender approach in relation to the continuum outlined at the start of this paper, utilising two of the five categories; gender-transformative and gender-sensitive [ 6 ]. Eight studies stated that they were gender-transformative, the definition of this strategy being to critically examine gender related norms and expectations and increase gender equitable attitudes and behaviours, often with a focus on masculinity [ 9 , 10 ]. An additional two stated they were gender-sensitive, the definition of which is to take into account and seek to address existing gender inequalities [ 10 ]. The remaining 61 (86%) studies did not specifically state engagement with a specific gender approach. Interpretation of the gender approach was not undertaken in relation to these 61 studies due to insufficient available data and to avoid potential risk of error, mislabelling or misidentification.

3.5. Characteristics supporting success

Due to the broad inclusion criteria for this review, there is considerable variation in study designs and the measurement of attitudes and behaviours. With the exception of the five studies using qualitative methods, all included studies reported on p-values, and 13 reported on effect sizes [ 51 , 60 , 66 , 69 , 70 , 71 , 77 , 78 , 79 , 83 , 92 , 99 , 110 ]. In addition to this, the centrality of gender norms and/or stereotypes within studies meeting inclusion criteria varied from a primary outcome to a secondary one, and in some studies was a peripheral consideration only, with minimal data reported. This heterogeneity prevents comparisons based purely on whether the outcomes of the studies were statistically significant, and as such consideration was also given to the inclusion of effect sizes, author interpretation, qualitative insights and whether outcomes reported as statistically non-significant reported encouraging results, which allowed for the inclusion of those using qualitative methods only [ 53 , 73 , 81 , 82 , 98 ].

As outlined in Table 5 , the studies were grouped into three categories based on reporting of statistical significance using p-values. Two categories include studies reporting statistically significant outcomes (n = 25) and those reporting mixed outcomes including some statistically significant results (n = 30), specifically in relation to the measurement of gender norms and/or stereotypes. Disparate outcomes included negligible behavioural changes, a shift in some but not all norms (i.e. shifts in descriptive but not personal norms, or masculine but not feminine stereotypes), and effects seen in some but not all participants (i.e. shifts in female participant scores but not male). It is worth noting that out of the 71 studies reviewed, all but one reported positive or negligible intervention impacts on attitudes and/or behaviours relating to gender norms and/or stereotypes. The other category include those reporting non-significant results (n = 2) as well as those that reported non-significant but positive results in relation to attitude and/or behaviour change towards gender norms and/or stereotypes (n = 14). These studies include those which had qualitative designs, several who reported on descriptive statistics only, and several which did not meet statistical significance but who demonstrated improvement in participant scores between base and end line and/or between intervention and control groups. The insights from the qualitative studies (n = 5) have been taken into consideration in the narrative synthesis of this review.

Studies reporting statistically significant outcomes were represented across seven of the eight intervention types. The only intervention category not represented was advocacy and education [ 48 ] which reported non-significant but positive results. The remainder of this section will consider the study characteristics of the statistically significant and mixed results categories, as well as identifying similar trends observed in the qualitative studies which reported positive but non-significant intervention outcomes. When considering intervention type, direct participant education was the most common, with 49 of the 55 studies reporting statistically significant or mixed outcomes containing a direct participant education component, and all but one of the five qualitative studies.

The majority of interventions reporting achievement of intended outcomes involved delivery of multiple sessions ranging from five x 20 min sessions across one week to multiple sessions across six years. This included 48 of the 55 studies reporting statistically significant or mixed outcomes, and all five qualitative studies. Only one of the seven that utilised single/one-off sessions reported significant outcomes. The remaining six studies had varying results, including finding shifts in descriptive but not personal norms amongst a male-only cohort, shifts in acceptance of both genders performing masculine behaviours but no shift in acceptance of males performing feminine behaviours, and significant outcomes for participants already demonstrating more egalitarian attitudes at baseline but not those holding more traditional ones – arguably the target audience.

When considering participant sex, the majority of studies reporting statistically significant or mixed results engaged mixed sex cohorts (n = 33 out of 55), with the remaining studies engaging male only (n = 13) and female only (n = 9) cohorts. Of the qualitative studies, three engaged mixed sex participant cohorts. Interestingly however, several studies reported disparate results, including significant outcomes for male but non-significant outcomes for female participants primarily in studies incorporating a community mobilisation element, and the reverse pattern in some studies that were education based. Additional discrepancies were found between several studies looking at individual and community level outcomes.

Finally, a quarter of studies worked with male only cohorts (n = 18). Of these, four reported significant results, nine reported mixed results, and the remaining five studies reported non-significant but positive outcomes, one of which was a qualitative study. Within these studies, two demonstrated shifts in more generalised descriptive norms and/or stereotypes relating to men, but not in relation to personal norms. Additionally, several studies demonstrated that shifts in male participant attitudes were not generalised, with discrepancies found in relation to attitudes shifting towards women but not men and in relation to some norms or stereotypes (for example men acting in ‘feminine’ ways) but not others that appeared to be more culturally entrenched. These studies are explored further in the Discussion.

In summary, interventions that used direct participant education, across multiple sessions, with mixed sex participant cohorts were associated with greater success in changing attitudes and in a small number of studies behaviour. Further to these characteristics, several strategies were identified that appear to enhance intervention impact which are discussed further in the next section.

3.6. Theory of change

One aim of this review was to draw out common theory and practice in order to strengthen future intervention development and delivery. Across all included studies, the implicit theory of change was raising knowledge/awareness for the purposes of shifting attitudes relating to gender norms and/or stereotypes. Direct participant education-based interventions was the predominant method of delivery. In addition to this, 23 (32%) studies attempted to take this a step further to address behaviour and/or behavioural intentions, of which 10 looked at gender equality outcomes (including bystander action and behavioural intentions), whilst the remaining studies focused on gender-based violence (n = 9), sexual and reproductive health (n = 2) and two studies which did not focus on behaviours related to the focus of this review.

As highlighted in Figure 2 , this common theory of change was the same across all identified intervention categories, irrespective of the overarching focus of the study (gender equality, prevention of violence, sexual and reproductive health, mental health and wellbeing). Those examining gender equality more broadly did so in relation to female empowerment in relationships, communities and political participation, identifying and addressing stereotypes and normative attitudes with kindergarten and school aged children. Those considering prevention of violence did so specifically in relation to violence against women, including intimate partner violence, rape awareness and myths, and a number of studies looking at teen dating violence. Sexual and reproductive health studies primarily assessed prevention of HIV, but also men and women's involvement in family planning, with several exploring the interconnected issues of violence and sexual and reproductive health. Finally, those studies looking at mental health and wellbeing did so in relation to mental and physical health outcomes and associated help-seeking behaviours, including reducing stigma around mental health (particularly amongst men in terms of acceptance and help seeking) and emotional expression (in relationships).

Figure 2

Breakdown of study characteristics and strategies associated with achieving intended outcomes.

In addition to the implicit theory of change, the review process identified five additional strategies that appear to have strengthened interventions (regardless of intervention type). In addition to implicit theory of change across all studies, one or more of these strategies were utilised by 31 of the 55 studies that reported statistically significant results:

  • • Addressing more than one level of the ecological framework (n = 17): which refers to different levels of personal and environmental factors, all of which influence and are influenced by each other to differing degrees [ 120 ]. The levels are categorised as individual, relational, community/organisational and societal, with the individual level being the most commonly addressed across studies in this review;
  • • Peer engagement (n = 14): Using participant peers (for example people from the same geographical location, gender, life experience, etc.) to support or lead an intervention, including the use of older students to mentor younger students, or using peer interactions as part of the intervention to enhance learning. This included students putting on performances for the broader school community, facilitation of peer discussions via online platforms or face-to-face via direct participant education and group activities or assignments;
  • • Use of role models and modelling of desired attitudes and/or behaviours by facilitators or persons of influence in participants' lives (n = 11);
  • • Developing agents of change (n = 7): developing knowledge and skills for the specific purpose of participants using these to engage with their spheres of influence and further promote, educate and support the people and environments in which they interact; and
  • • Co-design (n = 6): Use of formative research or participant feedback to develop the intervention or to allow flexibility in its evolution as it progresses.

Additionally, four of the five studies using qualitative methods utilised one or more of these strategies; ecological framework (n = 3), peer engagement (n = 1), role models (n = 2), agents of change (n = 2) and co-design (n = 1). Whilst only a small number of studies reported engaging the last two strategies, developing agents of change and co-design, they have been highlighted due to their prominence in working with the sub-set of men and boys, as well as the use of role models/modelling.

The remaining 24 studies that reported significant outcomes did not utilise any of these five strategies. Eight used a research/experimental design, the remaining 16 were all direct participant education interventions, and either did not provide enough detail about the intervention structure or delivery to determine if they engaged in any of these strategies (n = 13), were focused on testing a specific theory (n = 2) or in the case of one study used financial incentives.

Figure 2 provides a conceptual model exploring the relationship amongst studies reporting statistically significant outcomes. Utilising the common theory of change as well as the additional identified strategies, interventions were able to address factors that act as gender inequality enforcers including knowledge, attitudes, environmental factors and behaviour and behavioural intentions (see Table 7 ), to achieve statistically significant shifts in attitudes, and in a small number of cases behaviour (see Table 8 ).

Table 7

Factors supportive of gender inequality in studies reporting significant positive outcomes (n = 55).

Table 8

Changes observed in attitudes and behaviours in studies reporting significant positive outcomes (n = 55).

4. Discussion

This systematic review synthesises evidence on ‘which intervention characteristics support change in attitudes and behaviours in relation to rigid gender stereotypes and norms’, based on the seventy-one studies that met the review inclusion criteria. Eight intervention types were identified, seven of which achieved statistically significant outcomes. Patterns of effectiveness were found based on delivery style and duration, as well as participant sex, and several strategies (peer engagement, addressing multiple levels of the ecological framework, skilling participants as agents of change, use of role models and modelling of desired attitudes and behaviours, and intervention co-design with participants) were identified that enhanced shifts in attitudes and in a small number of studies, behaviour. Additionally, a common theory of change was identified (increasing knowledge and raising awareness to achieve shifts in attitudes) across all studies reporting statistically significant results.

The articles included in this review covered a range of intervention types, duration and focus, demonstrating relative heterogeneity across these elements. This is not an unexpected outcome given the aim of this review was to allow for comparisons to be drawn across interventions, regardless of the overarching focus of the study (gender equality, prevention of violence, sexual and reproductive health, mental health and wellbeing). As a result, one of the key findings of this review is that design, delivery and engagement strategies that feature in studies reporting successful outcomes, are successful regardless of the intervention focus thus widening the evidence base from which those researching and implementing interventions can draw. That said, the heterogeneity of studies limits the ability for definitive conclusions to be drawn based on the studies considered in this review. Instead this section provides a discussion of the characteristics and strategies observed based on the narrative synthesis undertaken.

4.1. Intervention characteristics that support success

4.1.1. intervention type and participant demographics.

The 71 included studies were categorised into eight intervention types (see Table 4 ); advocacy and education, advocacy and community mobilisation, community mobilisation, community mobilisation and education, education (direct participant), research and education, research, and two studies that utilised four or more intervention types (advocacy via campaigns and social media, community mobilisation, education and legislation, and, advocacy, education, community mobilisation, policy and social marketing). With the exception of the individual study that utilised advocacy and education, all intervention types were captured in studies reporting statistically significant or mixed results.

Direct participant education was the most common intervention type across all studies (n = 47 out of 71, 66%). When considering those studies that included a component of direct participant education in their intervention (e.g. those studies which engaged education and community mobilisation) this figure rose to 63 of the 69 individual interventions looked at in this review, 54 of which reported outcomes that were either statistically significant (n = 23), mixed (n = 26) or were non-significant due to the qualitative research design, but reported positive outcomes (n = 5). These findings indicate that direct participant education is both a popular and an effective strategy for engaging participants in attitudinal (and in a small number of cases behaviour) change.

Similarly, mixed sex participant cohorts were involved in over half of all studies (n = 39 out of 71, 55%), of which 33 reported statistically significant or mixed results, and a further three did not meet statistical significance due to the qualitative research design but reported positive outcomes. Across several studies however, conflicting results were observed between male and female participants, with female's showing greater improvement in interventions using education [ 85 , 89 , 114 ] and males showing greater improvement when community mobilisation was incorporated [ 51 , 60 ]. That is not to say that male participants do not respond well to education-based interventions with 13 of the 18 studies engaging male only cohorts reporting intended outcomes using direct participant education. However, of these studies, nine also utilised one or more of the additional strategies identified such as co-design or peer engagement which whilst different to community engagement, employ similar principles around participant engagement [ 77 , 79 , 87 , 91 , 92 , 96 , 97 , 99 , 105 , 107 , 111 , 115 ]. These findings suggest that participant sex may impact on how well participants engage with an intervention type and thus how successful it is.

There was a relatively even spread of studies reporting significant outcomes across all age groups, in line with the notion that the impact of rigid gender norms and stereotypes are not age discriminant [ 10 ]. Whilst the broad nature of this review curtailed the possibility of determining the impact of aged based on the studies synthesised, the profile of studies reporting statistically significant outcomes indicates that no patterns were found in relation to impact and participants age.

The relatively small number of studies that observed the above differences in intervention design and delivery means definitive conclusions cannot be drawn based on the studies examined in this review. That said, all of these characteristics support an increase in personal buy-in. Interventions that incorporate community mobilisation engage with more than just the individual, often addressing community norms and creating environments supportive of change [ 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 117 , 118 ]. Similarly, education based programs that incorporate co-design and peer support do more than just knowledge and awareness raising with an individual participant, providing space for them to develop their competence and social networks [ 70 , 75 , 77 , 79 , 81 , 86 , 90 , 91 , 92 , 93 , 97 , 103 , 107 , 109 , 110 , 111 , 113 , 115 , 116 ]. When it comes to designing these interventions, it would appear that success may be influenced by which method is most engaging to the participants and that this is in turn influenced by the participants' sex. This finding is reinforced further when taking into consideration the quality of studies with those reporting on a mixed-sex cohort, which were generally lower in quality than those working with single sex groups. Whilst it appears mixed sex cohorts are both common and effective at obtaining significant results, these findings suggest that when addressing gendered stereotypes and norms, there is a need to consider and accommodate differences in how participants learn and respond when designing interventions to ensure the greatest chance of success in terms of impacting on all participants, regardless of sex, and ensuring quality of study design.

4.1.2. Intervention delivery

The findings from this review suggests that multi-session interventions are both more common and more likely to deliver significant outcomes than single-session or one-off interventions. This is evidenced by the fact that only one [ 67 ] out seven studies engaging the use of one-off sessions reported significant outcomes with the remaining six reporting mixed results [ 63 , 66 , 68 , 69 , 78 , 90 ]. Additionally, all but two of the studies [ 78 , 90 ] used a research/experimental study design, indicating a current gap in the literature in terms of real-world application and effectiveness of single session interventions. This review highlights the lack of reported evidence of single session effectiveness, particularly in terms of maintaining attitudinal changes in the few instances in which follow-up data was collected. Additionally this review only captured single-sessions that ran to a maximum of 2.5 h, further investigation is needed into the impact of one-off intensive sessions, such as those run over the course of a weekend. While more evidence is needed to reach definitive conclusions, the review indicates that single-session or one-off interventions are sub-optimal, aligning with the same finding by Barker and colleagues [ 5 ] in their review of interventions engaging men and boys in changing gender-based inequity in health. This is further reflected in the health promotion literature that points to the lack of demonstrated effectiveness of single-session direct participant interventions when it comes to addressing social determinants of health [ 121 , 122 , 123 ]. Studies that delivered multiple sessions demonstrate the ability to build rapport with and amongst the cohort (peer engagement, modelling, co-design) as well as the allowance of greater depth of learning and retention achievable through repeated touch points and revision. These are elements that can only happen through recurring and consistent exposure. Given these findings, practitioners should consider avoiding one-off or single-session delivery, in favour of multi-session or multi-touch point interventions allowing for greater engagement and impact.

4.1.3. Evaluation

Very few included studies collected follow-up data, with only one third of studies evaluating beyond immediate post-intervention data collection (n = 24). Of those that did, ten reported maintenance of their findings [ 55 , 56 , 64 , 70 , 79 , 93 , 95 , 103 , 113 , 116 ], eleven did not provide sufficient detail to determine [ 50 , 52 , 57 , 65 , 66 , 82 , 91 , 92 , 94 , 102 , 105 ] and two reported findings were not maintained [ 61 , 90 ]. The last study, a 90 min single session experiment with an education component, reported significant positive outcomes between base and end line scores, but saw a significant negative rebound in scores to worse than base line when they collected follow up data six weeks later [ 63 ]. This study supports the above argument for needing more than a single session in order to support change long term and highlights the importance of capturing follow up data not only to ensure longevity of significant outcomes, but also to capture reversion effects. The lack of standardised measures to capture shifts in norms is acknowledged empirically [ 11 , 13 ]. However, the outcomes of this review, including the lack of follow up data collection reported, are supportive of the need for increased investment in longitudinal follow-up, particularly in relation to measuring behaviour change and ensuring maintenance of observed changes to attitudes and behaviour over time (see also [ 124 ]).

4.1.4. Behaviour change

When it comes to behaviour change, definitive conclusions cannot be drawn due to the paucity of studies. The studies that did look at behaviour focused on the reduction of relational violence including the perpetration and experience of physical, psychological and sexual violence [ 50 , 54 , 55 , 56 , 59 , 60 , 105 , 115 ], as well as more equitable division of domestic labour [ 82 , 86 , 98 ] and responsibility for sexual and reproductive health [ 58 , 116 ], intention to take bystander action [ 65 , 102 , 117 ] and female political participation [ 81 ]. Lack of follow up data and use of measurement tools other than self-report, however, make it difficult to determine the permanency of the behaviour change and whether behavioural intentions transition to action. Models would suggest that interventions aimed at changing attitudes/norms would flow on to behaviour change but need to address multiple levels of the ecological framework not just the individual to support this change, and engage peer leadership and involvement in order to do so. This supports findings from the literature discussed at the start of this paper, alerting practitioners to the danger of making incorrect assumptions about ‘pathways to change’ [ 7 ] and the need to be mindful of the intention-behaviour gap which has been shown to disrupt this flow from attitude and intention to actual behaviour change [ 6 , 13 , 35 , 36 , 37 ].

If studies are to evaluate the impact of an intervention on behaviour, this objective must be made clear in the intervention design and evaluation strategy, and there must be an avoidance of relying on self-report data only, which is subject to numerous types of bias such as social desirability. Use of participant observation as well as key informant feedback would strengthen evaluation. The quality of studies that measured behaviour change was varied, ranging from poor (n = 1 at <.5 looking at behavioural intentions) to high (n = 3 at >.85 looking at bystander action and gender equality). The majority of studies however, were moderate in quality measuring either lower (n = 4 at .57, looking at gender-based violence, domestic labour division and bystander intention, and n = 2 at .64 looking at gender-based violence) to higher (n = 11 at .71-.79, looking at gender-based violence, gender equality, sexual and reproductive health and behavioural intentions), further supporting the finding that consideration in study design and evaluation is crucial. It is worth noting that measuring behaviour change is difficult, it requires greater resources should more than just self-report measurements be used, as well as longitudinal follow up to account for sustained change and to capture deterioration of behaviour post intervention should it occur.

4.2. Theory of change

Across all included studies, the implicit theory of change was knowledge/awareness raising for the purposes of shifting attitudes towards gender norms and/or stereotypes. This did not vary substantially across intervention type or study focus, whether it was norms, stereotypes or both being addressed, and for all participant cohorts. The conceptual framework developed (see Figure 2 ) shows that by increasing knowledge and raising awareness, the studies that reported statistically significant outcomes were able to address factors enforcing gender inequality in the form of knowledge, attitudes, environmental factors, and in a small number of cases behaviour.

Further to this common theory of change, several strategies were identified which appear to have enhanced the delivery and impact of these interventions. These included the use of participant peers to lead, support and heighten learning [ 49 , 77 , 79 , 81 , 86 , 90 , 92 , 93 , 103 , 109 , 110 , 111 , 113 , 115 , 116 , 117 ], involvement of multiple levels of the ecological framework (not just addressing the individual) [ 51 , 52 , 53 , 54 , 55 , 56 , 58 , 59 , 60 , 70 , 72 , 74 , 81 , 86 , 91 , 97 , 98 , 102 , 117 , 118 ], developing participants into agents of change [ 49 , 52 , 58 , 60 , 72 , 81 , 98 , 117 , 118 ], using modelling and role models [ 49 , 51 , 52 , 58 , 60 , 65 , 82 , 98 , 110 , 117 , 118 ], and the involvement of participants in co-designing the intervention [ 51 , 70 , 81 , 90 , 91 , 97 , 111 ]. As mentioned earlier, these strategies all contain principles designed to increase participant buy-in, creating a more personal and/or relatable experience.

One theory that can be used to consider this pattern is Petty and Cacioppo's [ 125 ] Elaboration Likelihood Model. The authors posit that attitudes changed through a central (deliberative processing) route, are more likely to show longevity, are greater predictors of behaviour change and are more resistant to a return to pre-intervention attitudes, than those that are the result of peripheral, or short cut, mental processing. Whether information is processed deliberately is dependent on a person's motivation and ability, both of which need to be present and both of which are influenced by external factors including context, message delivery and individual differences. In other words, the more accessible the message is and the more engaged a person is with the messaging they are exposed to, the stronger the attitude that is formed.

In the context of the studies in this review, the strategies found to enhance intervention impact all focus on creating a relationship and environment for the participant to engage in greater depth with the content of the intervention. This included not only the use of the five strategies discussed here, but also the use of multi-session delivery as well as use of delivery types aligned with participant responsiveness (community mobilisation and co-design elements when engaging men and boys, and education-focused interventions for engaging women and girls). With just under two thirds of studies reporting positive outcomes employing one or more of these strategies, practitioners should consider incorporating these into intervention design and delivery for existing interventions or initiatives as well as new ones.

4.3. Engaging men and boys

Represented by only a quarter of studies overall (n = 18 out of 71) this review further highlights the current dearth of research and formal evaluation of interventions working specifically with men and boys [ 124 ].

Across the 18 studies, four reported significant outcomes [ 59 , 79 , 97 , 111 ], nine reported mixed results with some but not all significant outcomes [ 49 , 63 , 68 , 77 , 91 , 92 , 99 , 105 , 115 ] and the remaining five reported non-significant but positive results [ 75 , 87 , 96 , 107 ], including one qualitative study [ 53 ]. Quality was reasonably high (n = 12 rated .71 - .86), and there were some interesting observations to be made about specific elements for this population.

The majority of the studies reporting positive significant or mixed results utilised one or more of the five additional strategies identified through this review (n = 10 out of 14) including the one qualitative study. Three studies used co-design principles to develop their intervention, which included formative research and evolution through group discussions across the duration of the intervention [ 91 , 97 , 111 ]. Four studies targeted more than just the individual participants including focusing on relational and community aspects [ 53 , 59 , 91 , 97 ]. Another six leveraged peer interaction in terms of group discussions and support, and leadership which included self-nominated peer leaders delivering sessions [ 49 , 77 , 79 , 92 , 111 , 115 ]. Finally, two studies incorporated role models [ 79 ] or role models and agents of change [ 49 ]. Similar to the overall profile of studies in this review, the majority in this group utilised direct participant education (n = 12 out of 14) either solely [ 77 , 79 , 91 , 92 , 97 , 99 , 105 , 111 , 115 ], or in conjunction with community mobilisation [ 53 , 59 ] or a research/experimental focus [ 63 ].

The use of the additional strategies in conjunction with direct participant education aligning with the earlier observation about male participants responding better in studies that incorporated a community or interpersonal element. A sentiment that was similarly observed by Burke and colleagues [ 79 ] in their study of men in relation to mental health and wellbeing, in which they surmised that a ‘peer-based group format’ appears to better support the psychosocial needs of men to allow them the space to ‘develop alternatives to traditional male gender role expectations and norms’ (p195).

When taken together, these findings suggest that feeling part of the process, being equipped with the information and skills, and having peer engagement, support and leadership/modelling, are all components that support the engagement of men and boys not only as allies but as participants, partners and agents of change when it comes to addressing gender inequality and the associated negative outcomes. This is reflective of the theory of change discussion outlining design principles that encourage and increase participant buy-in and the strength in creating a more personal and/or relatable learning experience.

Working with male only cohorts is another strategy used to create an environment that fosters participant buy-in [ 126 ]. Debate exists however around the efficacy of this approach, highlighted by the International Centre for Research on Women as an unsubstantiated assumption that the ‘best people to work with men are other men’ ([ 7 ] p26), which they identify as one of the key challenges to engaging men and boys in gender equality work [ 7 , 13 ]. Although acknowledging the success that has been observed in male-only education and preference across cultures for male educators, they caution of the potential for this assumption to extend to one that men cannot change by working with women [ 7 , 13 ]. The findings from this review support the need for further exploration and evaluation into the efficacy of male only participant interventions given the relatively small number of studies examined in this review and the variance in outcomes observed.

4.3.1. One size does not fit all

In addition to intervention and engagement strategies, the outcomes of several studies indicate a need to consider the specifics of content when it comes to engaging men and boys in discussions of gendered stereotypes and norms. This was evident in Pulerwitz and colleagues [ 59 ] study looking at male participants, which found an increase in egalitarian attitudes towards gendered stereotypes in relation to women, but a lack of corresponding acceptance and change when consideration was turned towards themselves and/or other males. Additionally, Brooks-Harris and colleagues [ 68 ] found significant shifts in male role attitudes broadly, but not in relation to personal gender roles or gender role conflict. Their findings suggest that targeted attention needs to be paid to addressing different types of stereotypes and norms, with attitudes towards one's own gender roles, and in the case of this study one's ‘fear of femininity’ being more resistant to change than attitudes towards more generalised stereotypes and norms. This is an important consideration for those working to engage men and boys, particularly around discussions of masculinity and what it means to be a man. Rigid gendered stereotypes and norms can cause harmful and restrictive outcomes for everyone [ 2 ] and it is crucial that interventions aimed at addressing them dismantle and avoid supporting these stereotypes; not just between sexes, but amongst them also [ 127 ]. Given the scarcity of evidence at present, further insight is required into how supportive spaces for exploration and growth are balanced with the avoidance of inadvertently reinforcing the very stereotypes and norms being addressed in relation to masculinity, particularly in the case of male only participant groups.

There is currently a gap in the research in relation to these findings, particularly outside of the U.S. and countries in Africa. Further research into how programs engaging men and boys in this space utilise these elements of intervention design and engagement strategies, content and the efficacy of single sex compared to mixed sex participant cohorts is needed.

4.4. Limitations and future directions

The broad approach taken in this review resulted in a large number of included studies (n = 71) and a resulting heterogeneity of study characteristics that restricted analysis options and assessment of publication bias. That said, the possibility of publication bias appears less apparent given that less than half of the 71 included studies reported statistically significant effects, with the remainder reporting mixed or non-significant outcomes. This may be in part due to the significant variance in evaluation approaches and selection of measurement tools used.

Heterogeneity of studies and intervention types limited the ability to draw statistical comparisons for specific outcomes, settings, and designs. Equally, minimal exclusion criteria in the study selection strategy also meant there was noteworthy variance in quality of studies observed across the entire sample of 71 papers. The authors acknowledge the limitations of using p-values as the primary measurement of significance and success. The lack of studies reporting on effect sizes (n = 13) in addition to the variance in study quality is a limitation of the review. However, the approach taken in this review, to include those studies with mixed outcomes and those reporting intended outcomes regardless of the p-value obtained, has allowed for an all-encompassing snapshot of the work happening and the extrapolation of strategies that have previously not been identified across such a broad spectrum of studies targeting gender norms and stereotypes.

An additional constraint was the inclusion of studies reported in English only. Despite being outside the scope of this review it is acknowledged that inclusion of non-English articles is necessary to obtain a comprehensive understanding of the literature.

The broad aim of the review and search strategy will have also inevitably resulted in some studies being missed. It was noted at the beginning of the paper that the framing of the research question was expected to impact the types of interventions captured. This was the case when considering the final list of included studies, in particular the relative absence of tertiary prevention interventions featured, such as those looking at men's behaviour change programs. This could in part account for the scarcity of interventions focused on behaviour change as opposed to the pre-cursors of attitudes and norms.

This review found that interventions using direct participant education interventions were the most common approach to raising awareness, dismantling harmful gender stereotypes and norms and shifting attitudes and beliefs towards more equitable gender norms. However due to the lack of follow-up data collected and reported, these changes can only be attributable to the short-term, with a need for further research into the longevity of these outcomes. Future research in this area needs to ensure the use of sound and consistent measurement tools, including avoiding a reliance solely on self-report measures for behaviour change (e.g. use of observations, key informant interviews, etc.), and more longitudinal data collection and follow-up.

When it comes to content design, as noted at the start of the paper, there is growing focus on the use and evaluation of gender-transformative interventions when engaging in gender equality efforts [ 1 , 2 , 6 , 128 ]. This review however found a distinct lack of engagement with this targeted approach, providing an opportunity for practitioners to explore this to strengthen engagement and impact of interventions (see 1 for a review of gender-transformative interventions working with young people). The scope of this review did not allow for further investigation to be undertaken to explore the gender approaches taken in the 61 studies which did not state their gender approach. There is scope for future investigation of this nature however in consultation with study authors.

An all-encompassing review, such as this one, allows for comparisons across intervention types and focus, such as those targeted at reducing violence or improving sexual and reproductive health behaviours. This broad approach allowed for the key finding that design, delivery and engagement strategies that feature in studies reporting successful outcomes, are successful regardless of the intervention focus thus widening the evidence based from which those researching and implementing interventions can draw. However, the establishment of this broad overview of interventions aimed at gendered stereotypes and norms highlights the current gap and opportunity for more targeted reviews in relation to these concepts.

5. Conclusion

Several characteristics supporting intervention success have been found based on the evidence examined in this review. The findings suggest that when planning, designing and developing interventions aimed at addressing rigid gender stereotypes and norms participant sex should help inform the intervention type chosen. Multi-session interventions are more effective than single or one-off sessions, and the use of additional strengthening strategies such as peer engagement and leadership, addressing multiple levels of the ecological framework, skilling up agents of change, modelling/role models, co-design with participants can support the achievement of intended outcomes. Longitudinal data collection is currently lacking but needed, and when seeking to extend the impact of an intervention to include behaviour change there is currently too much reliance on self-report data, which is subject to bias (e.g. social desirability).

When it comes to engaging men and boys, this review indicates that interventions have a greater chance of success when using peer-based learning in education programs, involving participants in the design and development, and the use of peer delivery and leadership. Ensuring clear learning objectives and outcomes in relation to specific types of norms, stereotypes and behaviours being addressed is crucial in making sure evaluation accurately captures these things. Practitioners need to be cognisant of breaking down stereotypes amongst men (not just between genders), as well as the need for extra attention to be paid in shifting some of the more deeply and culturally entrenched stereotypes and norms. More research is needed into the efficacy of working with male only cohorts, and care taken that rigid stereotypes and norms are not inadvertently reinforced when doing so.

Declarations

Author contribution statement.

Rebecca Stewart: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Wrote the paper.

Breanna Wright: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data.

Liam Smith, Steven Roberts, Natalie Russell: Conceived and designed the experiments.

Funding statement

This work was supported by Australian Government Research Training Program and the Victorian Health Promotion Foundation (VicHealth).

Data availability statement

Declaration of interests statement.

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

Acknowledgements

This research was completed as part of a PhD undertaken at Monash University.

Gender inequality in childhood: toward a life course perspective

Gender issues • vol/iss. 19 • published in 2001 • pages: 61-86 •   cite, by baunach, dawn michelle.

The lack of women's participation in politics/public life will be positively associated with adult gender inequality but not childhood gender inequality (66).

Adulthood gender inequality was significantly correlated with lack of women's groups (rho= .71) and lack of women's participation (rho= .87). Childhood gender inequality was not significantly correlated.

Related Hypotheses

May 31, 2024

10 min read

Math Can Help Solve Social Justice Problems

Mathematicians are working on ways to use their field to tackle major social issues, such as social inequality and the need for gender equity

By Rachel Crowell & Nature magazine

Human Head and Equal Sign Formed by Human Crowd on White Background

MicroStockHub/Getty Images

When Carrie Diaz Eaton trained as a mathematician, they didn’t expect their career to involve social-justice research. Growing up in Providence, Rhode Island, Diaz Eaton first saw social justice in action when their father, who’s from Peru, helped other Spanish-speaking immigrants to settle in the United States.

But it would be decades before Diaz Eaton would forge a professional path to use their mathematical expertise to study social-justice issues. Eventually, after years of moving around for education and training, that journey brought them back to Providence, where they collaborated with the Woonasquatucket River Watershed Council on projects focused on preserving the local environment of the river’s drainage basin, and bolstering resources for the surrounding, often underserved communities.

By “thinking like a mathematician” and leaning on data analysis, data science and visualization skills, they found that their expertise was needed in surprising ways, says Diaz Eaton, who is now executive director of the Institute for a Racially Just, Inclusive, and Open STEM Education at Bates College in Lewiston, Maine.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing . By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

For example, the council identified a need to help local people to better connect with community resources. “Even though health care and education don’t seem to fall under the purview of a watershed council, these are all interrelated issues,” Diaz Eaton says. Air pollution can contribute to asthma attacks, for example. In one project, Diaz Eaton and their collaborators built a quiz to help community members to choose the right health-care option, depending on the nature of their illness or injury, immigration status and health-insurance coverage.

“One of the things that makes us mathematicians, is our skills in logic and the questioning of assumptions”, and creating that quiz “was an example of logic at play”, requiring a logic map of cases and all of the possible branches of decision-making to make an effective quiz, they say.

Maths might seem an unlikely bedfellow for social-justice research. But applying the rigour of the field is turning out to be a promising approach for identifying, and sometimes even implementing, fruitful solutions for social problems.

Mathematicians can experience first-hand the messiness and complexity — and satisfaction — of applying maths to problems that affect people and their communities. Trying to work out how to help people access much-needed resources, reduce violence in communities or boost gender equity requires different technical skills, ways of thinking and professional collaborations compared with breaking new ground in pure maths. Even for an applied mathematician like Diaz Eaton, transitioning to working on social-justice applications brings fresh challenges.

Mathematicians say that social-justice research is difficult yet fulfilling — these projects are worth taking on because of their tremendous potential for creating real-world solutions for people and the planet.

Data-driven research

Mathematicians are digging into issues that range from social inequality and health-care access to racial profiling and predictive policing. However, the scope of their research is limited by their access to the data, says Omayra Ortega, an applied mathematician and mathematical epidemiologist at Sonoma State University in Rohnert Park, California. “There has to be that measured information,” Ortega says.

Fortunately, data for social issues abound. “Our society is collecting data at a ridiculous pace,” Ortega notes. Her mathematical epidemiology work has examined which factors affect vaccine uptake in different communities. Her work has found, for example, that, in five years, a national rotavirus-vaccine programme in Egypt would reduce disease burden enough that the cost saving would offset 76% of the costs of the vaccine. “Whenever we’re talking about the distribution of resources, there’s that question of social justice: who gets the resources?” she says.

Lily Khadjavi’s journey with social-justice research began with an intriguing data set.

About 15 years ago, Khadjavi, a mathematician at Loyola Marymount University in Los Angeles, California, was “on the hunt for real-world data” for an undergraduate statistics class she was teaching. She wanted data that the students could crunch to “look at new information and pose their own questions”. She realized that Los Angeles Police Department (LAPD) traffic-stop data fit that description.

At that time, every time that LAPD officers stopped pedestrians or pulled over drivers, they were required to report stop data. Those data included “the perceived race or ethnicity of the person they had stopped”, Khadjavi notes.

When the students analysed the data, the results were memorable. “That was the first time I heard students do a computation absolutely correctly and then audibly gasp at their results,” she says. The data showed that one in every 5 or 6 police stops of Black male drivers resulted in a vehicle search — a rate that was more than triple the national average, which was about one out of every 20 stops for drivers of any race or ethnicity, says Khadjavi.

Her decision to incorporate that policing data into her class was a pivotal moment in Khadjavi’s career — it led to a key publication and years of building expertise in using maths to study racial profiling and police practice. She sits on California’s Racial Identity and Profiling Advisory Board , which makes policy recommendations to state and local agencies on how to eliminate racial profiling in law enforcement.

In 2023, she was awarded the Association for Women in Mathematics’ inaugural Mary & Alfie Gray Award for Social Justice, named after a mathematician couple who championed human rights and equity in maths and government.

Sometimes, gaining access to data is a matter of networking. One of Khadjavi’s colleagues shared Khadjavi’s pivotal article with specialists at the American Civil Liberties Union. In turn, these specialists shared key data obtained through public-records requests with Khadjavi and her colleague. “Getting access to that data really changed what we could analyse,” Khadjavi says. “[It] allowed us to shine a light on the experiences of civilians and police in hundreds of thousands of stops made every year in Los Angeles.”

The data-intensive nature of this research can be an adjustment for some mathematicians, requiring them to develop new skills and approach problems differently. Such was the case for Tian An Wong, a mathematician at the University of Michigan-Dearborn who trained in number theory and representation theory.

In 2020, Wong wanted to know more about the controversial issue of mathematicians collaborating with the police, which involves, in many cases, using mathematical modelling and data analysis to support policing activities. Some mathematicians were protesting about the practice as part of a larger wave of protests around systemic racism , following the killing of George Floyd by police in Minneapolis, Minnesota. Wong’s research led them to a technique called predictive policing, which Wong describes as “the use of historical crime and other data to predict where future crime will occur, and [to] allocate policing resources based on those predictions”.

Wong wanted to know whether the tactics that mathematicians use to support police work could instead be used to critique it. But first, they needed to gain some additional statistics and data analysis skills. To do so, Wong took an online introductory statistics course, re-familiarized themself with the Python programming language, and connected with colleagues trained in statistical methods. They also got used to reading research papers across several disciplines.

Currently, Wong applies those skills to investigating the policing effectiveness of a technology that automatically locates gunshots by sound. That technology has been deployed in parts of Detroit, Michigan, where community members and organizations have raised concerns about its multimillion-dollar cost and about whether such police surveillance makes a difference to public safety.

Getting the lay of the land

For some mathematicians, social-justice work is a natural extension of their career trajectories. “My choice of mathematical epidemiology was also partially born out of out of my love for social justice,” Ortega says. Mathematical epidemiologists apply maths to study disease occurrence in specific populations and how to mitigate disease spread. When Ortega’s PhD adviser mentioned that she could study the uptake of a then-new rotovirus vaccine in the mid-2000s, she was hooked.

Mathematicians, who decide to jump into studying social-justice issues anew, must do their homework and dedicate time to consider how best to collaborate with colleagues of diverse backgrounds.

Jonathan Dawes, an applied mathematician at the University of Bath, UK, investigates links between the United Nations’ Sustainable Development Goals (SDGs) and their associated target actions. Adopted in 2015, the SDGs are “a universal call to action to end poverty, protect the planet, and ensure that by 2030 all people enjoy peace and prosperity,” according to the United Nations , and each one has a number of targets.

“As a global agenda, it’s an invitation to everybody to get involved,” says Dawes. From a mathematical perspective, analysing connections in the complex system of SDGs “is a nice level of problem,” Dawes says. “You’ve got 17 Sustainable Development Goals. Between them, they have 169 targets. [That’s] an amount of data that isn’t very large in big-data terms, but just big enough that it’s quite hard to hold all of it in your head.”

Dawes’ interest in the SDGs was piqued when he read a 2015 review that focused on how making progress on individual goals could affect progress on the entire set. For instance, if progress is made on the goal to end poverty how does that affect progress on the goal to achieve quality education for all, as well as the other 15 SDGs?

“If there’s a network and you can put some numbers on the strengths and signs of the edges, then you’ve got a mathematized version of the problem,” Dawes says. Some of his results describe how the properties of the network change if one or more of the links is perturbed, much like an ecological food web. His work aims to identify hierarchies in the SDG networks, pinpointing which SDGs should be prioritized for the health of the entire system.

As Dawes dug into the SDGs, he realized that he needed to expand what he was reading to include different journals, including publications that were “written in very different ways”. That involved “trying to learn a new language”, he explains. He also kept up to date with the output of researchers and organizations doing important SDG-related work, such as the International Institute for Applied Systems Analysis in Laxenburg, Austria, and the Stockholm Environment Institute.

Dawes’ research showed that interactions between the SDGs mean that “there are lots of positive reinforcing effects between poverty, hunger, health care, education, gender equity and so on.” So, “it’s possible to lift all of those up” when progress is made on even one of the goals. With one exception: managing and protecting the oceans. Making progress on some of the other SDGs could, in some cases, stall progress for, or even harm, life below water.

Collaboration care

Because social-justice projects are often inherently cross-disciplinary, mathematicians studying social justice say it’s key in those cases to work with community leaders, activists or community members affected by the issues.

Getting acquainted with these stakeholders might not always feel comfortable or natural. For instance, when Dawes started his SDG research, he realized that he was entering a field in which researchers already knew each other, followed each other’s work and had decades of experience. “There’s a sense of being like an uninvited guest at a party,” Dawes says. He became more comfortable after talking with other researchers, who showed a genuine interest in what he brought to the discussion, and when his work was accepted by the field’s journals. Over time, he realized “the interdisciplinary space was big enough for all of us to contribute to”.

Even when mathematicians have been invited to join a team of social-justice researchers, they still must take care, because first impressions can set the tone.

Michael Small is an applied mathematician and director of the Data Institute at the University of Western Australia in Perth. For much of his career, Small focused on the behaviour of complex systems, or those with many simple interacting parts, and dynamical systems theory, which addresses physical and mechanical problems.

But when a former vice-chancellor at the university asked him whether he would meet with a group of psychiatrists and psychologists to discuss their research on mental health and suicide in young people, it transformed his research. After considering the potential social impact of better understanding the causes and risks of suicide in teenagers and younger children, and thinking about how the problem meshed well with his research in complex systems and ‘non-linear dynamics’, Small agreed to collaborate with the group.

The project has required Small to see beyond the numbers. For the children’s families, the young people are much more than a single data point. “If I go into the room [of mental-health professionals] just talking about mathematics, mathematics, mathematics, and how this is good because we can prove this really cool theorem, then I’m sure I will get push back,” he says. Instead, he notes, it’s important to be open to insights and potential solutions from other fields. Listening before talking can go a long way.

Small’s collaborative mindset has led him to other mental-health projects, such as the Transforming Indigenous Mental Health and Wellbeing project to establish culturally sensitive mental-health support for Indigenous Australians.

Career considerations

Mathematicians who engage in social-justice projects say that helping to create real-world change can be tremendously gratifying. Small wants “to work on problems that I think can do good” in the world. Spending time pursuing them “makes sense both as a technical challenge [and] as a social choice”, he says.

However, pursuing this line of maths research is not without career hurdles. “It can be very difficult to get [these kinds of] results published,” Small says. Although his university supports, and encourages, his mental-health research, most of his publications are related to his standard mathematics research. As such, he sees “a need for balance” between the two lines of research, because a paucity of publications can be a career deal breaker.

Diaz Eaton says that mathematicians pursuing social-justice research could experience varying degrees of support from their universities. “I’ve seen places where the work is supported, but it doesn’t count for tenure [or] it won’t help you on the job market,” they say.

Finding out whether social-justice research will be supported “is about having some really open and transparent conversations. Are the people who are going to write your recommendation letters going to see that work as scholarship?” Diaz Eaton notes.

All things considered, mathematicians should not feel daunted by wading into solving the world’s messy problems, Khadjavi says: “I would like people to follow their passions. It’s okay to start small.”

This article is reproduced with permission and was first published on May 22, 2024 .

ScienceDaily

'Lean In' messages can lower women's motivation to protest gender inequality

Women in leadership are often told to "Lean In," designed to be motivational messaging demonstrating that they are more confident, strategic and resilient to setback. However, new research indicates that such "lean in" messaging can hinder women's motivation to protest gender equality.

Popularised in a book by American technology executive Sherly Sandberg, the "Lean In" solution to gender inequality advises women that demonstrating personal resilience and perseverance in the face of setbacks is key to career advancement. Now, a new study led by the University of Exeter, Bath Spa University and the Australian National University has found that while such messages may provide inspiration for some, they can also reduce women's likelihood to protest gender discrimination. This effect could actually be hindering gender equality progress.

Published in Psychology of Women Quarterly , the study involved four experiments, Researchers examined women's motivation to protest gender inequality after exposure to "Lean In" messages promoting individual resilience. All the experiments were in the UK and involved more than 1,100 women who were either undergraduate students or employed women with university degrees. Women read about gender inequality, and then either read about resilience as key to promoting advancement (in line with "lean in" messaging), or participated in activities to build their own resilience by learning how to set flexible goals and maintain confidence.

The research found:

  • In three of four experiments, women in "Lean In" conditions were less willing to be part of protest action over gender inequality compared to those in a control condition who were not exposed to "Lean In" messages.
  • In two of the experiments, this effect occurred because women in "Lean In" conditions were less likely to believe that gender discrimination would affect their career prospects.
  • In one, this effect occurred because women in "Lean In" conditions also felt less angry about ongoing gender inequality.

Authors say the findings of this research highlight an unintended consequence of 'Lean In' messages and related individual resilience training for women that is offered as a remedy for gender inequality in the workplace -- that it can undermine women's recognition of, and willingness to protest about, the root causes of gender inequality: discrimination.

Lead author, Dr Renata Bongiorno, who conducted the studies while at the University of Exeter and is now Senior Lecturer in Psychology at Bath Spa University, said: "The popularity of the 'Lean In' movement speaks to the challenges women continue to face due to gender discrimination in the workplace.

"Women are understandably looking for ways to advance their careers despite the disproportionate setbacks they continue to experience compared to men.

"While the 'Lean In' solution offered by Sheryl Sandberg can feel empowering, a lack of individual resilience or perseverance is not the cause of women's poorer career progress.

"The messages lead to women assuming that gender discrimination will be less of a barrier to their career advancement. This false belief is concerning for progress because it is reducing women's willingness to protest the real causes of gender inequality.

"Progress and gains for women have historically been achieved through collective protest over gender discriminatory practices and policies, including pregnancy discrimination, a lack of affordable childcare, and workplace sexual harassment.

"Finding ways to effectively challenge these ongoing barriers should be a focus for feminism because they are the real causes of gender inequality in career outcomes."

  • Gender Difference
  • Racial Issues
  • Social Psychology
  • Relationships
  • Education and Employment
  • STEM Education
  • Social Issues
  • Privacy Issues
  • Cyber-bullying
  • Limbic system
  • Funding policies for science
  • Double blind
  • Bisexuality

Story Source:

Materials provided by University of Exeter . Original written by Louise Vennells. Note: Content may be edited for style and length.

Journal Reference :

  • Renata Bongiorno, Michelle K. Ryan, Olivier Gibson, Hannah Joyce. Neoliberal Feminism and Women's Protest Motivation . Psychology of Women Quarterly , 2024; DOI: 10.1177/03616843241238176

Cite This Page :

Explore More

  • Kinship and Ancestry of the Celts
  • How Statin Therapy May Prevent Cancer
  • Origins of 'Welsh Dragons' Exposed
  • Resting Brain: Neurons Rehearse for Future
  • Observing Single Molecules
  • A Greener, More Effective Way to Kill Termites
  • One Bright Spot Among Melting Glaciers
  • Martian Meteorites Inform Red Planet's Structure
  • Volcanic Events On Jupiter's Moon Io: High Res
  • What Negative Adjectives Mean to Your Brain

Trending Topics

Strange & offbeat.

  • Share full article

Advertisement

Subscriber-only Newsletter

Jessica Grose

The gender pay gap is a culture problem.

An illustration of a man in business attire reclining comfortably along the top of a one-dollar bill and a woman in business attire, holding a baby, reclining less comfortably on part of a torn dollar bill.

By Jessica Grose

Opinion Writer

American women made significant progress toward closing the gender pay gap in the second half of the 20th century, but that gap has barely budged over the past two decades. In 2022, according to Pew Research , “American women typically earned 82 cents for every dollar earned by men. That was about the same as in 2002, when they earned 80 cents to the dollar.”

In a country where women are now a (slight) majority of the college-educated labor force and the annual earnings median for college degree holders is 55 percent more than that of those with high school diplomas, the stickiness of this gap is frustrating. While there are several factors at play, one of the key contributors to the gap is what’s known as the motherhood penalty and the corresponding fatherhood premium: Women’s pay decreases when they have children, while men’s pay increases.

This dynamic isn’t just an American phenomenon. “In general, women don’t recover. They don’t catch back up to men, even many years after first childbirth,” said Henrik Kleven, the lead author of a 2023 National Bureau of Economic Research working paper, “ The Child Penalty Atlas ,” in which he and his co-authors, Camille Landais and Gabriel Leite-Mariante, reviewed wage gap data from 134 countries. “Now, that basic pattern is true essentially everywhere, but the quantitative magnitudes of the effects vary greatly across countries,” he told me recently.

Somewhat surprisingly to me, his research, which builds on years of earlier scholarship, suggests that a country’s family policy has relatively little to do with how big the parenthood pay gap is. A society’s culture and norms seem to be much bigger factors in how big the motherhood penalty is: The more egalitarian the culture, the lower the gap.

Kleven told me that sometimes countries that seem superficially similar in terms of income levels, development, family policy and geography have very different pay gaps. (We see the same interplay in American states , with the child penalty 21 percent in Vermont and 61 percent in Utah.) Even countries right next to each other can have wildly different gaps. Spain’s child pay gap is much bigger than Portugal’s, and Germany’s is bigger than Denmark’s. Central European countries have “some of the highest child penalties we see anywhere in the world,” Kleven said. Scandinavian countries have some of the lowest.

Let’s look at Austria. It has generous family leave policies and child care subsidies , especially by American standards. But in a 2022 working paper, “Do Family Policies Reduce Gender Inequality? Evidence From 60 Years of Policy Experimentation,” Kleven and his co-authors’ analysis showed “that the enormous expansions of parental leave and child care subsidies have had virtually no impact on gender convergence.” Despite an influx of Austrian women into the work force in the past 50 or so years, the relatively large child penalty can be at least partly explained by gender attitudes and norms.

According to data from the 2012 wave of the International Social Survey Program that was analyzed in the paper, more than 60 percent of Austrians agreed that when a mother works for pay, her young children probably suffer. By comparison, those in the more egalitarian Scandinavian countries felt differently. Fewer than 20 percent of Danes agreed that children suffer when their mothers work outside the home. Though that data is more than a decade old, those kinds of attitudes die hard and are backed up by newer research .

Speaking of Danes: A new paper from economists at Lund University, the University of Amsterdam and Aarhus University found that for a subset of Danish women, the motherhood penalty disappeared in the long run and, in limited circumstances, turned into a premium. The paper followed the earnings trajectory of more than 18,000 child-free women who received in vitro fertilization treatment in Denmark, where, Time magazine reported in 2019, “the cost of three cycles of I.V.F. for a first child is covered by the tax-financed public health service” for women up to the age of 40. The study’s authors then compared the women who had successful first-round I.V.F. treatments and ended up with children and the women who didn’t.

The women who had successful I.V.F. treatments had a near-term child penalty, and their earnings dropped below those of the women with unsuccessful treatments. “By Year 10,” though, according to the study, “successfully treated women earn as much as unsuccessfully treated women. And by Year 15, successfully treated women earn slightly more. This earnings advantage persists throughout the remainder of the study period.” Men’s earnings weren’t affected, regardless of whether they became parents.

In my mind, one limitation in interpreting these findings is that I.V.F. pregnancies are planned, while over 40 percent of pregnancies in the United States, for example, are unplanned . One could imagine that the women pursuing I.V.F. at various income levels might be better set up to weather a career interruption than women who have surprise pregnancies. But it is still a thought-provoking finding that complicates previous child gap research.

In the United States, where gender norms are less progressive than in Scandinavia and the “costs for a single cycle of I.V.F. have recently been estimated to range from $15,000 to $20,000 and can exceed $30,000,” according to the Department of Health and Human Services, we find a very different experience with the motherhood pay gap than in Denmark. And it’s a much less happy picture.

A paper published last year in the scientific journal PNAS looked at 22 years of administrative data from the United States and found “surprisingly robust” motherhood penalties, even, unfortunately, in circumstances in which you might expect that the penalty would be slim, like in female-breadwinner families:

On average, women earn 57 percent more than men in these female-breadwinner families. Were couples simply seeking to maximize household income conditional on a certain amount of time investment in children, we would expect to see fatherhood penalties. Instead, we see one of the largest motherhood penalties in female-breadwinner families. Indeed, higher-earning women experience a 60 percent drop from prechildbirth earnings relative to their lower-earning male partner and the highest of our various sample stratifications. The pattern we find for the United States is the polar opposite of that for Sweden.

There was also no difference for mothers in companies that were female led or had a majority of female employees. “If anything,” according to the authors, “this motherhood penalty grows faster over time at firms headed by women. On the whole, our findings are discouraging even relative to the existing work on motherhood penalties.”

I asked one of the paper’s co-authors, Cecilia Machado , an economist at the Getulio Vargas Foundation, to summarize the state of the motherhood penalty in the United States. If we wanted to take steps to improve the pay gap as a society, what would we do? Via email, she said that there might be a limited scope of what public policy and workplace policy can do. But she added that federal and workplace policy that encouraged both men and women to take paid parental leave could help; creating the political conditions for involved fatherhood in a child’s first year can set egalitarian patterns that last a lifetime. Still, Machado said, “Both of these combined are important policies, but maybe them alone, by themselves, will not work if we don’t see culture and gender norms changing.”

My take is that we’re in a time when cultural norms around motherhood in the United States seem particularly contradictory and in flux. While a record high percentage of women with children under 5 work, a large subset of Americans still thinks society would be better off if they didn’t.

In an email, Jessica Calarco, a sociologist at the University of Wisconsin, Madison, and the author of “ Holding It Together: How Women Became America’s Safety Net, ” said:

I asked 2,000 parents from across the U.S., “Do you think children are better off if their mother is home and doesn’t hold a job, or are children just as well off if their mother works for pay?” Fifty-two percent of dads and 42 percent of moms said it’s better for kids if their moms aren’t working for pay. Those attitudes are somewhat more common among Republicans (60 percent of dads and 48 percent of moms), but they’re pretty common among Democrats, too (53 percent of dads and 41 percent of moms).

Until we reconcile our cultural ambivalence toward working mothers, I don’t think the gap is going to get any better. Maybe in another 20 years, we’ll get another two cents.

An earlier version of this article included a quotation from Jessica Calarco that misstated a finding from her work. The percentage of mothers surveyed who said it’s better for children if their mothers don’t work for pay is 42 percent, not 47 percent.

How we handle corrections

Jessica Grose is an Opinion writer for The Times, covering family, religion, education, culture and the way we live now.

Advertisement

Advertisement

The Income Inequality Hypothesis Revisited: Assessing the Hypothesis Using Four Methodological Approaches

  • Open access
  • Published: 04 March 2016
  • Volume 131 , pages 1015–1033, ( 2017 )

Cite this article

You have full access to this open access article

research hypothesis on gender inequality

  • Nigel Kragten   ORCID: orcid.org/0000-0001-6347-5394 1 &
  • Jesper Rözer 1  

13k Accesses

29 Citations

8 Altmetric

Explore all metrics

The income inequality hypothesis states that income inequality has a negative effect on individual’s health, partially because it reduces social trust. This article aims to critically assess the income inequality hypothesis by comparing several analytical strategies, namely OLS regression, multilevel regression, fixed effects models and fixed effects models using pseudo panel data. To test the hypothesis, data from two studies conducted between 1981 and 2014 were combined: the World Values Survey and the European Values Study. Three frequently used measures of health were taken into account. In the OLS and multilevel models, income inequality was often associated with better health, whereas in the fixed effects and pseudo panel data, income inequality was associated with poorer health, suggesting that the unexpected results of the OLS and multilevel methods might be explained by unobserved confounders. Furthermore, in almost all of the models, social trust mediates the relationship between income inequality and health, showing the importance of this mechanism. Interestingly, the pseudo panel data offer the strongest support for the income inequality hypothesis, suggesting that better controlling for confounding factors and/or more carefully monitoring cohort effects, may result in a better understanding whether and how income inequality can be harmful for people’s health.

Similar content being viewed by others

Some microeconometric evidence on the relationship between health and income.

research hypothesis on gender inequality

Revisiting the Effect of Income on Health in Europe: Evidence from the 8th Round of the European Social Survey

research hypothesis on gender inequality

Income Inequality in the Great Recession did not Harm Subjective Health in Europe, 2003–2012

Avoid common mistakes on your manuscript.

1 Introduction

With rising levels of income inequality within numerous developed and developing countries, there is a growing pool of literature seeking to better understand the implications of inequality on society. One particular issue that has received much attention is the relationship between income inequality and health. The income inequality hypothesis (IIH) states that high income inequality is detrimental to human health, in particular because it has negative psychosocial effects (Gold et al. 2002 ; Pickett and Wilkinson 2015 ; Wilkinson 1996 ) and reduces social trust (Elgar 2010 ; Gold et al. 2002 ; Kawachi and Kennedy 1997 ). Wilkinson and various other authors have made strong statements in favour of the IIH (Kawachi and Kennedy 1999 ; Wilkinson and Pickett 2006 , 2009 ). Other authors are not as convinced by this relationship (e.g., Präg et al. 2014 ; Qi 2012 ; Zagorski et al. 2014 ). One possible reason for this controversy is that the relationship has been investigated with a wide variety of methods, which may allow for different results and answers.

This study focuses on moving the literature forward by focusing specifically on empirical rigor. We compare three common methods used to test the IIH. First, we use OLS regression to compare levels of income inequality and health across countries (e.g., Gold et al. 2002 ; Kawachi and Kennedy 1997 ; Wilkinson and Pickett 2007 ). Second, we use multilevel models to compare the health of individual citizens across countries, while controlling for individual-level factors and modelling individuals not as independent observations but as observations nested within countries (e.g., Kondo et al. 2009 ; Layte 2012 ; Rözer and Volker 2015 ). Third, fixed effects models are used to compare changes in income inequality within countries (e.g., Pop et al. 2013 ; Avendano 2012 ; Babones 2008 ; Mellor and Milyo 2001 ). When looking at within-country variation, between-country variation can be ignored, which is likely to be affected by unmeasured country characteristics, such as cultural differences between countries or differences in the interpretation of key questions about health and social trust (Dorling and Barford 2009 ).

In addition, we propose an alternative method to test the IIH by analysing pseudo panel data. In our case, we analyse subgroups of cohorts. By analysing cohort groups and adding country-level fixed effects, cohorts within countries are tracked over time. This has several advantages. First, pseudo panel data is the best substitute for longitudinal cross-sectional data that can be obtained without actually following individuals. Hence, it moves the focus from macro to micro data, unlike classical fixed effects models. Second, changes often occur over time within countries because cohorts are replaced over time (Mannheim 1952 ; Rözer and Volker 2015 ). Thus, changes might not be accounted for by period but rather by cohort effects. Pseudo panel data avoids this problem because cohorts are tracked over time. Third, following cohorts over time enables researchers to control for unmeasured confounders, such as wealth differences or motorization effects that differ across cohorts. Fourth, by making use of subgroups, more information from the original data is used than would be with standard fixed effects models. Thus, with this newly applied method we can even more rigorously test the IIH.

After a series of papers of Verbeek (Verbeek and Vella 2005 ; Verbeek 2007 ), pseudo panels have become increasingly popular in economic oriented studies over the last years. For example, it has been used to examine the preferences for redistribution (Olivera 2012 ), the vulnerability for poverty (Échevin 2013 ), price elasticities of alcohol demand (Meng et al. 2014 ), and to study poverty dynamics (Dang and Lanjouw 2013 ). However, as far as we know, this type of analyses has not yet been conducted in articles that test the IIH (nor in sociological journals in general).

Thus, OLS, multilevel, country fixed effects and fixed effects models with pseudo panel data were compared to determine the extent to which income inequality affects health. The measures of health that we considered were self-rated health, mortality rate and life expectancy.

2 The Income Inequality Hypothesis

Advocates of the income inequality hypothesis (IIH) have argued that income inequality has a direct effect on health by affecting people’s psychosocial well-being. In addition, social trust is thought to mediate the association between income inequality and health. In this section, we elaborate briefly on these arguments.

The most common explanation for the effect of income inequality on health is the psychosocial argument, which suggests that, as a result of income inequality, psychosocial stress rises, which reduces the state of health in society (e.g., Kawachi et al. 1997 ; Layte and Whelan 2014 ; Lynch et al. 2014 ; Subramanian and Kawachi 2004 ; Whelan and Maître 2013 ). Unequal societies show greater differences between individuals, which produces higher levels of competition. Higher levels of competition are thought to increase frustration and stress among people within the population. This affects all people in society because the competition flows through all strata (Wilkinson 1996 ). Individuals may try to cope with stress by seeking rewards from other sources as they become more susceptible to unhealthy food or addictions, which results in more health related problems (Rözer and Kraaykamp 2012 ).

Social trust is expected to be one of the most important mediators of the relationship between income inequality and health (e.g., Kawachi and Kennedy 1997 ; Delhey and Dragolov 2013 ; Layte 2012 ; Rözer et al. 2016 ). Generally, social trust can be described as the state that exists between individuals where rational consideration is no longer possible (Lewis and Weigert 1985 ). Social trust makes people dare to step outside their own world, which enhances their interest in other people in general. This may lead to mutual understanding, feelings of solidarity and, consequently, philanthropic behaviour (Rothstein and Uslaner 2005 ). Therefore, it is regarded as one of the most important aspects of social capital, regardless of whether it is a property of individuals (e.g., Coleman 1988 ) or nations (e.g., Putnam 2001 ).

Income inequality can decrease social trust because it creates dissimilarities between people, which can lead to a society becoming more heterogeneous. Because people are more inclined to trust those who are similar to them, the ‘aversion to heterogeneity’ principle states that, when there is greater heterogeneity in society, people will trust others less (Zak and Knack 2001 ). Furthermore, the ‘social success and well-being’ theory explains that, in more unequal countries, there are more poor people who are inclined to trust others less. This is because they are, by necessity, more prone to take risks (Delhey and Newton 2003 ). Hence, poor people will be more inclined to seek certainty, and trust is, of course, never certain (Kahneman and Tversky 1979 ). In addition, when individuals are surrounded by non-trusting people, they may become less inclined to help others (Alesina and La Ferrara 2002 ). In this way, low trust may spread across society.

High levels of social trust are associated with better individual health (e.g., Jen et al. 2010 ). At the individual level, trust may create a social context or environment that is more peaceful and less stressful (Takahashi et al. 2005 ). In addition, social trust increases reciprocity and can therefore be an important resource and thus a form of social capital, which might help people live healthier (Coleman 1988 ). At a societal level, social trust facilitates cooperation by providing a safer social context (Kawachi and Kennedy 1999 ). In a society with high social trust, there will be more obligations and expectations that will be met. For instance, this may translate into social support and ‘diffusion of innovations’ (Kawachi and Kennedy 1999 ). This suggests that innovative behaviour (e.g., information channels or preventive services) diffuses much more rapidly in communities that are cohesive and in which members know and trust one another.

The compositional and policy arguments are well-known alternative explanations for inequality effects. These should be mentioned, but they can be regarded as not directly explaining the effect of income inequality on health. According to the composition argument, income inequality logically implies that there are more poor people (in absolute terms) in societies, who have relatively poor health. Thus, the overall standards of health within a country with high inequality will be low. In addition, the poor might influence other people with unhealthy lifestyles, crime or other activities that are associated with having a low income (Kawachi and Kennedy 1997 ). According to the policy argument, unequal societies are associated with underinvestment in social policies and processes, which may result in lower overall health (Coburn 2000 ; Veenstra 2002 ; Subramanian and Kawachi 2004 ). In our analyses, we control for these two arguments.

3 Data, Measurements and Methods

3.1 data collection.

To test our model, we combined data from the World Values Survey (WVS) and the European Values Study (EVS) (WVS 1981 –2008; EVS 2011 ; WVS 2014 ). These datasets consist of national cross-sectional data collected over time, and they provide information on health, social trust and several other characteristics at the individual level. The surveys were designed to gather information from a representative random sample of the adult population within each country. However, the exact method of gathering the data was not standardized, so the samples were not always random and representative. For example, sampling techniques differ between countries, such that non-response varies considerably (from 25 to 95 percent) and can sometimes not be calculated because quota sampling is applied. Although there is a slight oversampling of Western countries, all continents are well represented. The third wave of the EVS did not include questions on individual health and is therefore excluded from the analysis. This dataset contains data from 1980 up until 2014 and information on 80 countries (see Supplementary Table). The size of the dataset is particularly important in this study because the key variables change little over time. Using smaller samples would lead to reduced variance over time, which would make it especially hard to estimate the effects in fixed effects models.

3.2 Measurements

3.2.1 dependent variable.

Individual health was measured as self - rated health . Self-rated health has been the most frequently used measure of health in previous research (Kondo et al. 2009 ). It is measured using the following question: “All in all, how would you describe your state of health these days?” The respondents choose from ‘very poor’ (1), ‘poor’ (2), ‘fair’ (3), ‘good’ (4) and ‘very good’ (5). Self-rated health as a measurement of health has been tested in previous studies on retest reliability and compared with other measurements of health, such as mortality rate (Clarke and Ryan 2006 ; Lundberg and Manderbacka 1996 ). The results of these studies indicate that self-rated health is a reliable measurement of health.

In addition to individual measurements of health, national characteristics of health were considered. The World Bank provides information on crude death rates (i.e., mortality rates) and life expectancy at birth. In previous research, these measurements of health have often been used to test the IIH because of their availability and generalizability (e.g., Judge 1995 ; Kawachi et al. 1997 ), which make them interesting as complements to self-rated health. Footnote 1

3.2.2 Independent Variables

Data on income inequality were obtained from the Standardized World Income Inequality Database (SWIID) (Solt 2009 ). This dataset provides comparable Gini coefficients for the largest possible sample of countries and years available. It provides us with Gini index data that were initially measured with several different methods and standardized for use in national comparisons, reducing many of the problems with previous data used in comparative studies (United Nations University (UNU)-WIDER 2008 ).

The concept of social trust is measured with the same widely used standard question in the EVS and WVS (e.g., Delhey and Dragolov 2013 ; Layte 2012 ; Rözer and Kraaykamp 2012 ). “Generally speaking, would you say that most people can be trusted or that you can’t be too careful in dealing with people?” The respondents chose from two answers: “you cannot be too careful” (0) and “most people can be trusted” (1).

3.2.3 Control Variables

National characteristics of wealth were used to control for absolute income effects at the national level. Measures of Gross Domestic Product (GDP) per capita were taken from the extensive Penn’s World Trade Tables (Feenstra et al. 2013 ). The logarithm of GDP was used in the analyses because wealth becomes less important for health as wealth increases. Secondly, the human capital index from the Penn’s World Trade Tables was included to account for the expected association between a country’s stock of human capital and its income inequality. The stock of human capital in a country is clearly related to the health of its citizens. Additionally, age was aggregated as the mean age of a country wave for the aggregated analyses in order to control for important demographic differences between countries (and cohorts), which may affect people’s health.

In the multilevel analysis, several individual level variables were used as control variables. The first individual level control variable was relative perceived income. Respondents were asked to grade their income position in their country by asking the following question: ‘On this card is an income scale on which 1 indicates the lowest income group and 10 the highest income group in your country. We would like to know in what group your household is. Please, specify the appropriate number, counting all wages, salaries, pensions and other incomes that come in.’ Answer categories were rescaled, ranging from (0) ‘lowest group’ to (9) ‘highest group’. Second, the level of education was measured as the highest educational level achieved, ranging from (0) primary education not finished, to (7) finished university. When other information was available, because respondents were asked about their education in years instead of in categories, the missing values for the highest educational level achieved were replaced by using education scores measured in years. Footnote 2 Furthermore, age, gender, marital status, employment status and religious denomination are also used as control variables. Multiple imputation was used to fill in the missing values for these individual level variables. There was a particularly large proportion of missing values for income (21.9 %).

3.3 Analytical Strategy

In line with previous research, we applied three often-used statistical methods to study the IIH: OLS, multilevel and fixed effects regression. These methods allowed us to determine the extent to which previous outcomes were affected by their choice of method. In addition, to build upon the standard fixed effects models, we used fixed effects models with pseudo panel data.

First, as a base model, OLS regressions were conducted on aggregated data at the country wave level. In the OLS and fixed effects model, the variables drawn from the EVS and WVS (i.e., age, self-rated health and social trust) were aggregated by taking the mean score for each country wave before the analysis. OLS regression can be used to determine whether there is an association between country-level health and income inequality. The well-known OLS model can be represented as follows:

The subscript j indicates countries, \(Y_{j}\) denotes the outcome, \(\beta_{0}\) represents the intercept, and \(X_{j}\) stands for the covariates income inequality and social trust, and the control variables. Among the control variables, dummy variables were added to control for effects across time, which can also be regarded as fixed effects.

Secondly, multilevel regressions, or random effects models, are used (Hox 2010 ). In these models, information was based on the individual (i) and country (j) levels. An extra error term ( \(\mu_{j}\) ) separated the components of sampling variability attributable to individual-level and country-level effects, and the term accounted for the possibility that observations within countries may be correlated. This avoided violating the assumption of independence of observations, and correspondingly assured that standard errors were correctly estimated. The extent to which individuals belonging to the same country resemble one another can be expressed by the intra - class correlation , which is the proportion of individual-level variance out of the total variance \((\rho = \varepsilon_{ij} /(\mu_{j} + \varepsilon_{ij} ))\) . Using country fixed effects to control for individuals being nested within countries is of course possible, but it implies that national level variables, like the Gini, cannot be added to the model. Only self-rated health was used as a dependent variable because this is the only dependent variable that was measured on the individual-level. Multilevel regression allows the researcher to study whether differences in income inequality at the country-level are related to health outcomes at the individual-level. The model is expressed as follows:

Here, \(X_{ij}\) represents individual-level covariates, and \(X_{j}\) represents country-level covariates, \(\beta_{0}\) denotes the intercept, \(\mu_{j}\) is the error term for the country-level, and \(\varepsilon_{ij}\) is the error term for the individual level. Again, dummy variables were used to simplify the model and to capture the effects of unmeasured time effects that vary across waves but not across countries and individuals in the same wave.

Third, fixed effects models were estimated. Data were again aggregated at the country-level. In the fixed effects models, time series were nested within countries. As a result, only within-country variance was analysed. Additionally, time fixed effects were used to model time-specific effects that did not vary across countries. By using country fixed effects ( \(\beta_{0j}\) ), we examined changes within countries and were able to control for country-level unobserved heterogeneity. Estimators of fixed effects models are, thus, not contaminated with spurious effects of stable, unmeasured country characteristics (Verbeek 2004 ). Footnote 3 Effects that vary across time are still controlled by using dummy variables. These represent the average change in health across time. Fixed effects models can be used to determine whether changes within countries, with respect to levels of income inequality, are related to changes in health at the country-level:

Finally, pseudo panel datasets were created so that subgroups instead of populations become the unit of analysis (Deaton 1985 ; Moffitt 1993 ; Verbeek and Vella 2005 ; Verbeek 2007 ). We analysed subgroups of cohorts (c), but analysing other time-invariant subgroups is also possible (e.g., race or gender). By adding country ( \(\beta_{0j}\) ) and cohort ( \(\beta_{0c}\) ) fixed effects, we followed cohorts over time. Footnote 4 , Footnote 5 We followed procedures used in previous studies (e.g., Meng et al. 2014 ) and divided the cohort groups into timeslots of 5 years. The first and the last cohorts included outliers on both ends. Periods of 5 years were used to ensure sufficient variety across cohorts at the levels of income inequality and health, and to allow the cohort averages to be based on sufficient sample sizes. Smaller and somewhat larger cohorts did not alter our results. As self-rated health was our only variable that varied across individuals and cohorts, it was the only dependent variable used for the multilevel and the cohort fixed effects models. The model can be written as follows:

To test for the mediating effect of trust in our models, we employed two strategies. In the multilevel setting, we used Mplus 7.0 with bootstrapping to account for the non-normal distribution of the indirect effect (Preacher and Hayes 2008 ). Because Mplus and other structural equation modelling programs have problems mimicking fixed effects models with unbalanced longitudinal datasets (Allison 2009 ), the Mediation package for R (Tingley et al. 2013 ) was used to estimate the indirect effects of our OLS and fixed effects models. Again, bootstrapping was performed to test the indirect effect of trust.

To make the models more comparable, analyses that used self-rated health as the dependent variable were performed with the same data, consisting of 80 countries and 224 country-waves. Table  1 shows the descriptive statistics of the models.

4.1 OLS Regression

First, OLS regressions were performed. Table  2 shows that, unexpectedly, income inequality had a positive effect on self-rated health and a negative effect on mortality rate. The effect of income inequality on life expectancy was negative, as expected, indicating that the country with the lowest level of income inequality had a life expectancy 10 years higher than the country with the highest level of income inequality. Social trust was associated with better health for all health measures, and it significantly mediated income inequality in all three models. GDP was positively associated with mortality rate and life expectancy. Average age showed a negative effect on self-rated health and mortality rate, but life expectancy showed no significant effects. The human capital index showed a positive association only with life expectancy.

4.2 Multilevel Model

Table  3 shows the outcomes of the multilevel models. The intra-class correlation shows that about nine percent of the variance in self-rated health is accounted for by the clustering within countries. This justifies the use of a multilevel approach for this type of data. Again, income inequality was associated with better self-rated health; the country with the highest level of inequality was predicted to have a .215 higher average score for self-rated health compared to the country with the lowest level of inequality. This runs counter to the IIH. On the individual- and country-level, social trust was positively associated with self-rated health, although the effect of country-level social trust was substantially larger ( b  = .649, p  < .001 and b  = .131, p  < .001). In addition, income inequality was associated with lower national social trust. Hence, in line with the IIH, the multilevel models suggest that social trust mediates the association between income inequality and health. This effect would not change if, instead of an indirect effect of national social trust, we tested the indirect effect of individual social trust.

Moving towards the control variables, living in a wealthier country, with a higher GDP per capita, was again positively associated with better health. The other country-level control variables were not significant. Our individual-level control variables are not presented in order to make comparisons between models easier (but are available upon request). However, they show well-known effects. Education, being female, and income all show positive effects on self-rated health. The work activity variables all show negative effects compared to individuals with full-time jobs. Married people also have higher scores than all other marital statuses, except for singles, who score equally well. Moreover, the model demonstrates that Protestants and Buddhists score higher on self-rated health than people without religion, whereas Orthodox adherents score lower.

4.3 Fixed Effects Models

Table  4 shows the results for fixed effects models. In line with the OLS models, and as the IIH predicts, income inequality is still associated with lower life expectancy. Interestingly, compared to the OLS and multilevel models, the signs of income inequality and self-rated health, and income inequality and mortality, are reversed and in line with the IIH: income inequality is associated with lower self-rated health and higher mortality rates. Hence, it seems that the better we control for between country variance, the more support we find for the IIH. Social trust is still associated with good health (although not significant with respect to life expectancy), and it still mediates the relationship between income inequality and health.

The control variables show more significant associations with our health measures. When we are able to control for between-country variation, associations with health become visible. The effect of average age becomes negatively associated with self-rated health. Thus, older people report poorer health. In addition, human capital becomes positively associated with self-rated health and borderline significantly related with life expectancy. The more countries invest in human capital, for instance through schooling, the more the health of citizens improves. GDP remains negatively associated with health, but this effect is not significant with respect to self-rated health.

4.4 Fixed Effects Models with Pseudo Panel Data

Table  5 shows the results of the fixed effects models applied to the pseudo panel data. When the effect of income inequality on health was positive in the OLS and multilevel models and negative but borderline significant in the fixed effects, it became more negative and significant in the pseudo panel data. Hence, the better we control for between-country variation and the closer we follow groups through society, the more our models support the IIH. Furthermore, social trust still mediates the relationship between income inequality and health, as predicted by the IIH.

The control variables show effects similar to the fixed effects models. The human capital index has a positive effect on self-rated health, and people in older societies report lower health. GDP remains negatively, but not significantly, associated with lower self-rated health.

4.5 Robustness Checks

Much of the debate on income inequality focuses on the differences between developed and developing countries. In developed countries, the effects of wealth are expected to be less important; yet, other socio-psychological factors, such as stress induced by income inequality, are expected to be more important (Wilkinson and Pickett 2009 ). Therefore, separate analyses were conducted using a selection of high income countries only, as indicated by a GNP per capita of $12,746 or more (World Bank 2015 ). The results are summarized in Table  6 . In the OLS models, contrary to the IIH, income inequality was associated with lower mortality rates, and no significant effects were found for the other dependent variables. In the fixed effects models, income inequality was associated with higher self-rated health, lower mortality rate and higher life expectancy for high-income countries, and no significant effects were found for the non-high-income countries. Footnote 6 Thus, our main results appear to be robust even when tested on different sub-samples of the data.

Additionally, previous studies suggest that the effects of income inequality may be delayed up to 15 years and may account for the etiological period for income inequality to affect health (Blakely et al. 2000 ; Karlsdotter et al. 2012 ; Mellor and Milyo 2003 ). Therefore, lagged effects of income inequality for 5-, 10- and 15-years were tested in the models. However, no notable differences from our main results were observed. Several outliers were identified within the fixed effects models; these showed strong fluctuations within countries across time. Separate analyses were performed without outliers; however, the results did not differ greatly from the initial results. Overall, these findings increase our confidence in our results.

Moreover, as argued in theory, income inequality may affect individual’s health through psychosocial and institutional mechanisms. According to the institutional mechanism of health, and, more generally, government expenditures are lower in more unequal countries, which negatively affects health. Therefore, we included health expenditures per capita, from the OECD and WHO, in the models as a proxy for the institutional pathway. As information about health expenditures is not available for all countries considered in our previous analyses, we only used information from 74 countries. Because the effect of income inequality on health does not significantly change when we add health expenditure to the models, we find little evidence for the institutional pathway. This suggests, as the IIH predicts, that income inequality health associations can be attributed to the social–psychological pathway. We note that some unequal countries have notoriously costly but ineffective health care systems, particularly, the United States. However, excluding the United States did not alter our results.

Furthermore, differences between the results and the predictions of the IIH may be caused by the use of different measurements and datasets across studies. Therefore, the models were replicated with European Social Survey (ESS) data, consisting of 114 country waves and 1140 pseudo panel observations. The ESS includes fewer countries than the WVS/EVS, but it has the advantage of measuring trust on a ten-point scale instead of as the binary question used in the WVS/EVS. Overall, quite similar results were found, although the effect of social trust was stronger, as could be expected with a more refined measurement.

Additionally, separate analysis with the EVS and WVS or with a dichotomous measure of self-rated health were performed. These showed very similar results to the previous analyses. Finally, estimating the multilevel data, in which individuals were nested within countries, with clustering correction and time fixed effects, yields very similar results as our multilevel models. Footnote 7

5 Conclusion and Discussion

The consequence of income inequality on individual health is still a strongly debated topic in the academic world. The main advantage of this study is its use of four different statistical methods to test the income inequality hypothesis (IIH). The methods used were OLS regression, multilevel models, fixed effects models and fixed effects models with pseudo panel data. The latter, being the first time used to test the IIH, provides yet another rigorous measure to test it.

As expected based on the literature (e.g., Kondo et al. 2009 ; Pickett and Wilkinson 2015 ), mixed effects were found for the relationship between income inequality and health. Around half of our results are in favour of the IIH. However, there are interesting differences between the models. They suggest that part of the evidence against the IIH can be attributed to confounding variables because the fixed effects and pseudo panel data, which can be used to control most rigorously for these confounding variables by looking at within country variance, show support for the IIH. Thus, the better we control for unobserved variables and biases, the more support we find for the IIH. Biases that might hamper OLS and multilevel studies include the different perceptions of self-rated health across countries and cultural/historical differences between countries (e.g., Dorling and Barford 2009 ; Pickett and Wilkinson 2015 ; Wilkinson and Pickett 2006 ).

Additionally, compared to the fixed effects models, the pseudo panel models show stronger significant effects, and they show support for the IIH, which could simply be a result of greater power. However, it might also indicate that the role of cohorts is crucial for understanding the IIH because pseudo panel data follow cohorts over time and avoid the data distortions that occur when cohorts are replaced. It may be that only certain (younger) cohorts are affected by income inequality and that society-wide effects are only notable when cohorts are replaced (Rözer and Volker 2015 ). Fixed effects models could not have detected this effect, which, in our opinion, demonstrates the strength of the pseudo panel data and illustrates a new method for further theoretical testing of the hypothesis.

In line with the IIH, support was found for the harmful effect of income inequality on health because it reduces social trust (e.g., Kawachi et al. 1997 ). In our models, income inequality was associated with lower trust, except in the fixed effects models. Furthermore, social trust was consistently related to higher health, except for life expectancy in the fixed effects model. In all models, we found support for the mediating role of social trust. Additionally, when comparing the effect sizes of the direct effect of income inequality on individual health and the indirect effect mediated by social trust, we found that the effects of income inequality on social trust and of social trust on health were stronger than the direct effect of income inequality on health, emphasising the importance of the indirect pathway.

Our study has some drawbacks that may be addressed in future research. First, our analyses rely, as most studies do, solely on the Gini coefficient as a measure for income inequality. A disadvantage of the Gini coefficient is that it is an average measure of inequality; it does not measure where inequality appears. For example, a country with a small group of very wealthy individuals and a country with a small group of very poor people might have the same Gini coefficient, although the countries clearly differ. Furthermore, income is only one aspect of financial inequality; differences in wealth might be even larger and more important. Furthermore, within our multilevel models, we control for relative, instead of absolute, income, but one of the great strengths of multilevel modelling is that it can control for (absolute) income. Therefore, in contrast to OLS regression, it can distinguish effects of income inequality due to income and compositional differences (Gravelle et al. 2002 ). Related to this, we control for policy effects by using information about the health expenditures of countries. However, income inequality might be associated with numerous other policy effects, such as low social benefits, which future research may want to control for. Similarly, social trust was only measured with a binary question, which makes it hard to distinguish levels of trust between individuals. As our results using ESS data showed, measuring social trust on a ten-point scale, this might have weakened its effects. In addition, social trust is a broad concept that is associated with a variety of more specific concepts, such as people’s openness to others, as well as their social network and social capital. Future research may seek to further disentangle why social trust is important in the association between income inequality and health.

Overall, half of our main results and robustness checks are in contrast with the prevailing hypothesis. However, as with the fixed effects and pseudo panel models, the unexpected results might be explained by unobserved confounders. Thus, if models are better able to control for these confounders, we may find that the income inequality hypothesis is supported. Moreover, as the income inequality hypothesis predicts, there is evidence that income inequality is associated with low trust and thus indirectly associated with health. Hence, our results indicate that, as the income inequality hypothesis predicts, income inequality is associated with ill health, but to a large extent this is because income inequality decreases social trust. The strong support that the pseudo panel data provide for these relationships suggests that cohort effects may play an important role in explaining why income inequality can be detrimental to health. Obviously, future research is needed to more fully understand the role of cohorts, but this study shows the usefulness of the newly applied method.

The correlation between the measures varies from moderate to weak; i.e. self-rated health with mortality rates (r = −.427), self-rated health with life expectancy (r = .248), and mortality rates with life expectancy (r = −.351).

Repeated single imputation in which education in years and our other variables (e.g. income, age, gender) were used to impute the missing values for the highest educational level achieved. The high correlation between education in terms of categories and years (r = .931) justifies this approach and simplifies further analyses.

To test whether a fixed or random effects model is appropriate for these data, the Hausman test was conducted (Hausman 1978 ). The Hausman test indicates that a fixed effects model, rather than a random-effects model, is most appropriate to examine our data because the assumption of no correlation between the country-specific effects and independent variables was not met: Chi squared(5) = 30.976, p < .001.

Due to the high cost (in degrees of freedom) of modelling the interaction between country and cohort fixed effects, it was not possible to strictly follow cohorts over time. However, because the average age was used as a control variable, the results would most likely not differ from a more complex model in which cohorts were tracked over time.

Although this method does not follow individuals over time, it does not necessarily produce inferior results. Compared to individual data, repeated cross-sections suffer much less from typical panel data problems like attrition and non-response (Verbeek 2007 ).

Similar differences in previous results occurred when using mainly communistic or European countries (Jen, Jones, and Johnston 2009 ). Selecting these countries produced stronger significant results than their counterparts, but the results were in line with our conclusions.

In our application, the difference between the models is that multilevel models, instead of using a clustering correction, explicitly model the dependency in the data and therefore result in more accurate predictions than (most) clustering corrections (Hox 2010 : 4).

Alesina, A. F., & La Ferrara, E. L. (2002). Who trusts others? Journal of Public Economics, 85 (2), 207–234.

Article   Google Scholar  

Allison, P. D. (2009). Fixed effects regression models . Los Angeles: Sage.

Book   Google Scholar  

Avendano, M. (2012). Correlation or causation? Income inequality and infant mortality in fixed effects models in the period 1960–2008 in 34 OECD countries. Social Science and Medicine, 75 (4), 754–760.

Babones, S. J. (2008). Income inequality and population health: Correlation and causality. Social Science and Medicine, 66 (7), 1614–1626.

Blakely, T. A., Kennedy, B. P., Glass, R., & Kawachi, I. (2000). What is the lag time between income inequality and health status? Journal of Epidemiology and Community Health, 54 (4), 318–319.

Clarke, P. M., & Ryan, C. (2006). Self-reported health: Reliability and consequences for health inequality measurement. Health Economics, 15 (6), 645–652.

Coburn, D. (2000). Income inequality, social cohesion and the health status of populations: The role of neo-liberalism. Social Science and Medicine, 51 , 135–146.

Coleman, J. S. (1988). Social capital in the creation of human capital. American journal of sociology, 94 , 95–120.

Dang, H. A., & Lanjouw, P. (2013). Measuring poverty dynamics with synthetic panels based on cross-sections. World Bank Policy Research Working Paper No. 6504 . Washington DC: The World Bank.

Deaton, A. (1985). Panel data from time series of cross-sections. Journal of Econometrics, 30 (1–2), 109–126.

Delhey, J., & Dragolov, G. (2013). Why inequality makes Europeans less happy: The role of distrust, status anxiety, and perceived conflict. European Sociological Review, 30 (2), 151–165.

Delhey, J., & Newton, K. (2003). Who trusts?: The origins of social trust in seven societies. European Societies, 5 (2), 93–137.

Dorling, D., & Barford, A. (2009). The inequality hypothesis. Thesis, antithesis, and a synthesis? Health and Place, 15 (4), 1166–1169.

Échevin, D. (2013). Measuring vulnerability to asset-poverty in sub-Saharan Africa. World Development, 46 , 211–222.

Elgar, F. J. (2010). Income inequality, trust, and population health in 33 countries. American Journal of Public Health, 100 (11), 2311–2315.

EVS. (2011). European values study 1981 – 2008, longitudinal data file . GESIS Data Archive, Cologne, Germany, ZA4804 Data File Version 2.0.0.

Feenstra, R. C., Inklaar, R., & Timmer, M. P. (2013). The next generation of the Penn World Table . www.ggdc.net/pwt .

Gold, R., Kennedy, B., Connell, F., & Kawachi, I. (2002). Teen births, income inequality, and social capital: Developing an understanding of the causal pathway. Health and Place, 8 (2), 77–83.

Gravelle, H., Wildman, J., & Sutton, M. (2002). Income, income inequality and health: What can we learn from aggregate data? Social Science and Medicine, 54 (4), 577–589.

Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46 (6), 1251–1271.

Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). London: Psychology Press.

Google Scholar  

Jen, M. H., Jones, K., & Johnston, R. (2009). Global variations in health: Evaluating Wilkinson’s income inequality hypothesis using the World Values Survey. Social Science and Medicine, 68 (4), 643–653.

Jen, M. H., Sund, E. R., Johnston, R., & Jones, K. (2010). Trustful societies, trustful individuals, and health: An analysis of self-rated health and social trust using the World Value Survey. Health & Place, 16 (5), 1022–1029.

Judge, K. (1995). Income distribution and life expectancy: A critical appraisal. British Medical Journal, 311 , 1282–1285.

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47 (2), 263.

Karlsdotter, K., Martín, J. J. M., & del Amo González, M. P. L. (2012). Multilevel analysis of income, income inequalities and health in Spain. Social Science and Medicine, 74 (7), 1099–1106.

Kawachi, I., & Kennedy, B. P. (1997). Socioeconomic determinants of health: Health and social cohesion: Why care about income inequality? British Medical Journal, 314 (7086), 1037.

Kawachi, I., & Kennedy, B. P. (1999). Income inequality and health: Pathways and mechanisms. Health Services Research, 34 (1), 215–227.

Kawachi, I., Kennedy, B. P., Lochner, K., & Prothrow-Stith, D. (1997). Social capital, income inequality, and mortality. American Journal of Public Health, 87 (9), 1491–1498.

Kondo, N., Sembajwe, G., Kawachi, I., Van Dam, R. M., Subramanian, S. V., & Yamagata, Z. (2009). Income inequality, mortality, and self rated health: Meta-analysis of multilevel studies. British Medical Journal, 339 , 4471.

Layte, R. (2012). The association between income inequality and mental health: Testing status anxiety, social capital, and neo-materialist explanations. European Sociological Review, 28 (4), 498–511.

Layte, R., & Whelan, C. T. (2014). Who feels inferior? A test of the status anxiety hypothesis of social inequalities in health. European Sociological Review , 30 (4), 525–535.

Lewis, J. D., & Weigert, A. (1985). Trust as a social reality. Social Forces, 63 (4), 967–985.

Lundberg, O., & Manderbacka, K. (1996). Assessing reliability of a measure of self-rated health. Scandinavian Journal of Public Health, 24 (3), 218–224.

Lynch, J., Smith, G. D., Harper, S., Hillemeier, M., Ross, N., Kaplan, G. A., & Wolfson, M. (2004). Is income inequality a determinant of population Health? Part 1. A systematic review. Milbank Quarterly, 82 (1), 5–99.

Mannheim, K. (1952). Das problem der generations (the problems of generations). In P. Kecsemeti (Ed.), Essays on the sociology of knowledge by Karl Mannheim . New York: Routledge and Kegan Paul.

Mellor, J. M., & Milyo, J. (2001). Reexamining the evidence of an ecological association between income inequality and health. Journal of Health Politics, Policy and Law, 26 (3), 487–522.

Mellor, J. M., & Milyo, J. (2003). Is exposure to income inequality a public health concern? Lagged effects of income inequality on individual and population health. Health Services Research, 38 (11), 137–151.

Meng, Y., Brennan, A., Purshouse, R., Hill-McManus, D., Angus, C., Holmes, J., & Meier, P. S. (2014). Estimation of own and cross price elasticities of alcohol demand in the UK—A pseudo-panel approach using the living costs and food survey 2001–2009. Journal of Health Economics, 34 , 96–103.

Moffitt, R. (1993). Identification and estimation of dynamic models with a time series of repeated cross-sections. Journal of Econometrics, 59 , 99–123.

Olivera, J. (2012). Preferences for redistribution in Europe. IZA Journal of European Labor Studies, 4 (1), 1–18.

Pickett, K., & Wilkinson, R. G. (2015). Income inequality and health: A causal review. Social Science and Medicine, 128 , 316–326.

Pop, I. A., van Ingen, E., & van Oorschot, W. (2013). Inequality, wealth and health: Is decreasing income inequality the key to create healthier societies? Social Indicators Research, 113 (3), 1025–1043.

Präg, P., Mills, M., & Wittek, R. (2014). Income and income inequality as social determinants of health: Do social comparisons play a role? European Sociological Review, 30 (2), 218–229.

Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40 (3), 879–891.

Putnam, R. D. (2001). Bowling alone: The collapse and revival of American community . New York: Simon and Schuster.

Qi, Y. (2012). The impact of income inequality on self-rated general health: Evidence from a cross-national study. Research in Social Stratification and Mobility, 30 (4), 451–471.

Rothstein, B., & Uslaner, E. M. (2005). All for all: Equality, corruption, and social trust. World Politics, 58 (1), 41–72.

Rözer, J., & Kraaykamp, G. (2012). Income inequality and subjective well-being: A cross-national study on the conditional effects of individual and national characteristics. Social Indicators Research, 113 (3), 1009–1023.

Rözer, J., Kraaykamp, G., & Huijts, T. (2016). National income inequality and self-rated health. The differing impact of individual social trust across 89 countries. European Societies . doi: 10.1080/14616696.2016.1153697 .

Rözer, J., & Volker, B. (2015). Does income inequality have lasting effects on health and trust? Social Science and Medicine . doi: 10.1016/j.socscimed.2015.11.047 .

Solt, F. (2009). Standardizing the world income inequality database. Social Science Quarterly, 90 (2), 231–242.

Subramanian, S. V., & Kawachi, I. (2004). Income inequality and health: What have we learned so far? Epidemiologic Reviews, 26 (1), 78–91.

Takahashi, T., Ikeda, K., Ishikawa, M., Kitamura, N., Tsukasaki, T., Nakama, D., & Kameda, T. (2005). Interpersonal trust and social stress-induced cortisol elevation. NeuroReport, 16 (2), 197–199.

Tingley, D., Yamamoto, T., Keele, L., & Imai, K. (2013). Mediation: R package for causal mediation analysis . R package version 4.2.3. http://CRAN.R-project.org/package=mediation .

United Nations University (UNU)-WIDER. (2008). World income inequality database, version 2.0c .

Veenstra, G. (2002). Social capital and health (plus wealth, income inequality and regional health governance). Social Science and Medicine, 54 (6), 849–868.

Verbeek, M. (2004). A guide to modern econometrics (2nd ed.). Chichester: Wiley.

Verbeek, M. (2007). Pseudo-panels and repeated cross-sections. In L. Mátyás & P. Sevestre (Eds.), the econometrics of panel data (Vol. 46, pp. 369–383). Berlin: Springer.

Chapter   Google Scholar  

Verbeek, M., & Vella, F. (2005). Estimating dynamic models from repeated cross-sections. Journal of Econometrics, 127 (1), 83–102.

Whelan, C. T., & Maître, B. (2013). Material deprivation, economic stress, and reference groups in Europe: An analysis of EU-SILC 2009. European Sociological Review, 29 (6), 1162–1174.

Wilkinson, R. G. (1996). Unhealthy societies: The afflictions of inequality . London: Routledge.

Wilkinson, R. G., & Pickett, K. E. (2006). Income inequality and population health: A review and explanation of the evidence. Social Science and Medicine, 62 (7), 1768–1784.

Wilkinson, R. G., & Pickett, K. E. (2007). The problems of relative deprivation: Why some societies do better than others. Social Science and Medicine, 65 (9), 1965–1978.

Wilkinson, R. G., & Pickett, K. E. (2009). The spirit level: Why equality is better for everyone . London: Pinguin books.

World Bank. (2015). World Bank list of economies . Retrieved from http://data.worldbank.org/about/country-and-lending-groups#High_income .

WVS. (1981–2008). World value survey official aggregate v.20090901, 2009 . World Value Survey Association. www.worldvaluesurvey.org . Aggregate File Producer: ASEP/JDS, Madrid.

WVS. (2014). World value survey wave 6 2010 – 2014 official aggregate v.20140429 . World Values Survey Association. www.worldvaluessurvey.org .

Zagorski, K., Evans, M. D. R., Kelley, J., & Piotrowska, K. (2014). Does national income inequality affect individuals’ quality of life in Europe? Inequality, happiness, finances, and health. Social Indicators Research , 117 (3), 1089–1110.

Zak, P. J., & Knack, S. (2001). Trust and growth. The Economice Journal, 111 (470), 295–321.

Download references

Acknowledgments

We are grateful for Beate Völker’s, Alexander Grims’, and Zoltan Lippenyi’s feedback throughout the writing process.

Author information

Authors and affiliations.

Department of Sociology, University of Amsterdam, Nieuwe Achtergracht 166, Amsterdam, The Netherlands

Nigel Kragten & Jesper Rözer

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Nigel Kragten .

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 37 kb)

Rights and permissions.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Kragten, N., Rözer, J. The Income Inequality Hypothesis Revisited: Assessing the Hypothesis Using Four Methodological Approaches. Soc Indic Res 131 , 1015–1033 (2017). https://doi.org/10.1007/s11205-016-1283-8

Download citation

Accepted : 26 February 2016

Published : 04 March 2016

Issue Date : April 2017

DOI : https://doi.org/10.1007/s11205-016-1283-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Income inequality
  • Social trust
  • Pseudo panels
  • Multilevel regression
  • Fixed effects model
  • Find a journal
  • Publish with us
  • Track your research

share this!

May 28, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

trusted source

'Lean in' messages can lower women's motivation to protest gender inequality

by University of Exeter

woman boss

Women in leadership are often told to "Lean In," designed to be motivational messaging demonstrating that they are more confident, strategic and resilient to setback. However, new research indicates that such "lean in" messaging can hinder women's motivation to protest gender equality.

Popularized in a book by American technology executive Sherly Sandberg, the "Lean In" solution to gender inequality advises women that demonstrating personal resilience and perseverance in the face of setbacks is key to career advancement.

Now, a new study led by the University of Exeter, Bath Spa University and the Australian National University has found that while such messages may provide inspiration for some, they can also reduce women's likelihood to protest gender discrimination. This effect could actually be hindering gender equality progress.

Published in Psychology of Women Quarterly , the study involved four experiments, Researchers examined women's motivation to protest gender inequality after exposure to "Lean In" messages promoting individual resilience.

All the experiments were in the UK and involved more than 1,100 women who were either undergraduate students or employed women with university degrees. Women read about gender inequality, and then either read about resilience as key to promoting advancement (in line with "lean in" messaging), or participated in activities to build their own resilience by learning how to set flexible goals and maintain confidence.

The research found:

  • In three of four experiments, women in "Lean In" conditions were less willing to be part of protest action over gender inequality compared to those in a control condition who were not exposed to "Lean In" messages.
  • In two of the experiments, this effect occurred because women in "Lean In" conditions were less likely to believe that gender discrimination would affect their career prospects.
  • In one, this effect occurred because women in "Lean In" conditions also felt less angry about ongoing gender inequality.

Authors say the findings of this research highlight an unintended consequence of "Lean In" messages and related individual resilience training for women that is offered as a remedy for gender inequality in the workplace—that it can undermine women's recognition of, and willingness to protest about, the root causes of gender inequality: discrimination.

Lead author, Dr. Renata Bongiorno, who conducted the studies while at the University of Exeter and is now Senior Lecturer in Psychology at Bath Spa University, said, "The popularity of the 'Lean In' movement speaks to the challenges women continue to face due to gender discrimination in the workplace.

"Women are understandably looking for ways to advance their careers despite the disproportionate setbacks they continue to experience compared to men.

"While the 'Lean In' solution offered by Sheryl Sandberg can feel empowering, a lack of individual resilience or perseverance is not the cause of women's poorer career progress.

"The messages lead to women assuming that gender discrimination will be less of a barrier to their career advancement. This false belief is concerning for progress because it is reducing women's willingness to protest the real causes of gender inequality.

"Progress and gains for women have historically been achieved through collective protest over gender discriminatory practices and policies, including pregnancy discrimination, a lack of affordable childcare, and workplace sexual harassment.

"Finding ways to effectively challenge these ongoing barriers should be a focus for feminism because they are the real causes of gender inequality in career outcomes."

Journal information: Psychology of Women Quarterly

Provided by University of Exeter

Explore further

Feedback to editors

research hypothesis on gender inequality

Neurons in the visual system of flies exhibit surprisingly heterogeneous wiring, connectome analysis finds

9 minutes ago

research hypothesis on gender inequality

Physicists take molecules to a new ultracold limit, creating a state of matter where quantum mechanics reigns

research hypothesis on gender inequality

Altered carbon points toward sustainable manufacturing

research hypothesis on gender inequality

Captivating blue-colored ant discovered in India's remote Siang Valley

research hypothesis on gender inequality

Body of a woman discovered among remains of 25 warrior monks of the Order of Calatrava in Guadalajara

research hypothesis on gender inequality

Food, not sex, drove the evolution of giraffes' long neck, new study finds

research hypothesis on gender inequality

The unexpected connection between brewing coffee and understanding turbulence

2 hours ago

research hypothesis on gender inequality

A transition-metal-free zeolite catalyst for direct conversion of methane to methanol

research hypothesis on gender inequality

Study identifies fungus that breaks down ocean plastic

research hypothesis on gender inequality

Researchers discover 400,000-year-old stone tools designed specifically for butchering fallow deer in Israel

Relevant physicsforums posts, most underrated rock drummer.

57 minutes ago

What's The Opposite Of Subtlety?

Jun 2, 2024

Best and Worst Cowbell Parody Parodies!!!

Jun 1, 2024

Biographies, history, personal accounts

Cover songs versus the original track, which ones are better, another word i got wrong : vile.

May 31, 2024

More from Art, Music, History, and Linguistics

Related Stories

research hypothesis on gender inequality

Gender discrimination may be making parts of the female brain thinner

May 9, 2023

research hypothesis on gender inequality

Perceived gender discrimination linked to decline in well-being for older women

Mar 20, 2024

research hypothesis on gender inequality

More gender segregation in jobs means more harassment, lower pay

May 12, 2022

research hypothesis on gender inequality

Gender inequality can predict high rates of child physical abuse

Oct 5, 2022

research hypothesis on gender inequality

Gender pay progress stalls on 'motherhood penalty': study

Mar 7, 2023

research hypothesis on gender inequality

What is gender equality in science? Common solutions may not be solving the problem

Mar 27, 2019

Recommended for you

research hypothesis on gender inequality

Data scientists aim to improve humanitarian support for displaced populations

3 hours ago

research hypothesis on gender inequality

Satellite data study shows 1.18 billion people are energy poor, finding no evidence of electricity usage from space

research hypothesis on gender inequality

A surprising result for a group's optimal path to cooperation

May 30, 2024

research hypothesis on gender inequality

Misleading COVID-19 headlines from mainstream sources did more harm on Facebook than fake news, study finds

research hypothesis on gender inequality

Most people trust accurate search results when the stakes are high, study finds

research hypothesis on gender inequality

Who has the largest burial mound? Study examines differences among the upper classes of prehistoric societies

May 29, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

UN Women Strategic Plan 2022-2025

Artificial Intelligence and gender equality

  • Share to Facebook
  • Share to Twitter
  • Share to LinkedIn
  • Share to E-mail

The world has a gender equality problem, and Artificial Intelligence (AI) mirrors the gender bias in our society.

Although globally more women are accessing the internet every year , in low-income countries, only 20 per cent are connected . The gender digital divide creates a data gap that is reflected in the gender bias in AI. 

Who creates AI and what biases are built into AI data (or not), can perpetuate, widen, or reduce gender equality gaps.

Young women participants work together on a laptop at during an African Girls Can Code Initiative's coding bootcamp held at the GIZ Digital Transformation Center in Kigali, Rwanda in April 2024

What is AI gender bias? 

A study by the Berkeley Haas Center for Equity, Gender and Leadership analysed 133 AI systems across different industries and found that about 44 per cent of them showed gender bias , and 25 per cent exhibited both gender and racial bias.

Beyza Doğuç, an artist from Ankara, Turkey, encountered gender bias in Generative AI when she was researching for a novel and prompted it to write a story about a doctor and a nurse. Generative AI creates new content (text, images, video, etc.) inspired by similar content and data that it was trained on, often in response to questions or prompts by a user.

The AI made the doctor male and the nurse female. Doğuç continued to give it more prompts, and the AI always chose gender stereotypical roles for the characters and associated certain qualities and skills with male or female characters. When she asked the AI about the gender bias it exhibited, the AI explained it was because of the data it had been trained on and specifically, “word embedding” – which means the way certain words are encoded in machine learning to reflect their meaning and association with other words – it’s how machines learn and work with human language. If the AI is trained on data that associates women and men with different and specific skills or interests, it will generate content reflecting that bias.

“Artificial intelligence mirrors the biases that are present in our society and that manifest in AI training data,” said Doğuç, in a recent interview with UN Women.

Who develops AI, and what kind of data it is trained on, has gender implications for AI-powered solutions.

Sola Mahfouz, a quantum computing researcher at Tufts University, is excited about AI, but also concerned. “Is it equitable? How much does it mirror our society’s patriarchal structures and inherent biases from its predominantly male creators,” she reflected. 

Mahfouz was born in Afghanistan, where she was forced to leave school when the Taliban came to her home and threatened her family. She eventually escaped Afghanistan and immigrated to the U.S. in 2016 to attend college.

As companies are scrambling for more data to feed AI systems, researchers from Epoch claim that tech companies could run out of high-quality data used by AI by 2026 .

Natacha Sangwa is a student from Rwanda who participated in the first coding camp organized under the African Girls Can Code Initiative last year. “I have noticed that [AI] is mostly developed by men and trained on datasets that are primarily based on men,” said Sangwa, who saw first-hand how that impacts women’s experience with the technology. “When women use some AI-powered systems to diagnose illnesses, they often receive inaccurate answers, because the AI is not aware of symptoms that may present differently in women.” 

If current trends continue, AI-powered technology and services will continue lacking diverse gender and racial perspectives, and that gap will result in lower quality of services, biased decisions about jobs, credit, health care and more. 

How to avoid gender bias in AI?

Removing gender bias in AI starts with prioritizing gender equality as a goal, as AI systems are conceptualized and built. This includes assessing data for misrepresentation, providing data that is representative of diverse gender and racial experiences, and reshaping the teams developing AI to make them more diverse and inclusive.

According to the Global Gender Gap Report of 2023, there are only 30 per cent women currently working in AI .  

“When technology is developed with just one perspective, it’s like looking at the world half-blind,” concurred Mahfouz. She is currently working on a project to create an AI-powered platform that would connect Afghan women with each other. 

“More women researchers are needed in the field. The unique lived experiences of women can profoundly shape the theoretical foundations of technology. It can also open new applications of the technology,” she added. 

“To prevent gender bias in AI, we must first address gender bias in our society,” said Doğuç from Turkey.

There is a critical need for drawing upon diverse fields of expertise when developing AI, including gender expertise, so that machine learning systems can serve us better and support the drive for a more equal and sustainable world.

In a rapidly advancing AI industry, the lack of gender perspectives, data, and decision-making can perpetuate profound inequality for years to come.

The AI field needs more women, and that requires enabling and increasing girls’ and women’s access to and leadership in STEM and ICT education and careers.

The World Economic Forum reported in 2023 that women accounted for just 29 per cent of all science, technology, engineering and math (STEM) workers. Although more women are graduating and entering STEM jobs today than ever before, they are concentrated in entry level jobs and less likely to hold leadership positions.

Detail from the mural painting "Titans" by Lumen Martin Winter as installed on the third floor of the UN General Assembly Building in New York

How can AI governance help accelerate progress towards gender equality?

International cooperation on digital technology has focused on technical and infrastructural issues and the digital economy, often at the expense of how technological developments were affecting society and generating disruption across all its layers – especially for the most vulnerable and historically excluded. There is a global governance deficit in addressing the challenges and risks of AI and harnessing its potential to leave no one behind.

“Right now, there is no mechanism to constrain developers from releasing AI systems before they are ready and safe. There’s a need for a global multistakeholder governance model that prevents and redresses when AI systems exhibit gender or racial bias, reinforce harmful stereotypes, or does not meet privacy and security standards,” said Helene Molinier, UN Women’s Advisor on Digital Gender Equality Cooperation in a recent interview with Devex.

In the current AI architecture, benefits and risks are not equitably distributed, with power concentrated in the hands of a few corporations, States and individuals, who control talent, data and computer resources. There is also no mechanism to look at broader considerations, like new forms of social vulnerability generated by AI, the disruption of industries and labour markets, the propensity for emerging technology to be used as a tool of oppression, the sustainability of the AI supply chain, or the impact of AI on future generations.

In 2024, the negotiation of the Global Digital Compact (GDC) offers a unique opportunity to build political momentum and place gender perspectives on digital technology at the core of a new digital governance framework. Without it, we face the risk of overlaying AI onto existing gender gaps, causing gender-based discrimination and harm to be left unchanged – and even amplified and perpetuated by AI systems.

UN Women position paper on the GDC provide concrete recommendations to harness the speed, scale, and scope of digital transformation for the empowerment of women and girls in all their diversity, and to trigger transformations that set countries on paths to an equitable digital future for all.

  • Science and technology for development
  • Innovation and technology

Related content

Image placeholder with UN Women logo (English) - 3:2 aspect ratio

UN Women statement for the International Girls in ICT Day 2024

Participants during a robotics session at the first AGCCI bootcamp in Rwanda.

UN Women statement for the International Day for Women and Girls in Science

creating safe digital spaces thumbnail

Creating safe digital spaces free of trolls, doxing, and hate speech

IMAGES

  1. PPT

    research hypothesis on gender inequality

  2. Gender Inequality Infographic on Behance

    research hypothesis on gender inequality

  3. 30 Important Gender Inequality Facts To Raise Awareness

    research hypothesis on gender inequality

  4. Special lectures on gender inequality

    research hypothesis on gender inequality

  5. Here's why Gender Inequality is more than a women issue

    research hypothesis on gender inequality

  6. 30 Important Gender Inequality Facts To Raise Awareness

    research hypothesis on gender inequality

VIDEO

  1. Jon Matthews: Anthropology, Language, And Transgenderism

  2. Research Hypothesis Testing Fundamentals

  3. Decision Making Under Uncertainty

  4. 4485

  5. Hypothesis / परिकल्पना/ Introduction to Research Methods

  6. Mathematical Universe Hypothesis: Is Maths Fundamental?

COMMENTS

  1. Gender inequality as a barrier to economic growth: a review of the

    The vast majority of theories reviewed argue that gender inequality is a barrier to economic development, particularly over the long run. The focus on long-run supply-side models reflects a recent effort by growth theorists to incorporate two stylized facts of economic development in the last two centuries: (i) a strong positive association between gender equality and income per capita (Fig. 1 ...

  2. Gender inequities in the workplace: A holistic review of organizational

    9.1. Theoretical contributions and calls for future research. Our review of the literature has led us to create a model of gender inequities that develop from cumulative processes across the employee lifespan and that cascade across multiple levels: societal, organizational, interpersonal, and individual (see Fig. 1).The societal level refers to factors and processes occurring at the national ...

  3. Twenty years of gender equality research: A scoping review based on a

    Our paper offers a scoping review of a large portion of the research that has been published over the last 22 years, on gender equality and related issues, with a specific focus on business and economics studies. Combining innovative methods drawn from both network analysis and text mining, we provide a synthesis of 15,465 scientific articles.

  4. (PDF) Exploring Theories of Workplace Gender Inequality and Its

    This study conducted a comprehensive literature review to address the critical issue of gender inequality in the workplace. The aim was to identify and synthesize existing research and provide a ...

  5. Progress toward gender equality in the United States has slowed ...

    Here, we review our findings and use past research on causes of gender inequality to speculate about what would need to change to hasten the reduction of inequality. Women's employment has stalled out at 70 to 75% for decades. The ratio of women's to men's employment rose dramatically from 0.53 in 1970 to 0.85 in 1995 but has changed ...

  6. Social Perceptions of Gender Differences and the Subjective

    Social perceptions of gender differences in various spheres are interrelated with the subjective significance of the gender inequality issue. Hypothesis 2. ... Backlash effects for disconfirming gender stereotypes in organizations. Research in organizational behavior, 28, 61-79. 10.1016/j.riob.2008.04.003 ...

  7. Exploring Gender Inequality and Practical Solutions for an ...

    The dynamic hypothesis of gender inequality proposes the existence of four feedback loops, each with four cycles of reinforcement, that contribute to structural inequality between men and women in science. ... Women in research and academia face significant challenges that hinder their career development and limit their advancement of ...

  8. Gender inequality and the entrepreneurial gender gap ...

    In this section, we first review the literature about the relationship between gender inequality and entrepreneurship. This review in Section 2.1 leads to Hypothesis 1. In Section 2.2, we focus on the relationship between gender inequality and the motives behind engaging in entrepreneurship (opportunity-driven vs. necessity-driven).At the end of this section, we derive Hypotheses 2a and 2b.

  9. Workplace Gender Pay Gaps: Does Gender Matter Less the Longer Employees

    Research on discretion, organizational culture, and gender composition of management suggests all of these factors should minimize gender pay differences, making B2G a particularly conservative site to study gender inequality. Pay gaps are likely even greater in less bureaucratic, less equitable, and less female-dominated workplaces.

  10. Gender inequalities in the workplace: the effects of organizational

    Gender inequality in organizations is a complex phenomenon that can be seen in organizational structures, processes, and practices. ... The glass ceiling hypothesis: a comparative study of the United States, Sweden, and Australia. ... the role of gender stereotypes," in Research in Social Issues in Management eds Steiner D., Gilliland S. W ...

  11. The impact of gender diversity on scientific research teams: a need to

    How does gender diversity affect scientific team interactions? A novel study using relational data and social network analysis challenges the existing paradigm and calls for more research.

  12. Men and women differ in their perception of gender bias in research

    There is extensive evidence of gender inequality in research leading to insufficient representation of women in leadership positions. Numbers revealing a gender gap in research are periodically reported by national and international institutions but data on perceptions of gender equality within the research community are scarce. In the present study, a questionnaire based on the British Athena ...

  13. Full article: Gender and Intersecting Inequalities in Education

    Introduction. Girls' education and gender inequalities associated with education were areas of major policy attention before the COVID-19 pandemic, and remain central to the agendas of governments, multilateral organisations and international NGOs in thinking about agendas to build back better, more equal or to build forward (Save the Children Citation 2020; UN Women Citation 2021; UNESCO ...

  14. The GenderSci Lab Takes On the Gender Equality Paradox Hypothesis

    The Gender Equality Paradox is the new vanguard of innate sex difference champions. The "Gender Equality Paradox" hypothesis is widely referenced to support the view that, no matter the efforts, women will not achieve parity in STEM fields in the US and similar countries.

  15. Twenty years of gender equality research: A scoping review based on a

    Gender equality is a major problem that places women at a disadvantage thereby stymieing economic growth and societal advancement. In the last two decades, extensive research has been conducted on gender related issues, studying both their antecedents and consequences. However, existing literature reviews fail to provide a comprehensive and clear picture of what has been studied so far, which ...

  16. Microfinance and gender inequality: cross-country evidence

    Consequently, microfinance has the potential to reduce gender inequality (GI). Case-study evidence from across the developing world both supports and contradicts this hypothesis. We therefore revisit this issue using macroeconomic cross-country panel data for 64 developing economies over the period 2003-2014.

  17. (PDF) Gender Inequality in India: A Comprehensive Analysis and

    The research a rticle has provided a comprehensive analysis of gender inequality in India, highlighting its causes, manifestations, and consequences. Gender i nequality in India is deeply rooted ...

  18. PDF RESEARCH PAPER THE IMPACT OF MILITARIZATION ON GENDER INEQUALITY

    Hypothesis 1: Militarization leads to higher gender inequality and lower female labour force participation. Hypothesis 2: The negative impact of militarization on gender inequality is weaker in countries with a higher democracy level. Hypothesis 3: The negative impact of militarization on gender inequality is weaker in high-income

  19. Gendered stereotypes and norms: A systematic review of interventions

    1. Introduction. Gender is a widely accepted social determinant of health [1, 2], as evidenced by the inclusion of Gender Equality as a standalone goal in the United Nations Sustainable Development Goals [].In light of this, momentum is building around the need to invest in gender-transformative programs and initiatives designed to challenge harmful power and gender imbalances, in line with ...

  20. International Journal of Research Gender inequality-A Global issue

    Abstract. The practice of gender inequality is an entire observable fact. Each country of the world is experiencing it one or the other way the term "gender inequality" refers to the seeming or ...

  21. Hypothesis for: Gender inequality in childhood: toward a life course

    Hypothesis. Supported. Variables Tested. Baunach, Dawn Michelle. 2001. Patrilocality will predict childhood gender inequality when controlling for technology, hunger, warfare, women's political participation, and women's economic contribution and control (79). Supported. 2. More.

  22. Can We Solve Social Justice Problems with Math?

    Math Can Help Solve Social Justice Problems. Mathematicians are working on ways to use their field to tackle major social issues, such as social inequality and the need for gender equity. When ...

  23. The Gender Difference Hypothesis: A Synthesis of Research Findings

    The intent of this inquiry was to synthesize empirical findings on gender differences published in the first 22 volumes of the Educational Administration Quarterly. A content analysis of these articles provided adequate statistical information to conduct a meta-analysis on 147 tests of the gender difference hypothesis.When an effect size of .5 was required to infer a gender difference on a ...

  24. 'Lean In' messages can lower women's motivation to protest gender

    Women read about gender inequality, and then either read about resilience as key to promoting advancement (in line with "lean in" messaging), or participated in activities to build their own ...

  25. Opinion

    Opinion Writer. American women made significant progress toward closing the gender pay gap in the second half of the 20th century, but that gap has barely budged over the past two decades. In 2022 ...

  26. The Income Inequality Hypothesis Revisited: Assessing the ...

    The income inequality hypothesis states that income inequality has a negative effect on individual's health, partially because it reduces social trust. This article aims to critically assess the income inequality hypothesis by comparing several analytical strategies, namely OLS regression, multilevel regression, fixed effects models and fixed effects models using pseudo panel data. To test ...

  27. 'Lean in' messages can lower women's motivation to protest gender

    The research found: In three of four experiments, women in "Lean In" conditions were less willing to be part of protest action over gender inequality compared to those in a control condition who ...

  28. Artificial Intelligence and gender equality

    The world has a gender equality problem, and Artificial Intelligence (AI) mirrors the gender bias in our society. Although globally more women are accessing the internet every year, in low-income countries, only 20 per cent are connected. The gender digital divide creates a data gap that is reflected in the gender bias in AI. Who creates AI and what biases are built into AI data (or not), can ...