Analyzing the gender bias in STEM biographies: The role of reliable information sources in Wikipedia


Wikipedia has a well documented gender gap both in content and content creation (Collier & Bear, 2012; Hill & Shaw, 2013), for instance the English Wikipedia has less than 18.6% of biographies of women scientists in 2022 (Wikimédia France & Le Hir, 2017) and less than 11,6% of female editors in Spanish Wikipedia (Minguillón et al., 2021). This work presents a methodology to analyze the bibliographic references used in Wikipedia biographies in order to study the systemic gender bias in its available contents. This work details an assessment of new information resources from a gender perspective supporting the diffusion of open science.

One of the handicaps faced to get content accepted into Wikipedia is that any new article must be notable, that is to say, to meet notability requirements according to a set of rules: articles have to be based on a third party, non-affiliated sources, with some degree of editorial overview, which is seen as a guarantee of neutrality and quality. Wikipedia’s vision of knowledge conflates cultural significance with visibility in secondary media sources (Gauthier & Sawchuk, 2017). This is especially relevant to women who do not easily attract mainstream media interest and therefore appear less. Only 24% of news sources are related to women, that is to say, less people seen, heard or read about in the media are women (Macharia, 2020). Therefore, it is relevant to study the information sources used to ground the reliability of biographies in Wikipedia in relation to gender bias and the notability as a parameter to allow women’s visibility in the online encyclopedia, in this case for STEM scientists and their works.

In this paper the reliable information sources are selected and compared from 100 biographies of scientists from STEM disciplines (science, technology, engineering and mathematics) of the English Wikipedia. Our tool for analyzing the information references contains 8 categories of analysis which can be organized in several sub-groups: number of sources cited in each biography; the type of content -for example books, journals, media etc.- (Ford et al., 2013); the format -digital, physical-; the type of access -if it is an open access article, the link is broken or still active - (Teplitskiy et al., 2016); and the section of the biography where it is cited -introduction, early years, career, honors and awards or selected publications- (Agarwal et al., 2020; Azer et al., 2015). This tool has been built reviewing literature on information sources from the library and information science field and the previous studies conducted on the use of references in Wikipedia.

87th IFLA World Library and Information Congress: Inspire, Engage, Enable, Connect


David Ramírez-Ordóñez
Investigador doctoral

Investigador predoctoral del programa de Societat de la Informació i el Coneixement de la UOC.

Julio Meneses
Professor agregat

Professor de metodologia de la investigació, director de Learning Analytics de l’eLearning Innovation Center i investigador de l’Internet Interdisciplinary Institute de la Universitat Oberta de Catalunya.