Wikipedia is the self named free encyclopedia, available in more than 300 languages and one of the most popular websites on the Internet. Despite its mission of collecting the sum of all knowledge, one of Wikipedia’s struggles is its gender bias. In this paper we present a proposal of the corpus for analysis of the generation of biographies, written in the English Wikipedia, in order to identify the gender bias in the creation of new content to reflect the new valid knowledge of all human beings. First we identify a mechanism to access a corpus of deleted biographies and those which have been accepted into the category Articles for Deletion, where editors vote to keep, merge, redirect or delete content in an online debate. Then we access a different set of data, a second corpus from the category Scientist by field in which we have chosen biographies marked as content to be improved due to its lack of bibliographic references and those which have never been marked for improvement. To do so, we focused on the area of science, in the first case, with the category Articles for Deletion we selected scientists, and in the second case, with the category Scientists by field we selected STEM scientists, in order to compare how gender affects the development of content in Wikipedia. Lastly we propose a path to understanding the generation of the gender gap in the collaborative creation of shared content, this entails a close up look at the policies and guidelines of the digital encyclopedia, such as notability and reliable sources, created by the community of editors to shape the type of content accepted as valid knowledge.