The General Index of Software Engineering Papers - Département Informatique et Réseaux Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

The General Index of Software Engineering Papers

Résumé

We introduce the General Index of Software Engineering Papers, a dataset of fulltext-indexed papers from the most prominent scientific venues in the field of Software Engineering. The dataset includes both complete bibliographic information and indexed ngrams (sequence of contiguous words after removal of stopwords and non-words, for a total of 577 276 382 unique n-grams in this release) with length 1 to 5 for 44 581 papers retrieved from 34 venues over the 1971-2020 period. The dataset serves use cases in the field of meta-research, allowing to introspect the output of software engineering research even when access to papers or scholarly search engines is not possible (e.g., due to contractual reasons). The dataset also contributes to making such analyses reproducible and independently verifiable, as opposed to what happens when they are conducted using 3rd-party and non-open scholarly indexing services. The dataset is available as a portable Postgres database dump and released as open data.
Fichier principal
Vignette du fichier
main.pdf (712.42 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03623109 , version 1 (06-04-2022)

Identifiants

Citer

Zeinab Abou Khalil, Stefano Zacchiroli. The General Index of Software Engineering Papers. MSR 2022 - The 2022 Mining Software Repositories Conference, May 2022, Pittsburgh, Pennsylvania, United States. ⟨10.1145/3524842.3528494⟩. ⟨hal-03623109⟩
574 Consultations
157 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More