Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes

Published by Dominika Tkaczyk on

Our paper “Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes” got recently accepted and will be presented at Joint Conference on Digital Libraries 2018.

Abstract: Creating scientific publications is a complex process. It is composed of a number of different activities, such as designing the experiments, analyzing the data, and writing the manuscript. Information about the contributions of individual authors of a paper is important for assessing authors’ scientific achievements. Some biomedical publications contain a short section written in natural language, which describes the roles each author played in the process of preparing the article. In this paper, we present a study of authors’ roles commonly appearing in these sections, and propose an algorithm for automatic extraction of authors’ roles from them. In our study, we used co-clustering techniques, as well as Open Information Extraction, to semi-automatically discover the most popular roles within a corpus of contributions sections. In total 13 roles were discovered, three of which (paper revision, literature review, and interpretation) are not described by existing author role taxonomies. Discovered roles are then used to automatically build a training set for a supervised Naïve Bayes role extractor. The proposed role extractor is able to extract roles from the text with micro-averaged precision 0.68, recall 0.48 and F1 0.57.

The workflow of author contributions extraction

The workflow of author contributions extraction


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.