Structuring Scholarly NLP Contributions in the Open Research Knowledge Graph

A Scholarly Contribution Graph

Since scientific literature is growing at a rapid rate and researchers today are faced with this publications deluge, it is increasingly tedious, if not practically impossible to keep up with the research progress even within one's own narrow discipline. The Open Research Knowledge Graph (ORKG) is posited as a solution to the problem of keeping track of research progress minus the cognitive overload that reading dozens of full papers impose. It aims to build a comprehensive knowledge graph that publishes the research contributions of scholarly publications per paper, where the contributions are interconnected via the graph even across papers.

With the NLPContributionGraph Shared Task, we have formalized the building of such a scholarly contributions-focused graph over NLP scholarly articles as an automated task.

Task Overview

NLPContributionGraph is defined on a dataset of NLP scholarly articles with their contributions structured to be integrable within Knowledge Graph infrastructures such as the ORKG.

The structured contribution annotations are provided as:

  1. Contribution sentences: a set of sentences about the contribution in the article;
  2. Scientific terms and relations: a set of scientific terms and relational cue phrases extracted from the contribution sentences; and
  3. Triples: semantic statements that pair scientific terms with a relation, modeled toward subject-predicate-object RDF statements for KG building. The Triples are organized under three (mandatory) or more information units (viz., ResearchProblem, Approach, Model, Code, Dataset, ExperimentalSetup, Hyperparameters, Baselines, Results, Tasks, Experiments, and AblationAnalysis).

Thus the extraction task is defined in terms of these three dataset annotation elements where the extracting data element 3 relies on having extracted data element 2 which in turn depends on extracting 1. The dataset is explained in detail on the Data Page.

More background information about the task can be found in our pilot dataset annotation description paper; or about the ORKG can be found at DOI.


ncg.task [at]