OSDG publishes a new SDG-labelled text dataset

July 1, 2022

The OSDG Community platform continues to deliver data-driven tools to researchers worldwide. The OSDG team has recently published the fourth quarterly OSDG Community Dataset (OSDG-CD).

More than a thousand citizen scientists have contributed to the exercise so far. The newest dataset contains 32,431 text excerpts and a total of 217,147 assigned labels.

Ever since its first publication in September 2021, the dataset hasaccelerated impactful research on the SDGs. For instance, a recent article “A Methodology for Linking the Energy-Related Policies of the European Green Deal to the 17 SDGs Using Machine Learning” utilised our dataset to fine-tune a pre-trained BERT model. The article is developed by researchers Prof. Dr. Phoebe Koundouri (School of Economics of the Athens University of Economics and Business), Prof. Nicolaos Theodossiou (Aristotle University of Thessaloniki), Charalampos Stavridis (Aristotle University of Thessaloniki), Stathis Devves (Athens University of Economics and Business), and Angelos Plataniotis.

Have you applied the dataset for your own research efforts? Share your results with us and help us spread the word about your SDG impact. If you’re looking for ways to contribute to this project, please visit the Community platform page to find out more.