![who uses babelnet who uses babelnet](https://i.pinimg.com/originals/32/85/82/328582b54bb369b57b36efdaf4411698.png)
The Sapienza authors gratefully acknowl edge the support of the ERC Starting Grant MultiJEDI No. T1 - Annotating the MASC corpus with BabelNet Finally, we estimate the quality of our annotations using both manually-tagged named entities and word senses, obtaining an accuracy of roughly 70% for both named entities and word sense annotations.", Our overall aim is to stimulate research on the joint exploitation and disambiguation of word senses and named entities. We use this corpus because its goal of integrating different types of annotations goes exactly in our same direction. In this paper we present an automatic approach for performing this annotation, together with its output on the MASC corpus. However, to date, there has been no resource that contains both kinds of annotation. More recently Entity Linking has followed the same path, with the creation of huge resources containing annotated named entities. Word sense annotated corpora have been around for more than twenty years, helping the development of Word Sense Disambiguation algorithms by providing both training and testing grounds. We use BabelNet 2.0, a multilingual semantic network which integrates both lexicographic and encyclopedic knowledge, as our sense/entity inventory together with its semantic structure, to perform the aforementioned annotation task. Finally, we estimate the quality of our annotations using both manually-tagged named entities and word senses, obtaining an accuracy of roughly 70% for both named entities and word sense annotations.Ībstract = "In this paper we tackle the problem of automatically annotating, with both word senses and named entities, the MASC 3.0 corpus, a large English corpus covering a wide range of genres of written and spoken text.
![who uses babelnet who uses babelnet](http://nlp.uniroma1.it/media/images/2021/02/babelnet_5_post.jpeg)
Tél.In this paper we tackle the problem of automatically annotating, with both word senses and named entities, the MASC 3.0 corpus, a large English corpus covering a wide range of genres of written and spoken text. LIG - Laboratoire d'Informatique de Grenoble (UMR 5217 - Laboratoire LIG - Bâtiment IMAG - 700 avenue Centrale - Domaine Universitaire de Saint-Martin-d’HèresĪdresse postale : CS 40700 - 38058 Grenoble cedex 9 - France.Inria - Institut National de Recherche en Informatique et en Automatique (Domaine de Voluceauħ8153 Le Chesnay Cedex - France) StructId : 300009.Inria Grenoble - Rhône-Alpes (Inovalléeģ8330 Montbonnot - France) StructId : 2497.1 EXMO - Computer mediated exchange of structured knowledge (Inria Grenoble - Rhône-Alpes 655 avenue de l'Europe - Montbonnot 38334 Saint Ismier Cedex - France) StructId : 44971