The MEANTIME Corpus (the NewsReader Multilingual Event ANd TIME Corpus) consists of a total of 480 news articles: 120 English Wikinews (http://en.wikinews.org/) articles on four topics (i.e. Airbus and Boeing, Apple Inc., Stock market, and General Motors, Chrysler and Ford) and their translations in Spanish, Italian, and Dutch.
It has been annotated manually at multiple levels, including entities, events, temporal information, semantic roles, and intra-document and cross-document event and entity coreference.
For a more detailed description see the video (on YouTube) or the slides.
The NewsReader MEANTIME corpus is licensed under a Creative Commons Attribution 4.0 International License.
If you use this corpus, please cite the following paper:
Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begona Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the NewsReader Multilingual Event and Time Corpus. In Proceedings of the 10th language resources and evaluation conference (LREC 2016), European Language Resources Association (ELRA), Portorož, Slovenia.
Download data
Manually annotated data (version 1.0)
- MEANTIME – English section: meantime_newsreader_english.zip
- MEANTIME – Spanish section: meantime_newsreader_spanish.zip
- MEANTIME – Dutch section: meantime_newsreader_dutch.zip
- MEANTIME – Italian section: meantime_newsreader_italian.zip
- MEANTIME – agreement data: agreement-MEANTIME-corpus.zip
Raw texts (NAF format)
- English articles: meantime_newsreader_english_raw_NAF.zip
- Spanish articles: meantime_newsreader_spanish_raw_NAF.zip
- Italian articles: meantime_newsreader_italian_raw_NAF.zip
- Dutch articles: meantime_newsreader_dutch_raw_NAF.zip
Shared tasks
SemEval 2015
The English section has been used as trial and evaluation data for the Task “TimeLine: Cross-Document Event Ordering” at SemEval 2015.
In this context timelines have been created from the annotated articles.
For more information please visit the task’s website: http://alt.qcri.org/semeval2015/task4/.
CLIN26
The Dutch section of the MEANTIME corpus has been used for the CLIN26 Shared Task, the first collocated Shared Task for Dutch.
For more information please visit the task’s website: http://wordpress.let.vupr.nl/clin26/shared-task/.
FactA at EVALITA 2016
A revised version of the Italian section of the MEANTIME corpus has been used for the FactA task at EVALITA 2016. For more information please visit the task’s website: http://facta-evalita2016.fbk.eu.
Annotation Guidelines
- Sara Tonelli, Rachele Sprugnoli, Manuela Speranza and Anne-Lyse Minard (2014) NewsReader Guidelines for Annotation at Document Level. NWR-2014-2-2. Version FINAL (Aug 2014). Fondazione Bruno Kessler.
- Manuela Speranza, Rubén Urizar and Anne-Lyse Minard. NewsReader Italian and Spanish specific Guidelines for Annotation at Document Level. NWR-2014-6. DRAFT version. Fondazione Bruno Kessler.
- Anneleen Schoen, Chantal van Son, Marieke van Erp and Hennie van der Vliet. NewsReader Document-Level Annotation Guidelines – Dutch. NWR-2014-08. VU University Amsterdam.
- Manuela Speranza and Anne-Lyse Minard. Cross-Document Annotation Guidelines. NWR-2014-9. Fondazione Bruno Kessler.
Publications
- Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begona Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the NewsReader Multilingual Event and Time Corpus. In Proceedings of the 10th language resources and evaluation conference (LREC 2016), European Language Resources Association (ELRA), Portorož, Slovenia.
- Manuela Speranza and Anne-Lyse Minard. Cross-language projection of multilayer semantic annotation in the NewsReader Wikinews Italian corpus (WItaC). In Proceedings of the Second Italian Conference on Computational Linguistics (CLiC-it 2015). Proceedings of CLiC-it
- Anne-Lyse Minard, Manuela Speranza, Eneko Agirre, Itziar Aldabe, Marieke van Erp, Bernardo Magnini, German Rigau and Ruben Urizar. SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). http://www.aclweb.org/anthology/S15-2132
Technical Report
- Marieke van Erp, Piek Vossen, Rodrigo Agerri, Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Egoitz Laparra, Itziar Aldabe, and German Rigau. 2015. Annotated Data, version 2. Technical Report D3-3-2, VU Amsterdam. http://www.newsreader-project.eu/files/2012/12/NWR-D3-3-2.pdf.