Together with Giuseppe Rizzo and Raphaël Troncy from EURECOM, NewsReader team member Marieke van Erp participated in the MSM2013 Concept Extraction Challenge with a system called NERD-ML. This system was developed to identify and classify named entities in microposts (Tweets). Their system ranked 2nd in the challenge, only 0.01 points in F-score behind the best system.
NERD-ML is built on top of the NERD framework developed at EURECOM that makes it possible to easily query different named entity extractors that are available on the web. With NERD-ML, this framework was extended to take the output of the individual extractors and learn which extractors perform best on which data through an additional machine learning layer, improving the performance of the system over the individual extractors.
Giuseppe Rizzo presented NERD-ML at the Making Sense of Microposts workshop at WWW2013 in Rio de Janeiro this week. You can find the preprint of the abstract describing the system here and the presentation below.