Information Extraction

Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers

Our paper “Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers” got recently accepted and will be presented at Joint Conference on Digital Libraries 2018. Abstract: Bibliographic reference parsing refers to extracting machine-readable metadata, such as the names of the authors, the title, or journal name, from bibliographic reference strings. Many approaches to this problem have been proposed so far, including regular expressions, knowledge bases and supervised machine learning. Many open source reference parsers based on various algorithms are also available. In this paper, we apply, evaluate and compare ten reference parsing tools in a specific business use case. The tools are Anystyle-Parser, Biblio, CERMINE, Citation, Citation-Parser, GROBID, ParsCit, PDFSSA4MET, Reference Tagger Read more…

By Dominika Tkaczyk, ago
Information Extraction

Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes

Our paper “Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes” got recently accepted and will be presented at Joint Conference on Digital Libraries 2018. Abstract: Creating scientific publications is a complex process. It is composed of a number of different activities, such as designing the experiments, analyzing the data, and writing the manuscript. Information about the contributions of individual authors of a paper is important for assessing authors’ scientific achievements. Some biomedical publications contain a short section written in natural language, which describes the roles each author played in the process of preparing the article. In this paper, we present a study of authors’ roles commonly appearing in these sections, and propose an algorithm for automatic Read more…

By Dominika Tkaczyk, ago
Machine Learning

Call for Marie Curie Individual Fellowships: We are open to supervise projects relating to recommender-systems, machine learning, and NLP here at TCD Dublin

The European Union has published the call for Individual Marie Curie Fellowships (MSCA) with the application deadline being 12 September 2018. The goal of the Individual Fellowships is to enhance the creative and innovative potential of experienced researchers. Our group has already one Marie Curie fellow, i.e. a postdoctoral researcher, as part of the EU/SFI EDGE fellowship programme. However, we are open to supervise more postdoctoral researchers. If you are interested in applying for the Individual Marie Curie Fellowship and need a supervisor, please contact us. We are particularly interested in projects relating to machine learning (machine translation, machine-learning evaluation, novel machine-learning algorithms, curriculum learning), recommender systems, and natural language processing.

By Joeran Beel, ago
Partnerships

Visiting Professorship: We intensify our collaboration with the NII in Tokyo

Today I was appointed as Visiting Professor at the National Institute of Informatics (NII), effective 1. April 2018 for the forthcoming four years. I am very grateful for the generous support of the NII, and I am looking forward to visiting the NII approximately once or twice a year for a few weeks to collaborate on research relating to recommender systems, machine learning, natural language processing and our other research areas. I have been working closely together with the NII in Tokyo for quite a while, in particular with the Digital Content and Media Division and Prof Dr Akiko Aizawa. As such, I am glad to continue and intensify the collaboration for at least four years.

By Joeran Beel, ago
Conferences

AICS’2018: We Co-Organize the 26th Irish Conference on Artificial Intelligence and Cognitive Science

We are delighted to announce the 26th Irish Conference on Artificial Intelligence and Cognitive Science (AICS’2018), which we will co-organize together with Rob Brennan, Ruth Byrne, Jeremy Debattista, and a renowned program committee. AICS 2018 takes place from December 6 to 7, 2018 at Trinity College Dublin, more precisely in the Long Room Hub. Deadline for submissions is 30th September 2018. There will be three tracks for submissions, namely full papers, NECTAR Papers, and student papers. The call for papers invites papers relating particularly to machine learning, machine translation, neural networks, data mining, cognitive modelling, behaviour epistemology, evolutionary computation, recommender systems, collective intelligence, human learning, and several more. AICS 2018 is sponsored by the ADAPT Research Centre, Trinity Long Room Hub, and Trinity College Dublin. AICS dates Read more…

By Joeran Beel, ago
Internal

Our new website is live!

Today, we launched our new website https://www.scss.tcd.ie/joeran.beel/. It provides lots of information about our research, publications, projects, and teaching relating to recommender systems, machine learning and more. The new website also combines the blog posts of our project websites Mr. DLib and Docear.  

By Joeran Beel, ago
Jobs / Career

We Are Hiring: 1 Software/Machine-Learning Engineer & 1 Software Architect / Product Owner for a Recommender-System Business Start-up

UPDATE: We will soon advertise another position for this start-up. Please come back in a few days. The School of Computer Science and Statistics of Trinity College Dublin and the ADAPT Centre received funding to hire 2 employees for 2 years* to spin-out a business start-up in the field of recommender-systems as-a-service and machine learning in Dublin. The two positions are to be filled with one machine-learning engineer and one software architect/product manager, whereas both employees are expected to work together very closely. They will be responsible for developing a recommender-system as-a-service that uses a unique technology, based on the research of Prof Dr Joeran Beel who will be the project lead (read here for a brief outline of the Read more…

By Joeran Beel, ago
Jobs & Internships

We welcome two DAAD interns in recommender systems and machine learning (Dublin & Tokyo)

As part of the DAAD RISE Worldwide program, we were awarded two funded internship positions for two undergraduate students both being from Germany. The two interns will be conducting a research project as part of Mr. DLib in the fields of recommender systems, machine learning and natural language processing. Gordian (University of Munich / LMU) and Martin (Universty of Göttingen) will spend around three months with us over the summer — Gordian at the National Institute of Informatics in Tokyo, and Martin at the ADAPT Centre and School of Computer Science at the Trinity College Dublin.

By Joeran Beel, ago
Mr. DLib

Mr. DLib Recommendations-as-a-Service v1.3: “Word Embeddings” and Many Minor Improvements and Bug Fixes

We released version 1.3 of Mr. DLib´s Recommender-System as-a-Service. The new major feature is “word embeddings” based recommendations. We are excited to see how the new recommendations will perform with our partners. In addition, we fixed many small bugs, and added some minor improvements.  A complete overview can be found in JIRA.

By Joeran Beel, ago
Mr. DLib

Mr. DLib v1.2.1: Improved keyphrase recommendations and Apache Lucene query handling

The new version of our recommender system completes 104 issues and significantly improves the recommendations. The most notable improvements are: We improved the keyphrase extraction process in the recommender system, i.e. keyphrases are not stored differently in Lucene. We expect better recommendation effectiveness and are currently running an A/B test. More robust path encoding for search queries (special characters in a URL caused errors) Lucene’s eDismax function is A/B tested (together with Lucene’s standard query parser) Improved queries for CORE recommender system (their system needs queries to be of a certain length; Mr. DLib now just multiplies the queries until they are at least 50 characters) Abstracts and keywords in the XML response of Mr. DLib are enclosed in <![CDATA[ HTML Snippet is improved Read more…

By Joeran Beel, ago