Mr. DLib v1.2.1: Improved keyphrase recommendations and Apache Lucene query handling

Published by Joeran Beel on

Mr. DLib: Recommendations-as-a-Service

Mr. DLib: Recommender-System as-a-Service

The new version of our recommender system completes 104 issues and significantly improves the recommendations. The most notable improvements are:

  • We improved the keyphrase extraction process in the recommender system, i.e. keyphrases are not stored differently in Lucene. We expect better recommendation effectiveness and are currently running an A/B test.
  • More robust path encoding for search queries (special characters in a URL caused errors)
  • Lucene’s eDismax function is A/B tested (together with Lucene’s standard query parser)
  • Improved queries for CORE recommender system (their system needs queries to be of a certain length; Mr. DLib now just multiplies the queries until they are at least 50 characters)
  • Abstracts and keywords in the XML response of Mr. DLib are enclosed in <![CDATA[
  • HTML Snippet is improved (better layout for recommendations in JabRef), i.e. spaces were added, and “NULL” elements are not shown anymore
  • For both queries and Lucene indexes, only lowercase is used (previously, we used cases inconsistently, i.e. not all documents were considered for recommendations)


Leave a Reply

Your email address will not be published. Required fields are marked *