Mr. DLib

RARD: The Related-Article Recommendation Dataset

We are proud to announce the release of ‘RARD’, the related-article recommendation dataset from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains information about 57.4 million recommendations that were displayed to the users of Sowiport. Information includes details on which recommendation approaches were used (e.g. content-based filtering, stereotype, most popular), what types of features were used in content based filtering (simple terms vs. keyphrases), where the features were extracted from (title or abstract), and the time when recommendations were delivered and clicked. In addition, the dataset contains an implicit item-item rating matrix that was created based on the recommendation click logs. RARD enables researchers to train machine learning algorithms for research-paper recommendations, perform offline evaluations, and Read more…

By Joeran Beel, ago
Mr. DLib

Several new publications: Mr. DLib, Lessons Learned, Choice Overload, Bibliometrics (Mendeley Readership Statistics), Apache Lucene, CC-IDF, TF-IDuF

In the past few weeks, we published (or received acceptance notices for) a number of papers related to Mr. DLib, research-paper recommender systems, and recommendations-as-a-service. Many of them were written during our time at the NII or in collaboration with the NII. Here is the list of publications: Beel, Joeran, Bela Gipp, and Akiko Aizawa. “Mr. DLib: Recommendations-as-a-Service (RaaS) for Academia.” In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), 2017. Beel, Joeran. “Real-World Recommender Systems for Academia: The Gain and Pain in Developing, Operating, and Researching them.” In 5th International Workshop on Bibliometric-enhanced Information Retrieval (BIR) at the 39th European Conference on Information Retrieval (ECIR), 2017. [short version, official], [long version, arxiv] Beierle, Felix, Akiko Aizawa, and Joeran Beel. Read more…

By Joeran Beel, ago
Machine Learning

Some numbers about Mr. DLib’s Recommendations-as-a-Service (RaaS)

Six months ago, we launched Mr. DLib’s recommendations-as-a-service for Academia. Time, to look back and provide some numbers: Since September 2016, Mr. DLib´s recommender system has delivered 60,836,800 recommendations to our partner Sowiport, and Sowiport’s users have clicked 91,545 of the recommendations. This equals on overall click-through rate (CTR) of 0.15%. The figure shows the number of delivered recommendations and CTR by month (2016-09-08 to 2017-02-11).  CTR is rather low and there is a notable variance among the months (e.g. 0.21% in September and 0.10% in December). The variance may be caused by different algorithms we are experimenting with. In addition, recommendations are also delivered when web spiders such as Google Bot are crawling our partner website Sowiport.de. In contrast, clicks are Read more…

By Joeran Beel, ago
Publications

Paper accepted at ISI conference in Berlin: “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport”

Our paper titled “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport” is accepted for publication at the 15th International Symposium on Information Science (ISI) in Berlin. Abstract: Stereotype and most-popular recommendations are widely neglected in the research-paper recommender-system and digital-library community. In other domains such as movie recommendations and hotel search, however, these recommendation approaches have proven their effectiveness. We were interested to find out how stereotype and most-popular recommendations would perform in the scenario of a digital library. Therefore, we implemented the two approaches in the recommender system of GESIS’ digital library Sowiport, in cooperation with the recommendations-as-a-service provider Mr. DLib. We measured the effectiveness of most-popular and stereotype recommendations with click-through rate (CTR) based on 28 million delivered Read more…

By Joeran Beel, ago
Machine Learning

Two of our papers about citation and term-weighting schemes got accepted at iConference 2017

Two of our papers about weighting citations and terms in the context of user modeling and recommender systems got accepted at the iConference 2017. Here are the abstracts, and links to the pre-print versions: Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references? In the domain of academic search engines and research-paper recommender systems, CC-IDF is a common citation-weighting scheme that is used to calculate semantic relatedness between documents. CC-IDF adopts the principles of the popular term-weighting scheme TF-IDF and assumes that if a rare academic citation is shared by two documents then this occurrence should receive a higher weight than if the citation is shared among a large number of documents. Although CC-IDF Read more…

By Joeran Beel, ago
Recommendations as-a-Service (RaaS)

Enhanced re-ranking in our recommender system based on Mendeley’s readership statistics

Content-based filtering recommendations suffer from the problem that no human quality assessments are taken into account. This means a poorly written paper ppoor would be considered equally relevant for a given input paper pinput as high-quality paper pquality if pquality and ppoor contain the same words. We elevate for this problem by using Mendeley’s readership data for re-ranking Mr. DLib’s recommendations. This means, once we have a number of e.g. 20 documents that are related for a requested input paper, we re-rank the 20 documents based on the number of readers they have on Mendeley. The most read papers are then recommended. More details will follow.

By Joeran Beel, ago
Academia

Various positions to work on research-paper recommender systems (Mr. DLib) and Docear (Bachelor/Master/PhD/Post-Doc)

Update on 2018-03-15: We Are Hiring 1 Software Engineer & 1 Software Architect / Product Owner for a Recommender-System Business Start-up   Updated on 2017-08-14: Here at Docear and Mr. DLib we have many exciting projects in the field of recommender systems, user modelling, personalisation, and adaptive systems (primarily with a focus on digital libraries but we are also open for domains such as health care, transportation, and tourism). If you are interested in pursuing any of the projects as part of a Bachelor, Master, or PhD thesis, as a post-doctoral researcher, or as a short-term research internship, read on. The projects We have a number of interesting projects but please consider the following list only as a suggestion. If you have Read more…

By Joeran Beel, ago
Help Wanted

Students & PostDocs: We have open positions in Tokyo, Copenhagen, and Konstanz (2-24 months)

Update 2016-01-12: The salary in Tokyo would be around 1.600 US$ per month, not 1.400. 2015 has been a rather quiet year for Docear, but 2016 will be different. We have lots of ideas for new projects, and even better –  we have funding to pay at least 1 Master or PhD student, to help us implementing the ideas. There is also a good chance that we get more funding, maybe also for Bachelor students and postdoctoral researchers. The positions will be located in Tokyo, Copenhagen or Konstanz (Germany). In the following, there is a list of potential projects. If you are interested, please apply, and if you have own ideas, do not hesitate to discuss them with us. What exactly you Read more…

By Joeran Beel, ago