Call for Donation: (Automatic) PDF Metadata Extraction and Renaming
Done! We’ve got all the money we need, thank you very much!!!!!!!! Read on here…
One of Docear’s biggest disadvantages, compared to other reference managers, is the rather poor PDF metadata extraction capability. As such, it is no surprise that the second most popular feature request is to add decent PDF metadata extraction and file renaming to Docear. However, adding such a function is a lot of work and we currently do not really have the manpower for this. Fortunately, one of our best students – i.e. Christoph, who already did a lot of work for us – wants a paid job for his semester breaks. If we could pay him 1,800 Euros, he would love to implement the PDF metadata extraction method in his semester breaks, and we have no doubts that he is capable of doing it. The problem is, we don’t have the funds to pay him.
Therefore, we would like to start a call for donation: If you want decent PDF metadata extraction in Docear, please donate, before February 28, 2014. We need 1,800 Euros to pay Christoph for four weeks, almost full-time, starting the end of February.
During the four weeks, Christoph would (begin to) implement the following work packages (see also GitHub for more details, and to follow the development):
1. Improved PDF metadata extraction dialog
Right now, retrieving metadata in Docear is quite annoying. You need to select a PDF, select the entry in the menu and go through several dialogs. We want a single dialog in which all options are combined and that could look like this:
This means, when you want to create a new reference for a PDF, a dialog opens in which you can select to a) create a blank entry b) retrieve metadata or c) create new entry based on the PDF’s XMP metadata. For b), the PDF’s title is immediately extracted and shown in the dialog. Via the lookup button, additional metatada can be retrieved and selected from the list.
2. Request metadata from Google Scholar
Docear’s digital library is rather small and not always available. Therefore, we would love to additionally request metadata directly from Google Scholar. Docear could send a title, extracted from the PDF, as search query to Google Scholar and show the search results to import them in Docear as BibTeX. We would also need an option for users to enter a captcha when Google Scholar blocks someone’s IP.
3. Auto Retrieve metadata for PDF files
We would also like to auto-retrieve metadata for all your PDF files in the background. We know, Google Scholar only allows a few dozens of requests per day. But we could implement something like requesting metadata for only e.g. 50 PDF files via Google Scholar per day.
4. Auto Rename PDF files based on BibTeX metadata
A function that renames all your PDF files according to your metadata, and you could specify the pattern how PDF files shall be renamed (e.g. [Author]_[Year].pdf).
5. Sort PDF files based on BibTeX metadata
A function that sorts your PDF files based on the metadata into folders like \year\author\filename.pdf in both the physical folder structure and the mind map structure.
Implementing all five items is a lot of work, and we cannot promise that Christoph will be able to do all of them in four weeks. However, if the required 1,800 Euros are donated, we can promise to deliver 1., and 2., and most likely also 3., maybe even 4. within two months (if Christoph does not finish the job entirely, we will help out).
If you want these features, please donate! In the unlikely case that Christoph does not manage to implement at least 1. and 2. by the end of April, you will get your money back. Similarly, if we receive less than 1,800 Euros in donations, and Christoph will not start the job, you will also get your money back if you want. And of course, we ensure that everything Christoph develops will be maintained by us in the long run. And, it’s probably needless to say: All work that Christoph is doing will be open source, so others can use it for their projects as well if they like.
If you live in the European Union, you may also use a bank transfer instead of PayPal. In this case, please transfer the money to Bank: Postbank Frankfurt, account owner: Joeran Beel, IBAN: DE51500100600853552606, BIC: PBNKDEFF.
For questions, or suggestions, please use the comment function!