New Paper: On the Robustness of Google Scholar against Spam

I am currently in Toronto presenting our new paper titled “On the Robustness of Google Scholar against Spam” at Hypertext 2010. The paper is about some experiments we did on Google Scholar to find out how reliable their citation data etc. is. The paper soon will be downloadable on our publication page but for now i will post a pre-print version of that paper here in the blog:

Abstract

In this research-in-progress paper we present the current results of several experiments in which we analyzed whether spamming Google Scholar is possible. Our results show, it is possible: We ‘improved’ the ranking of articles by manipulating their citation counts and we made articles appear in searchers for keywords the articles did not originally contained by placing invisible text in modified versions of the article.

1.    Introduction

Researchers should have an interest in having their articles indexed by Google Scholar and other academic search engines such as CiteSeer(X). The inclusion of their articles in the index improves the ability to make their articles available to the academic community. In addition, authors should not only be concerned about the fact that their articles are indexed, but also where they are displayed in the result list. As with all ranked search results, articles displayed in top positions are more likely to be read.

In recent studies we researched the ranking algorithm of Google Scholar [/fusion_builder_column][fusion_builder_column type=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”no” center_content=”no” min_height=”none”][1-3] and gave advice to researchers on how to optimize their scholarly literature for Google Scholar [4]. However, there are provisos in the academic community against what we called “Academic Search Engine Optimization” [4]. There is the concern that some researchers might use the knowledge about ranking algorithms to ‘over optimize’ their papers in order to push their articles’ rankings in non-legitimate ways.

We conducted some experiments to find out how robust Google Scholar is against spamming. The experiments are not all completed yet but those that are completed show interesting results which are presented in this paper. (more…)

Hypertext 2010 Security Hole: All papers downloadable and editable by anyone (2 month before conference start)

In June the ACM Hypertext 2010 will take place in Toronto. Some days ago I wanted to upload the camera ready versions of three papers being accepted at the conference. And… I was surprised. By email I got a link to a web page (namely

http://www.sheridanprinting.com/acm/sigweb-ht/sigweb-ht.cfm?id=ht104,

http://www.sheridanprinting.com/acm/sigweb-ht/sigweb-ht.cfm?id=ht105, and

http://www.sheridanprinting.com/acm/sigweb-ht/sigweb-ht.cfm?id=ht121)

on which I could upload my camera ready papers, specify the authors, keywords, etc. No password or other kind of authorization had to be entered. Now, guess what. I played around with the URL and tried, for instance, to open the following URLs in my browser.

http://www.sheridanprinting.com/acm/sigweb-ht/sigweb-ht.cfm?id=ht100

http://www.sheridanprinting.com/acm/sigweb-ht/sigweb-ht.cfm?id=ht107

You can probably guess what happened: I could edit the details (and see the private email addresses the primary authors provided) and upload PDF files for the other papers being accepted at Hypertext just by changing the URL. That means, I could have added or modified the author list, changed the title or uploaded a modied PDF.

The screenshot shows the user interface on which I could have changed the data for the paper “Dealing with the Video Tidal Wave: The Relevance of Expertise for Video Tagging” by Sara Darvish and Alvin Chin (here is a list of all papers being accepted at Hypertext 2010)

Academic Search Engine Optimization: What others think about it

In January we published our article about Academic Search Engine Optimization (ASEO). As expected, feedback varied strongly. Here are some of the opinions on ASEO:

Search engine optimization (SEO) has a golden age in this internet era, but to use it in academic research, it sounds quite strange for me. After reading this publication (pdf) focusing on this issue, my opinion changed.

[/fusion_builder_column][fusion_builder_column type=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”no” center_content=”no” min_height=”none”][…] on first impressions it sounds like the stupidest idea I’ve ever heard.

ASEO sounds good to me. I think it’s a good idea.

Good Article..

As you have probably guessed from the above criticisms, I thought that the article was a piece of crap.

In my opinion, being interested in how (academic) search engines function and how scientific papers are indexed and, of course, responding to these… well… circumstances of the scientific citing business is just natural.

Check out the following Blogs to read more about it (some in German and Dutch) (more…)

How to write a thesis (Bachelor, Master, or PhD) and which software tools to use

Available translations: Chinese (thanks to Chen Feng) | Portuguese (thanks to Marcelo Cruz dos Santos) | Russian (thanks to Sergey Loy) send us your translation Writing a thesis is a complex task. You need to find related literature, take notes, draft the thesis, and eventually write the final document and create the bibliography. Many books explain how to Read more…

Academic Search Engine Optimization – make your articles better findable

The Journal of Scholarly Publishing just published our article Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar and Co. The article introduces and discusses the concept of what we call “academic search engine optimization” (ASEO) and define as: “Academic search engine optimization is the creation, publication, and Read more…