About Me

02 October 2011

I'm a researcher in the Distributed Systems Group, School of Computer Science and Statistics, Trinity College Dublin.

The Research

The 17th of July, 2011 saw Twitter's previous highest Tweets Per Second (TPS) rate broken. A maximum of 7,196 TPS passed through Twitter's architecture this month, beating the previous record of 6,939 at the turn of the 2011 Japanese New Year. The increase was attributed to the FIFA Women's World Cup final.

On the 23rd of August, 2011, 5,500 TPS passed through Twitter's systems for approximately 2 minutes as the East coast of the Unites States was hit by an earthquake measuring 5.8 on the Richter scale. No fatalities were reported, but the quake racked up an estimated $200-$300 million in other damages.

With over 200 million active users, how does Twitter manage to deliver 230 million tweets a day? How many of those Tweets go missing? If you Tweet now, when will someone else get it - and how long will they have to wait? What if they never receive it?

My research lies in the development of generic mechanisms for adaption in stream computing frameworks, to guarantee processing of all data in a timely fashion regardless of data rate or burst characteristics. Data streams flow like rivers, a single piece of data can easily pass through, never to be seen again. In such a paradigm, analysis of data in a timely fashion is critical to providing temporally accurate results for any single computation.

Architectures need to adapt to meet the demands of an ever changing data stream - or multiple data streams. Current work on adaptation in this area is highly restrictive, with custom applications requiring custom frameworks for operation. The only other alternative is load shedding - dropping data before it enters a system, thereby losing potentially valuable information. In many applications this is simply not acceptable - lives may depend on it.

Publications

  • Performance evaluation of the 6LoWPAN protocol on MICAz and TelosB motes. In Proceedings of the 4th ACM Workshop on Performance Monitoring and Measurement of Heterogeneous Wireless and Wired Networks (Tenerife, Canary Islands, Spain, October 26 - 26, 2009). PM2HW2N '09. ACM, New York, NY, 25-30. Link to paper.

David Guerin dguerin-@-cs.tcd.ie

interesting software

S4 - Distributed Stream Computing Platform

Hadoop - Distributed Batch Processing Platform

Tornado Web Framework

Twisted - Event-driven networking engine

interesting sites

Twitter's highest Tweets per Second so far

Twitter Global Pulse of earthquake in Japan

Twitter's Tweet of 23rd August earthquake

My How to Blog

Academic

Distributed Systems Group

Grid Ireland

The Dimes Project

science

CERN

NASA

ESA

Astronomy Ireland

Science Gallery

interesting topics

Net Neutrality

NoSQL

WebGL