My research is in the areas of
artificial intelligence and neural networks. A bibliography of my
research papers and four odd books, together with a list of my grants
whilst I was at Surrey (until 2005) and now at Trinity
College, Dublin is here.
In artificial
intelligence I have worked on terminology and
ontology systems and showed the relevance of such ideas to knowledge management
and knowledge acquisition; here is an audio visual presentation of my
thoughts on the elusive
subject of ontology and the accessible notion of terminology. More recently, I have been
working on information extraction from special language texts; a
special language is a subset of everyday language that has its own
distinctive vocabulary and a 'local grammar'. I have used special
language techniques for:
- The analysis of
sentiments in financial and political news: this work
covers news published in English, Arabic and Chinese and the
focus is on financial news, but there is a small study about ethnicity
and tensions due to identity crisis. In this respect, a
year ago I gave a talk about the analysis
of a news collection (a 'corpus of texts') and there is a paper published in
a technical analysis magazine on the topic of sentiment analysis
(carried on a real-live news
wire on a data-and-compute grid).
- The
assemblage and analysis of text
corpora now dominates work in linguistics and increasingly such
analysis is being used in information extraction. I have
tried to look at whether one can understand changes in science and
technology by looking at the ontological commitment of scientists and
engineers through their published documents including their research
output in learned journal and their contributions to patent documents.
The evolution of subjects like nuclear physics, economics, linguistics and cancer
care, was studied by assembling and analyzing text corpora in these
subject areas; I have looked at the use of metaphor in nuclear
physics and studied the recent case of plagiarism
in nano-technology.
- The most recent project
in this area was that of autonomous
corpora - how can a system assemble a text corpus much in the way
an expret corpus linguist does. This project was being developed
in conjunction with the late Prof. John Sinclair , Prof Yorick Wilks
and Mr Lou Burnard. I gave two lectures at the invitation of the
Tuscan Word Centre respectively called the First Sienna Lecture
and the Second Sienna
Lecture. These lectures attempt to relate work in crawlers
and page-ranking systems to the work in text analysis and
corpus building.
In neural networks, I have worked
mainly on multi-net systems - systems of neural computing, systems that
learn autonomously. The individual constituents of a
multi-net specialize in learning one aspect of a data set
generated by an object or (physical/biological) system. Multi-net
systems will play a major role in simulating and articulating the
complex notions of cross-modal processing and multi-sensory
information fusion. Cross
modal processing, that is stimulus is
in one modality and response in another, is an important sub-set of
multi-modal
processing.
- I have been studying how to simulate and model multi-modal
processing in humans, especially infants
and impaired
human beings, over the last 15 years or so: the simulations were
carried out using three neural nets -one learning 'concepts' and the
other learning language whilst the third learning the association of
the two. The novelty here is that all the three networks are
'unsupervised' and are based on the principles of self-organization.
- The development of numerosity - from the
ability of humans and other primates to 'visually enumerate', or to use
the correct term to subitise,
to the ability to count a very large number of objects correctly- gives
us an important insight into this critical cognitive ability.
There is a degree of cross-modality in the execution of numerosity
tasks in that areas of the brain that are involved in vision, language
and spatial attention appear to be involved in both subitisation
and counting. I have been involved in a study of small
numerosity using both supervised and unsupervised learning
algorithms.
- One
practical use of multi-net systems is in the area of automatic
annotation of images and automatic illustration of texts or
keywords. So, I have been involved in building systems that learn
about visual features of images and learn about the linguistic
collateral texts or keywords, and also learn to associate keywords and image
features. This appraoch was used in annotating large image
collections and illustrating keywords used in the collections.
(More to follow)