Areas of scientific interest and research:
- text mining: information retrieval, information extraction, language resources
- machine learning: case-based reasoning, support vector machines
- databases: data modelling, data mining, user interface development
- ontologies: development, application, standardisation
- systems biology and bioinformatics applications of the above
I also participate in standardisation activities (Metabolomics Standards Initiative).
|
|
I became research active in the area of natural language processing as part
of my MSc studies, continuing along these avenues of research during my
PhD focusing more on
biomedical applications of text mining. At the same time, I was as a researcher
on a Eureka funded project coordinated by
LION BioScience, working on the tasks of automatic term extraction, clustering
and classification. As part of my information management duties in
MCISB, I am currently developing methods of
text mining for use in systems biology. These activities involve collaboration
with the National Centre for Text Mining,
which is collocated with MCISB in
Manchester Interdisciplinary Biocentre (MIB).
|
|
|
In 2003 I undertook postdoctoral studies on metabolome bioinformatics with
Prof. Douglas Kell. I worked on a
BBSRC funded project in which my role
involved the storage, analysis and visualisation of the metabolome data and
their integration with other post-genomic data. As a result,
MeMo has been developed as a
formal model for structuring and annotating metabolomic data and the associated
metadata in a machine-usable form in order to facilitate the exploration of the
hidden links between the genes and their functions. It is currently being
evolved to handle other types of data produced within the
MCISB.
|
|
|
I am a member of the
Metabolomics Standards Initiative
(MSI) Ontology Working Group (OWG).
The OWG supports the activities of other MSI working groups
by developing a common semantic framework (controlled vocabularies and
ontologies) to consistently interpret and seamlessly integrate information
scattered across public resources. Manual term acquisition approaches are time-consuming,
labour-intensive and error-prone, especially in the rapidly developing domain of
metabolomics, where new analytical techniques emerge regularly, thus often compelling
domain experts to use non-standardised terms. My role in this open effort is to
provide text mining support for efficient corpus-based term acquisition
as a way of rapidly expanding a set of controlled vocabularies with the terms used
in the scientific literature.
|
|