University Home
Dr Irena Spasić

Research areas


Areas of scientific interest and research:

  • text mining: information retrieval, information extraction, language resources
  • machine learning: case-based reasoning, support vector machines
  • databases: data modelling, data mining, user interface development
  • ontologies: development, application, standardisation
  • systems biology and bioinformatics applications of the above


I also participate in standardisation activities (Metabolomics Standards Initiative).




book magnifier chart I became research active in the area of natural language processing as part of my MSc studies, continuing along these avenues of research during my PhD focusing more on biomedical applications of text mining. At the same time, I was as a researcher on a Eureka funded project coordinated by LION BioScience, working on the tasks of automatic term extraction, clustering and classification. As part of my information management duties in MCISB, I am currently developing methods of text mining for use in systems biology. These activities involve collaboration with the National Centre for Text Mining, which is collocated with MCISB in Manchester Interdisciplinary Biocentre (MIB).



molecule In 2003 I undertook postdoctoral studies on metabolome bioinformatics with Prof. Douglas Kell. I worked on a BBSRC funded project in which my role involved the storage, analysis and visualisation of the metabolome data and their integration with other post-genomic data. As a result, MeMo has been developed as a formal model for structuring and annotating metabolomic data and the associated metadata in a machine-usable form in order to facilitate the exploration of the hidden links between the genes and their functions. It is currently being evolved to handle other types of data produced within the MCISB.



MSI I am a member of the Metabolomics Standards Initiative (MSI) Ontology Working Group (OWG). The OWG supports the activities of other MSI working groups by developing a common semantic framework (controlled vocabularies and ontologies) to consistently interpret and seamlessly integrate information scattered across public resources. Manual term acquisition approaches are time-consuming, labour-intensive and error-prone, especially in the rapidly developing domain of metabolomics, where new analytical techniques emerge regularly, thus often compelling domain experts to use non-standardised terms. My role in this open effort is to provide text mining support for efficient corpus-based term acquisition as a way of rapidly expanding a set of controlled vocabularies with the terms used in the scientific literature.




eXTReMe Tracker