Research Interests

home | research | papers | software | activities | students | teaching | cl group


My research interests are in Computational Linguistics and Natural Language Processing. Much of my work uses models of language derived from corpus data to develop language processing applications.

My main area of research is linguistically motivated Statistical Parsing, with a particular focus on the grammar formalism Combinatory Categorial Grammar. I also carry out research in areas such as data-driven Machine Translation, Question Answering, Information Extraction, and Lexical- and World-Knowledge Acquisition.

Some Recent Papers

The first paper gives a detailed description of the natural language parser I have developed with James Curran. The parser, and associated tools, are freely available for research use: click on the software link above. The second paper describes a nascent interest in creating a compositional semantics for vector space models of meaning.
  • Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models
    Stephen Clark and James R. Curran
    to appear in Computational Linguistics
    [PDF](preprint)
  • Combining Symbolic and Distributional Models of Meaning
    Stephen Clark and Stephen Pulman
    Proceedings of the AAAI Spring Symposium on Quantum Interaction, pp.52-55, Stanford, CA, 2007
    [PDF]

Presentations

  • Linguistically Motivated Large-Scale Language Processing
    Invited talk at CLUK-07
    [PDF]

Grants

  • Accurate and Efficient Parsing of Biomedical Text. Funded by EPSRC. Starts October 2007