CLEF is an MRC sponsored project in the E-Science programme that aims to establish policies and infrastructure for the next generation of integrated clinical and bioscience research.

One of the major goals of the project is to provide a pseudonymised repository of histories of cancer patients that can be accessed by researchers.

CLEF’s Goals:

• Enable secure and ethical collection of clinical information from multiple sites
• Analyse, structure, integrate, prioritise and disseminate

• Make resources available using GRID tools (e.g. myGrid )
• To provide access to clinical data to authorised clinicians and scientists


  • The Basic CLEF Information Flow
    The requirements and technologies are best understood in the context of the CLEF information flow that has emerged from the design process and is shown in the figure below.

    Starting with the “Patient care” at the bottom left of the diagram, the flow is:
  • Capture of the information. Some information comes from dictated and transcribed text. Other information comes directly from hospital information systems – e.g. laboratory results, prescriptions, etc.
  • Pseudonymisation of all information at the originating hospital by removal of overt identifying items – name, date of birth, etc - and by providing a CLEF Entry identifier that can only be reversed by the provider (or their nominated trusted third party)
  • Depersonalisation of the texts to remove any residual information that might risk identification – e.g. names of relatives, nick names, place names, unusual occupations, etc. Hence a requirement for reliable scalable techniques.
  • Information extraction of key information from the texts into predefined “templates”, possibly with the help of context provided by the information already in the repository hence the requirement for
  • Integration into the health record repository of all information including laboratories, radiology, and genomic analyses
  • Constructing the chronicle to infer a coherent view of the patient’s history. Typically the same information occurs in many different documents with different levels of granularity, clarity and sometimes conflicts that must be reconciled.

  • From this point the information can go in two directions: Use for patient care - back to the clinicians in the form of summaries for patient care.Providing a concise up-to-date summary of a patients’ condition is a prime request of clinicians for improving patient care. Because it requires re-identification of patients, this step can only occur at the hospital and after security controls have been stringently tested and agreed to be adequate.
    Enrichment for E-Science – with the results of researchers’ queries, their workflows, interpretations, curation and links to external information added to the repository so that it becomes the basis for virtual communities of researchers.

    LCLEF Research

    CLEF aims to develop rigorous generic methods for capturing and managing clinical information in patient care and for integrating that information into clinical and basic bioscience research.
    CLEF will focus on cancer, but the goal is to produce a robust framework which can be used in many areas of clinical medicine and research based on emerging knowledge management techniques within the E-Science/Grids programme.

    CLEF Target Research text

    There is growing recognition that advances in health informatics are central both to the modernisation of health services and to successfully exploiting our rapidly expanding knowledge of genetics and molecular processes ('-omics'). Systems dealing with clinical information - what doctors, nurses and other healthcare professionals have heard, seen thought, and done - face major barriers in capturing information and assuring its quality. A more unified health service and more effective collaborative multidisciplinary research community both require significant progress in overcoming these barriers - as well as adapting the general e-Science issues of cooperation, integration, and dissemination to clinical needs. The scale, diversity and complexity of CLEF the problems require a coherent approach that draws on and contributes to the emerging common architectures and services of the e-Science/Grid communities.

    By the end of the project, the gap between our ability to manage and integrate clinical information and our ability to manage image and genomic information will have been significantly narrowed so that effective use can be made of the broader e-Science framework to benefit clinical research, patients, and public alike.
    CLEF is focusing on the specific technologies which are currently barriers to obtaining and integrating clinical information
    What we do?
    CLEF is developing technologies and IT applications that are of immediate use for healthcare service and medical research applications.
    Linking medical research to clinical practice
    Health informatics standards are essential to achieve the goals of e-health in Europe. Establishing interoperability between systems and patient information exchange between health care organizations is vital for improving the efficiency and quality of care. This however can also be achieved once the security of clinical data and protection of the privacy of the citizens is ensured.
    National IT Programme (NPfIT)
    Over the next 10 years, the NPfIT will connect over 30,000 GPs in England to almost 300 hospitals and give patients access to their personal health and care information, transforming the way the NHS works. It will help to improve the management of health care records, appointments, prescription, and link up-to-date research into treatments available to patients. It will also support patient choice.
