Collection, storage and management of datasets

Work package two

This work package focuses on the collection, storage, and management of datasets. It provides the infrastructure operationally, legally and technically.

A major barrier to progress in AI in medical imaging is the lack of standardised and accessible imaging data for the development, evaluation and validation of AI algorithms.

The development of potential AI software solutions requires high-quality, labelled, curated and validated data.

This work package works with the network of Lung Health Checks in providing data from their systems to support AI development. We have invested in the expertise to ensure that data is of high quality, consistent, and annotated and curated as needed.

The de-identified data is held within our central databank to support specific research project activity, and access is governed through our Data Access Committee, using FAIR principles – findable, accessible, interoperable and reusable – for scientific data management and stewardship.

DART processes support the collection, storage, sharing and analysis of the patient history, low dose CT and PET scans, biopsies, resections and bloods to support AI development, training and validation across various stages of development.

Work package 2 is led by Prof Jim Davies and Dr Charlie Crichton, supported by a Database Manager. Commercial partners are GEHC and Roche Diagnostics.