The Triangle of Interoperability: Metadata, Ontology and Vocabulary
This article was written with contributions from Francois Sherwood.
foundingGIDE is a project funded under the European Commission's Horizon Europe framework program. foundingGIDE is coordinated by Euro-BioImaging. foundingGIDE is dedicated to enabling the interoperability of image data resources through ontology mapping and harmonisation of metadata models. In this article, Maria Mirza, Scientific Project Manager of foundingGIDE, reflects on the importance of metadata, ontology, and vocabularies in enabling interoperability, and how foundingGIDE is addressing this challenge.
Interoperability is crucial for imaging resources, as it allows researchers to effectively share, compare, and analyze data from different sources, accelerating scientific discoveries and fostering collaboration across disciplines.
In imaging, metadata, ontology and vocabulary are interconnected concepts that play crucial roles in making data FAIR. Metadata uses ontology concepts to annotate imaging data. Vocabularies are used within ontologies to define concepts. Ontologies add structure and relationships to vocabularies.
Use of metadata schemas enables interoperability
Studies annotated with the same metadata schema can be seamlessly integrated with one another, enabling broader comparative analyses and new insights.
Metadata is data about the data and in imaging, metadata uses ontology terms to describe the context, content, and structure of image data. Metadata with schemas that have clear definitions ensure that imaging data is understandable, reproducible, and reusable. Often terms from various ontologies are re-used to define such a structured model.
Ontologies are quantifiable stores of knowledge in a specific domain. Ontologies reference controlled vocabulary terms (such as “image” or “tissue” or “microscope”) that represent concepts in the domain of that ontology. Most importantly, an ontology organizes and formalizes this knowledge by defining the relationships between the concepts.
A controlled vocabulary refers to the set of terms and definitions that are used within a specific domain. The purpose of a controlled vocabulary is to ensure consistency and clarity.
One metadata schema may reference multiple ontologies. One ontology may reference one or multiple vocabularies, or refer to one or multiple other ontologies.
Globally standardised metadata schema
Amazing progress has been achieved in sharing imaging data. There have been numerous community initiatives to standardize the use of metadata schemas and work is still ongoing. Sharing imaging data enables reusability and standardization of metadata and ontologies is critical for effective image data sharing and this is still a challenge.
To address the challenges of the growing life science image data and to increase the interoperability of shared biological and preclinical image data, foundingGIDE will lay the foundation for a Global Image Data Ecosystem and coordinate global community initiatives of standardizing the use of metadata schemas.
foundingGIDE will enable the seamless sharing of biological and preclinical image data across the globe by increasing the interoperability between major data repositories (BioImage Archive, IDR, SSBD) through ontology mapping and harmonisation of metadata models used by these data repositories.
The foundingGIDE Community Event will bring together imaging data stakeholders from all over the globe to Okazaki, Japan in an effort to establish the basis of a coordinated Global Image Data Ecosystem. Find out more about the event here.