The CURIE Knowledge Graph: Contextualizing evidence, building knowledge

Modern biomedical research is defined by its scale and heterogeneity. Across the life sciences, millions of datasets—from genomics to clinical trials—provide valuable fragments of information. Yet much of this information remains disconnected. To make full use of it, researchers need a framework that continuously integrates and contextualizes data across sources, modalities, and studies — a framework that connects information to build knowledge.
The CURIE Knowledge Graph (KG) provides that framework. It is the semantic and contextual layer of the Data4Cure Biomedical Intelligence® Cloud, transforming billions of heterogeneous data points into a connected, continuously updated model of biomedical knowledge.
From Data to Structured Knowledge
The CURIE Knowledge Graph integrates data-driven and literature-derived evidence spanning genes, diseases, drugs, cell types, pathways, and phenotypes. It currently comprises over 4 billion relationships across more than one million entities, providing an exceptionally broad and detailed structured representation of biomedical knowledge.
Unlike traditional databases and certain other knowledge graphs in this space, CURIE is dynamic—continuously expanding as new datasets and analyses are ingested through the Data4Cure Data Hub and analyzed through the App Engine. Experimental results, pathway analyses, clinical associations and literature-mined relations are organized together within a common ontological framework with contextual annotations and supporting references. The result is a living framework moving beyond static, one-off analyses toward a system of continuous, cumulative knowledge where new findings are immediately integrated and contextualized, rather than remaining isolated.
Part of the Architecture for Intelligence
The CURIE Knowledge Graph serves as the third layer in Data4Cure’s Architecture for Intelligence and is tightly integrated with the other layers in a four-layer system that turns raw data into actionable insights:
- Data Hub – Harmonizes and semantically annotates diverse omics and clinical datasets.
- Biomedical App Engine – Provides computational tools and machine learning pipelines to generate structured analytical outputs.
- CURIE Knowledge Graph – Integrates and contextualizes results and literature evidence into a unified semantic framework.
- AI & Insights Layer – Builds advanced AI models and tools that synthesize and reason over the integrated data and the graph to generate predictions and insights.
Knowledge Structure and Provenance
The CURIE Knowledge Graph is built on a semantically rigorous framework that integrates multiple biomedical ontologies—covering genes, diseases, pathways, drugs, phenotypes, and experimental models. Each entity in the graph is represented as a node connected by typed relations that describe biological or clinical associations such as “gene expressed in tissue”, “compound modulates target”, or “variant linked to phenotype.”
A defining feature of CURIE is its explicit provenance model. Every relationship is linked to its originating evidence—whether from analyses or experimental data, or literature-derived sources—and carries detailed metadata including provenance identifiers, data modality, evidence confidence, and publication references. This allows researchers to trace each assertion to its source and understand why it exists in the graph.
Provenance is not static; it is updated dynamically as new datasets, analyses, and publications are integrated through the Data4Cure Data Hub and the App Engine. This ensures that the graph remains both transparent and current—a reliable substrate for large-scale analysis and AI-driven discovery.
Visualization, Interpretation and Collaboration
Interpretability and visualization are central to CURIE’s design. Literature relations are supported by direct links and citations while data-driven relations are supported by dynamically-generated figures and detailed information on the specific data and analysis that generated the relation. This interactive representation allows researchers to inspect how each connection is supported, compare evidence from different modalities, and identify areas where data and/or literature evidence converge or diverge.
The platform also incorporates fine-grained permission controls, allowing organizations to manage access to specific subsets of the Knowledge Graph. Teams can securely define which data or knowledge components are visible to individual users and/or user groups — ensuring that proprietary knowledge can exist within a unified framework serving each organization.
These capabilities make the CURIE Knowledge Graph not only a repository of interconnected knowledge, but a collaborative, transparent environment for exploring, validating, and sharing biomedical insights at scale.
AI-Driven Knowledge Synthesis and Insights
The CURIE Knowledge Graph provides an ideal foundation for biomedical AI. Its structured, continuously updated representation of biological and clinical knowledge enables AI models to learn from context—linking genes, pathways, diseases, and phenotypes in ways that reflect real biological complexity.
This foundation supports a broad ecosystem of AI applications on the Data4Cure platform, from target and indication discovery to subtype identification and automated evidence synthesis. Tools such as Target Intelligence, Subtype Intelligence, and CURIE AI Entity Reports all build on CURIE’s integrated structure to deliver explainable, evidence-based results.
By training and operating directly on a living, semantically rich graph, these AI systems can continuously evolve with new data and insights—supporting diverse research goals across disease biology, biomarker discovery, and therapeutic development.
Recognition and Scientific Impact
The importance of this approach has been recognized in Gartner’s 2025 Hype Cycle for Life Science R&D, where Semantic Knowledge Graph Tools—including CURIE—were placed on the Slope of Enlightenment, reflecting growing maturity and impact across pharmaceutical R&D.
This recognition underscores a broader shift: as the life sciences adopt AI-driven discovery methods, structured and continuously updated knowledge graphs are becoming essential infrastructure for modern research.
Toward Continuous Knowledge
The CURIE Knowledge Graph embodies a simple but transformative principle: knowledge grows through connection and context. Each new dataset, analysis, or publication strengthens the graph, linking new evidence to the existing web of biological relationships.
By integrating and contextualizing data at scale, CURIE provides a shared foundation for both human and machine intelligence—enabling researchers to explore complex biological systems, discover new targets, and accelerate translation from data to insight.