The technical components of the MIRACUM data integration centres are defined as parts of a modular architecture and may interact with each other and interchange data based on ETL processes (Extraction, Trans-formation, Loading) and standardized application programming interfaces (REST service interface).
The major components of the DIC architecture are built upon a Medical Informatics ReusAble eCO-system of open source Linkable and Interoperable software tools (MIRACOLIX).
MIRACOLIX is composed of a large set of scalable and interoperable software tools, which will stepwise be designed, developed, refined, deployed and implemented by the MIRACUM partners. Thus the DIC architecture will, in the development and funding phase, grow in five (yearly) phases and be continuously enhanced. According to their previous experiences and competences different software tools will be provided, adapted for their use within a MIRACUM DIC, deployed and supported by different MIRACUM competence centres located at each of the MIRACUM partner sites.
For this ecosystem, we aim at reusing as many open source software tools (which have already been successfully applied in other international research projects) as possible.
MIRACUM start with a basic architecture, which has already been implemented in the conceptual phase to serve as a proof-of-concept architecture being based on the release of MIRACOLIX 0.9. Upcoming new releases (MIRACOLIX 1.0, 2.0, 3.0, 4.0) may constitute functional upgrades of already established architecture components, moving those to a higher level of maturity, but also the introduction of new components into the DIC architecture.
MIRACOLIX 0.9 comprises the following software modules:
The fully released MIRACOLIX 4.0 based DIC architecture (at the end of year 4) shall comprise at least the following further technical components:
Data Anonymization Tool:
Supports the anonymization of datasets by application of statistical disclosure methods adapted to the specific requirements of each use case; e.g. ARX: http://arx.deidentifier.org/; see Prasser et al. 2016abc, Kohlmayer et al. 2015, Prasser et al. 2014
Data Harmonisation/Data Mapping Tool:
Its functionality is closely linked to the MDR; it is supposed to support the map-ping processes between different local data elements and their respective values (value lists) and the centrally defined core data set; its development shall be based on MOLGENIS (BiobankConnect) and the Erlangen Ontology-based Mapping Tools: (compare Pang et al. 2015a; Pang et al. 2015b; Pang et al. 2016; Swertz et al. 2010; Mate et al. 2011; Mate et al. 2015)
Consent Management System:
shall support the electronic management and provision of graduated levels of patient consent in patient care as well as different research contexts; shall be based on gICS (generic informed consent service), which was developed and applied within the MOSAIC/MAGIC projects (compare Bialke et al. 2015)
Natural Language Processing Tool:
shall support the analysis of freetext information (typically provided in physician discharge letters, clinical notes, radiology/pathology reports and other documents of the EHR) and the annotation of such narrative text documents with structured data elements, defined within the M-MDR; shall be based on the Averbis Information Discovery tool (compare e.g. Seuss et al. 2017, López-Garcia et al. 2016, Christoph et al. 2015, Kreuzthaler et al. 2015, Schulz et al. 2013, 2011)
Hospital-/Faculty-wide Trial-/Project Registry:
shall provide the single source of information about all research projects and clinical trials established at a medical faculty/university hospital; shall comprise all the relevant metadata of such research projects/trials, including e.g. the trial´s eligibility criteria, which shall be further applied in the IT supported patient recruitment tool; to be newly developed within the MIRACUM project based on prototypes existing at GUF, UKFr and UME (compare e.g. Trinczek et al. 2014, Schreiweis et al. 2014)
Modules for innovative user-friendly and efficient patient care process visualization:
a set of generic and easy configurable patient care process visualization modules, such as e.g. sunburst diagrams, a tryptychon panel, patient timelines and geo visualizations, e.g. R statistical package with ggplot2 and Shiny add-ons; compare Icahn School of Medicine at Mount Sinai http://ehdviz.dudleylab.org/ (Badgeley et al 2016)
Long-Term Research Data Archive:
shall support the long-term storage of data (but also respective analysis software artefacts/queries) which have been applied in a research project; requires a close integration with the hospital-/faculty-wide trial-/project registry and the project proposal management tool;
shall allow users to authenticate against central as well as local components. Accounts issued in home institutions can be used via DFN-AAI; e.g. OAUTH (compare Choi et al 2016b, Rieger 2009); Samply.Auth (MAGIC project)
Tools for Development, Deployment and Monitoring of IT Solutions:
(Continuous Integration Test-Pipeline)
shall support the development and test of dedicated MIRACOLIX software solutions at one MIRACUM site and its integration test into the general DIC architecture as well as the deployment to all other MIRACUM sites and continuous monitoring in operation;