The Big Data Center (BDC) of China Medical University Hospital (CMUH) established the Clinical Research Data Repository in 2016. BDC manages the only phenome-genome-environmental data platform in Asia, encompassing 20-year EMR and environmental exposure data from 3.5 million patients and genetic information from 400 thousand patients, which forms the solid foundation for generating clinical data with high resolution. To make great use of valuable medical big data, BDC developed the smart data platform, iHi Platform, in 2020 to ignite hyper-intelligent data applications. The iHi platform is the only innovative data platform that combines clinical, genetic, and environmental data in Taiwan (Fig. 1). The iHi Platform provides clean, integrated, and de-identified data to clinical researchers through a cloud-based system. This data architecture not only can make the data ecosystem interoperable and sustainable, but also solve the problem of cluttered data and low accessibility and create a venue for infinite artificial intelligence. Through the iHi Platform services, we aim to expand multi-omics clinical data for education, research, and clinical or business application. Ultimately, the insights inspired by the iHi Platform provide feedback to clinical settings and ultimately improve medical quality and patient health.
The iHi Platform was designed as a patient-centered medical data ecosystem that provides clinical researchers with accessible, reliable, and diverse data. Several AI/data tools have received US and Taiwan patents and one AI tool has been approved by US FDA or Taiwan FDA, providing the validity and quality of the iHi Platform. The iHi Platform encompasses innovative data structure (data LEGO) and systematic data annotation workflows (data chip). We further establish the iHi Genomics Analytic Platform to speed up translational research discovery. With the deep-cleaned, comprehensive, and accessible data service, iHi Platform can bring research to an infinite intelligence application (Fig. 2).
We aim to build a full-spectrum big data ecosystem that can not only integrate the EMR data, health insurance data, genomic, and environmental data, but also combine with real-world data from patient-centered systems and multi-omics data such as microbiome and proteomic data. Most importantly, these diverse and heterogeneous data must be linkable and traceable for sustainability and reusable. Therefore, we process all data through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. Furthermore, we modularize multi-omics and multi-dimensional datasets into data LEGO brick which is deep-cleaned and well-sorted by their characteristics. The iHi data platform, a data LEGO pool, deposits diverse data sources, such as EHR, medical images and examination reports. The researchers can select the data bricks of interest to build their own unique data castles and to perform analyses on the iHi Platform (Fig. 3).
All data provided in the iHi platform were processed through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. To perform systematic data cleaning, validation, and integration, we establish a unique smart data chip fabrication process to control the quality of each processing step.
From data sources acquisition, data architecture design, data polishing, standardization, refinement, to data validation and stacking, the smart data chip with qualified and certified datasets can be generated (Fig. 4). Through this standard and pre-built smart data chip fabrication process, we can easily manage and trace each process step in the iHi Platform. In addition, iHi Platform is the only data platform that provides ISO-certified de-identified data in Taiwan (Fig. 5). This brand-new concept of continuous scale and flow production used in data processing can deeply clean data and enhance the high-quality AI solutions that fit into the real-world clinical flow. At the same time, high-performance AI can help extract important new data features and insights to enhance data diversity and further brew the smart data ecosystem (Fig. 6).
In 2021, we launched the iHi Genomics analytic platform, which is an easy-to-use analytic platform for data exploring and extracting insights from interesting datasets. The iHi Genomics provides the disease cohort selection and Genome-wide Association Study (GWAS) analysis within a few clicks. The iHi Genomics can generate the full report for GWAS, including the quality control details and the Manhattan plot for significant SNPs associated with diseases (Fig. 7). Using virtual desktop infrastructure (VDI), the user can remotely access de-identified data certified by ISO in a highly secure environment (Fig. 8). The iHi Genomics analytic pipeline provides the full report of the identified gene or SNP, with a detailed description and linkable external information, which allows researchers without coding skills to painlessly perform basic GWAS analysis.
Given an extensive experience and the high-quality data of the iHi Platform, we have been collaborating with 19 international institutions, including universities, medical centers, and national institutes of health. In 2018 and 2019, we were invited by the American Society of Nephrology (ASN) to present the big data research and application of kidney diseases (Fig. 9). Our solid research in chronic kidney disease genetics has attracted the attention of Dr. Anna Köttgen, co-director of the CKDGen Consortium, an international consortium on the genetic epidemiology of chronic kidney disease. Since 2023, we have been working with CKDGen on the global chronic kidney disease genetics project to contribute the genetic information from Asians. The Radiology Society of North America (RSNA) has also invited us to join as the data contributor in the world renown AI Challenges. The infrastructure and the high-quality data from the iHi platform have been highly recognized by many leading experts worldwide, such as Dr. Nick Bryan, the former president of RSNA (Fig. 10).
Starting from 2022, the global cooperation in clinical and intelligent medical projects has gone stronger, including the collaborations with the universities and hospitals located in US, United Arab Emirates, and Japan (Fig. 11). In the future, we will continue nurturing the iHi data ecosystem and integrating the worldwide collaborations.
The iHi Platform supports 161 theme-based databases and has contributed to over 100 publications, 8 patented AI solutions, and regulatory approvals from both the FDA and TFDA for AI-driven medical applications. The platform's excellence has been recognized through multiple prestigious domestic and international awards, including those for digital innovation, healthcare quality, and patient safety, highlighting its impact on medical research and AI-driven healthcare advancements.
2023
2022
2017