Skip to Main Content


The primary research interest is in translational biomedical and health informatics. The development of computer and Web-based technology has resulted in an accumulation of a high volume of biomedical and health data. Leveraging informatics approaches and emerging computational and web technologies is the main challenge in biomedical and health informatics fields. The research goal is to facilitate meaningful use of the biomedical/clinical data to better serve drug discovery and improve health care.

Keywords for research interests

  • Drug discovery
  • Precision medicine
  • Pharmacovigilance
  • Pharmacogenomics
  • Clinical decision support
  • Data integration
  • Data mining
  • Network analysis
  • Semantic Web

Focus areas

  • Computational drug discovery. It employs large-scale data mining and semantic prediction to support drug discovery. The goal is to 1) develop biomedical applications for drug discovery, especially for drug repositioning that is to discover novel usages of “old” drugs with a better risk-versus-reward trade-off solution compared to traditional drug development, by performing the advanced informatics approaches against a large volume of heterogeneous biomedical data; 2) leverage patient data and the advanced informatics approaches and technologies to identify alternative usages of “old” drugs.
  • Precision medicine. It explores electronic health records (EHR) and relevant genomics data to develop intelligent applications to support individualized treatment plan recommendation by applying the advanced data mining and machine learning approaches. In addition, I am also interested in leveraging  high-throughput computational technologies for efficient use of EHR data to support biomedical and clinical association study, which will ultimately support epidemiological study.
  • Pharmacovigilance.  It is also known as Drug Safety. It is the pharmacological science relating to the collection, detection, assessment, monitoring, and prevention of adverse effects with drugs. In this area, I am interested in drug adverse event prediction and drug-drug interaction detection/prediction by applying data mining and Natural Language Processing (NLP) techniques with a large volume of biomedical and literature data.
  • High-throughput phenotyping. This research aims to automatically identify a cohort of patients based on pre-defined phenotype algorithms that specify certain diseases, symptoms or clinical findings. To achieve the automation goal, those phenotype algorithms need to be modeled and formatted into machine-readable syntax by exploring different types of phenotype authoring tools, including Measure Authoring Tool(MAT) developed by the CMS, i2b2 and Eureka! Clinical Analytics.
  • Data integration and representation. It mainly focus on two aspects, 1) data integration by applying standardized biomedical vocabularies, as well as standardized data models, such as clinical element models (CEMs), FHIR. 2) semantic data representation by leveraging emerged semantic web technologies.
  • Linked Drug Open Data. The aim is to discover drug linkages among diverse drug resources on the Web, which supports translational biomedical research. Semantic web technologies has played a key role in this part.