We are Data Science & Artificial Intelligence Lab (DSAIL) at KAIST led by Prof. Chanyoung Park.
Due to the recent expansion of social media and online communities, online platforms in the digital economy are inundated with vast amounts of usergenerated multimodal (heterogeneous) data from various sources, which can be categorized into structured (e.g., graphs such as social network) and unstructured data (e.g., text, image, video, and audio). When properly analyzed, such multimodal data can be a valuable asset to the companies, but it is challenging not only due to the difficulty in
extracting meaningful information from the inherently sparse and noisy data, but also in combining and customizing the extracted knowledge from different modalities with different statistical properties to facilitate various target applications.
Research Area
Our goal is to mine meaningful knowledge from multimodal data, and develop artificial intelligence solutions for various real-world applications across different disciplines.
Two underlying themes of our research are:
Representation: How can we extract knowledge from different modalities of data and represent them in a unified way such that the relations among different modalities are captured, and the synergy within the multimodality is facilitated?
Fusion: How can we combine the extracted knowledge and customize it to facilitate various underlying target applications?
Our main research interests include Data-centric AI, Machine Learning, Deep Learning, Multi-modal Data Mining, and their applications including but not limited to the following:
Recommender system
AI for Science (Chemistry/Bioinformatics/Materials Science)
Graph Neural Network and its Applications
Molecular design and Drug discovery
(Multi-modal) Representation learning
Large Language Models
Explainable AI
Robust machine learning
Scene understanding
Knowledge graphs
Continual learning
Causal learning
Social network analysis
Graph mining
Fraud/Anomaly detection
Sentiment analysis
Purchase/Click prediction
Time-series and spatio-temporal analysis,
AI for finance
etc…
Retrieval-Retro: Retrieval-based Inorganic Retrosynthesis with Expert Knowledge, NeurIPS24
Vision Language Model is NOT All You Need: Augmentation Strategies for Molecule Language Models, CIKM24
Semantic Diversity-aware Prototype-based Learning for Unbiased Scene Graph Generation, ECCV24
Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network, ECCV24
Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System, KDD24
Self-Explainable Temporal Graph Networks based on Graph Information Bottleneck, KDD24
Unsupervised Episode Generation for Graph Meta-learning, ICML24
LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation, CVPR24
If you’re interested in joining our lab, send an email with your interests, CV, and transcript to cy.park (at) kaist.ac.kr.
News
November 2024
Two papers got accepted at KDD 2025.
October 2024
We are looking for interns to join our group during this Winter break (8 weeks).
October 2024
Junseok Lee received KAIST Graduate Student Outstanding Paper Award for his paper "Single-cell RNA Sequencing Data Imputation Using Bi-level Feature Propagation" published in Briefings in Bioinformatics. Congratulations!
October 2024
A paper got accepted at WSDM 2025.
October 2024
A paper got accepted at NeurIPS 2024 Workshop on AI for New Drug Modalities (AIDrugX).
September 2024
A paper got accepted at NeurIPS 2024.
September 2024
Namkyeong Lee started a research internship at Genentech, USA.
August 2024
Our paper "Subgraph Federated Learning for Local Generalization" received the Best Paper Award at KDD 2024 Workshop on Federated Learning for Data Mining and Graph Analytics (FedKDD).
August 2024
Our paper "Interpretable Graph Model with Prototype-Based Graph Information Bottleneck" received the Best Paper Award at KDD 2024 Workshop on Human-Interpretable AI.