hero

Xiaoze Liu

ねだるな、勝ち取れ、さすれば与えられん

# About

I'm a PhD student at Purdue University, advised by Prof. Jing Gao and Prof. Xiaoqian Wang. I hold a Master's Degree with honors from Zhejiang University (China) and a Bachelor's Degree with honors from Northeastern University (China).

# Recent publications (Full List) (Google Scholar)

Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs

arxiv badge GitHub Repo stars

Show abstract
Image
We propose GraphEval to evaluate an LLM's performance using a substantially large test dataset. Specifically, the test dataset is retrieved from a large knowledge graph with more than 10 million facts without expensive human efforts. Unlike conventional methods that evaluate LLMs based on generated responses, GraphEval streamlines the evaluation process by creating a judge model to estimate the correctness of the answers given by the LLM. Our experiments demonstrate that the judge model's factuality assessment aligns closely with the correctness of the LLM's generated outputs, while also substantially reducing evaluation costs. Besides, our findings offer valuable insights into LLM performance across different metrics and highlight the potential for future improvements in ensuring the factual integrity of LLM outputs.

Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity

arxiv badge GitHub Repo stars

Show abstract
Image
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital. We define the Factuality Issue as the probability of LLMs to produce content inconsistent with established facts. We first delve into the implications of these inaccuracies, highlighting the potential consequences and challenges posed by factual errors in LLM outputs. Subsequently, we analyze the mechanisms through which LLMs store and process facts, seeking the primary causes of factual errors. Our discussion then transitions to methodologies for evaluating LLM factuality, emphasizing key metrics, benchmarks, and studies. We further explore strategies for enhancing LLM factuality, including approaches tailored for specific domains. We focus two primary LLM configurations standalone LLMs and Retrieval-Augmented LLMs that utilize external data, we detail their unique challenges and potential enhancements. Our survey offers a structured guide for researchers aiming to fortify the factual reliability of LLMs.

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

arxiv badge GitHub Repo stars

Show abstract
Image
Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm. We begin by defining KGs and MMKGs, then explore their construction progress. Our review includes two primary task categories: KG-aware multi-modal learning tasks, such as Image Classification and Visual Question Answering, and intrinsic MMKG tasks like Multi-modal Knowledge Graph Completion and Entity Alignment, highlighting specific research trajectories. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research. Finally, we discuss current challenges and identify emerging trends, such as progress in Large Language Modeling and Multi-modal Pre-training strategies. This survey aims to serve as a comprehensive reference for researchers already involved in or considering delving into KG and multi-modal learning research, offering insights into the evolving landscape of MMKG research and supporting future work.

# Services

Serve as a Journal Reviewer for

Serve as a Conference reviewer/PC for

# Internships

  • (2022/09-2023/04) Research Intern @ Language Technology Lab, Alibaba DAMO Academy
  • (2021/08-2022/07) Research Intern @ Alibaba Cloud Database Services
  • (2019/10-2020/01) Software Developer Intern @ TikTok Backend Services