May 18 – 21, 2026

AI Agents for Sinitic Texts
Summer Workshop

This intensive four-day workshop will teach the latest applications of large language models (LLMs) to Sinitic texts, with a focus on agentic approaches—AI-driven workflows that can support and automate multi-step research tasks.

Overview

Organized by the Digital China Initiative (Principal Investigators: Peter Bol and Michael Szonyi).

The workshop will be led by Dr. Kwok-leong Tang, Managing Director of the Digital China Initiative, with a group of tutors. By the end of the workshop you will be able to use natural language to create programs to perform sophisticated research tasks at scale. The workshop will show how to bring AI to bear on all steps of the research cycle, from data collection and processing to analysis and presentation, using both qualitative and quantitative approaches. No prior digital humanities experience is required; participants without digital literacy are especially welcome.

Topics

01

Digitizing Materials

Digitizing non–digitally born materials

02

Data Collection

Collecting data online (web scraping; APIs)

03

Data Modeling

Turning unstructured data into structured data

04

Entity Extraction

Named entity extraction (people, places, dates, and other entities)

05

Knowledge Graphs

Networks and knowledge graphs

06

Custom Tools

Building custom digital tools for your research needs

07

Geospatial Analysis

LLM-assisted geospatial analysis

Schedule

May 18

Workshop Session 1

9:00 AM – 4:00 PM

May 19

Workshop Session 2

9:00 AM – 4:00 PM

May 20

Workshop Session 3

9:00 AM – 4:00 PM

May 21

Workshop Session 4

9:00 AM – 4:00 PM