AINS graduate course
AINS6006 · Big Data Management for AI Applications
Description
Data platforms, governance, and pipelines that feed reliable training and evaluation datasets for AI workloads.
Castalia LMS
Course shells on the Castalia LMS are provisioned per license; this link opens the LMS to explore the guest demo or landing experience.
Open Castalia LMS Back to catalog
Buy license Continue on the purchase hub to request a license or institutional quote.
Syllabus outline
-
Modules 1–2 · Platforms
- Lakehouse concepts and query engines
- Batch vs streaming (intro)
- Schema evolution and contracts
-
Modules 3–4 · Quality
- Data validation and anomaly detection
- Labeling operations and inter-rater reliability
- Lineage and reproducibility
-
Modules 5–6 · Scale
- Partitioning and cost controls
- Access control patterns
- Lab: end-to-end pipeline slice