Course Catalog

Course Description

Rigorous foundations for understanding convergence, continuity, and approximation in machine learning. Metric spaces, topology, and compactness. Sequences and series: convergence criteria, dominated convergence. Differentiation and integration in multiple dimensions. Measure theory introduction: sigma-algebras, Lebesgue measure, measurable functions. Probability as measure theory. Applications: convergence of gradient descent, universal approximation theorem proofs.

Course Description

Optimization theory for deep learning. Convex optimization: definitions, gradient inequalities, convergence rates. First-order methods: gradient descent, projected gradient, proximal gradient. Stochastic gradient descent: convergence, learning rate schedules, noise as regularization. Adaptive methods: AdaGrad, RMSprop, Adam, AdamW — derivation and analysis. Non-convex optimization: saddle points, local minima, loss landscape visualization. Second-order methods and why they're mostly impractical at scale.

Course Description

Theoretical foundations of generalization. PAC learning framework: probably approximately correct guarantees. VC dimension: definition, examples, fundamental theorem of learning. Rademacher complexity: data-dependent generalization bounds. PAC-Bayes bounds and their connection to Bayesian methods. Double descent and the interpolation threshold. Overparameterization and implicit regularization in neural networks. Connections to information theory: MDL principle.

Course Description

Reading seminar covering active research in deep learning theory. Topics change annually. Recent years: neural tangent kernel regime and its breakdown for finite-width networks; feature learning vs. lazy training; in-context learning as implicit Bayesian inference; mechanistic interpretability of circuits; emergent capabilities and phase transitions; grokking and delayed generalization. Students present papers and contribute original research directions.

Course Description

System design for production machine learning. The ML project lifecycle: problem framing, data collection, modeling, evaluation, deployment, monitoring. Common failure modes: training-serving skew, data drift, concept drift, silent failures. System design interviews for ML roles. Case studies: recommendation systems, search ranking, fraud detection, content moderation. Trade-offs: online vs. batch inference, model complexity vs. latency, precision vs. recall. Students design the architecture for a complete ML system end-to-end.

Course Description

Production inference for ML models. ONNX and model export. NVIDIA Triton Inference Server: model repository, batching, concurrency. vLLM for LLM serving: continuous batching, paged attention, quantization support. TGI (Text Generation Inference) by HuggingFace. BentoML for lightweight serving. Autoscaling: Kubernetes HPA, KEDA, GPU autoscaling. Latency optimization: caching, request batching, model distillation. Cost modeling and ROI calculations for inference infrastructure.

Course Description

Keeping ML systems healthy in production. Data drift: covariate shift, label shift, concept drift — detection with PSI, KL divergence, maximum mean discrepancy. Model degradation: performance metrics over time, shadow deployments, A/B testing. Logging: what to log, log aggregation (ELK stack, CloudWatch), structured logging for ML. Alerting strategies: avoid alert fatigue while catching real issues. Observability tools: Arize AI, Evidently AI, Fiddler, WhyLabs. LLM-specific monitoring: response quality, latency, token usage, hallucination detection.

Course Description

Mathematical foundations of sequential decision-making. MDPs: states, actions, rewards, transition dynamics, discount factor. Bellman equations: optimality conditions for value and Q functions. Dynamic programming: policy evaluation, policy iteration, value iteration. Finite vs. infinite horizon problems. Partial observability (POMDPs). Grid world and inventory control case studies. Students implement all DP algorithms from scratch.

Course Description

Neural network function approximation for RL. DQN: experience replay, target networks, Atari results. Rainbow: double DQN, dueling networks, prioritized replay, distributional RL, noisy nets. Policy gradient theorem, REINFORCE, baseline variance reduction. Actor-critic: A3C, A2C, PPO, SAC. Offline RL: IQL, CQL, TD3+BC. Multi-agent RL: independent Q-learning, QMIX, MAPPO. OpenAI Gym, Gymnasium, and Brax environments.

Course Description

Learning and using models of the environment. Dyna architecture. World models: WM2, DreamerV3. MBPO and model rollouts for sample efficiency. Neural network dynamics models: aleatoric vs. epistemic uncertainty. Planning in latent space. MuZero: learning to plan without hand-crafted rules. RSSM (Recurrent State Space Model) for partially observable environments. Students implement DreamerV3 on standard benchmarks.

Course Description

Sensing and understanding the robot's environment. Camera models (monocular, stereo, RGB-D). LiDAR point cloud processing: filtering, registration, object detection. IMU integration and state estimation. Sensor fusion: extended Kalman filter, unscented Kalman filter, particle filter. Simultaneous Localization and Mapping (SLAM): GMapping, Cartographer, ORB-SLAM3, LIO-SAM. 3D object detection: PointPillars, VoxelNet. Lab work on physical robots in the Threshold Robotics Lab.

Course Description

Technical methods for aligning language models with human values. RLHF pipeline: SFT, reward model training, PPO alignment. DPO (Direct Preference Optimization): derivation from RLHF, advantages, and limitations. Constitutional AI: Anthropic's approach to scalable oversight via AI feedback (RLAIF). Reward hacking examples and how to detect them. Scalable oversight: debate and amplification. Students run a DPO fine-tuning experiment and analyze alignment failures in public models.