Course Catalog | Meridian AI

MATH-401: Real Analysis for Machine Learning

Course Description

Rigorous foundations for understanding convergence, continuity, and approximation in machine learning. Metric spaces, topology, and compactness. Sequences and series: convergence criteria, dominated convergence. Differentiation and integration in multiple dimensions. Measure theory introduction: sigma-algebras, Lebesgue measure, measurable functions. Probability as measure theory. Applications: convergence of gradient descent, universal approximation theorem proofs.

MATH-430: Optimization for Machine Learning

Course Description

Optimization theory for deep learning. Convex optimization: definitions, gradient inequalities, convergence rates. First-order methods: gradient descent, projected gradient, proximal gradient. Stochastic gradient descent: convergence, learning rate schedules, noise as regularization. Adaptive methods: AdaGrad, RMSprop, Adam, AdamW — derivation and analysis. Non-convex optimization: saddle points, local minima, loss landscape visualization. Second-order methods and why they're mostly impractical at scale.

MATH-440: Statistical Learning Theory

Course Description

Theoretical foundations of generalization. PAC learning framework: probably approximately correct guarantees. VC dimension: definition, examples, fundamental theorem of learning. Rademacher complexity: data-dependent generalization bounds. PAC-Bayes bounds and their connection to Bayesian methods. Double descent and the interpolation threshold. Overparameterization and implicit regularization in neural networks. Connections to information theory: MDL principle.

MATH-490: Seminar: Deep Learning Theory

Course Description

Reading seminar covering active research in deep learning theory. Topics change annually. Recent years: neural tangent kernel regime and its breakdown for finite-width networks; feature learning vs. lazy training; in-context learning as implicit Bayesian inference; mechanistic interpretability of circuits; emergent capabilities and phase transitions; grokking and delayed generalization. Students present papers and contribute original research directions.

MLE-401: ML Systems Design Fundamentals

Course Description

System design for production machine learning. The ML project lifecycle: problem framing, data collection, modeling, evaluation, deployment, monitoring. Common failure modes: training-serving skew, data drift, concept drift, silent failures. System design interviews for ML roles. Case studies: recommendation systems, search ranking, fraud detection, content moderation. Trade-offs: online vs. batch inference, model complexity vs. latency, precision vs. recall. Students design the architecture for a complete ML system end-to-end.

MLE-430: Model Serving and Inference Infrastructure

Course Description

Production inference for ML models. ONNX and model export. NVIDIA Triton Inference Server: model repository, batching, concurrency. vLLM for LLM serving: continuous batching, paged attention, quantization support. TGI (Text Generation Inference) by HuggingFace. BentoML for lightweight serving. Autoscaling: Kubernetes HPA, KEDA, GPU autoscaling. Latency optimization: caching, request batching, model distillation. Cost modeling and ROI calculations for inference infrastructure.

MLE-440: ML Monitoring, Observability, and Reliability

Course Description

Keeping ML systems healthy in production. Data drift: covariate shift, label shift, concept drift — detection with PSI, KL divergence, maximum mean discrepancy. Model degradation: performance metrics over time, shadow deployments, A/B testing. Logging: what to log, log aggregation (ELK stack, CloudWatch), structured logging for ML. Alerting strategies: avoid alert fatigue while catching real issues. Observability tools: Arize AI, Evidently AI, Fiddler, WhyLabs. LLM-specific monitoring: response quality, latency, token usage, hallucination detection.

RL-401: Markov Decision Processes and Dynamic Programming

Course Description

Mathematical foundations of sequential decision-making. MDPs: states, actions, rewards, transition dynamics, discount factor. Bellman equations: optimality conditions for value and Q functions. Dynamic programming: policy evaluation, policy iteration, value iteration. Finite vs. infinite horizon problems. Partial observability (POMDPs). Grid world and inventory control case studies. Students implement all DP algorithms from scratch.

RL-420: Deep Reinforcement Learning

Course Description

Neural network function approximation for RL. DQN: experience replay, target networks, Atari results. Rainbow: double DQN, dueling networks, prioritized replay, distributional RL, noisy nets. Policy gradient theorem, REINFORCE, baseline variance reduction. Actor-critic: A3C, A2C, PPO, SAC. Offline RL: IQL, CQL, TD3+BC. Multi-agent RL: independent Q-learning, QMIX, MAPPO. OpenAI Gym, Gymnasium, and Brax environments.

RL-430: Model-Based RL and World Models

Course Description

Learning and using models of the environment. Dyna architecture. World models: WM2, DreamerV3. MBPO and model rollouts for sample efficiency. Neural network dynamics models: aleatoric vs. epistemic uncertainty. Planning in latent space. MuZero: learning to plan without hand-crafted rules. RSSM (Recurrent State Space Model) for partially observable environments. Students implement DreamerV3 on standard benchmarks.

ROB-201: Robot Perception and Sensor Fusion

Course Description

Sensing and understanding the robot's environment. Camera models (monocular, stereo, RGB-D). LiDAR point cloud processing: filtering, registration, object detection. IMU integration and state estimation. Sensor fusion: extended Kalman filter, unscented Kalman filter, particle filter. Simultaneous Localization and Mapping (SLAM): GMapping, Cartographer, ORB-SLAM3, LIO-SAM. 3D object detection: PointPillars, VoxelNet. Lab work on physical robots in the Threshold Robotics Lab.

SAF-201: Technical Alignment: RLHF, DPO, and Constitutional AI

Course Description

Technical methods for aligning language models with human values. RLHF pipeline: SFT, reward model training, PPO alignment. DPO (Direct Preference Optimization): derivation from RLHF, advantages, and limitations. Constitutional AI: Anthropic's approach to scalable oversight via AI feedback (RLAIF). Reward hacking examples and how to detect them. Scalable oversight: debate and amplification. Students run a DPO fine-tuning experiment and analyze alignment failures in public models.