AI AND MACHINE LEARNING COURSE FOR MONITORING & EVALUATION

COURSE OUTLINE

Course title – AI & Machine Learning for Monitoring & Evaluation


Target audience:  M&E specialists, Program Managers, Data analysts in Government/Donor funded projects, Researchers who want to apply ML to improve monitoring, reporting, and learning.
Prerequisites
– Basic statistics (mean, variance, hypothesis testing, basic regression)
– Basic SQL and Excel
– Comfortable with data concepts (rows, columns, missing data)
– Recommended: introductory Python or R (if not, include a short pre-course module)

Course learning objectives
By the end of the course participants will be able to:
1. Explain how AI/ML methods can support M&E objectives and where they are NOT appropriate.
2. Design data collection and indicator schemes aligned with ML use (data needs, quality).
3. Prepare and clean M&E data for analysis and feature engineering.
4. Build, evaluate, and interpret classification/regression models for monitoring tasks.
5. Integrate ML outputs into dashboards and decision-making workflows.
6. Apply basic predictive monitoring (early warning), automated classification (text/imagery)    and anomaly detection.
7. Understand and mitigate ethical, privacy, fairness, and governance risks.
8. Produce a small end‑to‑end ML-based M&E product (capstone) including model, evaluation, and an implementation plan.

Course format & time commitment
4 weeks  OR 2weeks physical and 2 weeks online workshop (compressed).
 Delivery: In-person or online (Zoom + GitHub/Colab + learning management system).



.


Introduction: M&E fundamentals + AI landscape

  1. Topics: M&E frameworks (logical model, indicators, baselines, end line), types of evaluation (process, outcome, impact). Where ML can add value (predictive monitoring, automated classification, program targeting, operational efficiency).
  2. Lab: Map ML use cases to an example program (education, health, livelihoods).
  3. Assignment: Write 1‑page M&E problem statement that could benefit from ML.

Data for M&E: collection, quality, and ethic

  1. Topics: data pipelines, survey vs administrative vs sensor data, sampling bias, missingness, measurement error; informed consent and privacy essentials for M&E.
  2. Lab: Explore and clean a small DHS or survey dataset: missing values, variable types, basic recoding.
  3. Assignment: Data quality checklist for a chosen program.

 Exploratory data analysis and feature engineering

  1. Topics: descriptive stats, visualizations, feature creation (time, geospatial, text transforms), handling categorical variables.
  2. Lab: Feature engineering on time series or program dataset; build baseline indicators.
  3. Assignment: Produce an EDA report (plots + short interpretations).

Supervised learning fundamentals (classification & regression)

  1. Topics: supervised model types, train/test split, cross-validation, performance metrics (accuracy, precision/recall, ROC-AUC, RMSE), baseline models.
  2. Lab: Build logistic regression and tree-based model to predict a monitoring outcome (dropout, attendance).
  3. Assignment: Model comparison report + recommended baseline.

Advanced models & interpretability

  1. Topics: ensemble methods (random forest, XGBoost), model interpretability (SHAP, LIME), feature importance vs. causality.
  2. Lab: Train XGBoost, produce SHAP plots, write interpretation notes.
  3. Assignment: Explain model decisions for five individual cases using SHAP.

Time series & anomaly detection for monitoring

  1. Topics: time series basics, forecasting (ARIMA, Prophet), change-point detection, anomaly detection (isolation forest, seasonal decomposition), early warning systems.
  2. Lab: Use program monitoring time series to detect anomalies and forecast next period.
  3. Assignment: Build an early-warning rule and evaluate its timeliness/precision.

Unstructured data: text and imagery for M&E

  1. Topics: NLP basics (tokenization, TF-IDF, embedding), sentiment/topic classification for feedback data, remote sensing & satellite imagery basics, object detection for infrastructure/land use.
  2. Lab: Text classification of beneficiary feedback / sentiment; simple remote-sensing land-cover classification (Sentinel/NASA sample).
  3. Assignment: Prototype model to classify text feedback or detect objects/features in an image dataset.

Causal inference, impact evaluation & when not to use ML

  1. Topics: difference between predictive modeling and causal inference; RCTs, matching, IV, regression discontinuity, synthetic controls; limits of ML for causal claims; ML for covariate selection and heterogeneity analysis.
  2. Lab: Use propensity score matching or double machine learning for treatment effect estimation.
  3. Assignment: Write a plan for how ML could complement (not replace) an impact evaluation for a program.

Deployment & dashboards, decision workflows, cost-benefit

  1. Topics: Model operationalization, monitoring model drift, APIs, dashboards (PowerBI/Tableau/Plotly), user adoption, cost-benefit & procurement considerations.
  2. Lab: Deploy a simple model inference script and build an interactive dashboard (Plotly Dash or Google Data Studio).
  3. Assignment: Prepare an implementation brief (how model outputs will be used operationally).

AI Ethics, governance, final presentations & capstone

  1. Topics: fairness metrics, privacy-preserving ML (differential privacy basics), data governance, consent, transparency to stakeholders, regulatory considerations.
  2. Capstone: Teams present end‑to‑end M&E ML project: problem, data, model, evaluation, implementation & ethics plan.
  3. Assessment: peer feedback + instructor scoring.

Hands‑on labs & tools

  1. Languages: Python (recommended) — pandas, scikit-learn, XGBoost, statsmodels, Prophet, SHAP, tensorflow/keras or PyTorch for deep learning; OR R (tidyverse, tidymodels).
  2. Environments: Google Colab (no install), Jupyter notebooks, GitHub for submissions.
    – Visualization: Plotly, Altair, Matplotlib; dashboards using Dash, Streamlit, Tableau or PowerBI.
  3. Data collection & mobile: Open Data Kit (ODK), KoBoToolbox, CommCare for field data.
  4. Cloud/Hosting: Google Cloud / AWS / Azure (optional) for deployment; Firebase or simple Flask APIs for prototypes.

Datasets (examples & sources)

  1. Demographic and Health Surveys (DHS) — household & health data
  2. World Bank Open Data — economic & development indicators
  3. Humanitarian Data Exchange (HDX)
  4. OpenStreetMap, Sentinel/Landsat imagery (Planet, Google Earth Engine)
  5. Gov open data portals (education, health)
  6. Simulated/cleaned program datasets for class exercises (prepare small CSVs)

Assignments & assessment

  1. Weekly assignments (50%): EDA, models, interpretation, implementation briefs.
  2. Capstone project (35%): group end‑to‑end project, presentation, code & report.
  3. Participation & quizzes (15%): attendance, short quizzes on readings, in-class participation.
  4. Rubrics: clarity of problem framing, appropriateness of ML method, data quality handling, model evaluation, ethics & implementation plan, reproducibility (code + README).

Sample capstone ideas

  1. Predict school attendance dropouts using administrative and survey data; design early intervention triggers.
  2. Classify beneficiary complaints from free-text to route to program managers.
  3. Use satellite imagery to monitor crop loss or infrastructure construction progress.
  4. Anomaly detection on program expenditure or service delivery metrics to flag fraud/irregularities.

Suggested readings & resources

 Courses: Coursera “AI For Everyone” (for non-technical staff), fast.ai, DataCamp, edX/HarvardX.


“The Hundred-Page Machine Learning Book” — A. Burkov (concise ML overview)

“Causal Inference: What If” — M. Hernán & J. Robins (causal concepts)

DHS Program tutorials and resources

Google AI for Social Good case studies

Papers & blogs on SHAP/LIME, responsible AI toolkits (IBM, Google)