AI AND MACHINE LEARNING COURSE FOR MONITORING & EVALUATION
COURSE OUTLINE
Course title – AI & Machine Learning for Monitoring & Evaluation
Target audience: M&E specialists, Program Managers, Data analysts in Government/Donor funded projects, Researchers who want to apply ML to improve monitoring, reporting, and learning.
Prerequisites
– Basic statistics (mean, variance, hypothesis testing, basic regression)
– Basic SQL and Excel
– Comfortable with data concepts (rows, columns, missing data)
– Recommended: introductory Python or R (if not, include a short pre-course module)
Course learning objectives
By the end of the course participants will be able to:
1. Explain how AI/ML methods can support M&E objectives and where they are NOT appropriate.
2. Design data collection and indicator schemes aligned with ML use (data needs, quality).
3. Prepare and clean M&E data for analysis and feature engineering.
4. Build, evaluate, and interpret classification/regression models for monitoring tasks.
5. Integrate ML outputs into dashboards and decision-making workflows.
6. Apply basic predictive monitoring (early warning), automated classification (text/imagery) and anomaly detection.
7. Understand and mitigate ethical, privacy, fairness, and governance risks.
8. Produce a small end‑to‑end ML-based M&E product (capstone) including model, evaluation, and an implementation plan.
Course format & time commitment
4 weeks OR 2weeks physical and 2 weeks online workshop (compressed).
Delivery: In-person or online (Zoom + GitHub/Colab + learning management system).
.
Introduction: M&E fundamentals + AI landscape
- Topics: M&E frameworks (logical model, indicators, baselines, end line), types of evaluation (process, outcome, impact). Where ML can add value (predictive monitoring, automated classification, program targeting, operational efficiency).
- Lab: Map ML use cases to an example program (education, health, livelihoods).
- Assignment: Write 1‑page M&E problem statement that could benefit from ML.
Data for M&E: collection, quality, and ethic
- Topics: data pipelines, survey vs administrative vs sensor data, sampling bias, missingness, measurement error; informed consent and privacy essentials for M&E.
- Lab: Explore and clean a small DHS or survey dataset: missing values, variable types, basic recoding.
- Assignment: Data quality checklist for a chosen program.
Exploratory data analysis and feature engineering
- Topics: descriptive stats, visualizations, feature creation (time, geospatial, text transforms), handling categorical variables.
- Lab: Feature engineering on time series or program dataset; build baseline indicators.
- Assignment: Produce an EDA report (plots + short interpretations).
Supervised learning fundamentals (classification & regression)
- Topics: supervised model types, train/test split, cross-validation, performance metrics (accuracy, precision/recall, ROC-AUC, RMSE), baseline models.
- Lab: Build logistic regression and tree-based model to predict a monitoring outcome (dropout, attendance).
- Assignment: Model comparison report + recommended baseline.
Advanced models & interpretability
- Topics: ensemble methods (random forest, XGBoost), model interpretability (SHAP, LIME), feature importance vs. causality.
- Lab: Train XGBoost, produce SHAP plots, write interpretation notes.
- Assignment: Explain model decisions for five individual cases using SHAP.
Time series & anomaly detection for monitoring
- Topics: time series basics, forecasting (ARIMA, Prophet), change-point detection, anomaly detection (isolation forest, seasonal decomposition), early warning systems.
- Lab: Use program monitoring time series to detect anomalies and forecast next period.
- Assignment: Build an early-warning rule and evaluate its timeliness/precision.
Unstructured data: text and imagery for M&E
- Topics: NLP basics (tokenization, TF-IDF, embedding), sentiment/topic classification for feedback data, remote sensing & satellite imagery basics, object detection for infrastructure/land use.
- Lab: Text classification of beneficiary feedback / sentiment; simple remote-sensing land-cover classification (Sentinel/NASA sample).
- Assignment: Prototype model to classify text feedback or detect objects/features in an image dataset.
Causal inference, impact evaluation & when not to use ML
- Topics: difference between predictive modeling and causal inference; RCTs, matching, IV, regression discontinuity, synthetic controls; limits of ML for causal claims; ML for covariate selection and heterogeneity analysis.
- Lab: Use propensity score matching or double machine learning for treatment effect estimation.
- Assignment: Write a plan for how ML could complement (not replace) an impact evaluation for a program.
Deployment & dashboards, decision workflows, cost-benefit
- Topics: Model operationalization, monitoring model drift, APIs, dashboards (PowerBI/Tableau/Plotly), user adoption, cost-benefit & procurement considerations.
- Lab: Deploy a simple model inference script and build an interactive dashboard (Plotly Dash or Google Data Studio).
- Assignment: Prepare an implementation brief (how model outputs will be used operationally).
AI Ethics, governance, final presentations & capstone
- Topics: fairness metrics, privacy-preserving ML (differential privacy basics), data governance, consent, transparency to stakeholders, regulatory considerations.
- Capstone: Teams present end‑to‑end M&E ML project: problem, data, model, evaluation, implementation & ethics plan.
- Assessment: peer feedback + instructor scoring.
Hands‑on labs & tools
- Languages: Python (recommended) — pandas, scikit-learn, XGBoost, statsmodels, Prophet, SHAP, tensorflow/keras or PyTorch for deep learning; OR R (tidyverse, tidymodels).
- Environments: Google Colab (no install), Jupyter notebooks, GitHub for submissions.
– Visualization: Plotly, Altair, Matplotlib; dashboards using Dash, Streamlit, Tableau or PowerBI. - Data collection & mobile: Open Data Kit (ODK), KoBoToolbox, CommCare for field data.
- Cloud/Hosting: Google Cloud / AWS / Azure (optional) for deployment; Firebase or simple Flask APIs for prototypes.
Datasets (examples & sources)
- Demographic and Health Surveys (DHS) — household & health data
- World Bank Open Data — economic & development indicators
- Humanitarian Data Exchange (HDX)
- OpenStreetMap, Sentinel/Landsat imagery (Planet, Google Earth Engine)
- Gov open data portals (education, health)
- Simulated/cleaned program datasets for class exercises (prepare small CSVs)
Assignments & assessment
- Weekly assignments (50%): EDA, models, interpretation, implementation briefs.
- Capstone project (35%): group end‑to‑end project, presentation, code & report.
- Participation & quizzes (15%): attendance, short quizzes on readings, in-class participation.
- Rubrics: clarity of problem framing, appropriateness of ML method, data quality handling, model evaluation, ethics & implementation plan, reproducibility (code + README).
Sample capstone ideas
- Predict school attendance dropouts using administrative and survey data; design early intervention triggers.
- Classify beneficiary complaints from free-text to route to program managers.
- Use satellite imagery to monitor crop loss or infrastructure construction progress.
- Anomaly detection on program expenditure or service delivery metrics to flag fraud/irregularities.
Suggested readings & resources
Courses: Coursera “AI For Everyone” (for non-technical staff), fast.ai, DataCamp, edX/HarvardX.
“The Hundred-Page Machine Learning Book” — A. Burkov (concise ML overview)
“Causal Inference: What If” — M. Hernán & J. Robins (causal concepts)
DHS Program tutorials and resources
Google AI for Social Good case studies
Papers & blogs on SHAP/LIME, responsible AI toolkits (IBM, Google)