M&E – Centram Centre for Applied Artificial Intelligence

AI AND MACHINE LEARNING COURSE FOR MONITORING & EVALUATION

COURSE OUTLINE

Course title – AI & Machine Learning for Monitoring & Evaluation

Target audience: M&E specialists, Program Managers, Data analysts in Government/Donor funded projects, Researchers who want to apply ML to improve monitoring, reporting, and learning.
Prerequisites
– Basic statistics (mean, variance, hypothesis testing, basic regression)
– Basic SQL and Excel
– Comfortable with data concepts (rows, columns, missing data)
– Recommended: introductory Python or R (if not, include a short pre-course module)

Course learning objectives
By the end of the course participants will be able to:
1. Explain how AI/ML methods can support M&E objectives and where they are NOT appropriate.
2. Design data collection and indicator schemes aligned with ML use (data needs, quality).
3. Prepare and clean M&E data for analysis and feature engineering.
4. Build, evaluate, and interpret classification/regression models for monitoring tasks.
5. Integrate ML outputs into dashboards and decision-making workflows.
6. Apply basic predictive monitoring (early warning), automated classification (text/imagery) and anomaly detection.
7. Understand and mitigate ethical, privacy, fairness, and governance risks.
8. Produce a small end‑to‑end ML-based M&E product (capstone) including model, evaluation, and an implementation plan.

Course format & time commitment
4 weeks OR 2weeks physical and 2 weeks online workshop (compressed).
Delivery: In-person or online (Zoom + GitHub/Colab + learning management system).

.

Introduction: M&E fundamentals + AI landscape

Topics: M&E frameworks (logical model, indicators, baselines, end line), types of evaluation (process, outcome, impact). Where ML can add value (predictive monitoring, automated classification, program targeting, operational efficiency).
Lab: Map ML use cases to an example program (education, health, livelihoods).
Assignment: Write 1‑page M&E problem statement that could benefit from ML.

Data for M&E: collection, quality, and ethic

Topics: data pipelines, survey vs administrative vs sensor data, sampling bias, missingness, measurement error; informed consent and privacy essentials for M&E.
Lab: Explore and clean a small DHS or survey dataset: missing values, variable types, basic recoding.
Assignment: Data quality checklist for a chosen program.

Exploratory data analysis and feature engineering

Topics: descriptive stats, visualizations, feature creation (time, geospatial, text transforms), handling categorical variables.
Lab: Feature engineering on time series or program dataset; build baseline indicators.
Assignment: Produce an EDA report (plots + short interpretations).

Supervised learning fundamentals (classification & regression)

Topics: supervised model types, train/test split, cross-validation, performance metrics (accuracy, precision/recall, ROC-AUC, RMSE), baseline models.
Lab: Build logistic regression and tree-based model to predict a monitoring outcome (dropout, attendance).
Assignment: Model comparison report + recommended baseline.

Advanced models & interpretability

Topics: ensemble methods (random forest, XGBoost), model interpretability (SHAP, LIME), feature importance vs. causality.
Lab: Train XGBoost, produce SHAP plots, write interpretation notes.
Assignment: Explain model decisions for five individual cases using SHAP.

Time series & anomaly detection for monitoring

Topics: time series basics, forecasting (ARIMA, Prophet), change-point detection, anomaly detection (isolation forest, seasonal decomposition), early warning systems.
Lab: Use program monitoring time series to detect anomalies and forecast next period.
Assignment: Build an early-warning rule and evaluate its timeliness/precision.

Unstructured data: text and imagery for M&E

Topics: NLP basics (tokenization, TF-IDF, embedding), sentiment/topic classification for feedback data, remote sensing & satellite imagery basics, object detection for infrastructure/land use.
Lab: Text classification of beneficiary feedback / sentiment; simple remote-sensing land-cover classification (Sentinel/NASA sample).
Assignment: Prototype model to classify text feedback or detect objects/features in an image dataset.

Causal inference, impact evaluation & when not to use ML

Topics: difference between predictive modeling and causal inference; RCTs, matching, IV, regression discontinuity, synthetic controls; limits of ML for causal claims; ML for covariate selection and heterogeneity analysis.
Lab: Use propensity score matching or double machine learning for treatment effect estimation.
Assignment: Write a plan for how ML could complement (not replace) an impact evaluation for a program.

Deployment & dashboards, decision workflows, cost-benefit

Topics: Model operationalization, monitoring model drift, APIs, dashboards (PowerBI/Tableau/Plotly), user adoption, cost-benefit & procurement considerations.
Lab: Deploy a simple model inference script and build an interactive dashboard (Plotly Dash or Google Data Studio).
Assignment: Prepare an implementation brief (how model outputs will be used operationally).

AI Ethics, governance, final presentations & capstone

Topics: fairness metrics, privacy-preserving ML (differential privacy basics), data governance, consent, transparency to stakeholders, regulatory considerations.
Capstone: Teams present end‑to‑end M&E ML project: problem, data, model, evaluation, implementation & ethics plan.
Assessment: peer feedback + instructor scoring.

Hands‑on labs & tools

Languages: Python (recommended) — pandas, scikit-learn, XGBoost, statsmodels, Prophet, SHAP, tensorflow/keras or PyTorch for deep learning; OR R (tidyverse, tidymodels).
Environments: Google Colab (no install), Jupyter notebooks, GitHub for submissions.
– Visualization: Plotly, Altair, Matplotlib; dashboards using Dash, Streamlit, Tableau or PowerBI.
Data collection & mobile: Open Data Kit (ODK), KoBoToolbox, CommCare for field data.
Cloud/Hosting: Google Cloud / AWS / Azure (optional) for deployment; Firebase or simple Flask APIs for prototypes.

Datasets (examples & sources)

Demographic and Health Surveys (DHS) — household & health data
World Bank Open Data — economic & development indicators
Humanitarian Data Exchange (HDX)
OpenStreetMap, Sentinel/Landsat imagery (Planet, Google Earth Engine)
Gov open data portals (education, health)
Simulated/cleaned program datasets for class exercises (prepare small CSVs)

Assignments & assessment

Weekly assignments (50%): EDA, models, interpretation, implementation briefs.
Capstone project (35%): group end‑to‑end project, presentation, code & report.
Participation & quizzes (15%): attendance, short quizzes on readings, in-class participation.
Rubrics: clarity of problem framing, appropriateness of ML method, data quality handling, model evaluation, ethics & implementation plan, reproducibility (code + README).

Sample capstone ideas

Predict school attendance dropouts using administrative and survey data; design early intervention triggers.
Classify beneficiary complaints from free-text to route to program managers.
Use satellite imagery to monitor crop loss or infrastructure construction progress.
Anomaly detection on program expenditure or service delivery metrics to flag fraud/irregularities.

Suggested readings & resources

Courses: Coursera “AI For Everyone” (for non-technical staff), fast.ai, DataCamp, edX/HarvardX.

“The Hundred-Page Machine Learning Book” — A. Burkov (concise ML overview)

“Causal Inference: What If” — M. Hernán & J. Robins (causal concepts)

DHS Program tutorials and resources

Google AI for Social Good case studies

Papers & blogs on SHAP/LIME, responsible AI toolkits (IBM, Google)