Case 02 | Pharmaceuticals | Cash Flow & Liquidity Planning

AI Cash Flow
Forecasting Under
COVID Shock

Company
Vertex Pharmaceuticals
Shock Event
COVID-19 (Q1 2020)
Method
AI + Real-Time Drivers
Legacy baseline MAE
$3.913M
AI MAE
$2.494M
← All Cases

When History Becomes Useless.

Vertex’s treasury view looked familiar: anchor on last year’s same week, layer a few spreadsheet guardrails, and hope the pattern repeats. COVID-19 made that contract void — trials slipped, supply stretched, and cash timing stopped obeying the calendar. The classroom story is not “replace finance with AI”; it is keep an auditable legacy anchor, then add a Python + scikit-learn driver model that learns from the same signals you already argue about in committee. The pilot notebook is the analytical source of truth: what you tune there is what this page plots (via the shared pipeline). Optional generative-AI sits on top as narrative glue — scenario packs and plain-language bridges — so faster numbers travel with faster explainability, not a black box.

Legacy baseline — worst week
$12.24M
Largest single-week absolute miss vs the adjusted last-year baseline on the test hold-out (same definition as the notebook export).
ML forecast — worst week
$7.39M
Largest single-week absolute miss for the driver-aware GBM on that same hold-out — stress-test where the model still hurts.
MAE uplift (ML vs legacy)
~1.6x
Mean absolute error on the hold-out: ML is 36.3% lower than the legacy baseline — the headline number your committee would ask for.
Hold-out window
42 weeks
Train/test split matches the notebook (≈80/20 by row). MAE and peak errors use test weeks only; the charts prepend ~52 training weeks so you can see behaviour before the shock.

What you are looking at is the same synthetic weekly cash path you build in pharma_cash_pilot.ipynb: warmup from early 2018, then a stress regime aligned around Q1 2020. The legacy baseline is deliberately “spreadsheet-shaped”: forecast ≈ cash from 52 weeks ago (seasonal naive), then a small shock-state haircut when the stress flag is on — transparent, reproducible in Excel, and wired identically in cases/vertex_chart_pipeline.py so this page cannot drift from the class notebook. The forecast target is observed weekly cash, cash_actual_m; drivers (trials, procurement, collections, FX, promo) are the levers FP&A already debates.

What you practise in Python is a clean separation of roles: the gradient boosting model sees drivers, short cash lags, calendar features, and a delayed shock indicator for realism — it does not ingest the legacy baseline as an input, so uplift is a fair “rules-based anchor vs learned pattern” story. Scenario bands and the optional GenAI cells mirror how you would brief treasury: envelope first, narrative second, audit trail always. Adapted from: Vertex Pharma AI/ML FP&A Case, FP&A Trends Global AI Committee 2024

Vertex Pharma cash-flow simulation with ML + GenAI implementation in Python. Similar case studies are used in investment banking solution workflows. GenAI defaults live in src/vertex_lab_ai_config.py; table display in src/pandas_display_config.py; plot style in src/chart_config.py; Matplotlib lab charts under src/charts/ with bind(raw_df) then trial_velocity(), row(…) (one row of subplots), or grid(…) (fixed 2×3 layout) in src/charts/quick.py; synthetic pilot parameters and build_synthetic_pharma_data live in src/pharma_cash_synthetic.py (the notebook loads data/vertex_pharma_cash_synthetic.csv when present to skip regeneration). In JupyterLite the tree is driven by prebuilt static/jupyterlite/api/contents/*.json (not a live scan of files/), so new paths under static/jupyterlite/files/ must be listed there — e.g. src/ and data/ via api/contents/all.json plus api/contents/src/all.json and api/contents/data/all.json. The repo keeps the same src/ modules at the project root; canonical CSV also under cases/data/.

pharma_cash_pilot.ipynb Initialising kernel…
Download .ipynb
Initialising Python environment
First load fetches ~30 MB Pyodide kernel · cached thereafter

Field definitions used in the Vertex cash-flow case

  • payer_lag_days: average delay (in days) between invoicing and cash collection; higher = slower collections, tighter liquidity.
  • procurement_delay_wk: supply/procurement delay in weeks; higher = materials/services arrive later, operations and cash timing get disrupted.
  • trial_velocity: pace of clinical trial progress (synthetic signal); higher usually supports faster revenue/cash realization.
  • shock_flag: true regime in the synthetic generator (0/1). Charts use this timing for the shock overlay.
  • shock_flag_ml: feature fed to the GBM — same regime but delayed 4 weeks (recognition / reporting lag).
  • cash_actual_m: observed weekly cash ($M) after reporting noise — the forecast target (latent operational cash is cash_latent_m in the synthetic generator).
  • fx_rate: exchange-rate factor affecting cross-border costs/revenues; volatility can move cash unexpectedly.
  • promo_spend_m: weekly commercial/promotion spend in millions; more spend can pressure short-term cash.
  • cash_lag_1, cash_lag_2: prior-week cash values used as ML inputs (not the legacy statistical baseline).
  • baseline_seasonal_naive: raw seasonal comparator = same week’s cash from 52 weeks earlier.
  • baseline_legacy_adjusted: baseline used in reporting = baseline_seasonal_naive - shock_flag_baseline * baseline_shock_adjust_m (Excel-style shock-state adjustment).
  • month, week_of_year: calendar features for the ML model.