Case 02 | Pharmaceuticals | Cash Flow & Liquidity Planning

AI Cash Flow
Forecasting Under
COVID Shock

Company

Vertex Pharmaceuticals

Shock Event

COVID-19 (Q1 2020)

Method

AI + Real-Time Drivers

Legacy baseline MAE

$3.913M

AI MAE

$2.494M

← All Cases

Problem Statement

When History Becomes Useless.

Vertex’s treasury view looked familiar: anchor on last year’s same week, layer a few spreadsheet guardrails, and hope the pattern repeats. COVID-19 made that contract void — trials slipped, supply stretched, and cash timing stopped obeying the calendar. The classroom story is not “replace finance with AI”; it is keep an auditable legacy anchor, then add a Python + scikit-learn driver model that learns from the same signals you already argue about in committee. The pilot notebook is the analytical source of truth: what you tune there is what this page plots (via the shared pipeline). Optional generative-AI sits on top as narrative glue — scenario packs and plain-language bridges — so faster numbers travel with faster explainability, not a black box.

Legacy baseline — worst week

$12.24M

Largest single-week absolute miss vs the adjusted last-year baseline on the test hold-out (same definition as the notebook export).

ML forecast — worst week
$7.39M
Largest single-week absolute miss for the driver-aware GBM on that same hold-out — stress-test where the model still hurts.

MAE uplift (ML vs legacy)

~1.6x

Mean absolute error on the hold-out: ML is 36.3% lower than the legacy baseline — the headline number your committee would ask for.

Hold-out window

42 weeks

Train/test split matches the notebook (≈80/20 by row). MAE and peak errors use test weeks only; the charts prepend ~52 training weeks so you can see behaviour before the shock.

What you are looking at is the same synthetic weekly cash path you build in pharma_cash_pilot.ipynb: warmup from early 2018, then a stress regime aligned around Q1 2020. The legacy baseline is deliberately “spreadsheet-shaped”: forecast ≈ cash from 52 weeks ago (seasonal naive), then a small shock-state haircut when the stress flag is on — transparent, reproducible in Excel, and wired identically in cases/vertex_chart_pipeline.py so this page cannot drift from the class notebook. The forecast target is observed weekly cash, cash_actual_m; drivers (trials, procurement, collections, FX, promo) are the levers FP&A already debates.

What you practise in Python is a clean separation of roles: the gradient boosting model sees drivers, short cash lags, calendar features, and a delayed shock indicator for realism — it does not ingest the legacy baseline as an input, so uplift is a fair “rules-based anchor vs learned pattern” story. Scenario bands and the optional GenAI cells mirror how you would brief treasury: envelope first, narrative second, audit trail always. Adapted from: Vertex Pharma AI/ML FP&A Case, FP&A Trends Global AI Committee 2024

Pilot Lab — Interactive Notebook Environment

Vertex Pharma cash-flow simulation with ML + GenAI implementation in Python. Similar case studies are used in investment banking solution workflows. GenAI defaults live in src/vertex_lab_ai_config.py; table display in src/pandas_display_config.py; plot style in src/chart_config.py; Matplotlib lab charts under src/charts/ with bind(raw_df) then trial_velocity(), row(…) (one row of subplots), or grid(…) (fixed 2×3 layout) in src/charts/quick.py; synthetic pilot parameters and build_synthetic_pharma_data live in src/pharma_cash_synthetic.py (the notebook loads data/vertex_pharma_cash_synthetic.csv when present to skip regeneration). In JupyterLite the tree is driven by prebuilt static/jupyterlite/api/contents/*.json (not a live scan of files/), so new paths under static/jupyterlite/files/ must be listed there — e.g. src/ and data/ via api/contents/all.json plus api/contents/src/all.json and api/contents/data/all.json. The repo keeps the same src/ modules at the project root; canonical CSV also under cases/data/.

Download .ipynb →

Initialising Python environment

First load fetches ~30 MB Pyodide kernel · cached thereafter

Post-Notebook Findings — Legacy baseline vs ML vs actuals

Cash Flow ($M) — Actuals vs. Legacy baseline vs. ML forecast

Absolute Forecast Error ($M) by Period

Cumulative Error Divergence

Consulting Report Addendum — Post-Pilot Performance Narrative

Executive interpretation of model outcomes and operating impact

Click Generate Executive Consulting Report below the lab to reveal Post-Notebook Findings charts and the full consulting addendum. Data mirror cases/vertex_chart_pipeline.py unless you export from the notebook.

Charts and KPIs stay hidden until you generate; then the same payload powers legacy vs ML series and the report below.

Primary result

Relative uplift versus the baseline forecast method across the evaluated test window.

Legacy baseline mean error

Reference baseline under historical-only assumptions.

AI Mean Error

Driver-aware model after shock-adjusted weighting.

Scenario Spread

Average downside / upside shift generated from the notebook scenario run.

Generate the report to convert notebook outputs into a board-ready interpretation.

Error Reduction by Model

Scenario Envelope — Predicted Cash (Illustrative)

Consulting Conclusion

The report will appear here once generated from the notebook run.

Executive Takeaways

Forecast reliability improves most in volatile periods, exactly where liquidity governance matters most.
Operational drivers should remain the steering controls rather than static historical averages.
Scenario spread is as important as the point estimate for executive decision quality.
A reusable template can scale this method across business units and portfolio entities.

Notebook Feature Cheatsheet

Field definitions used in the Vertex cash-flow case

payer_lag_days: average delay (in days) between invoicing and cash collection; higher = slower collections, tighter liquidity.
procurement_delay_wk: supply/procurement delay in weeks; higher = materials/services arrive later, operations and cash timing get disrupted.
trial_velocity: pace of clinical trial progress (synthetic signal); higher usually supports faster revenue/cash realization.
shock_flag: true regime in the synthetic generator (0/1). Charts use this timing for the shock overlay.
shock_flag_ml: feature fed to the GBM — same regime but delayed 4 weeks (recognition / reporting lag).
cash_actual_m: observed weekly cash ($M) after reporting noise — the forecast target (latent operational cash is cash_latent_m in the synthetic generator).
fx_rate: exchange-rate factor affecting cross-border costs/revenues; volatility can move cash unexpectedly.
promo_spend_m: weekly commercial/promotion spend in millions; more spend can pressure short-term cash.
cash_lag_1, cash_lag_2: prior-week cash values used as ML inputs (not the legacy statistical baseline).
baseline_seasonal_naive: raw seasonal comparator = same week’s cash from 52 weeks earlier.
baseline_legacy_adjusted: baseline used in reporting = baseline_seasonal_naive - shock_flag_baseline * baseline_shock_adjust_m (Excel-style shock-state adjustment).
month, week_of_year: calendar features for the ML model.

AI Cash FlowForecasting UnderCOVID Shock