Retirement choice speed benchmarks¶
Paper: Dobrescu and Shanker (2022), A Fast Upper Envelope Scan Method for Discrete-Continuous Dynamic Programming
This benchmarking notebook solves the discrete retirement choice model from Iskhakov et al. (2017) using FUES and benchmarks against other upper-envelope methods.
We benchmark FUES against three alternatives: MSS (monotone segment selection, Iskhakov et al. 2017, via HARK), LTM (local triangulation, Druedahl 2021, via ConSav), and RFC (rooftop cut, Dobrescu & Shanker 2024).
Note on the model files
This notebook loads the retirement model from a set of declarative YAML files written in
dolo+. In this declarative language, a stage can be thought of as just one self-contained Bellman operator of the dynamic problem, such as the worker's consumption choice, the retiree's consumption choice, or the labour-market decision. A full period is built by linking these stages together as a graph representing the choice structure of the model.The YAML format used here is called
dolo+. It is simply a structured, machine-readable way of writing down the model's states, controls, transitions, and Bellman objects. You do not need to learn that syntax to follow this notebook.
Load the basic imports, file paths, and model solver. We refer to a model as a nest, which is essentially a directed graph built up from the stages.
The function solve_nest solves a model by running backward induction on the stages in the graph.
import numpy as np
import matplotlib.pyplot as plt
import time, sys, os, yaml, tempfile
from pathlib import Path
from IPython.display import Image, Markdown, HTML, display
# ── Repo paths (robust: walk up until we find pyproject.toml) ──
_here = Path(os.path.abspath('')).resolve()
REPO = _here
for _ in range(10):
if (REPO / 'pyproject.toml').exists():
break
REPO = REPO.parent
sys.path.insert(0, str(REPO))
sys.path.insert(0, str(REPO / 'src'))
SYNTAX_DIR = REPO / 'examples' / 'retirement' / 'syntax'
# ── Imports ──
from examples.retirement.solve import solve_nest # main solver of a model
from examples.retirement.outputs import (
get_policy, get_timing, get_solution_at_age, # helper functions that unpack and extract from the solution dict
euler, consumption_deviation,
# Straight-forward plotters for this notebook (consistent Nord style)
setup_nb_style,
nb_plot_egrids, nb_plot_egm_interactive,
nb_plot_cons_ages, nb_plot_scaling,
)
# ── Apply notebook style globally ──
setup_nb_style()
print(f'REPO: {REPO}')
REPO: /Users/akshayshanker/Research/Repos/FUES2026/FUES
1. Model¶
An agent lives for $T$ periods. Each period she holds assets $a \geq 0$ and makes two choices: a discrete choice $d \in \{\text{work}, \text{retire}\}$ and a continuous choice of consumption $c$. Working yields wage income $y$ but costs disutility $\tau$; retirement is absorbing. Assets earn gross return $(1{+}r)$.
The sequence of events within a period are
- workers and retirees start with beginning-of-period assets $a$ and $a_{\text{ret}}$ respectively
- earn returns and receive income to produce cash-on-hand $w = (1{+}r)a + y$ for workers or $w_{\text{ret}} = (1{+}r)a_{\text{ret}}$ for retirees
- consumes $c$, leaving end-of-period savings $b = w - c$ (or $b_{\text{ret}} = w_{\text{ret}} - c$)
- inter-period transition maps $b \to a$ and $b_{\text{ret}} \to a_{\text{ret}}$ for the next period; retirement is absorbing.
Stage decomposition¶
Each period can be translated to a directed graph of self-contained modular stages, following Carroll (2026); see Carroll and Shanker (2026) for the formal framework. The retirement model has three stages:
labour_mkt_decision(branching) — discrete choice: $\max(\mathrm{v}_{\succ}^{\text{work}} - \tau,\; \mathrm{v}_{\succ}^{\text{retire}})$. Assets $a$ pass through unchanged.work_cons(continuous, EGM + FUES) — worker consumption: $a \to w \to b$. The continuation value $\mathrm{v}_{\succ}$ is non-concave; FUES recovers the correct envelope.retire_cons(continuous, EGM) — retiree consumption: $a_{\text{ret}} \to w_{\text{ret}} \to b_{\text{ret}}$. Standard concave problem.
Note that workers arrive at $a$ (into the branching stage); retirees arrive at $a_{\text{ret}}$ (directly into retire_cons). If a worker chooses to retire, they become a retiree and move into retiree consumption problem, retire_cons, as those who entered the period as a retiree.
Stage operators¶
Within each each stage, the state space is represented at three nodes: arrival ($\mathsf{X}_{\prec}$), decision ($\mathsf{X}$), and continuation ($\mathsf{X}_{\succ}$). Solving proceeds backward: given a continuation-value function $\mathrm{v}_{\succ}$ on $\mathsf{X}_{\succ}$, the decision mover $\mathbb{B}$ produces the decision-value function $\mathrm{v}$ on $\mathsf{X}$, and the arrival mover $\mathbb{I}$ passes $\mathrm{v}$ back to $\mathsf{X}_{\prec}$. Throughout, $\partial\mathrm{v}$ denotes the derivative of $\mathrm{v}$ with respect to the stage's own state variable.
The term mover here refers to operations that move from one node to the next (either forward or backward). The mathematical aspect of the mover is simply a functional operator -- but the mover captures an object that may also contain computational representations of how the operator is implemented on the computer.
work_cons — worker consumption (EGM + FUES)
Decision mover $\mathbb{B}$ (continuation $\to$ decision)
Let $\mathrm{v}_{\succ}(b)$ be the continuation value at end-of-period savings $b$, and $\partial\mathrm{v}_{\succ}(b)$ its derivative. The worker's cash-on-hand is $w$ and the budget constraint is $b = w - c$. The decision mover solves:
$$(\mathbb{B}\,\mathrm{v}_{\succ})(w) = \mathrm{v}(w) = \max_c\bigl\{\log(c) + \beta\,\mathrm{v}_{\succ}(w - c)\bigr\}$$
The first-order condition is $1/c = \beta\,\partial\mathrm{v}_{\succ}(b)$.
EGM. Given an exogenous grid $\{b_0^{\#},\dots,b_N^{\#}\}$ on the continuation state, the FOC yields optimal consumption $c_i^{\#} = \bigl(\beta\,\partial\mathrm{v}_{\succ}(b_i^{\#})\bigr)^{-1}$ and the budget constraint recovers the endogenous grid $w_i^{\#} = b_i^{\#} + c_i^{\#}$. Each pair $(w_i^{\#},\, c_i^{\#})$ satisfies the FOC, and the corresponding value is $q_i^{\#} = \log(c_i^{\#}) + \beta\,\mathrm{v}_{\succ}(b_i^{\#})$.
Non-concavity. The worker's $\mathrm{v}_{\succ}$ is the upper envelope of concave functions (one for each future discrete-choice sequence) and is not itself concave. As a result the endogenous grid $\{w_i^{\#}\}$ may be non-monotone, and the points $(w_i^{\#},\, q_i^{\#})$ define a correspondence rather than a function. An upper-envelope algorithm recovers the monotone upper envelope of $\{(w_i^{\#},\, q_i^{\#})\}$, thereby approximating $\mathrm{v}$. This is where FUES comes in.
Arrival mover $\mathbb{I}$ (decision $\to$ arrival)
The arrival transition is $w = (1{+}r)a + y$, so:
$$(\mathbb{I}\mathrm{v})(a) = \mathrm{v}\bigl((1{+}r)a + y\bigr)$$
The chain rule gives $\partial\mathrm{v}_{\prec}(a) = (1{+}r)\,\partial\mathrm{v}(w)$, and the envelope theorem yields $\partial\mathrm{v}(w) = 1/c$.
retire_cons — retiree consumption (EGM, concave)
Decision mover $\mathbb{B}$ (continuation $\to$ decision)
Let $\mathrm{v}_{\succ}(b_{\text{ret}})$ be the continuation value at retiree savings $b_{\text{ret}}$, and $\partial\mathrm{v}_{\succ}(b_{\text{ret}})$ the marginal value. The retiree's cash-on-hand is $w_{\text{ret}}$ and the budget constraint is $b_{\text{ret}} = w_{\text{ret}} - c$. The decision mover is:
$$(\mathbb{B}\mathrm{v}_{\succ})(w_{\text{ret}}) = \mathrm{v}(w_{\text{ret}}) = \max_c\bigl\{\log(c) + \beta\,\mathrm{v}_{\succ}(b_{\text{ret}})\bigr\}$$
such that $b_{\text{ret}} = w_{\text{ret}} - c$. The first-order condition is $1/c = \beta\,\partial\mathrm{v}_{\succ}(b_{\text{ret}})$.
EGM. Given a grid on $b_{\text{ret}}$, recover $c_i^{\#} = \bigl(\beta\,\partial\mathrm{v}_{\succ}(b_{\text{ret},i}^{\#})\bigr)^{-1}$ and $w_{\text{ret},i}^{\#} = b_{\text{ret},i}^{\#} + c_i^{\#}$. Here $\mathrm{v}_{\succ}$ is concave (retirement is absorbing), so EGM produces a monotone endogenous grid and no upper-envelope step is needed.
Arrival mover $\mathbb{I}$ (decision $\to$ arrival)
The arrival transition is $w_{\text{ret}} = (1{+}r)\,a_{\text{ret}}$ (no income), so:
$$(\mathbb{I}\mathrm{v})(a_{\text{ret}}) = \mathrm{v}\bigl((1{+}r)\,a_{\text{ret}}\bigr)$$
and $\partial\mathrm{v}_{\prec}(a_{\text{ret}}) = (1{+}r)\,\partial\mathrm{v}(w_{\text{ret}})$.
labour_mkt_decision — discrete branching
Decision mover $\mathbb{B}$ (continuation $\to$ decision)
The branching stage receives the arrival values from the two consumption stages: $\mathrm{v}_{\succ}^{\text{work}}(a)$ from work_cons and $\mathrm{v}_{\succ}^{\text{retire}}(a)$ from retire_cons. Assets $a$ pass through unchanged (identity transitions). The decision mover is the discrete-choice $\max$:
$$(\mathbb{B}\mathrm{v}_{\succ})(a) = \mathrm{v}(a) = \max\!\bigl(\mathrm{v}_{\succ}^{\text{work}}(a) - \tau,\;\; \mathrm{v}_{\succ}^{\text{retire}}(a)\bigr)$$
Arrival mover $\mathbb{I}$ (decision $\to$ arrival)
Identity: $(\mathbb{I}\mathrm{v})(a) = \mathrm{v}(a)$.
Sequential form. Composing the three stage operators and substituting the transitions recovers the traditional sequential recursive Bellman equations. Writing $V_t^1$ for the worker's arrival value and $V_t^0$ for the retiree's:
$$V_t^1(a) = \max_{d}\; Q_t^d(a), \qquad Q_t^{\text{work}}(a) = \max_c \bigl\{ \log(c) - \tau + \beta\, V_{t+1}^1\bigl((1{+}r)a + y - c\bigr) \bigr\}$$
$$Q_t^{\text{retire}}(a) = \max_c \bigl\{ \log(c) + \beta\, V_{t+1}^0\bigl((1{+}r)a - c\bigr) \bigr\}, \qquad V_t^0(a) = \max_c \bigl\{ \log(c) + \beta\, V_{t+1}^0\bigl((1{+}r)a - c\bigr) \bigr\}$$
Let us load the stage syntax templates from the syntax directory for each of the stages above.
# ── Display the stage YAML declarations (collapsible) ──
stage_names = ['work_cons', 'retire_cons', 'labour_mkt_decision']
_parts = []
for name in stage_names:
path = SYNTAX_DIR / 'stages' / name / f'{name}.yaml'
txt = path.read_text().rstrip()
_parts.append(
f'<details><summary><code>{name}</code>  '
f'<span style="opacity:.45;font-size:.85em">{path.relative_to(REPO)}</span>'
f'</summary><pre style="margin:4px 0;font-size:12px;line-height:1.45">{txt}</pre></details>'
)
display(HTML('\n'.join(_parts)))
{name} '
f'{path.relative_to(REPO)}'
f'
{txt}2. Calibration and settings¶
No parameters are hard-coded in Python. Let us load the baseline calibration and settings from the syntax directory to see what they look like.
# ── Display calibration and settings (collapsible) ──
_parts = []
for fname in ['calibration.yaml', 'settings.yaml']:
path = SYNTAX_DIR / fname
txt = path.read_text().rstrip()
_parts.append(
f'<details><summary><code>{fname}</code></summary>'
f'<pre style="margin:4px 0;font-size:12px;line-height:1.45">{txt}</pre></details>'
)
display(HTML('\n'.join(_parts)))
{fname}
'
f'{txt}3. Solve¶
solve_nest is the single entry point. It loads the YAML files, applies the three functors (methodize, configure, calibrate) to each stage, builds the model, and runs backward induction according to the stage graph above.
We can override the baseline settings and calibrations by passing calib_overrides and config_overrides to solve_nest. We can also select the upper-envelope method via method='FUES', 'DCEGM', 'RFC', or 'CONSAV'.
Note: the dolo+ declarative syntax is solver-agnostic — solve_nest is built as part of this repo, not part of the dolo+ implementation.
# ── Warmup (JIT compile) then timed solve ──
# Solve for 50 periods (longer horizon shows more kinks)
_, model, stage_ops, waves = solve_nest(SYNTAX_DIR, method='FUES', config_overrides={'T': 50})
nest, model, stage_ops, waves = solve_nest(
SYNTAX_DIR, method='FUES', config_overrides={'T': 50},
model=model, stage_ops=stage_ops, waves=waves)
print(f'Model: T={model.T}, grid_size={model.grid_size}, '
f'beta={model.beta}, delta={model.delta}, y={model.y}')
print(f'Euler error: {euler(model, get_policy(nest, "c")):.6f}')
# ── Override: lower beta and increase grid size ──
nest2, model2, _, _ = solve_nest(
SYNTAX_DIR, method='FUES',
calib_overrides={'beta': 0.91},
config_overrides={'grid_size': 3000, 'T': 50})
print(f'\nWith overrides: beta={model2.beta}, grid_size={model2.grid_size}')
print(f'Euler error: {euler(model2, get_policy(nest2, "c")):.6f}')
Model: T=50, grid_size=3000, beta=0.96, delta=1, y=20 Euler error: -1.628838 With overrides: beta=0.91, grid_size=3000 Euler error: -1.029809
3.1 EGM grid plots¶
The plots below show the raw EGM correspondence (red circles), FUES-refined points (blue crosses), the value function line (black), and estimated intersection points (green stars). We plot the solution from the second solve above (lower $\beta$).
# ── EGM grid with FUES refinement (static, notebook style) ──
plot_age = 37
fig = nb_plot_egrids(nest2, model2, age=plot_age)
plt.show()
Interactive plots
Make interactive plots (only for local run).
# ── Interactive EGM grid plot (local run only — Plotly) ──
try:
from examples.retirement.outputs import nb_plot_egm_interactive
fig = nb_plot_egm_interactive(nest2, model2, age=plot_age, pad=10)
fig.show()
except Exception:
print('Interactive plot skipped (requires local Jupyter with Plotly).')
3.2 Consumption policy across ages¶
nb_plot_cons_ages shows consumption as a function of assets at three ages. The discontinuous jumps correspond to the retirement decision — agents with wealth above the threshold retire.
# ── Consumption policy at multiple ages ──
from examples.retirement.outputs import nb_plot_cons_ages
fig = nb_plot_cons_ages(nest, model, ages=(20, 30, 40))
plt.show()
4. Method comparison¶
At this calibration, all four upper-envelope methods deliver essentially the same solution quality: the reported Euler errors are extremely close, and the small differences are numerical rather than economically meaningful. The substantive comparison is therefore computation speed.
# ── Solve with all four methods (T=50) ──
methods = ['FUES', 'DCEGM', 'RFC', 'CONSAV']
# Display labels (API uses DCEGM/CONSAV internally)
_DISPLAY = {'FUES': 'FUES', 'DCEGM': 'MSS', 'RFC': 'RFC', 'CONSAV': 'LTM'}
results = {}
for method in methods:
_, m_, ops_, w_ = solve_nest(SYNTAX_DIR, method=method, config_overrides={'T': 50})
nest_m, model_m, _, _ = solve_nest(
SYNTAX_DIR, method=method, config_overrides={'T': 50},
model=m_, stage_ops=ops_, waves=w_)
c_pol = get_policy(nest_m, 'c')
ue_t, solve_t = get_timing(nest_m)
results[method] = {
'euler': euler(model_m, c_pol),
'ue_ms': ue_t * 1000,
'total_ms': solve_t * 1000,
}
# ── Results table ──
print(f'{"Method":>8s} | {"Euler Error":>12s} | {"UE (ms)":>10s} | {"Total (ms)":>12s}')
print('-' * 50)
for m in methods:
r = results[m]
print(f'{_DISPLAY[m]:>8s} | {r["euler"]:>12.6f} | {r["ue_ms"]:>10.3f} | {r["total_ms"]:>12.3f}')
Method | Euler Error | UE (ms) | Total (ms)
--------------------------------------------------
FUES | -1.628838 | 0.149 | 0.244
MSS | -1.628767 | 5.825 | 5.971
RFC | -1.628718 | 5.008 | 5.158
LTM | -1.628846 | 5.946 | 6.102
5. Scaling: upper envelope time vs grid size¶
We sweep grid sizes from 1,000 to 15,000 and compare upper-envelope time across methods. FUES remains the fastest method throughout, with a much smaller constant factor than the alternatives.
MSS and RFC both scale approximately linearly in this experiment and stay relatively close to one another. LTM is competitive only on the very smallest grids, but its cost rises much more quickly and it is overtaken by the time we reach the medium-sized grids used in the main comparison.
# ── Scaling sweep (equal steps for visual linearity check) ──
# Operators take the grid as an argument, so we build them once
# per method (JIT compiles once) and reuse across all grid sizes.
import gc
grid_sizes = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 11000, 12000, 13000, 14000, 15000]
scaling = {m: [] for m in methods}
# Build + JIT-warmup operators once per method (smallest grid).
# Operators take grid as argument so they work with any size.
_cached = {}
for method in methods:
_, _, ops_, w_ = solve_nest(
SYNTAX_DIR, method=method,
config_overrides={'grid_size': grid_sizes[0], 'T': 50})
_cached[method] = (ops_, w_)
for gs in grid_sizes:
for method in methods:
ops_, w_ = _cached[method]
n, _, _, _ = solve_nest(
SYNTAX_DIR, method=method,
config_overrides={'grid_size': gs, 'T': 50},
stage_ops=ops_, waves=w_)
ue, _ = get_timing(n)
scaling[method].append(ue * 1000)
del n
gc.collect()
print(f'grid_size={gs:>6d} '
+ ' '.join(f'{m}: {scaling[m][-1]:.2f}ms' for m in methods))
grid_size= 1000 FUES: 0.08ms DCEGM: 1.87ms RFC: 1.61ms CONSAV: 0.68ms grid_size= 2000 FUES: 0.11ms DCEGM: 3.99ms RFC: 3.98ms CONSAV: 2.67ms grid_size= 3000 FUES: 0.14ms DCEGM: 5.65ms RFC: 5.05ms CONSAV: 5.94ms grid_size= 4000 FUES: 0.18ms DCEGM: 7.51ms RFC: 6.82ms CONSAV: 10.57ms grid_size= 5000 FUES: 0.27ms DCEGM: 11.43ms RFC: 9.16ms CONSAV: 16.70ms grid_size= 6000 FUES: 0.34ms DCEGM: 11.86ms RFC: 11.45ms CONSAV: 23.72ms grid_size= 7000 FUES: 0.40ms DCEGM: 13.86ms RFC: 15.47ms CONSAV: 32.11ms grid_size= 8000 FUES: 0.44ms DCEGM: 15.69ms RFC: 16.30ms CONSAV: 41.68ms grid_size= 9000 FUES: 0.48ms DCEGM: 17.19ms RFC: 19.43ms CONSAV: 53.36ms grid_size= 10000 FUES: 0.49ms DCEGM: 19.42ms RFC: 20.24ms CONSAV: 65.14ms grid_size= 11000 FUES: 0.60ms DCEGM: 23.24ms RFC: 22.46ms CONSAV: 78.73ms grid_size= 12000 FUES: 0.61ms DCEGM: 23.16ms RFC: 24.51ms CONSAV: 93.90ms grid_size= 13000 FUES: 0.61ms DCEGM: 24.74ms RFC: 25.12ms CONSAV: 110.92ms grid_size= 14000 FUES: 0.72ms DCEGM: 28.63ms RFC: 30.07ms CONSAV: 132.52ms grid_size= 15000 FUES: 0.84ms DCEGM: 29.93ms RFC: 32.88ms CONSAV: 146.54ms
# ── Scaling plot with O(√N) reference line ──
from examples.retirement.outputs import nb_plot_scaling
fig = nb_plot_scaling(grid_sizes, scaling, methods=methods)
ax = fig.axes[0]
gs0 = grid_sizes[0]
t0 = scaling['FUES'][0]
ref_sqrt = [t0 * (gs / gs0) ** 0.5 for gs in grid_sizes]
ax.plot(grid_sizes, ref_sqrt, '--', color='gray', linewidth=1,
alpha=0.6, label=r'$O(\sqrt{N})$')
ax.legend(ncol=2, frameon=False)
plt.show()
checkpoints = [1000, 3000, 12000]
idx_by_grid = {gs: i for i, gs in enumerate(grid_sizes)}
lines = [
"**Notebook scaling observations (live):**",
"",
"The table below reports the upper-envelope time ratio `other method / FUES`, so larger numbers mean a larger speed advantage for FUES.",
"",
"| Grid size | MSS / FUES | RFC / FUES | LTM / FUES |",
"|-----------|------------|------------|------------|",
]
for gs in checkpoints:
i = idx_by_grid[gs]
fues_t = scaling['FUES'][i]
mss_ratio = scaling['DCEGM'][i] / fues_t
rfc_ratio = scaling['RFC'][i] / fues_t
ltm_ratio = scaling['CONSAV'][i] / fues_t
lines.append(f"| {gs:,} | {mss_ratio:.1f}x | {rfc_ratio:.1f}x | {ltm_ratio:.1f}x |")
lines.extend([
"",
"MSS and RFC scale linearly and remain close to one another across the sweep, while LTM scales quadratically and becomes much more expensive as the grid grows. FUES scales less than linearly",
])
display(Markdown("\n".join(lines)))
Notebook scaling observations (live):
The table below reports the upper-envelope time ratio other method / FUES, so larger numbers mean a larger speed advantage for FUES.
| Grid size | MSS / FUES | RFC / FUES | LTM / FUES |
|---|---|---|---|
| 1,000 | 24.2x | 20.8x | 8.8x |
| 3,000 | 39.2x | 35.1x | 41.3x |
| 12,000 | 37.7x | 39.9x | 153.0x |
MSS and RFC scale linearly and remain close to one another across the sweep, while LTM scales quadratically and becomes much more expensive as the grid grows. FUES scales less than linearly
Laptop and live notebook timings can be noisy due to thermal throttling, OS scheduling, background processes and memory. The PBS cluster results below provide cleaner measurements on dedicated hardware.
6. PBS cluster timings (Gadi)¶
Compare the notebook scaling sweep (above) with timings from a PBS batch run on NCI Gadi.
The table below is parsed from a timing markdown file produced by run.py --run-timings.
We average UE time across all $\delta$ values for each grid size and method.
# ── PBS cluster timings ──
from nb_pbs import plot_pbs_scaling
timing_path = REPO / 'examples' / 'retirement' / 'results' / \
'retirement_20260305_125534' / 'retirement_timing.md'
fig = plot_pbs_scaling(timing_path)
plt.show()