No description

Find a file

ferdinandkruppa 75a1b7f89e update		2025-11-15 01:08:53 +01:00
__pycache__	update	2025-11-15 01:08:53 +01:00
.DS_Store	update	2025-11-15 01:08:53 +01:00
pde_rga_debug.ipynb	update	2025-11-15 01:08:53 +01:00
readme.md	update	2025-11-15 01:08:53 +01:00
requirements.txt	first RGA	2025-08-29 20:53:47 +02:00
RGA_1d_exact_argmax.py	update	2025-11-15 01:08:53 +01:00
RGA_debug.py	update	2025-11-15 01:08:53 +01:00
RGA_exact_debug.ipynb	update	2025-11-15 01:08:53 +01:00
rga_tanh1d.py	update	2025-11-15 01:08:53 +01:00
rga_tanh_exp.ipynb	update	2025-11-15 01:08:53 +01:00

readme.md

RGA-PDE: Relaxed Greedy Algorithm for PDEs

This repository implements the Relaxed Greedy Algorithm (RGA) for the numerical solution of partial differential equations (PDEs). The implementation is inspired by the approach of J. W. Siegel et al. (2023).

The current focus is the 1D Poisson problem

 u'' = f \quad \text{on } [0,\pi]

with Dirichlet boundary conditions. The boundary conditions are not enforced strongly, but instead via a boundary penalty term in the energy functional ("weakly imposed Dirichlet BC").

Typical right-hand sides are e.g.

 f \equiv 1,

 f(x) = \sin(x).

Goals of this project

Empirical study of the convergence rate of RGA for PDEs.
Influence of the dictionary (set of admissible atoms) on the constants in the theoretical convergence bounds.
Understanding and explaining the plateaus observed in the error curves.

Installation & setup

To run the experiments locally, create a Python environment and install the dependencies from requirements.txt.

1. Clone the repository

git clone <repository-url>
cd RGA-PDE

2. Create and activate a virtual environment

# create venv
python -m venv .venv

# activate venv (macOS/Linux)
source .venv/bin/activate

# OR: activate venv (Windows)
.venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

Theoretical background (short version)

The Relaxed Greedy Algorithm (RGA) is an iterative method for minimizing a smooth, convex energy functional

 \mathcal{R} : H \to \mathbb{R}

over the convex hull \operatorname{conv}(\mathbb{D}) of a symmetric dictionary \mathbb{D} \subset H.

The algorithm starts with u_0 = 0. For each iteration n \ge 1, the update is (see Algorithm RGA):

 u_n = (1 - \alpha_n) , u_{n-1} + M , \alpha_n , g_n.

Here:

Budget (`M`)

M is a fixed upper bound on the variation norm. In particular,

 |u_n|_{\mathcal{K}_1} \le M \quad \text{for all } n.

Relaxation / learning rate (`\alpha_n`)

A typical choice is

 \alpha_n = \min(1, 2/n).

This enforces a relaxed step rather than a full greedy jump to the new atom.

Greedy atom (`g_n`)

At each step, pick a single atom from the dictionary that best decreases the energy. In practice an approximate version of

 g_n \approx \underset{g \in \mathbb{D}}{\arg\max} , \bigl| \langle g, \nabla \mathcal{R}(u_{n-1}) \rangle_H \bigr|

is solved numerically.

Intuition:

Measure how the energy would change if adding a given atom.
Choose the atom with the largest (absolute) directional derivative.
Update in that direction, but only partially (via \alpha_n).

Convergence rates

Let u_n be the RGA iterate after n steps, and let u^* be an exact PDE solution.

Under standard assumptions (smoothness, convexity, symmetric dictionary, bounded variation norm), one obtains:

Energy / optimization error

 \mathcal{R}(u_n) - \inf_{|v|_{\mathcal{K}_1} \le M} \mathcal{R}(v) \le \mathcal{O}!\left( \frac{1}{n} \right).

$L^2$-error

Combining this with a Poincaré-type estimate (and standard elliptic theory for Poisson),

 |u_n - u^*|_{L^2} \le \mathcal{O}!\left(n^{-1/2}\right).

So asymptotically, the approximation improves at rate n^{-1/2} in L^2.

Key implementation ideas

1. Dictionaries (`\mathbb{D}`)

Unlike FEM, the algorithm does not use a fixed basis on a mesh. Instead, it constructs the solution as a convex combination of "atoms" drawn from a dictionary. Two dictionary types are currently implemented:

(a) Quadratic ReLU atoms

 g(x) = \operatorname{ReLU}^2(w x + b) = (w x + b)_+^2.

Implemented in RGA_ld_exact_argmax.py and RGA_debug.py.
Smooth except at one kink.
Strong localization depending on (w,b).

(b) Hyperbolic tangent atoms

 g(x) = \tanh(w x + b).

Implemented in rga_tanh1d.py.
Globally smooth.
Much better behavior near the domain boundaries in practice.

Why this matters:

The dictionary determines which shapes the method can "mix" to approximate the PDE solution.
It directly affects stability, convergence speed, and how quickly the method escapes error plateaus.

2. Argmax subproblem (atom selection)

Selecting g_n is the numerically most delicate step. Two solver strategies are implemented:

(a) Closed-form / exact argmax for quadratic ReLU

For the quadratic ReLU dictionary, define

 c(b) = \langle g_b, \nabla \mathcal{R} \rangle, \quad g_b(x) = (w x + b)_+^2.

It can be shown that c(b) is piecewise quadratic in the offset parameter b. The exact solver in RGA_ld_exact_argmax.py:

Builds all breakpoint intervals in b, coming from the grid points x_i via b_i = -w x_i.
Evaluates all candidate parabola maxima and boundary values.
Picks the global maximizer.

Result: a (numerically) exact argmax for ReLU$^2$ atoms.

(b) Numerical / fallback solver (grid + local refine)

For \tanh atoms (and as a fallback for ReLU$^2$):

Coarse grid search over parameters (w,b).
Optional L-BFGS-B refinement starting from the best coarse candidate.

This is implemented in RGA_debug.py and rga_tanh1d.py.

3. Convergence constants, boundary penalty & plateaus

The theoretical rate \mathcal{O}(1/n) hides constants, and those constants are everything in practice.

A generic upper bound for the optimization error after n steps is of the form

 \text{error}(n) \lesssim \frac{32 (C M)^2 K}{n}.

K: smoothness constant of \mathcal{R}. For the penalized Dirichlet energy used here, K = 1.
C = \sup_{g \in \mathbb{D}} |g|_H: maximal dictionary atom norm.
M: the budget / variation norm bound.

Key observations from experiments:

Boundary penalty and C(\delta). The weak enforcement of Dirichlet boundary conditions uses a penalty parameter \delta > 0. For the ReLU$^2$ dictionary, the effective constant C can blow up as
```
 C(\delta) \sim \delta^{-1/2}. 
```
Intuition: small \delta means "very stiff" boundary penalty. Atoms that violate the boundary even slightly get huge energy weight near the boundary, which makes their norm in the penalized inner product extremely large.
Huge prefactor. Because the prefactor 32 (C M)^2 can become enormous, the asymptotic 1/n regime may only be visible after millions of iterations. Before that, the measured error curve looks almost flat. \Rightarrow This explains the long error plateaus observed in experiments.
\tanh improves constants. The \tanh dictionary behaves much more tamely near the boundary. Empirically this leads to smaller effective C, more reasonable constants in the bound, and noticeably faster visible decay (less plateauing).

File overview (selected)

RGA_ld_exact_argmax.py
- RGA loop with an exact argmax routine for quadratic ReLU$^2$ atoms.
- Breakpoint/interval logic for computing the maximizing shift b.
RGA_debug.py
- Debug / playground version.
- Can switch between exact and numerical argmax.
- Contains coarse grid search + L-BFGS-B refinement.
rga_tanh1d.py
- RGA specialized to \tanh(w x + b) atoms.
- Uses numerical argmax.
requirements.txt
- Python dependencies for reproducing the experiments.

TL;DR / Why this matters

RGA builds the PDE solution as a greedy convex combination of simple nonlinear atoms instead of using a fixed mesh basis.
Theory promises energy decay like 1/n and $L^2$-error like n^{-1/2}.
In practice those rates can be hidden behind gigantic constants, especially when the boundary penalty is too stiff.
Different atom dictionaries (ReLU$^2$ vs. \tanh) dramatically change those constants and therefore the visible convergence.

Next steps / open questions

Quantify the observed plateaus: how long until the asymptotic regime actually kicks in for realistic \delta?
Better parameter search / argmax heuristics to escape local plateaus faster.
Adaptive or data-driven choice of dictionary atoms.
Extension from 1D Poisson to higher dimensions and to more general PDEs.