Flux bounding: enumeration and stochastic sampling#

What’s in this notebook? This notebook introduces the bounded_fluxes class from jaxvacua.flux_bounding, which implements the systematic flux-bounding and enumeration algorithm of arXiv:2501.03984. The algorithm provides provably complete enumeration of Type IIB SUSY flux vacua within a finite moduli-space region subject to a D3 tadpole bound, plus a stochastic variant that scales to large \(N_{\max}\) where enumeration is infeasible.

In this notebook, you will learn:

  • The eigenvalue-bound construction of the NS-NS flux box.

  • The six-step sample_bounded_fluxes walkthrough — ISD completion, tadpole and eigenvalue filters, scipy / Newton refinement.

  • When to use enumeration (small \(N_{\max}\), complete) vs. stochastic Gaussian-prior sampling (large \(N_{\max}\)).

  • How to reproduce Dataset B from arXiv:2501.03984 end-to-end.

Prerequisites: NB05 — Finding flux vacua, NB06 — ISD principle, NB07 — ISD sampling.

Outline#

  1. Setup

  2. Algorithm overview

  3. Eigenvalue bounds

  4. Step-by-step walkthrough

  5. Enumeration vs stochastic sampling

  6. Case study: recovering Dataset B

  7. Take-aways

  8. Further reading

Setup#

import warnings, time, math
import numpy as np
from tqdm.auto import tqdm
from scipy.optimize import root

import jax
from jax import jit, vmap
import jax.numpy as jnp
jax.config.update("jax_enable_x64", True)

import matplotlib.pyplot as plt
import seaborn as sn
cmap = sn.color_palette("viridis", as_cmap=True)

import jaxvacua as jvc
from jaxvacua.flux_bounding import bounded_fluxes

warnings.filterwarnings("ignore")

Canonical fixture. Throughout this notebook we use the degree-18 hypersurface in \(\mathbb{CP}^{1,1,1,6,9}\) (\(h^{2,1}=2\), model_ID=1), the standard test model from arXiv:2501.03984 (Dataset B). The data_sampler here restricts \(\operatorname{Im}(z) \in [2, 3]\) and \(s = \operatorname{Im}(\tau) \in [2, 5]\) for comparability with §6’s Dataset B reproduction.

h12 = 2
model = jvc.FluxVacuaFinder(h12=h12, model_ID=1)
print(model)
sampler = jvc.data_sampler(
    model,
    moduli_bounds=(2., 3.),
    axion_bounds=(-0.5, 0.5),
    dilaton_bounds=(2., 5.),
)
print(sampler)

Quick start: the 1D mirror octic#

Before the two-modulus walkthrough below, here is the entire flux-bounding workflow on the simplest geometry — the mirror octic, the one-parameter (\(h^{2,1}=1\)) mirror of the degree-8 hypersurface in \(\mathbb{WP}^4_{1,1,1,1,4}\) (\(\kappa=2\), \(c_2\cdot D=44\), \(\chi=-296\)).

The algorithm in one sentence: from the extremal eigenvalues of the ISD matrix \(\mathcal{M}\) and the gauge-kinetic matrix \(\mathcal{N}\) it builds a finite box of integer NS-NS fluxes \(h\), ISD-completes each to a full flux \([f\,|\,h]\), keeps those passing the tadpole and eigenvalue filters, and (optionally) Newton-refines to SUSY vacua. The call sequence is always the same:

model = FluxVacuaFinder(...)                    # the 4d EFT
bf    = bounded_fluxes(model, sampler=..., Nmax=...)
vacua = bf.enumerate_fluxes(...)                # or bf.sample_bounded_fluxes(...)
# Mirror octic: one-parameter model (h12 = 1), stored as KS model_ID = 1.
m_oct = jvc.FluxVacuaFinder(h12=1, model_ID=1)
print(m_oct)

sampler_oct = jvc.data_sampler(
    m_oct,
    moduli_bounds=(1.5, 7.0),     # Im(z) window in the LCS region
    axion_bounds=(-0.5, 0.5),
    dilaton_bounds=(1.0, 10.0),
    use_jax=True,
    seed=0,
)
bf_oct = bounded_fluxes(m_oct, sampler=sampler_oct, Nmax=4)

# Complete enumeration inside the eigenvalue box (provably complete at small Nmax).
vac_oct = bf_oct.enumerate_fluxes(
    n_sample=200,
    n_isd_per_h=20,
    refine=True,
    verbose=False,
    confirm_streaming=False,
)
print(f"Found {len(vac_oct)} octic SUSY flux vacua with N_flux <= 4.")
for v in vac_oct[:5]:
    nf = float(m_oct.tadpole(jnp.array(v["flux"])).real)
    print(f"  flux={np.round(np.array(v['flux'])).astype(int).tolist()}  "
          f"z={complex(v['moduli'][0]):.3f}  N_flux={nf:.0f}")

Cross-check against Plauschinn–Schlechter#

The mirror octic has a published set of flux vacua (arXiv:2310.06040, Plauschinn–Schlechter). Their fluxes use a different period/basis ordering; the integer map to the jaxvacua convention is \(P_B = B\,P\), where \(P\) swaps the two flux halves and \(B\) is the octic basis change (both from numerical_periods.octic_conventions). Applying \(P_B\) to a paper flux \((H, F)\) yields the jaxvacua \([f\,|\,h]\) vector.

We load their dataset from the HuggingFace vacua_vault repo (safeguarded — the check is skipped cleanly if the data or stringforge is unavailable) and confirm that, in the LCS region, their fluxes are genuine SUSY vacua of our EFT: the Newton residual \(\sum_I |D_I W|\) drops to machine precision.

# Convention map (Plauschinn-Schlechter -> jaxvacua), from
# numerical_periods.octic_conventions:
_P  = np.array([[0, 0, 1, 0], [0, 0, 0, 1], [1, 0, 0, 0], [0, 1, 0, 0]])   # flux half-order swap
_B  = np.array([[-1, 0, 0, 0], [0, -1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]]) # octic basis change
_PB = _B @ _P

def paper_to_jaxvacua_octic(H_paper, F_paper):
    """Map a Plauschinn-Schlechter (H, F) flux pair to the jaxvacua [f | h] vector."""
    return np.concatenate([_PB @ np.asarray(F_paper),
                           _PB @ np.asarray(H_paper)]).astype(float)

# Optional local CSV (raw paper columns H(4), F(4), N_flux, Re_t, Im_t, patch, ..., c, s).
# Leave as None to load from HuggingFace; set a path to run the cross-check offline.
OCTIC_PS_CSV = None

RUN_OCTIC_PS = True
_rows = None
try:
    if OCTIC_PS_CSV is not None:
        import csv as _csv
        _rows = []
        with open(OCTIC_PS_CSV) as _f:
            for _r in _csv.reader(_f):
                try:
                    _c, _s = float(_r[14]), float(_r[15])
                except (ValueError, IndexError):
                    continue
                _rows.append(dict(
                    flux=paper_to_jaxvacua_octic([int(_r[i]) for i in range(4)],
                                                 [int(_r[i]) for i in range(4, 8)]),
                    t=float(_r[9]) + 1j * float(_r[10]), patch=_r[11],
                    tau=_c + 1j * _s))
    else:
        from stringforge import LCSDatabase
        _ps = LCSDatabase(dataset="tdf").fetch_vacua_from_hub(model=m_oct, label="mirror_octic")
        if len(_ps) == 0:
            raise FileNotFoundError("no rows labelled 'mirror_octic' in the vacua_vault repo")
        _rows = [dict(flux=np.array(list(_row["flux"]), float),
                      t=complex(_row["moduli_re"][0] + 1j * _row["moduli_im"][0]),
                      patch="0", tau=complex(_row["tau_re"] + 1j * _row["tau_im"]))
                 for _, _row in _ps.iterrows()]
except Exception as _e:
    print(f"Plauschinn-Schlechter octic dataset unavailable ({type(_e).__name__}: {_e}); "
          "skipping the cross-check.")
    RUN_OCTIC_PS = False

if RUN_OCTIC_PS and _rows:
    # Restrict to the LCS region (large Im(t)) and Newton-refine the paper fluxes.
    _lcs = [r for r in _rows if r["t"].imag >= 1.8 and r.get("patch", "0") in ("0", "Infinity")][:50]
    _mod = jnp.array([[r["t"]] for r in _lcs])
    _tau = jnp.array([r["tau"] for r in _lcs])
    _flx = jnp.array([r["flux"] for r in _lcs])
    _mr, _tr, _res = bf_oct.newton_refine_batch(_mod, _tau, _flx,
                                                step_size=1.0, tol=1e-12, max_iters=50)
    _res = np.asarray(_res)
    _n_ok = int((_res < 1e-8).sum())
    print(f"Plauschinn-Schlechter cross-check (LCS region): {_n_ok}/{len(_res)} vacua "
          f"reproduced (Newton residual < 1e-8; median {np.median(_res):.1e}).")

Algorithm overview#

The flux-bounding algorithm derives rigorous upper bounds on the integer flux quanta \((f, h)\) from the eigenvalues of two period-matrix-dependent objects:

  • The ISD matrix \(\mathcal{M}(z,\bar{z})\): its largest eigenvalue \(\lambda_{\max}\) controls the size of the full flux vector.

  • The gauge kinetic matrix \(\mathcal{N}(z,\bar{z})\): the extreme eigenvalues of \(-\mathrm{Im}(\mathcal{N})\) and \(\mathrm{Im}(\mathcal{N}^{-1})\) bound the NSNS-flux sub-vectors \(h_1, h_2\) individually.

Given a sample of moduli points, global extrema of these eigenvalues are computed and used to construct a bounding box for the integer NSNS-flux vectors \(h = (h_1, h_2)\). A complete set of candidate \(h\) vectors is then enumerated inside this box; for each candidate, an ISD-projected RR-flux \(f\) is constructed via

\[ f \approx \bigl(s\,\mathcal{M}\,\Sigma + c_0\bigr)\,h, \]
and the resulting flux configuration is tested against the D3-tadpole constraint and all local eigenvalue bounds.

Eigenvalue bounds#

Initialising bounded_fluxes#

The bounded_fluxes class is the central object. It takes a model, an optional sampler, a D3-tadpole bound Nmax, and an optional lower bound dil_min on \(s = \mathrm{Im}(\tau)\).

If Nmax is not supplied it defaults to model.D3_tadpole. If dil_min is not supplied it defaults to \(\sqrt{3}/2\), the boundary of the \(\mathrm{SL}(2,\mathbb{Z})\) fundamental domain.

bf = bounded_fluxes(model, sampler=sampler, Nmax=50)
print(bf)
print(f"n_fluxes     = {bf.n_fluxes}   (= 2*(h12+1))")
print(f"dimension_H3 = {bf.dimension_H3}   (= h12+1)")
print(f"Nmax         = {bf.Nmax}")
print(f"dil_min      = {bf.dil_min:.4f}  (= sqrt(3)/2)")

Eigenvalue quantities at a single moduli point#

The method compute_evs(moduli) computes the five key eigenvalue quantities at a single point in moduli space:

Quantity

Definition

\(\lambda_{\max}\)

Largest eigenvalue of \(\mathcal{M}\) (ISD matrix)

\(\mu_{\min}\)

Smallest eigenvalue of \(-\mathrm{Im}(\mathcal{N})\)

\(\mu_{\max}\)

Largest eigenvalue of \(-\mathrm{Im}(\mathcal{N})\)

\(\tilde\mu_{\min}\)

Smallest eigenvalue of \(\mathrm{Im}(\mathcal{N}^{-1})\)

\(\tilde\mu_{\max}\)

Largest eigenvalue of \(\mathrm{Im}(\mathcal{N}^{-1})\)

This method is JIT-compiled and fast to evaluate repeatedly:

# A test point inside the LCS region
moduli_test = jnp.array([0.1 + 2.5j, -0.1 + 3.0j])

lam_max, mu_min, mu_max, tmu_min, tmu_max = bf.compute_evs(moduli_test)

print(f"lambda_max   = {lam_max:.6f}")
print(f"mu_min       = {mu_min:.6f}")
print(f"mu_max       = {mu_max:.6f}")
print(f"tmu_min      = {tmu_min:.6f}")
print(f"tmu_max      = {tmu_max:.6f}")

The batched version compute_evs_vmap(moduli_batch) evaluates these quantities over a whole array of moduli in one vectorised call:

moduli_batch = sampler.get_complex_moduli(200)
evs_batch = bf.compute_evs_vmap(jnp.array(moduli_batch))
lam_arr, mu_min_arr, mu_max_arr, tmu_min_arr, tmu_max_arr = evs_batch

print(f"lambda_max range : [{float(lam_arr.min()):.4f}, {float(lam_arr.max()):.4f}]")
print(f"mu_min    range  : [{float(mu_min_arr.min()):.4f}, {float(mu_min_arr.max()):.4f}]")
print(f"tmu_min   range  : [{float(tmu_min_arr.min()):.4f}, {float(tmu_min_arr.max()):.4f}]")

Computing the global bounding box#

compute_bounding_box(moduli_sample) iterates over the sample, updating the global eigenvalue extrema, and returns the \(L^2\) radii of the bounding box for the NSNS-flux vector:

\[ \|h_1\|^2 \leq \frac{N_{\max}}{s_{\min}\,\tilde\mu_{\min}^{\rm gl}}, \quad \|h_2\|^2 \leq \frac{N_{\max}}{s_{\min}\,\mu_{\min}^{\rm gl}}, \quad \|h\|^2 \leq \frac{2\,\lambda_{\max}^{\rm gl}\,N_{\max}}{s_{\min}}. \]

The global extrema are also accessible as attributes of bf after the call.

moduli_sample = sampler.get_complex_moduli(500)
h1_box, h2_box, h_box = bf.compute_bounding_box(moduli_sample)

print("Global eigenvalue extrema:")
print(f"  lambda_max_gl   = {bf.lambda_max_gl:.4f}")
print(f"  mu_min_gl       = {bf.mu_min_gl:.4f}")
print(f"  tilde_mu_min_gl = {bf.tilde_mu_min_gl:.4f}")
print()
print("Bounding box radii (L2 norms):")
print(f"  h1_box = {h1_box:.3f}   (|h1|^2 <= {h1_box**2:.2f})")
print(f"  h2_box = {h2_box:.3f}   (|h2|^2 <= {h2_box**2:.2f})")
print(f"  h_box  = {h_box:.3f}   (|h|^2  <= {h_box**2:.2f})")
print()
print(f"Dilaton upper bound: dil_max = {bf.dil_max:.4f}")

Pre-computing eigenvalue bounds#

For large-scale scans, computing eigenvalue bounds separately (once, with many moduli points) is more efficient than recomputing them inside each enumerate_fluxes or sample_bounded_fluxes call. The compute_eigenvalue_bounds method does this and stores the results as class attributes:

# Pre-compute once with a large sample — reused by all subsequent calls
bf.reset_eigenvalue_bounds()  # clear any previous bounds
h1_box, h2_box, h_box = bf.compute_eigenvalue_bounds(n_sample=10_000)
print(f"\nbounds_initialized = {bf.bounds_initialized}")

The box dimensions can always be retrieved later:

h1_box, h2_box, h_box = bf.get_h_box()
print(h1_box, h2_box, h_box)

Checking bounds for a given flux#

check_bounds(moduli, tau, flux) evaluates all eigenvalue inequalities of arXiv:2501.03984 for a specific flux configuration. It first calls update_local to compute the local eigenvalues and flux sub-vector norms, then evaluates every bound_* method.

Let us demonstrate this with a sample flux configuration:

# Draw a random initial guess from the sampler
seed = 42
rns_key = jvc.PRNGSequence(seed)

sampler.update_interior_points(num_pts=5000)
moduli_0, tau_0, flux_0 = sampler.initial_guesses(1, rns_key=rns_key)
moduli_0 = moduli_0[0]
tau_0    = complex(tau_0[0])
flux_0   = flux_0[0]

print("moduli =", moduli_0)
print("tau    =", tau_0)
print("flux   =", flux_0)
results = bf.check_bounds(moduli_0, tau_0, flux_0)

print(f"{'Bound':<20} {'Result'}")
print("-" * 40)
for result, label in results:
    print(f"{label:<20} {result}")

The local eigenvalue state is now stored in bf and can be inspected directly:

print(f"s          = {bf.s:.4f}")
print(f"c0         = {bf.c0:.4f}")
print(f"N_flux     = {bf.Nflux:.4f}")
print(f"lambda_max = {bf.lambda_max:.4f}")
print(f"mu_min     = {bf.mu_min:.4f}")
print(f"tmu_min    = {bf.tilde_mu_min:.4f}")
print()
print(f"||h||^2  = {bf.hnorm:.4f}")
print(f"||f||^2  = {bf.fnorm:.4f}")
print(f"||h1||^2 = {bf.h1norm:.4f}")
print(f"||h2||^2 = {bf.h2norm:.4f}")

check_bounds_flat() aggregates all bounds into a single pass/fail flag (no arguments — uses the current local state):

all_pass, detailed = bf.check_bounds_flat()
print(f"All bounds satisfied: {all_pass}")

Step-by-step walkthrough#

This section opens sample_bounded_fluxes at high resolution: each of the six internal stages — eigenvalue bounds, NS-NS box, ISD completion, tadpole + eigenvalue filters, ISD-condition sanity check, Newton/scipy refinement — gets its own code cell and a short prose introduction. The walkthrough uses a wider moduli/dilaton region than §4 so the box stays enumerable for cross-checks. The shadowed sampler and bf here are walkthrough-local; later sections build their own.

Model and sampler setup#

We use the same \(h^{1,2}=2\) Kreuzer–Skarke geometry as in the earlier flux-bounding examples. The moduli region is:

\[ \mathrm{Im}(z_i) \in [2, 3],\quad s = \mathrm{Im}(\tau) \in [2, 5],\quad c_0 = \mathrm{Re}(\tau) \in [-0.5, 0.5]. \]

We set \(N_{\max} = 5\) so that the bounding box is small enough to enumerate exhaustively as a cross-check.

DILATON_BOUNDS  = [2, 5]
MODULI_BOUNDS   = [2, 3]
AXION_BOUNDS    = [-0.5, 0.5]
NMAX            = 5

sampler = jvc.data_sampler(
    model,
    flux_bounds=[-5, 5],
    moduli_bounds=MODULI_BOUNDS,
    axion_bounds=AXION_BOUNDS,
    dilaton_bounds=DILATON_BOUNDS,
    seed=0,
)

print(f"s   ∈ [{sampler.s_lower}, {sampler.s_upper}]")
print(f"Im(z) ∈ [{sampler.moduli_lower}, {sampler.moduli_upper}]")
print(f"c₀  ∈ [{sampler.axion_lower}, {sampler.axion_upper}]")

Step 1 — Global eigenvalue bounds#

sample_bounded_fluxes first samples \(n_{\mathrm{sample}}\) moduli points, computes three eigenvalue quantities at each point, and takes global extrema.

Quantity

Matrix

Role

\(\lambda_{\max}\)

ISD matrix \(\mathcal{M}\)

bounds \(|h|^2\) and \(|f|^2\)

\(\mu_{\min/\max}\)

\(-\mathrm{Im}(\mathcal{N})\)

bounds \(|h_2|^2\) (B-cycle NSNS sector)

\(\tilde{\mu}_{\min/\max}\)

\(\mathrm{Im}(\mathcal{N}^{-1})\)

bounds \(|h_1|^2\) (A-cycle NSNS sector)

Below we reproduce this step manually so we can inspect every number.

N_SAMPLE = 200_000
np.random.seed(42)

moduli_sample = sampler.get_complex_moduli(N_SAMPLE)
tau_sample    = sampler.get_complex_tau(N_SAMPLE)

flag = np.all(moduli_sample.imag>MODULI_BOUNDS[0],axis=1)
print(np.where(flag)[0].shape[0], "points with Im(z) >", MODULI_BOUNDS[0])
moduli_sample = moduli_sample[flag]
tau_sample    = tau_sample[flag]

print(f"Sampled {len(moduli_sample)} moduli points.")
print(f"Im(z) range: [{moduli_sample.imag.min():.3f}, {moduli_sample.imag.max():.3f}]")
print(f"s range:     [{tau_sample.imag.min():.3f}, {tau_sample.imag.max():.3f}]")
# Compute eigenvalues at a single representative moduli point.
z0  = moduli_sample[0]
tau0 = tau_sample[0]
z0c = jnp.conj(z0)

# Gauge kinetic matrix N(z, z_bar)
N_mat = model.gauge_kinetic_matrix(z0, z0c)
print("N matrix (gauge kinetic):")
print(N_mat)

# Eigenvalues of -Im(N)
mu_evs = jnp.linalg.eigvalsh(-N_mat.imag)
print(f"\nEigenvalues of -Im(N): {np.array(mu_evs)}")
print(f"  μ_min = {float(mu_evs.min()):.6f},  μ_max = {float(mu_evs.max()):.6f}")

# Eigenvalues of Im(N^{-1})
tmu_evs = jnp.linalg.eigvalsh(jnp.linalg.inv(N_mat).imag)
print(f"\nEigenvalues of Im(N⁻¹): {np.array(tmu_evs)}")
print(f"  μ̃_min = {float(tmu_evs.min()):.6f},  μ̃_max = {float(tmu_evs.max()):.6f}")
# ISD matrix M(z, z_bar) — formula from periods.py:
#   Block1 = [Im(N) + Re(N) @ Im(N)^{-1} @ Re(N)  ,  Re(N) @ Im(N)^{-1}]
#   Block2 = [Im(N)^{-1} @ Re(N)                   ,  Im(N)^{-1}         ]
#   M = -[Block1; Block2]

M0 = model.ISD_matrix(z0, z0c)
lm_evs = jnp.linalg.eigvalsh(M0)
print("ISD matrix M shape:", M0.shape)
print(f"ISD matrix eigenvalues: {np.array(lm_evs)}")
print(f"  λ_max = {float(lm_evs.max()):.6f}")
print()
# Verify M is positive definite
print(f"All eigenvalues positive? {bool(jnp.all(lm_evs > 0))}")
# Now compute global extrema over the full moduli sample using the JIT-compiled vmap.
# This is what compute_bounding_box does internally.
bf = bounded_fluxes(model, sampler=sampler, Nmax=NMAX)

# The dil_min should tighten from sqrt(3)/2 to sampler.s_lower=2
# (done inside sample_bounded_fluxes / enumerate_fluxes automatically)
bf.dil_min = float(sampler.s_lower)  # manual tightening for this walkthrough
print(f"dil_min (used for bounds) = {bf.dil_min}  (= sampler.s_lower)")

# JIT-vmapped eigenvalue computation over all moduli points
evs_all, M_all = bf._compute_evs_and_M_vmap(jnp.array(moduli_sample, dtype=complex))
lm_arr, mu_min_arr, mu_max_arr, tmu_min_arr, tmu_max_arr = evs_all

lambda_max_gl   = float(jnp.max(lm_arr))
mu_min_gl       = float(jnp.min(mu_min_arr))
mu_max_gl       = float(jnp.max(mu_max_arr))
tmu_min_gl      = float(jnp.min(tmu_min_arr))
tmu_max_gl      = float(jnp.max(tmu_max_arr))

print(f"\nGlobal eigenvalue extrema over {N_SAMPLE} moduli points:")
print(f"  λ_max   = {lambda_max_gl:.4f}")
print(f"  μ_min   = {mu_min_gl:.4f}")
print(f"  μ_max   = {mu_max_gl:.4f}")
print(f"  μ̃_min  = {tmu_min_gl:.6f}")
print(f"  μ̃_max  = {tmu_max_gl:.4f}")
fig, axes = plt.subplots(1, 3, dpi=130, figsize=(13, 3.2))

axes[0].hist(np.array(lm_arr), bins=40, edgecolor='k', lw=0.4)
axes[0].axvline(lambda_max_gl, color='r', lw=1.5, label=f'λ_max={lambda_max_gl:.1f}')
axes[0].set_xlabel(r'$\lambda_{\max}(\mathcal{M})$', fontsize=11)
axes[0].set_ylabel('Count', fontsize=11)
axes[0].set_title('ISD matrix max eigenvalue', fontsize=10)
axes[0].legend(fontsize=8)
axes[0].set_yscale('log')

axes[1].hist(np.array(mu_min_arr), bins=40, edgecolor='k', lw=0.4)
axes[1].axvline(mu_min_gl, color='r', lw=1.5, label=f'μ_min={mu_min_gl:.3f}')
axes[1].set_xlabel(r'$\mu_{\min}(-\mathrm{Im}\,\mathcal{N})$', fontsize=11)
axes[1].set_title(r'Gauge kinetic $\mu_{\min}$', fontsize=10)
axes[1].legend(fontsize=8)
axes[1].set_yscale('log')

axes[2].hist(np.array(tmu_min_arr), bins=40, edgecolor='k', lw=0.4)
axes[2].axvline(tmu_min_gl, color='r', lw=1.5, label=f'μ̃_min={tmu_min_gl:.5f}')
axes[2].set_xlabel(r'$\tilde\mu_{\min}(\mathrm{Im}\,\mathcal{N}^{-1})$', fontsize=11)
axes[2].set_title(r'Inverse gauge kinetic $\tilde\mu_{\min}$', fontsize=10)
axes[2].legend(fontsize=8)
axes[2].set_yscale('log')

plt.tight_layout()
plt.show()

Step 2 — Bounding box for NSNS-flux \(h\)#

From the inequalities in arXiv:2501.03984 the allowed integer NSNS-fluxes \(h = (h_1, h_2)\) must satisfy:

\[ \tilde\mu_{\min}\,\|h_1\|^2 \leq \frac{N_{\max}}{s_{\min}},\qquad \mu_{\min}\,\|h_2\|^2 \leq \frac{N_{\max}}{s_{\min}},\qquad \|h\|^2 \leq \frac{2\,\lambda_{\max}\,N_{\max}}{s_{\min}}. \]

Here \(s_{\min}\) is dil_min, which we set to sampler.s_lower = 2.

The key insight is that \(s_{\min}\) must reflect the actual lower bound of the sampled region, not the SL(2,Z) fundamental domain floor \(\sqrt{3}/2 \approx 0.866\). Using the correct \(s_{\min} = 2\) tightens the box radii by a factor \(\sqrt{2/0.866} \approx 1.5\), reducing box volume by \(\sim 3.4\times\) for each component.

s_min = bf.dil_min  # = 2.0 after tightening

h1_box = np.sqrt(NMAX / (s_min * tmu_min_gl))
h2_box = np.sqrt(NMAX / (s_min * mu_min_gl))
h_box  = np.sqrt(2.0 * lambda_max_gl * NMAX / s_min)

print(f"Bounding box (with s_min = {s_min}):")
print(f"  h1_box = sqrt(Nmax / (s_min * μ̃_min)) = sqrt({NMAX} / ({s_min} * {tmu_min_gl:.5f})) = {h1_box:.3f}")
print(f"  h2_box = sqrt(Nmax / (s_min * μ_min))  = sqrt({NMAX} / ({s_min} * {mu_min_gl:.5f})) = {h2_box:.3f}")
print(f"  h_box  = sqrt(2*λ_max*Nmax / s_min)    = sqrt(2*{lambda_max_gl:.3f}*{NMAX} / {s_min})  = {h_box:.3f}")

# Cross-check against compute_bounding_box
h1_box_cc, h2_box_cc, h_box_cc = bf.compute_bounding_box(
    jnp.array(moduli_sample, dtype=complex)
)
print(f"\nFrom compute_bounding_box:")
print(f"  h1_box={h1_box_cc:.3f}, h2_box={h2_box_cc:.3f}, h_box={h_box_cc:.3f}")
print(f"  (matches: {np.allclose([h1_box, h2_box, h_box], [h1_box_cc, h2_box_cc, h_box_cc], rtol=1e-4)})")
# What would happen if we forgot to tighten dil_min?
s_min_wrong = np.sqrt(3) / 2  # SL(2,Z) fundamental domain floor
h1_box_wrong = np.sqrt(NMAX / (s_min_wrong * tmu_min_gl))
h2_box_wrong = np.sqrt(NMAX / (s_min_wrong * mu_min_gl))
h_box_wrong  = np.sqrt(2.0 * lambda_max_gl * NMAX / s_min_wrong)

print(f"Comparison of bounding box radii:")
print(f"  s_min = {s_min_wrong:.4f} (wrong, SL(2,Z) floor):  h_box = {h_box_wrong:.2f}")
print(f"  s_min = {s_min:.4f} (correct, sampler floor): h_box = {h_box:.2f}")
print(f"  Volume ratio: ({h_box_wrong:.2f}/{h_box:.2f})^4 ≈ {(h_box_wrong/h_box)**4:.1f}x more candidates with wrong s_min")
# Enumerate all integer h vectors inside the bounding box (for small Nmax=2 this is feasible)

h_candidates = bf.get_h_candidates()
print(f"Total integer h candidates inside bounding box: {len(h_candidates)}")
print(f"h vector shape: {h_candidates.shape}  (n_candidates x n_fluxes)")
print(f"n_fluxes = {model.n_fluxes}, dimension_H3 = {model.dimension_H3}")
print(f"  → h = [h₁ | h₂] where h₁,h₂ each have {model.dimension_H3} components")

Step 3 — ISD completion: \(h \mapsto f\)#

For each NSNS-flux \(h\) and each sampled moduli point \((z, \tau)\), we compute the ISD-projected RR-flux:

\[ f = s\,\mathcal{M}(z,\bar z)\,\Sigma\,h + c_0\,h, \qquad \tau = c_0 + \mathrm{i}s. \]

This is mode "H" in sampling.py’s _ISD_sampling_FH. The result \(f\) is continuous (not integer); we round to the nearest integer and later check the tadpole constraint.

Convention verification#

We verify that flux_bounding.py’s formula matches sampling.py’s formula exactly.

# Take the first moduli point as our working example
z_ex  = jnp.array(moduli_sample[0], dtype=complex)
tau_ex = jnp.array(tau_sample[0], dtype=complex)
z_ex_c = jnp.conj(z_ex)
tau_ex_c = jnp.conj(tau_ex)

c0_ex = float(jnp.real(tau_ex))
s_ex  = float(jnp.imag(tau_ex))

print(f"Working moduli point:")
print(f"  z   = {np.array(z_ex)}")
print(f"  τ   = {complex(tau_ex):.6f}  (c₀={c0_ex:.4f}, s={s_ex:.4f})")

# Symplectic form sigma from model
sigma = model.periods.sigma
print(f"\nSymplectic form σ:")
print(np.array(sigma).astype(int))
# Pick a sample h vector
h_test = jnp.array([0, 1, 2, 0, 0, 1], dtype=float)  # [h1_1, h1_2, h2_1, h2_2]

# Method A: flux_bounding.py formula
M0_ex = model.ISD_matrix(z_ex, z_ex_c)
M0_sigma = M0_ex @ sigma
f_bounding = s_ex * (M0_sigma @ h_test) + c0_ex * h_test

# Method B: sampling.py _ISD_sampling_FH mode="H" formula
SigmaFlux = sigma @ h_test              # σ @ h
M0_SigmaFlux = M0_ex @ SigmaFlux        # M₀ @ σ @ h
f_sampling = M0_SigmaFlux * s_ex + h_test * c0_ex

# Method C: sampling.py's ISD_sampling function (the official API)
#flux_full_test = jnp.concatenate([jnp.zeros(model.n_fluxes), h_test])  # [f=0 | h]
flux_full_test = h_test
f_official = sampler.ISD_sampling(
    z_ex, z_ex_c, tau_ex, tau_ex_c,
    flux_full_test,
    mode="H", output="half", return_integer_flux=False,
)

print("ISD completion: f = s * M₀ @ (σ @ h) + c₀ * h")
print(f"  f (flux_bounding formula) = {np.array(f_bounding)}")
print(f"  f (sampling.py formula)   = {np.array(f_sampling)}")
print(f"  f (ISD_sampling API)      = {np.array(f_official)}")
print()
print(f"  All agree: {np.allclose(f_bounding, f_sampling) and np.allclose(f_bounding, f_official)}")
# Batch ISD completion: apply to all h candidates in parallel
# This is the inner loop of _process_h_at_modulus_jit

h_chunk = jnp.array(h_candidates[:200], dtype=float)  # first 200 for display

# Batched formula: (M0_sigma @ h_chunk.T).T  applies M0_sigma to each h as a column
f_chunk = s_ex * (M0_sigma @ h_chunk.T).T + c0_ex * h_chunk
f_int_chunk = jnp.round(f_chunk)  # round to nearest integer

print(f"ISD-completed f for first 5 h candidates (before rounding):")
for i in range(5):
    print(f"  h = {np.array(h_candidates[i])}, f_cont = {np.array(f_chunk[i]).round(3)}, f_int = {np.array(f_int_chunk[i]).astype(int)}")

Step 4 — Tadpole and eigenvalue bound filters#

After ISD completion we apply two filters to each candidate flux \([f | h]\):

  1. Tadpole constraint: \(N_{\mathrm{flux}} = f^T \Sigma\, h \in (0, N_{\max}]\)

  2. Eigenvalue bounds: a set of local and global inequalities from arXiv:2501.03984 involving \(\|h\|^2, \|f\|^2\), and the eigenvalues at this moduli point.

We verify the tadpole formula first.

Sign convention. We use the signed tadpole \(N_{\mathrm{flux}} = f^T \Sigma\, h\) with \(\Sigma = \begin{pmatrix}0 & \mathbb{1}\\ -\mathbb{1} & 0\end{pmatrix}= \) model.periods.sigma. This equals model.tadpole(flux) exactly. Since \(\Sigma\) is antisymmetric, \(f^T\Sigma h = -\,h^T\Sigma f\); for physical (ISD-completed) fluxes \(N_{\mathrm{flux}}>0\), so no absolute value is needed. The same convention is cross-checked against arXiv:2501.03984 in the Dataset B case study (§6).

# D3-tadpole formula: N_flux = f @ sigma @ h  (antisymmetric bilinear form)
# For a full flux vector [f | h] of length 2*n_fluxes:
#   N_flux = flux[:n_fl] @ sigma @ flux[n_fl:]

n_fl = model.n_fluxes
dim  = model.dimension_H3

# Use the first ISD-completed example
h_ex = jnp.array(h_candidates[10], dtype=float)
f_ex = jnp.round(s_ex * (M0_sigma @ h_ex) + c0_ex * h_ex)
flux_ex = jnp.concatenate([f_ex, h_ex])

# Compute tadpole three ways:
N1 = float((f_ex @ sigma @ h_ex).real)
N2 = float((flux_ex[:n_fl] @ jnp.dot(sigma, flux_ex[n_fl:])).real)
N3 = float(model.tadpole(flux_ex).real)

print(f"Flux example: f = {np.array(f_ex).astype(int)}, h = {np.array(h_ex).astype(int)}")
print(f"Tadpole N_flux = f @ σ @ h:")
print(f"  Method 1 (direct):          {N1:.6f}")
print(f"  Method 2 (via full vector): {N2:.6f}")
print(f"  Method 3 (model.tadpole):   {N3:.6f}")
print(f"  All agree: {np.isclose(N1, N2) and np.isclose(N1, N3)}")
# For an exactly ISD flux (not rounded), the tadpole has a clean form:
#   N_flux = s * h^T @ M^{-1} @ h  (positive for positive-definite M)

h_ex_f = jnp.array(h_candidates[10], dtype=float)
f_ex_cont = s_ex * (M0_sigma @ h_ex_f) + c0_ex * h_ex_f  # continuous (not rounded)

N_continuous = float((f_ex_cont @ sigma @ h_ex_f).real)
N_formula    = (s_ex * h_ex_f @ jnp.linalg.inv(M0_ex) @ h_ex_f).real

print("For the CONTINUOUS (pre-rounding) ISD flux:")
print(f"  N_flux = f_cont @ σ @ h        = {N_continuous:.8f}")
print(f"  N_flux = s * h^T M^{{-1}} h     = {N_formula:.8f}")
print(f"  Match (ISD identity):          {np.isclose(N_continuous, N_formula)}")
print()
print("Note: after rounding f to integers, N_flux changes slightly.")
# Run the JIT-compiled filter on h candidates at this one moduli point.
# We process the candidates in chunks: this avoids a large one-shot compile/allocation
# while still exercising the same kernel used by the production bounded-flux search.
from jaxvacua.flux_bounding import _process_h_at_modulus_jit

h_candidates_arr = jnp.array(h_candidates, dtype=float)
evs_ex = (
    float(jnp.max(jnp.linalg.eigvalsh(M0_ex))),
    float(jnp.min(jnp.linalg.eigvalsh(-N_mat.imag))),
    float(jnp.max(jnp.linalg.eigvalsh(-N_mat.imag))),
    float(jnp.min(jnp.linalg.eigvalsh(jnp.linalg.inv(N_mat).imag))),
    float(jnp.max(jnp.linalg.eigvalsh(jnp.linalg.inv(N_mat).imag))),
)
dil_max = float(lambda_max_gl * NMAX)

chunk_size = min(1024, len(h_candidates_arr))
flux_chunks = []
valid_chunks = []
n_chunks = 0
for start in tqdm(range(0, len(h_candidates_arr), chunk_size)):
    h_chunk = h_candidates_arr[start:start + chunk_size]
    flux_chunk, valid_chunk = _process_h_at_modulus_jit(
        h_chunk,
        M0_sigma,
        n_fl, dim,
        jnp.array(s_ex),
        jnp.array(c0_ex),
        evs_ex,
        tau_ex,
        sigma,
        lambda_max_gl, mu_min_gl, mu_max_gl, tmu_min_gl, tmu_max_gl,
        float(s_min), dil_max, float(NMAX),
    )
    flux_chunks.append(flux_chunk)
    valid_chunks.append(valid_chunk)
    n_chunks += len(flux_chunk)
    
print(f"Processed {n_chunks} flux candidates.")
flux_all = jnp.concatenate(flux_chunks, axis=0)
valid_all = jnp.concatenate(valid_chunks, axis=0)
n_valid = int(jnp.sum(valid_all))
print(f"Candidates passing all filters (at moduli point 0): {n_valid} / {len(h_candidates)}")
print(f"Pass rate: {100*n_valid/len(h_candidates):.2f}%")
# Manually check the tadpole for all passing candidates.
valid_idx = np.where(np.array(valid_all))[0]
flux_valid = np.array(flux_all)[valid_idx]

if len(flux_valid) == 0:
    print("No candidates passed all filters at this illustrative moduli point.")
else:
    tadpoles = np.array([float(model.tadpole(jnp.array(fv)).real) for fv in flux_valid])
    print("Tadpoles of passing candidates:")
    print(f"  min={tadpoles.min():.1f}, max={tadpoles.max():.1f}")
    print(f"  All in (0, {NMAX}]: {bool(np.all((tadpoles > 0) & (tadpoles <= NMAX)))}")
    print()
    print("First few passing fluxes [f | h] (integer):")
    for i, (fv, tad) in enumerate(zip(flux_valid[:5], tadpoles[:5])):
        print(f"  [{i}] {fv.astype(int).tolist()}  N_flux={tad:.0f}")

Step 5 — ISD convention consistency check#

We perform a direct cross-check: take a single ISD flux constructed by sampling.py’s initial_guesses_ISD, and verify that:

  1. The ISD condition \(\star G_3 = \mathrm{i}\,G_3\) is approximately satisfied (up to the integrality rounding)

  2. The tadpole \(N_{\mathrm{flux}} = f^T \Sigma h\) is positive and within bounds

  3. The flux_bounding.py formula gives the same \(f\) for the same \((h, z, \tau)\)

# ISD condition: f = s * M(z) * sigma * h + c0 * h
# Equivalently: (f - c0*h) = s * M(z) * sigma * h
# Or: ftilde = f - c0*h satisfies ftilde = s * M * sigma * h
#
# Check the degree to which the integer-rounded ISD flux satisfies this.

f_int = jnp.round(f_ex_cont)  # integer-rounded ISD flux
ftilde = f_int - c0_ex * h_ex_f
rhs    = s_ex * (M0_sigma @ h_ex_f)

print("ISD residual check for integer-rounded flux:")
print(f"  f_tilde  = f - c₀ h = {np.array(ftilde).round(4)}")
print(f"  s M σ h             = {np.array(rhs).round(4)}")
print(f"  ||f_tilde - s M σ h|| = {float(jnp.linalg.norm(ftilde - rhs)):.4f}  (rounding error)")
# Full ISD condition in terms of G3 = f - tau*h (complex flux)
# Self-duality: G3 = (f - tau*h) should satisfy M * sigma * G3 = -G3
# (M is the ISD projector with eigenvalue -1 for ISD fluxes)

# For integer flux we check how well this is satisfied
G3 = f_int - tau_ex * h_ex_f
sigma_G3 = sigma @ G3
M_sigma_G3 = M0_ex @ sigma_G3

print("Self-duality check: M σ G₃ ≈ -G₃ ?")
print(f"  G₃         = {np.array(G3.real).round(3)} + i * {np.array(G3.imag).round(3)}")
print(f"  M σ G₃     = {np.array(M_sigma_G3.real).round(3)} + i * {np.array(M_sigma_G3.imag).round(3)}")
print(f"  -G₃        = {np.array(-G3.real).round(3)} + i * {np.array(-G3.imag).round(3)}")
print()
# For the continuous flux this should be exact
G3_cont = f_ex_cont - tau_ex * h_ex_f
sigma_G3_cont = sigma @ G3_cont
M_sigma_G3_cont = M0_ex @ sigma_G3_cont
print(f"  ||M σ G₃_cont - (-G₃_cont)|| = {float(jnp.linalg.norm(M_sigma_G3_cont + G3_cont)):.2e}  (continuous, should be ~0)")

Step 6 — Refinement with scipy.optimize.root#

The ISD-completed, integer-rounded fluxes are not exact SUSY vacua: after rounding, \(D_I W \neq 0\). The final step is to numerically solve

\[ D_I W(z, \tau) = 0 \]

for the moduli and axio-dilaton, holding the integer flux fixed.

Rather than using the Newton method built into jaxvacua, we use scipy.optimize.root (Hybr method by default) for transparency. We pack the real and imaginary parts of \((z_1, z_2, \tau)\) into a real vector, evaluate \(D_I W\) via the JAX model, and let SciPy find the root.

Initial guess strategy#

The moduli point at which ISD completion was performed is an excellent initial guess: the rounded integer flux is close to ISD at that point, so \(D_I W\) is small there.

# Run the full sample_bounded_fluxes with refine=False to get initial guesses
bf2 = bounded_fluxes(model, sampler=sampler, Nmax=NMAX)

candidates = bf2.sample_bounded_fluxes(
    n_target=5_000,
    n_batch=100_000,
    n_sample=1_000,
    n_mod=1_000,
    max_batches=20,
    verbose=True,
    seed=42,
    refine=False,
    return_moduli=True,
)
print(f"\nFound {len(candidates)} initial candidates.")
# Inspect the F-term residuals at the initial guess moduli points
print("F-term residuals |DW| at initial guess moduli:")
print(f"{'Flux':>12s}  {'|DW| pre-refinement':>52s}  N_flux")
print("-" * 60)
for r in candidates:
    z   = jnp.array(r["moduli"], dtype=complex)
    tau = jnp.array(r["tau"], dtype=complex)
    fl  = jnp.array(r["flux"])
    DW  = model.DW(z, jnp.conj(z), tau, jnp.conj(tau), fl)
    nfl = float(model.tadpole(fl).real)
    residual = float(jnp.linalg.norm(DW))
    if residual < 2:  # only show those with reasonably small residuals
        print(f"  {str(r['flux'].astype(int).tolist()).replace(" ",""):>12s}  {residual:>22.6e}  {nfl:.0f}")
# Define the root-finding problem for scipy.optimize.root
# Pack (Re(z1), Im(z1), Re(z2), Im(z2), Re(tau), Im(tau)) into a real 6-vector x

h12 = model.h12  # number of complex structure moduli

def pack(z, tau):
    """Complex moduli + tau -> real vector."""
    z_np = np.array(z)
    return np.concatenate([z_np.real, z_np.imag, [tau.real, tau.imag]])

def unpack(x):
    """Real vector -> (z_jax, tau_jax)."""
    z_re  = jnp.array(x[:h12])
    z_im  = jnp.array(x[h12:2*h12])
    z     = z_re + 1j * z_im
    tau   = x[2*h12] + 1j * x[2*h12 + 1]
    return z, tau

def residual_fn(x, flux):
    """F-term residual Re(DW) concatenated with Im(DW), as a real vector."""
    z, tau = unpack(x)
    DW = model.DW(z, jnp.conj(z), jnp.array(tau), jnp.conj(jnp.array(tau)), flux)
    return np.concatenate([np.array(DW.real), np.array(DW.imag)])

# Quick sanity check on a known initial guess
r0   = candidates[0]
fl0  = jnp.array(r0["flux"])
x0   = pack(r0["moduli"], complex(r0["tau"]))
res0 = residual_fn(x0, fl0)
print(f"Residual vector at initial guess: {res0.round(4)}")
print(f"||residual|| = {np.linalg.norm(res0):.4e}")
# Solve DW = 0 for each candidate using scipy.optimize.root
# DW has h12+1 = 3 complex components → 6 real equations
# Variables: (Re(z1), Im(z1), Re(z2), Im(z2), Re(tau), Im(tau)) → 6 real unknowns

TOL_SCIPY = 1e-10

scipy_results = []
for r in tqdm(candidates):
    fl   = jnp.array(r["flux"])
    x0   = pack(r["moduli"], complex(r["tau"]))

    sol  = root(residual_fn, x0, args=(fl,), method="hybr", tol=TOL_SCIPY)
    z_sol, tau_sol = unpack(sol.x)
    
    flag = np.any(z_sol.imag > MODULI_BOUNDS[1]) or np.any(z_sol.imag < MODULI_BOUNDS[0])
    
    if flag:
        continue
    
    X = model._convert_complex_to_real_nondif(z_sol, tau_sol) 
    model.is_physical(X)
    
    if not sol.success:
        continue
    else:
        print(f"Root found for {r['flux'].astype(int).tolist()}: {sol.message}")

    DW_final = residual_fn(sol.x, fl)
    res_final = np.linalg.norm(DW_final)
    converged = sol.success and res_final < TOL_SCIPY * 1e3  # slightly loose
    
    z_sol, tau_sol, fl = model.map_to_fd(z_sol, tau_sol, fl)

    scipy_results.append({
        "flux":       np.array(fl),
        "moduli":     np.array(z_sol),
        "tau":        complex(tau_sol),
        "residual":   res_final,
        "converged":  converged,
        "scipy_msg":  sol.message,
    })

n_conv = sum(r["converged"] for r in scipy_results)
print(f"scipy.optimize.root: {n_conv}/{len(scipy_results)} converged (tol={TOL_SCIPY:.0e})")
# Filter converged solutions and check whether they lie inside the sampler's moduli patch.
# Patch conditions: Im(z_i) ∈ [moduli_lower, moduli_upper], s ∈ [s_lower, s_upper], c₀ ∈ [axion_lower, axion_upper]

def in_patch(z, tau):
    im_z = np.imag(z)
    s    = np.imag(tau)
    c0   = np.real(tau)
    return (
        np.all(im_z >= sampler.moduli_lower) and
        np.all(im_z <= sampler.moduli_upper) and
        s >= sampler.s_lower and s <= sampler.s_upper and
        c0 >= sampler.axion_lower and c0 <= sampler.axion_upper
    )

print(f"{'Flux':>45s}  {'|DW|':>12s}  {'in patch':>10s}  {'N_flux':>8s}")
print("-" * 90)
n_in_patch = 0
vacua = []
seen  = set()
for r in scipy_results:
    if not r["converged"]:
        continue
    key = tuple(int(x) for x in r["flux"])
    if key in seen:
        continue
    seen.add(key)
    patch = in_patch(r["moduli"], r["tau"])
    if patch:
        n_in_patch += 1
        vacua.append(r)
    fl_str = str(r["flux"].astype(int).tolist())
    nfl    = float(model.tadpole(jnp.array(r["flux"])).real)
    print(f"{fl_str:>45s}  {r['residual']:>12.2e}  {'yes' if patch else 'no':>10s}  {nfl:>8.0f}")

print(f"\nConverged and in-patch: {n_in_patch} unique vacua")
# Print the refined vacua in detail
if vacua:
    print(f"{'='*80}")
    print(f"Found {len(vacua)} refined SUSY vacua:")
    print(f"{'='*80}")
    for i, r in enumerate(vacua):
        z   = jnp.array(r["moduli"], dtype=complex)
        tau = jnp.array(r["tau"], dtype=complex)
        fl  = jnp.array(r["flux"])
        DW  = model.DW(z, jnp.conj(z), tau, jnp.conj(tau), fl)
        nfl = float(model.tadpole(fl).real)

        print(f"\nVacuum {i+1}:")
        print(f"  flux    = {r['flux'].astype(int).tolist()}")
        print(f"  moduli  = {np.array(r['moduli'])}")
        print(f"  tau     = {r['tau']:.8f}")
        print(f"  N_flux  = {nfl:.0f}")
        print(f"  |DW|    = {r['residual']:.2e}")
        print(f"  DW_vec  = {np.array(DW).round(12)}")
else:
    print("No converged in-patch vacua found in this run. Try increasing n_target or n_batch.")

Comparison: scipy vs Newton#

Let us verify that scipy.optimize.root and the built-in Newton method converge to the same vacua.

# Re-run sample_bounded_fluxes with Newton refinement (default step_size=1.0)
bf3 = bounded_fluxes(model, sampler=sampler, Nmax=NMAX)

newton_vacua = bf3.sample_bounded_fluxes(
    n_target=50,
    n_batch=100_000,
    n_sample=5_000,
    n_mod=100,
    max_batches=10,
    verbose=True,
    seed=42,
    refine=True,
    newton_step_size=1,  # full Newton steps, quadratic convergence
    newton_tol=1e-10,
    newton_max_iters=200,
)
print(f"Newton refinement found {len(newton_vacua)} vacua.")
print(f"scipy.optimize.root found {len(vacua)} vacua.")
# Compare flux vectors found by both methods
scipy_flux_set  = {tuple(int(x) for x in r["flux"]) for r in vacua}
newton_flux_set = {tuple(int(x) for x in r["flux"]) for r in newton_vacua}

print(f"Flux vectors in scipy result:  {len(scipy_flux_set)}")
print(f"Flux vectors in Newton result: {len(newton_flux_set)}")
print(f"Intersection: {len(scipy_flux_set & newton_flux_set)}")
print(f"Only in scipy:  {scipy_flux_set - newton_flux_set}")
print(f"Only in Newton: {newton_flux_set - scipy_flux_set}")
# For fluxes found by both, compare the converged moduli
if scipy_flux_set & newton_flux_set:
    common_key = next(iter(scipy_flux_set & newton_flux_set))

    r_scipy  = next(r for r in vacua       if tuple(int(x) for x in r["flux"]) == common_key)
    r_newton = next(r for r in newton_vacua if tuple(int(x) for x in r["flux"]) == common_key)

    print(f"Flux: {list(common_key)}")
    print(f"  scipy  moduli = {r_scipy['moduli']}")
    print(f"  Newton moduli = {r_newton['moduli']}")
    print(f"  scipy  tau    = {r_scipy['tau']:.10f}")
    print(f"  Newton tau    = {r_newton['tau']:.10f}")
    print(f"  ||Δmoduli||   = {np.linalg.norm(r_scipy['moduli'] - r_newton['moduli']):.2e}")
    print(f"  |Δtau|        = {abs(r_scipy['tau'] - r_newton['tau']):.2e}")

Summary and consistency checks#

We now perform a final self-consistency check on each refined vacuum.

def check_vacuum(r, model, sampler, label=""):
    """Print a consistency table for one refined vacuum."""
    z   = jnp.array(r["moduli"], dtype=complex)
    tau = jnp.array(r["tau"], dtype=complex)
    fl  = jnp.array(r["flux"])

    DW   = model.DW(z, jnp.conj(z), tau, jnp.conj(tau), fl)
    nfl  = float(model.tadpole(fl).real)
    M0_v = model.ISD_matrix(z, jnp.conj(z))
    G3   = fl[:model.n_fluxes] - tau * fl[model.n_fluxes:]
    isd_res = float(jnp.linalg.norm(M0_v @ (model.periods.sigma @ G3) + G3))

    patch = in_patch(np.array(z), complex(tau))

    header = f"--- Vacuum {label} ---"
    print(header)
    print(f"  flux            = {np.array(fl).astype(int).tolist()}")
    print(f"  moduli Im(z)    = {np.imag(np.array(z))}")
    print(f"  moduli Re(z)    = {np.real(np.array(z))}")
    print(f"  tau             = {complex(tau):.8f}")
    print(f"  N_flux          = {nfl:.2f}  (should be integer in (0,{NMAX}])")
    print(f"  ||DW||          = {float(jnp.linalg.norm(DW)):.2e}  (should be < 1e-8)")
    print(f"  ||M σ G3 + G3|| = {isd_res:.2e}  (ISD residual; 0 = exact ISD)")
    print(f"  in sampler patch= {patch}")
    print()

if vacua:
    for i, r in enumerate(vacua[:3]):
        check_vacuum(r, model, sampler, label=str(i+1))
else:
    print("No vacua to check. Increase n_target or n_batch.")
# Tadpole and moduli distribution of all refined vacua
all_refined = vacua + newton_vacua  # combine both refinement methods
# Deduplicate by flux
seen_all = set()
all_refined_dedup = []
for r in all_refined:
    key = tuple(int(x) for x in r["flux"])
    if key not in seen_all:
        seen_all.add(key)
        all_refined_dedup.append(r)

print(f"Total unique refined vacua: {len(all_refined_dedup)}")

if len(all_refined_dedup) >= 2:
    tads = np.array([float(model.tadpole(jnp.array(r["flux"])).real) for r in all_refined_dedup])
    mods = np.array([r["moduli"] for r in all_refined_dedup])
    taus = np.array([r["tau"] for r in all_refined_dedup])

    fig, axes = plt.subplots(1, 3, dpi=130, figsize=(13, 3.5))

    axes[0].hist(tads, bins=range(0, NMAX + 2), edgecolor='k', lw=0.5)
    axes[0].set_xlabel(r'$N_{\rm flux}$', fontsize=12)
    axes[0].set_ylabel('Count', fontsize=12)
    axes[0].set_title('Tadpole distribution', fontsize=11)

    sc = axes[1].scatter(
        mods[:, 0].imag, mods[:, 1].imag,
        c=tads, cmap='viridis', s=20, alpha=0.8
    )
    plt.colorbar(sc, ax=axes[1], label=r'$N_{\rm flux}$')
    axes[1].set_xlabel(r'$\mathrm{Im}(z_1)$', fontsize=12)
    axes[1].set_ylabel(r'$\mathrm{Im}(z_2)$', fontsize=12)
    axes[1].set_title('Moduli distribution', fontsize=11)

    axes[2].scatter(taus.real, taus.imag, c=tads, cmap='viridis', s=20, alpha=0.8)
    axes[2].set_xlabel(r'$\mathrm{Re}(\tau)$', fontsize=12)
    axes[2].set_ylabel(r'$\mathrm{Im}(\tau) = s$', fontsize=12)
    axes[2].set_title(r'Axio-dilaton $\tau$', fontsize=11)

    plt.tight_layout()
    plt.show()
else:
    print("Not enough vacua to plot — try larger n_target.")

Enumeration vs stochastic sampling#

This section compares two complementary approaches to populating flux ensembles inside the eigenvalue-bounded box:

  • Systematic enumeration (bounded_fluxes.enumerate_fluxes): iterates every integer \(h\) in the bounding box, ISD-completes each, and applies tadpole + eigenvalue filters. Provably complete but scales as \(\sim N_{\max}^3\), so practical only for \(N_{\max} \lesssim 30\).

  • Stochastic Gaussian-prior sampling (bounded_fluxes.sample_bounded_fluxes): draws \(h \sim \mathcal{N}(0, \sigma^2 M)\) from the prior tuned to the ISD tadpole, then runs the same ISD/filter pipeline. Sublinear in \(N_{\max}\), gives up completeness in exchange. The Gaussian prior outperforms a uniform-box prior by ~50× yield (see NB07 §5).

Flux utility helpers#

The class provides several helpers to decompose flux vectors into their sub-components.

# Split flux = [f | h] into f and h (each of length n_fluxes = 2*(h12+1))
f, h = bf.get_fh(flux_0)
print("f =", f)
print("h =", h)

# Further split f = [f1 | f2] and h = [h1 | h2] (each of length dimension_H3 = h12+1)
h1, h2 = bf.get_flux_split(h)
f1, f2 = bf.get_flux_split(f)
print("h1 =", h1, "  h2 =", h2)
print("f1 =", f1, "  f2 =", f2)

# Or all at once:
h1, h2, f1, f2 = bf.get_subvector(flux_0)

# D3-tadpole charge
print(f"N_flux = {bf.get_nflux(flux_0):.4f}")

Full enumeration algorithm#

enumerate_fluxes() is the main entry point that combines all steps above. It:

  1. Samples n_sample moduli points from the sampler and computes the global bounding box. If compute_eigenvalue_bounds() was called beforehand, this step is skipped.

  2. Enumerates all integer \(h\) candidates inside the box.

  3. For each \(h\), computes an ISD-projected \(f\) at up to n_isd_per_h moduli points.

  4. Retains flux configurations that satisfy the D3-tadpole constraint and all local eigenvalue bounds.

New features (since v2):

  • compute_eigenvalue_bounds(n_sample) — pre-compute and cache eigenvalue bounds.

  • moduli_regions — stratified ISD starting points via Cartesian product across moduli dimensions.

  • use_linearised_shifts=True — iterative ISD refinement with flag-based early stopping (requires FluxVacuaFinder).

  • constraints — user-supplied constraint function (moduli, tau, flux) bool.

  • chunk_size — override the default streaming chunk size.

  • KeyboardInterrupt — gracefully returns partial results found so far.

This requires a sampler to have been passed at initialisation.

# Re-initialise with a tighter tadpole to keep the enumeration fast
bf_enum = bounded_fluxes(model, sampler=sampler, Nmax=4)

# Pre-compute eigenvalue bounds (reused inside enumerate_fluxes)
bf_enum.compute_eigenvalue_bounds(n_sample=5_000)

valid_fluxes = bf_enum.enumerate_fluxes(
                            n_sample=100,
                            n_isd_per_h=10,
                            verbose=True,
                            confirm_streaming=False,
                        )

# Package as dicts (flux only) for the comparison / tadpole cells below.
enum_results = [{"flux": np.asarray(fv)} for fv in valid_fluxes]
if len(valid_fluxes) > 0:
    flux_mat = np.stack(valid_fluxes)   # shape (N_valid, 2*n_fluxes)

    # D3-tadpole for each valid flux
    tadpoles = np.array([
        bf_enum.get_nflux(jnp.array(fv)) for fv in valid_fluxes
    ])

    print(f"Valid flux configurations found: {len(valid_fluxes)}")
    print(f"N_flux range : [{tadpoles.min():.0f}, {tadpoles.max():.0f}]")
    print(f"N_flux mean  : {tadpoles.mean():.2f}")
    print()
    print("First three valid flux vectors [f | h]:")
    for fv in valid_fluxes[:3]:
        print(" ", fv.astype(int))
else:
    print("No valid flux vectors found — try increasing Nmax or the moduli sample size.")
import matplotlib.pyplot as plt

if len(valid_fluxes) > 0:
    fig, axes = plt.subplots(1, 2, dpi=150, figsize=(8, 3))

    axes[0].hist(tadpoles, bins=range(0, int(tadpoles.max()) + 2), edgecolor='k', linewidth=0.5)
    axes[0].set_xlabel(r"$N_{\rm flux}$", fontsize=12)
    axes[0].set_ylabel("Count", fontsize=12)
    axes[0].set_title("D3-tadpole distribution", fontsize=11)

    # h1 vs h2 sub-vector norms
    h1norms = np.array([
        bf_enum.compute_norm(bf_enum.get_flux_split(bf_enum.get_fh(jnp.array(fv))[1])[0])
        for fv in valid_fluxes
    ])
    h2norms = np.array([
        bf_enum.compute_norm(bf_enum.get_flux_split(bf_enum.get_fh(jnp.array(fv))[1])[1])
        for fv in valid_fluxes
    ])

    axes[1].scatter(h1norms, h2norms, s=15, alpha=0.7)
    axes[1].set_xlabel(r"$\|h_1\|^2$", fontsize=12)
    axes[1].set_ylabel(r"$\|h_2\|^2$", fontsize=12)
    axes[1].set_title(r"NSNS sub-vector norms", fontsize=11)
    h1b, h2b, _ = bf_enum.get_h_box()
    axes[1].axvline(h1b**2, color='r', linestyle='--', linewidth=0.8, label=r"$h_1$ box")
    axes[1].axhline(h2b**2, color='b', linestyle='--', linewidth=0.8, label=r"$h_2$ box")
    axes[1].legend(fontsize=9)

    plt.tight_layout()
    plt.show()

Stochastic alternative#

sample_bounded_fluxes(n_target, n_batch, ...) Monte-Carlos the Gaussian prior described above, ISD-completes, and applies the filters until n_target vacua are found or max_batches is exhausted. Below we run it at \(N_{\max}=20\) — where systematic enumeration starts to slow down — and compare yields.

bf_stoch = bounded_fluxes(model, sampler=sampler, Nmax=20)
bf_stoch.compute_eigenvalue_bounds(n_sample=10_000)

stoch_results = bf_stoch.sample_bounded_fluxes(
    n_target=1_000,
    n_batch=50_000,
    n_sample=1_000,
    n_mod=20,
    max_batches=50,
    refine=True,
    newton_step_size=1.,
    verbose=True,
    seed=42,
    return_moduli=True,
    moduli_regions=[(2., 2.5), (2.5, 3.)],
)
print(f"\nStochastic search found {len(stoch_results)} unique valid flux candidates.")
# Each result is a dict with "flux", "moduli", "tau"
if stoch_results:
    r = stoch_results[0]
    print("flux   =", r["flux"].astype(int))
    print("moduli =", r["moduli"])
    print("tau    =", r["tau"])
    z = r["moduli"]
    t = r["tau"]
    flux = r["flux"].astype(int)
    DW = model.DW(z,jnp.conj(z),t,jnp.conj(t),flux)
    print("DW    =", DW)

Comparison#

We check how many of the stochastic results also appear in the exhaustive enumeration.

# Build set of flux tuples from enumeration
enum_set = set()
for r in enum_results:
    enum_set.add(tuple(int(x) for x in r["flux"]))

# Check overlap
n_overlap = 0
for r in stoch_results:
    key = tuple(int(x) for x in r["flux"])
    if key in enum_set:
        n_overlap += 1

print(f"Enumeration:  {len(enum_results)} flux candidates")
print(f"Stochastic:   {len(stoch_results)} flux candidates")
print(f"Overlap:      {n_overlap} / {len(stoch_results)} stochastic results also found by enumeration")
if len(enum_results) > 0:
    print(f"Coverage:     {n_overlap}/{len(enum_results)} = {100*n_overlap/len(enum_results):.1f}% of enumerated vacua recovered")

Tadpole distributions#

def compute_tadpoles(bf_obj, results):
    return np.array([bf_obj.get_nflux(jnp.array(r["flux"])) for r in results])

if len(enum_results) > 0 and len(stoch_results) > 0:
    tad_enum  = compute_tadpoles(bf, enum_results)
    tad_stoch = compute_tadpoles(bf_stoch, stoch_results)

    fig, axes = plt.subplots(1, 3, dpi=150, figsize=(13, 3.5))

    bins = range(0, int(max(tad_enum.max(), tad_stoch.max())) + 2)

    axes[0].hist(tad_enum, bins=bins, edgecolor='k', linewidth=0.5, alpha=0.7, label='Enumeration')
    axes[0].hist(tad_stoch, bins=bins, edgecolor='k', linewidth=0.5, alpha=0.5, label='Stochastic')
    axes[0].set_xlabel(r"$N_{\rm flux}$", fontsize=12)
    axes[0].set_ylabel("Count", fontsize=12)
    axes[0].set_title("D3-tadpole distribution", fontsize=11)
    axes[0].legend(fontsize=9)

    # h-norm distributions
    def h_norms(bf_obj, results):
        return np.array([bf_obj.compute_norm(bf_obj.get_fh(jnp.array(r["flux"]))[1]) for r in results])

    hn_enum  = h_norms(bf, enum_results)
    hn_stoch = h_norms(bf_stoch, stoch_results)

    axes[1].hist(hn_enum, bins=20, edgecolor='k', linewidth=0.5, alpha=0.7, label='Enumeration')
    axes[1].hist(hn_stoch, bins=20, edgecolor='k', linewidth=0.5, alpha=0.5, label='Stochastic')
    axes[1].set_xlabel(r"$\|h\|^2$", fontsize=12)
    axes[1].set_ylabel("Count", fontsize=12)
    axes[1].set_title(r"NSNS-flux norm distribution", fontsize=11)
    axes[1].legend(fontsize=9)

    # Moduli scatter: Im(z_1) vs Im(z_2) at which each stochastic flux was found valid
    mod_arr = np.array([r["moduli"] for r in stoch_results])
    sc = axes[2].scatter(mod_arr[:, 0].imag, mod_arr[:, 1].imag,
                         c=tad_stoch, cmap="viridis", s=15, alpha=0.7)
    plt.colorbar(sc, ax=axes[2], label=r"$N_{\rm flux}$")
    axes[2].set_xlabel(r"$\mathrm{Im}(z_1)$", fontsize=12)
    axes[2].set_ylabel(r"$\mathrm{Im}(z_2)$", fontsize=12)
    axes[2].set_title("Moduli of stochastic vacua", fontsize=11)

    plt.tight_layout()
    plt.show()
else:
    print("Not enough data to plot.")

Scaling to large \(N_{\max}\)#

At \(N_{\max} = 200\), the bounding box contains far too many \(h\)-vectors for exhaustive enumeration. But sample_bounded_fluxes handles this without difficulty.

bf_large = bounded_fluxes(model, sampler=sampler, Nmax=200)
bf_large.compute_eigenvalue_bounds(n_sample=10_000)

dim = bf_large.dimension_H3
h1_box, h2_box, h_box = bf_large._h1_box, bf_large._h2_box, bf_large._h_box
h1_max = int(np.ceil(h1_box))
h2_max = int(np.ceil(h2_box))
n_box = (2 * h1_max + 1) ** dim * (2 * h2_max + 1) ** dim

print(f"Nmax = {bf_large.Nmax}")
print(f"Bounding box: h1_box={h1_box:.2f}, h2_box={h2_box:.2f}, h_box={h_box:.2f}")
print(f"Unfiltered box size: {n_box:,} h-candidates")
print(f"  → Full enumeration is {'feasible' if n_box < 1_000_000 else 'INFEASIBLE'}")
%%time
large_results = bf_large.sample_bounded_fluxes(
    n_target=500,
    n_batch=100_000,
    n_sample=300,
    n_mod=15,
    max_batches=50,
    verbose=True,
    seed=123,
    return_moduli=True,
)
print(f"\nFound {len(large_results)} unique flux candidates at Nmax={bf_large.Nmax}.")
if len(large_results) > 0:
    tad_large = compute_tadpoles(bf_large, large_results)
    mod_large = np.array([r["moduli"] for r in large_results])

    fig, axes = plt.subplots(1, 2, dpi=150, figsize=(9, 3.5))

    axes[0].hist(tad_large, bins=30, edgecolor='k', linewidth=0.5)
    axes[0].set_xlabel(r"$N_{\rm flux}$", fontsize=12)
    axes[0].set_ylabel("Count", fontsize=12)
    axes[0].set_title(f"Tadpole distribution ($N_{{\\max}} = {bf_large.Nmax}$)", fontsize=11)

    sc = axes[1].scatter(mod_large[:, 0].imag, mod_large[:, 1].imag,
                         c=tad_large, cmap="viridis", s=15, alpha=0.7)
    plt.colorbar(sc, ax=axes[1], label=r"$N_{\rm flux}$")
    axes[1].set_xlabel(r"$\mathrm{Im}(z_1)$", fontsize=12)
    axes[1].set_ylabel(r"$\mathrm{Im}(z_2)$", fontsize=12)
    axes[1].set_title("Moduli locations", fontsize=11)

    plt.tight_layout()
    plt.show()

    print(f"N_flux range: [{tad_large.min():.0f}, {tad_large.max():.0f}]")
    print(f"N_flux mean:  {tad_large.mean():.1f}")
else:
    print("No flux candidates found.")

Newton refinement#

With refine=True, each candidate flux is Newton-refined to solve \(D_I W = 0\) exactly. The result is filtered by convergence and patch membership, then deduplicated.

Why refine=True may yield 0 converged vacua: The ISD completion uses n_mod randomly sampled moduli points as Newton starting guesses for each integer-rounded flux vector. The actual SUSY vacuum for a given flux typically lies at a moduli location different from these n_mod sampled points. With small n_mod (e.g. n_mod=15), coverage is too sparse and Newton diverges for essentially all candidates. Practical guidance:

  • Use n_mod 100 (ideally 300–500) to have a reasonable chance that at least one starting point is close to the vacuum.

  • Keep newton_step_size=1.0 (the default) — this gives quadratic convergence when the starting point is close to a root. Reducing the step size does not enlarge the basin of attraction and only slows convergence.

  • For large-scale searches, the recommended workflow is to run refine=False first to collect a pool of candidate fluxes, then refine them with the doubly-vmapped optimiser from NB07, which uses a much larger and better-distributed set of moduli starting points.

The example below illustrates the 0-convergence case with n_mod=15; the output is the expected behaviour, not a bug.

bf_refine = bounded_fluxes(model, sampler=sampler, Nmax=20)
bf_refine.compute_eigenvalue_bounds(n_sample=10_000)

refined = bf_refine.sample_bounded_fluxes(
    n_target=1000,
    n_batch=500000,
    n_sample=300,
    n_mod=15,
    max_batches=50,
    verbose=True,
    seed=42,
    refine=True,
    newton_tol=1e-10,
    newton_max_iters=100,
)
if len(refined) > 0:
    print(f"Found {len(refined)} refined SUSY vacua.\n")
    print(f"{'Flux':<45} {'|residual|':>12} {'Im(tau)':>10}")
    print("-" * 70)
    for r in refined[:10]:
        flux_str = str(r['flux'].astype(int).tolist())
        print(f"{flux_str:<45} {r['residual']:12.2e} {r['tau'].imag:10.4f}")
    if len(refined) > 10:
        print(f"  ... ({len(refined) - 10} more)")
else:
    print("No refined vacua found — try increasing n_target or loosening newton_tol.")

Case study: recovering Dataset B#

This section reproduces flux-vacuum Dataset B from arXiv:2501.03984 (Deep observations of the Type IIB flux landscape) end-to-end using bounded_fluxes. The pipeline:

  1. Load the model with the paper’s a_matrix calibration.

  2. Load the stored SUSY vacua and decode VEVs + fluxes.

  3. Compute the eigenvalue-based bounding box at the full-dataset \(N_{\max}\).

  4. Test scan at small \(N_{\max}\) + direct Newton verification of the stored solutions.

  5. Pipeline validation: recover stored vacua from \(h\)-flux alone via ISD completion + Newton refinement.

  6. (Optional) Full Dataset B scan with summary statistics.

Data source. Dataset B is published on the HuggingFace vacua_vault repo under the label "datasetB" and loaded below via stringforge. The load is safeguarded: if stringforge or the dataset is unavailable, the case study is skipped cleanly (a local dataset_B.p next to the notebook is used as a fallback if present).

Override the model’s a_matrix to match the paper’s calibration (arXiv:2501.03984, Table 2):

# Dataset B (arXiv:2501.03984) is published on the HuggingFace `vacua_vault`
# repo under the label "datasetB". We load it via stringforge if available,
# fall back to a local ./dataset_B.p, and skip the case study cleanly otherwise.
RUN_DATASET_B = True
z1 = z2 = tau_B = f_B = h_B = flux_B = Nflux_B = None

# Symplectic form for the tadpole:  N_flux = f^T sigma h,  sigma = [[0, I3], [-I3, 0]].
sigma_np = np.array([[0,0,0,1,0,0],[0,0,0,0,1,0],[0,0,0,0,0,1],
                     [-1,0,0,0,0,0],[0,-1,0,0,0,0],[0,0,-1,0,0,0]], float)

def _decode_dataset_B(arr):
    """Decode the raw [z1_re, z1_im, z2_re, z2_im, tau_re, tau_im, f(6), h(6)] layout."""
    arr = np.asarray(arr, dtype=float)
    z1 = arr[:, 0] + 1j * arr[:, 1]
    z2 = arr[:, 2] + 1j * arr[:, 3]
    tau_B = arr[:, 4] + 1j * arr[:, 5]
    flux_B = arr[:, 6:18]
    return z1, z2, tau_B, flux_B[:, :6], flux_B[:, 6:12], flux_B

try:
    from stringforge import LCSDatabase
    _vac = LCSDatabase(dataset="tdf").fetch_vacua_from_hub(model=model, label="datasetB")
    if len(_vac) == 0:
        raise FileNotFoundError("no rows labelled 'datasetB' in the vacua_vault repo")
    flux_B = np.array([list(fl) for fl in _vac["flux"]], dtype=float)
    _zall = np.array([[r + 1j * i for r, i in zip(mr, mi)]
                      for mr, mi in zip(_vac["moduli_re"], _vac["moduli_im"])])
    z1, z2 = _zall[:, 0], _zall[:, 1]
    tau_B = (_vac["tau_re"] + 1j * _vac["tau_im"]).to_numpy()
    f_B, h_B = flux_B[:, :6], flux_B[:, 6:12]
    print(f"Loaded {len(flux_B)} Dataset B vacua from HuggingFace (vacua_vault, label='datasetB').")
except Exception as _e:
    import os
    if os.path.exists("./dataset_B.p"):
        z1, z2, tau_B, f_B, h_B, flux_B = _decode_dataset_B(jvc.util.load_zipped_pickle("./dataset_B.p"))
        print(f"Loaded {len(flux_B)} Dataset B vacua from local ./dataset_B.p.")
    else:
        print(f"Dataset B unavailable ({type(_e).__name__}: {_e}).\n"
              "  -> Skipping the Dataset B case study. It runs automatically once the\n"
              "     'datasetB' vacua are on the HuggingFace vacua_vault repo, or place\n"
              "     a local dataset_B.p next to this notebook.")
        RUN_DATASET_B = False

Load Dataset B#

if RUN_DATASET_B:
    # Signed tadpole  N_flux = f^T sigma h  (positive for physical fluxes; no abs needed).
    Nflux_B = np.einsum('ki,ij,kj->k', f_B, sigma_np, h_B)
    # Paper a_matrix calibration for the degree-18 model.
    model.lcs_tree.a_matrix = jnp.array([[4.5, 1.5], [1.5, 0.]])
    print(f"Dataset B: {len(flux_B)} solutions")
    print(f"N_flux range:   {int(Nflux_B.min())} - {int(Nflux_B.max())}")
    print(f"Im(tau) range:  {np.imag(tau_B).min():.3f} - {np.imag(tau_B).max():.3f}")
    print(f"Im(z1) range:   {np.imag(z1).min():.3f} - {np.imag(z1).max():.3f}")
    print(f"Im(z2) range:   {np.imag(z2).min():.3f} - {np.imag(z2).max():.3f}")

Build a data_sampler and bounded_fluxes instance with the Dataset B region \(\operatorname{Im}(z) \in (2, 5)\), \(s \in (\sqrt 3 / 2, 10)\):

if RUN_DATASET_B:
    import math
    from jaxvacua.flux_bounding import bounded_fluxes

    # Full Dataset B parameters (arXiv:2501.03984, Table 2)
    # Used for bounding box analysis and the full scan.
    _dil_B = (math.sqrt(3.) / 2., 10.)
    sampler_B = jvc.data_sampler(
        model,
        moduli_bounds=(2., 5.),
        dilaton_bounds=_dil_B,
        axion_bounds=(-0.5, 0.5),
        seed=42,
    )
    bf_B = bounded_fluxes(model, sampler=sampler_B, Nmax=10)

    print(f"n_fluxes     = {bf_B.n_fluxes}  (= 2*(h12+1))")
    print(f"dimension_H3 = {bf_B.dimension_H3}  (= h12+1)")

Bounding box for full Dataset B#

We sample moduli points from the Dataset B region and compute the eigenvalue-based bounding box (§3.1 of arXiv:2501.03984).

Bound

Formula

Quantity

h1_box

\(\sqrt{N_{\max}/(s_{\min}\,\tilde\mu_{\min})}\)

max \(|h_1|\)

h2_box

\(\sqrt{N_{\max}/(s_{\min}\,\mu_{\min})}\)

max \(|h_2|\)

h_box

\(\sqrt{2\lambda_{\max} N_{\max}/s_{\min}}\)

max \(|h|\) (global)

The full Cartesian-product enumeration \((2h_{1,\max}+1)^3\times(2h_{2,\max}+1)^3\) is infeasible for Dataset B due to the small \(\tilde\mu_{\min}\) eigenvalue near \(\mathrm{Im}(z)\sim 2\).

if RUN_DATASET_B:
    # Pre-compute eigenvalue bounds once (reused by enumerate_fluxes)
    h1_box_B, h2_box_B, h_box_B = bf_B.compute_eigenvalue_bounds(n_sample=1_000)

    h1_max_B = int(np.ceil(h1_box_B))
    h2_max_B = int(np.ceil(h2_box_B))
    dim      = bf_B.dimension_H3
    n_box_B  = (2*h1_max_B + 1)**dim * (2*h2_max_B + 1)**dim

    print()
    print(f"  h1_box = {h1_box_B:.2f}  → h1_max = {h1_max_B}")
    print(f"  h2_box = {h2_box_B:.2f}  → h2_max = {h2_max_B}")
    print(f"  h_box  = {h_box_B:.2f}")
    print()
    print(f"  Cartesian product: ({2*h1_max_B+1})³ × ({2*h2_max_B+1})³ ≈ {n_box_B:,}")
    print(f"  → Streaming mode (h₂-outer) activates automatically in enumerate_fluxes")

Test scan: N_max = 4, s ≥ 2#

We restrict to N_flux ≤ 4 and Im(τ) ≥ 2 to get a small, feasible bounding box while keeping a non-trivial ground truth.

From dataset_B.p (precomputed):

  • N_flux ≤ 4, s ≥ 2, Im(z) ∈ [2,5] → 42 expected solutions (exact subset of Dataset B).

These are the solutions our scan must find. A 100 % match validates:

  • the linearised_shifts_H pipeline for moduli-adaptive ISD completion,

  • the Newton refinement convergence,

  • and the flux conventions against the paper.

if RUN_DATASET_B:
    @jit
    def constraints_model(moduli,tau,flux):
    
        #return ((jnp.all(jnp.imag(moduli)<3.,axis=0))&(jnp.all(jnp.imag(moduli)>2.,axis=0)))
        return ((jnp.all(jnp.imag(moduli)>=1.,axis=0)))
if RUN_DATASET_B:
    # Run enumerate_fluxes with ISD refinement via linearised_shifts_H.
    #
    # Key parameters:
    #   use_linearised_shifts=True  — moves moduli to be self-consistent with each h
    #                                 (Algorithm 1 of arXiv:2501.03984)
    #   n_isd_iters=5               — 5 linearised_shifts_H iterations per starting point
    #   moduli_regions              — Cartesian product across h12 moduli dimensions
    #
    # Eigenvalue bounds were pre-computed above → Step 1 is skipped.

    enum_results_test = bf_test.enumerate_fluxes(
        n_sample=500,
        n_isd_per_h=30,
        max_h_candidates=5_000_000,
        refine=True,
        return_moduli=True,
        newton_tol=1e-10,
        newton_max_iters=10,
        newton_step_size=1.0,
        confirm_streaming=False,
        #moduli_regions=[(2., 3.), (3., 5.)],
        moduli_regions=[(2., 3.)],
        use_linearised_shifts=True,
        chunk_size=20_000,
        n_isd_iters=8,
        n_moduli_batches=30,
        constraints=constraints_model,
        verbose=True,
    )
    print(f"\nFound {len(enum_results_test)} converged SUSY vacua in test region.")
if RUN_DATASET_B:
    # Display test scan results
    if len(enum_results_test) == 0:
        print("No converged vacua found. Check that the model is FluxVacuaFinder")
        print("and that a_matrix is set correctly.")
    else:
        print(f"{'#':>3}  {'Im(z1)':>8} {'Im(z2)':>8} {'Im(τ)':>8}  {'|W₀|':>12}  {'N_flux':>6}  flux [f|h]")
        print("-" * 100)
        for i, vac in enumerate(enum_results_test):
            fv = jnp.array(vac["flux"], dtype=float)
            mv = vac["moduli"]
            tv = vac["tau"]
            W0    = model.superpotential(mv, tv, fv)
            Nflux = abs(float(model.tadpole(fv).real))
            print(
                f"{i:>3}  "
                f"{float(jnp.imag(mv[0])):>8.4f} "
                f"{float(jnp.imag(mv[1])):>8.4f} "
                f"{float(jnp.imag(tv)):>8.4f}  "
                f"{abs(W0):>12.4e}  "
                f"{Nflux:>6.0f}  "
                f"{np.round(np.array(fv)).astype(int)}"
            )

Direct verification: Newton refinement of stored Dataset B solutions#

Newton-refine the first 500 stored solutions starting from the paper’s VEVs. All should converge with residual ≪ 10⁻¹⁰, confirming that:

  • the a_matrix and prepotential match the paper,

  • the symplectic conventions are consistent.

if RUN_DATASET_B:
    n_verify = min(500, len(datsB))

    moduli_0 = jnp.array(np.column_stack([z1[:n_verify], z2[:n_verify]]), dtype=complex)
    tau_0    = jnp.array(tau_B[:n_verify], dtype=complex)
    flux_0   = jnp.array(flux_B[:n_verify], dtype=float)

    # bf_B is already built above (full Dataset B sampler, Nmax=10)
    moduli_ref, tau_ref, residuals = bf_B.newton_refine_batch(
        moduli_0, tau_0, flux_0,
        step_size=1.0, tol=1e-12, max_iters=10,
    )

    res_np     = np.asarray(residuals)
    n_conv     = int(np.sum(res_np < 1e-10))
    dz1_drift  = np.abs(np.asarray(moduli_ref[:, 0]) - z1[:n_verify])
    dz2_drift  = np.abs(np.asarray(moduli_ref[:, 1]) - z2[:n_verify])
    dtau_drift = np.abs(np.asarray(tau_ref) - tau_B[:n_verify])

    print(f"Verified {n_verify}/{len(datsB)} Dataset B solutions")
    print(f"  Converged (res < 1e-10): {n_conv}/{n_verify}")
    print(f"  Max residual:            {res_np.max():.2e}")
    print(f"  Mean residual:           {res_np.mean():.2e}")
    print()
    print(f"  Max |Δz1| drift:  {dz1_drift.max():.2e}")
    print(f"  Max |Δz2| drift:  {dz2_drift.max():.2e}")
    print(f"  Max |Δτ|  drift:  {dtau_drift.max():.2e}")
    print()
    if n_conv == n_verify:
        print("✓  All solutions converged — model setup confirmed.")
    else:
        print(f"⚠  {n_verify - n_conv} solutions did NOT converge.")
        print("   Check a_matrix and symplectic conventions.")
if RUN_DATASET_B:
    import matplotlib.pyplot as plt

    fig, axes = plt.subplots(1, 3, figsize=(15, 4))

    # Residual histogram
    axes[0].hist(np.log10(res_np + 1e-16), bins=40, color="steelblue", edgecolor="white")
    axes[0].axvline(-10, color="red", linestyle="--", label="tol = 1e-10")
    axes[0].set_xlabel(r"$\log_{10}(\sum|D_I W|)$")
    axes[0].set_ylabel("count")
    axes[0].set_title(f"Newton residuals — {n_verify} Dataset B solutions")
    axes[0].legend()

    # Im(tau) distribution — full Dataset B
    axes[1].hist(np.imag(tau_B), bins=50, color="coral", edgecolor="white")
    axes[1].axvline(2.0, color="red", linestyle="--", label="test scan: s ≥ 2")
    axes[1].set_xlabel(r"$s = \mathrm{Im}(\tau)$")
    axes[1].set_ylabel("count")
    axes[1].set_title("Full Dataset B — dilaton distribution")
    axes[1].legend()
    axes[1].set_yscale("log")

    # N_flux distribution — full Dataset B
    axes[2].hist(Nflux_B, bins=np.arange(0.5, 11.5, 1), color="mediumseagreen", edgecolor="white")
    axes[2].axvline(4.5, color="red", linestyle="--", label="test scan: N_flux ≤ 4")
    axes[2].set_xlabel(r"$N_\mathrm{flux}$")
    axes[2].set_ylabel("count")
    axes[2].set_title("Full Dataset B — tadpole distribution")
    axes[2].legend()
    axes[2].set_yscale("log")

    plt.tight_layout()
    plt.show()

    # Count expected solutions in test region
    mask_test = (Nflux_B <= Nmax_test) & (np.imag(tau_B) >= dil_test[0])
    n_expected = int(mask_test.sum())
    print(f"\nDataset B solutions in test region (N_flux≤{Nmax_test}, s≥{dil_test[0]:.0f}):")
    print(f"  Expected: {n_expected}  |  Found: {len(enum_results_test)}")

A monodromy-aware cross-check would additionally quotient by integer-axion monodromies and sign flips. That check depends on extra monodromy utilities and is therefore not part of this public tutorial.

Pipeline validation: recover solutions from h-fluxes alone#

Take a subset of Dataset B solutions with \(N_{\rm flux} \leq 5\), extract only their \(h\)-fluxes, and verify that we can recover the full solution (f-fluxes, moduli, \(\tau\)) from scratch using randomly sampled starting points.

This validates the complete pipeline:

  1. ISD completion (isd_refine_batch): given \(h\) and random moduli/tau starting points, iteratively compute self-consistent \((z, \tau, f)\) via linearised_shifts.

  2. Newton refinement (newton_refine_batch): refine the ISD candidate to solve \(D_I W = 0\) exactly.

  3. Comparison: verify the converged solution matches the stored Dataset B entry.

if RUN_DATASET_B:
    # Select Dataset B solutions with N_flux <= 5, s >= 2
    mask_pipe = (Nflux_B <= 5) & (np.imag(tau_B) >= 2.)
    idx_pipe = np.where(mask_pipe)[0][:100]
    n_pipe = len(idx_pipe)

    h_pipe = jnp.array(h_B[idx_pipe], dtype=float)          # ONLY h-fluxes — no f, no moduli, no tau
    f_pipe_ref = f_B[idx_pipe]                                # reference (for comparison only)
    z_pipe_ref = np.column_stack([z1[idx_pipe], z2[idx_pipe]])
    tau_pipe_ref = tau_B[idx_pipe]

    print(f"Selected {n_pipe} h-flux vectors from Dataset B (N_flux ≤ 5, s ≥ 2)")
    print(f"N_flux values: {sorted(set(Nflux_B[idx_pipe].astype(int)))}")
    print(f"  → We use ONLY h_B — all moduli/tau starting points are random.")
if RUN_DATASET_B:
    # Step 1: ISD completion with RANDOM starting points
    # Multiple passes with different random moduli/tau to increase coverage,
    # following the multi-pass strategy from run_hscan_34.py.

    n_pts_per_pass = 200   # random starting points per pass
    n_passes = 10         # number of passes with fresh random points
    n_iters = 10          # linearised_shifts iterations per pass

    # Build lookup set from reference f-fluxes (for evaluation only)
    f_ref_int = np.round(f_pipe_ref).astype(int)

    all_recovered = {}  # h_idx → best result

    for p in range(n_passes):
        # Fresh random starting points each pass
        moduli_rand = jnp.array(sampler_B.get_complex_moduli(n_pts_per_pass), dtype=complex)
        tau_rand = jnp.array(sampler_B.get_complex_tau(n_pts_per_pass), dtype=complex)

        mod_out, tau_out, flux_out = bf_B.isd_refine_batch(
            h_pipe, moduli_rand, tau_rand, n_iters=n_iters,
        )

        # Check which h-vectors recovered the correct f-flux
        for i in range(n_pipe):
            if i in all_recovered:
                continue
            f_ref = f_ref_int[i]
            for j in range(mod_out.shape[1]):
                f_cand = np.round(flux_out[i, j, :6].real).astype(int)
                if np.array_equal(f_cand, f_ref) or np.array_equal(f_cand, -f_ref):
                    all_recovered[i] = {
                        'idx': idx_pipe[i], 'pass': p,
                        'flux': np.round(flux_out[i, j].real).astype(float),
                        'moduli': mod_out[i, j],
                        'tau': tau_out[i, j],
                    }
                    break

        print(f"  Pass {p+1}/{n_passes}: {len(all_recovered)}/{n_pipe} h-vectors recovered "
              f"({n_pts_per_pass} random starts, {n_iters} iters)")

    print(f"\nISD completion recovered correct f-flux: {len(all_recovered)}/{n_pipe} "
          f"using only random starting points")
if RUN_DATASET_B:
    # Step 2: Newton-refine the recovered candidates and compare to Dataset B
    recovered = list(all_recovered.values())

    if recovered:
        flux_arr_pipe = jnp.array([r['flux'] for r in recovered])
        mod_arr_pipe = jnp.array([r['moduli'] for r in recovered], dtype=complex)
        tau_arr_pipe = jnp.array([r['tau'] for r in recovered], dtype=complex)

        mod_newton, tau_newton, res_newton = bf_B.newton_refine_batch(
            mod_arr_pipe, tau_arr_pipe, flux_arr_pipe,
            step_size=1.0, tol=1e-12, max_iters=200,
        )

        res_np = np.asarray(res_newton)
        n_conv = int(np.sum(res_np < 1e-10))

        # Compare to Dataset B reference
        print(f"Newton converged: {n_conv}/{len(recovered)}")
        print(f"Residuals: max={res_np.max():.2e}, min={res_np.min():.2e}\n")

        print(f"{'#':>3} {'pass':>4} {'N_flux':>6} {'|ΣD_IW|':>10} {'|Δz|_max':>10} {'|Δτ|':>10}  match?")
        print("-" * 66)
        n_exact = 0
        for k, r in enumerate(recovered):
            hi = list(all_recovered.keys())[k]
            z_ref_k = z_pipe_ref[hi]
            t_ref_k = tau_pipe_ref[hi]
            Nfl = float(Nflux_B[r['idx']])

            dz = float(np.max(np.abs(np.asarray(mod_newton[k]) - z_ref_k)))
            dt = float(np.abs(np.asarray(tau_newton[k]) - t_ref_k))
            close = dz < 0.01 and dt < 0.01 and res_np[k] < 1e-10
            if close:
                n_exact += 1
            print(f"{k:>3} {r['pass']:>4} {Nfl:>6.0f} {res_np[k]:>10.2e} {dz:>10.6f} {dt:>10.6f}  "
                  f"{'YES' if close else 'no'}")

        print(f"\n{'='*66}")
        print(f"Pipeline validation: {n_exact}/{n_pipe} Dataset B solutions recovered")
        print(f"  from h-fluxes alone with random starting points")
        print(f"  ({n_passes} passes × {n_pts_per_pass} random moduli/tau per pass)")
        if n_exact == len(recovered):
            print(f"  All {n_exact} recovered solutions match Dataset B exactly.")
    else:
        print("No solutions recovered — try increasing n_passes or n_pts_per_pass.")

Full Dataset B scan#

Parameters: Im(zⁱ) ∈ [2,5], s ∈ [√3/2, 10], N_max=10, use_linearised_shifts=True

Step

Status

Notes

Model + a_matrix

Verified via direct Newton refinement

Bounding box

compute_bounding_box

h₂-outer streaming

_iter_h_chunks_streaming + _stream_fixed_chunks

linearised_shifts_H ISD completion

isd_refine_batch, use_linearised_shifts=True

Newton refinement

newton_refine_batch

Monodromy reduction

Low priority — small effect (~1% reduction)

Expected: ~12,196 SUSY vacua; ~4.64 billion h-flux candidates before filtering. The streaming mode keeps peak memory at ~2.4 MB (one 100K-row h-chunk at a time).

Runtime: hours to days on CPU; ~1 hour on a GPU cluster with h₂-batch sharding. Set confirm_streaming=False for non-interactive execution.

if RUN_DATASET_B:
    # Full Dataset B scan — set RUN_FULL_SCAN = True to execute
    # Expected runtime: several hours (CPU) or ~1 h (GPU cluster with h2-sharding)
    RUN_FULL_SCAN = False

    if RUN_FULL_SCAN:
        import math as _math
        from jaxvacua.flux_bounding import bounded_fluxes as _bf

        sampler_full = jvc.data_sampler(
            model,
            moduli_bounds=(2., 5.),
            dilaton_bounds=(_math.sqrt(3.) / 2., 10.),
            axion_bounds=(-0.5, 0.5),
            seed=42,
        )
        bf_full = _bf(model, sampler=sampler_full, Nmax=10)

        full_results = bf_full.enumerate_fluxes(
            n_sample=1000,
            n_isd_per_h=10,
            max_h_candidates=10_000_000,   # streaming triggered immediately for Dataset B
            refine=True,
            return_moduli=True,
            newton_tol=1e-10,
            newton_max_iters=100,
            newton_step_size=1.0,
            confirm_streaming=False,
            moduli_regions=[(2., 3.), (3., 4.), (4., 5.)],
            use_linearised_shifts=True,
            n_isd_iters=5,
            verbose=True,
        )
        print(f"\nFull scan: found {len(full_results)} vacua  (expected: 12,196)")
    else:
        print("RUN_FULL_SCAN = False — skipping full Dataset B scan.")
        print("Set RUN_FULL_SCAN = True to execute (long runtime).")
if RUN_DATASET_B:
    # Visual comparison: VEV distributions of test-scan results vs Dataset B subset
    import matplotlib.pyplot as plt

    mask_test = (Nflux_B <= Nmax_test) & (np.imag(tau_B) >= dil_test[0])
    z1_sub  = z1[mask_test]
    z2_sub  = z2[mask_test]
    tau_sub = tau_B[mask_test]

    fig, axes = plt.subplots(1, 3, figsize=(15, 4))

    # Im(z1) comparison
    axes[0].hist(np.imag(z1_sub),  bins=20, alpha=0.6, label="Dataset B subset", color="steelblue", edgecolor="white")
    if enum_results_test:
        imz1_found = [float(jnp.imag(v["moduli"][0])) for v in enum_results_test]
        axes[0].hist(imz1_found, bins=20, alpha=0.7, label="enumerate_fluxes", color="coral", edgecolor="white")
    axes[0].set_xlabel(r"$\mathrm{Im}(z_1)$"); axes[0].set_ylabel("count")
    axes[0].set_title(r"$\mathrm{Im}(z_1)$ distribution"); axes[0].legend()

    # Im(z2) comparison
    axes[1].hist(np.imag(z2_sub),  bins=20, alpha=0.6, label="Dataset B subset", color="steelblue", edgecolor="white")
    if enum_results_test:
        imz2_found = [float(jnp.imag(v["moduli"][1])) for v in enum_results_test]
        axes[1].hist(imz2_found, bins=20, alpha=0.7, label="enumerate_fluxes", color="coral", edgecolor="white")
    axes[1].set_xlabel(r"$\mathrm{Im}(z_2)$"); axes[1].set_ylabel("count")
    axes[1].set_title(r"$\mathrm{Im}(z_2)$ distribution"); axes[1].legend()

    # Im(tau) comparison
    axes[2].hist(np.imag(tau_sub), bins=20, alpha=0.6, label="Dataset B subset", color="steelblue", edgecolor="white")
    if enum_results_test:
        imtau_found = [float(jnp.imag(v["tau"])) for v in enum_results_test]
        axes[2].hist(imtau_found, bins=20, alpha=0.7, label="enumerate_fluxes", color="coral", edgecolor="white")
    axes[2].set_xlabel(r"$s = \mathrm{Im}(\tau)$"); axes[2].set_ylabel("count")
    axes[2].set_title(r"$s$ distribution"); axes[2].legend()

    plt.suptitle(f"Test region: N_flux ≤ {Nmax_test}, s ≥ {dil_test[0]:.0f}, Im(z) ∈ {moduli_bounds_B}", y=1.02)
    plt.tight_layout()
    plt.show()
if RUN_DATASET_B:
    # Final summary table
    print("=" * 60)
    print("RECOVERY STATUS — Dataset B (arXiv:2501.03984)")
    print("=" * 60)
    print()
    print(f"  Full Dataset B:         {len(datsB):>6} solutions")
    print()
    print(f"  Test region summary")
    print(f"  ├─ Parameters:          N_flux ≤ {Nmax_test}, s ≥ {dil_test[0]:.0f}, Im(z) ∈ {moduli_bounds_B}")
    print(f"  ├─ Expected (paper):    {int(mask_test.sum()):>6}")
    print(f"  ├─ Found (scan):        {len(enum_results_test):>6}")

    if len(enum_results_test) > 0:
        n_in_B = sum(
            1 for v in enum_results_test
            if np.round(np.array(v["flux"])).astype(int).tobytes() in flux_B_set
        )
        print(f"  ├─ Matched in Dataset B:{n_in_B:>6}")
        recall = len(enum_results_test) / int(mask_test.sum()) if mask_test.sum() > 0 else 0
        precision = n_in_B / len(enum_results_test) if enum_results_test else 0
        print(f"  ├─ Recall:              {recall*100:.1f}%")
        print(f"  └─ Precision:           {precision*100:.1f}%")
    else:
        print(f"  └─ (scan not yet run)")

    print()
    print("  Full scan status:       ", end="")
    if RUN_FULL_SCAN and 'full_results' in dir():
        n_full_in_B = sum(
            1 for v in full_results
            if np.round(np.array(v["flux"])).astype(int).tobytes() in flux_B_set
        )
        print(f"{len(full_results)} found, {n_full_in_B} matched in Dataset B")
    else:
        print("NOT RUN (set RUN_FULL_SCAN=True)")

Take-aways#

  • Eigenvalue bounds give a finite search region. From the extremal eigenvalues of \(\mathcal{M}\) (ISD matrix) and \(-\mathrm{Im}(\mathcal{N})\) / \(\mathrm{Im}(\mathcal{N}^{-1})\) (gauge kinetic matrix), the algorithm derives rigorous boxes on \(\|h_1\|\), \(\|h_2\|\), and \(\|h\|\) given an \(s_{\min}\) lower bound and a tadpole bound \(N_{\max}\).

  • bounded_fluxes packages the six-step pipeline (eigenvalue bounds → NS-NS box → ISD completion → tadpole + eigenvalue filters → ISD-condition check → Newton/scipy refinement) into one class with two entry points: enumerate_fluxes (complete) and sample_bounded_fluxes (stochastic).

  • Use enumerate_fluxes for \(N_{\max} \lesssim 30\) where the box is enumerable. The tadpole pre-filter (\(s_{\min}\,h^T M^{-1} h \leq N_{\max}\)) eliminates >99% of candidates before refinement, making this fast.

  • Switch to sample_bounded_fluxes for \(N_{\max} \gtrsim 50\). It Monte-Carlos a Gaussian prior \(h \sim \mathcal{N}(0, \sigma^2 M)\) tuned to the ISD tadpole. Sublinear in \(N_{\max}\), outperforms uniform-box sampling by ~50× yield, sacrifices completeness.

  • Pre-compute eigenvalue bounds once with compute_eigenvalue_bounds(n_sample=5000) and reuse them across many enumerate_fluxes / sample_bounded_fluxes calls — much cheaper than recomputing inside each call.

  • The Dataset B case study confirms end-to-end soundness. Test scans at \(N_{\max}=4\) recover Dataset B vacua exactly (up to monodromy + sign-flip equivalences); Newton refinement of stored solutions converges to \(\sum_I|D_I W|<10^{-10}\), validating both the period data and the symplectic conventions.

  • Newton vs scipy refinement. scipy’s hybrid Powell method is generally faster per-candidate; JAX-native Newton with step_size_Newton=1.0 is preferable when vmapping over many starts (single XLA kernel, no Python overhead per step).

Further reading#