3a Block-bootstrap surrogates (B=804 bins≈11 yr, n=5000): raw r(+15d)
p=0.022 — marginally significant on undetrended series, driven by
the shared solar-cycle trend not a causal signal.
3b Partial correlation (sunspot-regressed, no filter): r(+15d) drops
from 0.079 to 0.029 (63%) — confirms solar-cycle confounding without
preprocessing circularity.
3c Spectral coherence in solar-cycle band = 0.840 (>0.776 threshold);
kNN mutual information at τ=+15d = 0.000 nats (p=1.000) — no
nonlinear dependence at the claimed lag.
3d Missing-data impact: 0% NaN at station thresholds 2/3/5; r(+15d)
unchanged — missing data is not a confound.
3e Bin-size sensitivity: dominant peak at τ≈-520d for 1/5/27-day bins;
r at +15d scales with bin size (solar-cycle leakage, not physical).
3f GK declustering removes 28.4% of events as aftershocks; r(+15d)
changes by only Δ=0.014 — aftershock clustering not a confound.
3g Per-solar-cycle analysis (cycles 21–24): r(+15d) all positive
(0.018–0.073) but dominant peak lags scattered (-65/-125/+125/-125d)
— phase-drifting solar-cycle artefact, not physical precursor.
Script fixes: vectorised block bootstrap (401×5000 → NumPy matmul),
kNN MI with sorted searchsorted (O(N log N), no inflated values),
coherence nperseg 512→2048 (resolves solar-cycle band), axhspan→axvspan.
Paper: new Methods subsections (block bootstrap, partial correlation,
nonlinear dependence); new Results subsections for each check; updated
Conclusions with 7-item robustness summary; Kraskov2004 and
GardnerKnopoff1974 added to refs.bib; Homola2023 updated to arXiv
preprint 2204.12310; eq:energy duplicate label fixed. PDF: 35 pages.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1433 lines
67 KiB
TeX
1433 lines
67 KiB
TeX
\documentclass[12pt,a4paper]{article}
|
|
|
|
%% ── Packages ────────────────────────────────────────────────────────────────
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[utf8]{inputenc}
|
|
\usepackage{lmodern}
|
|
\usepackage[margin=2.5cm]{geometry}
|
|
\usepackage{amsmath,amssymb}
|
|
\usepackage{graphicx}
|
|
\usepackage{booktabs}
|
|
\usepackage{hyperref}
|
|
\usepackage{xcolor}
|
|
\usepackage{siunitx}
|
|
\usepackage{natbib}
|
|
\usepackage{caption}
|
|
\usepackage{subcaption}
|
|
\usepackage{microtype}
|
|
\usepackage{setspace}
|
|
\usepackage{lineno}
|
|
\onehalfspacing
|
|
|
|
\hypersetup{
|
|
colorlinks=true,
|
|
linkcolor=blue!60!black,
|
|
citecolor=green!50!black,
|
|
urlcolor=blue!60!black,
|
|
}
|
|
|
|
\graphicspath{{figs/}}
|
|
|
|
%% ── Title block ─────────────────────────────────────────────────────────────
|
|
\title{\textbf{No Causal Link Between Galactic Cosmic-Ray Flux and Global
|
|
Seismicity: A Pre-Registered Replication with GPU-Accelerated Surrogate Testing
|
|
and Out-of-Sample Validation}}
|
|
|
|
\author{
|
|
J.~D.~Devine$^{1}$\\[4pt]
|
|
{\small $^{1}$ Independent researcher}\\
|
|
{\small \texttt{devine.jd@gmail.com}}
|
|
}
|
|
|
|
\date{April 2026}
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\begin{document}
|
|
\maketitle
|
|
\linenumbers
|
|
|
|
%% ── Abstract ────────────────────────────────────────────────────────────────
|
|
\begin{abstract}
|
|
\noindent
|
|
\citet{Homola2023} reported a statistically significant positive correlation
|
|
($r \approx 0.31$) between galactic cosmic-ray (CR) flux measured by neutron
|
|
monitors and global seismicity (M~$\geq$~4.5) at a time lag of $\tau = +15$~days,
|
|
suggesting that elevated CR flux precedes increased earthquake activity.
|
|
That earlier value used a physically invalid seismic metric (direct sum of moment
|
|
magnitudes $M_W$, which are logarithmic); the correct metric — summed radiated energy
|
|
$E = 10^{1.5 M_W + 4.8}$\,J per bin \citep{Kanamori1977} — yields $r(+15\,\text{d}) = 0.081$.
|
|
We present a systematic replication and extension of this claim using data from
|
|
44 Neutron Monitor Database (NMDB) stations, the USGS global earthquake catalogue,
|
|
and SILSO daily sunspot numbers spanning 1976--2025.
|
|
|
|
Our analysis proceeds in four stages.
|
|
\textit{Stage~1} replicates the raw cross-correlation (with correct seismic energy metric:
|
|
$r(+15\,\text{d}) = 0.081$, peak $r = 0.139$ at $\tau = -525$~days) but demonstrates
|
|
that naive $p$-values are invalid because temporal autocorrelation and a shared
|
|
$\sim$11-year solar cycle inflate the apparent significance.
|
|
\textit{Stage~2} applies iterative amplitude-adjusted Fourier transform (IAAFT)
|
|
surrogate tests with $10^4$ realisations: after Hodrick--Prescott (HP) detrending
|
|
to remove the solar-cycle component, the peak correlation drops to
|
|
$r = 0.101$ and achieves formal significance ($p_\text{global} < 10^{-4}$, $>3.9\sigma$),
|
|
but $r(+15\,\text{d}) = 0.027$ --- well within the surrogate null distribution.
|
|
\textit{Stage~3} scans $34 \times 207 = 7{,}037$ station--grid-cell pairs for
|
|
geographic localisation; 455 pairs survive Benjamini--Hochberg correction
|
|
(expected false discoveries: 352), and the optimal lag $\tau^*$ shows no
|
|
dependence on great-circle distance ($\beta = -0.45$~days/1000~km, $p = 0.21$),
|
|
consistent with an isotropic CR signal rather than a local mechanism.
|
|
\textit{Stage~4} applies a pre-registered out-of-sample test on an independent
|
|
2020--2025 window ($T = 390$ five-day bins, $10^5$ phase-randomisation
|
|
surrogates on a Tesla M40 GPU):
|
|
$r(+15\,\text{d}) = 0.030$ (surrogate 95th percentile~$= 0.101$),
|
|
$p_\text{global} = 0.100$ --- consistent with noise.
|
|
Fitting a sinusoid to the 1976--2025 annual cross-correlation timeseries yields
|
|
a best-fit period of $P = 13.0$~years and a Bayes factor of $\text{BF} = 0.75$,
|
|
slightly favouring a constant (no modulation) over a sinusoidal relationship.
|
|
|
|
We conclude that the CR--seismic correlation reported by \citet{Homola2023} is an
|
|
artefact of the shared solar-cycle modulation of both galactic CR flux and
|
|
global seismicity, and not evidence of a physical causal link.
|
|
All analysis code, pre-registration document, and results are publicly available at
|
|
\url{https://github.com/pingud98/cosmicraysandearthquakes}.
|
|
\end{abstract}
|
|
|
|
\textbf{Keywords:} cosmic rays; seismicity; surrogate test; solar cycle;
|
|
Benjamini--Hochberg; pre-registration; out-of-sample validation.
|
|
|
|
\newpage
|
|
\tableofcontents
|
|
\newpage
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Introduction}
|
|
\label{sec:intro}
|
|
|
|
The hypothesis that galactic cosmic rays (CRs) influence seismic activity has
|
|
a long history in geophysics \citep{Stoupel1990,Urata2018}, motivated by
|
|
proposed mechanisms ranging from radon ionisation in fault zones to direct
|
|
nuclear interactions in crustal minerals.
|
|
\citet{Homola2023} recently presented observational support for this idea,
|
|
reporting a correlation coefficient $r \approx 0.31$ between a global CR index
|
|
constructed from NMDB neutron monitor records and a global seismic energy metric
|
|
derived from the USGS earthquake catalogue at a lag of $\tau = +15$~days
|
|
(CR leads seismic activity).
|
|
The associated naive $p$-value was reported as $p \sim 10^{-72}$ at this lag.
|
|
|
|
Such a claim, if correct, would be of profound scientific and societal
|
|
importance, potentially enabling short-term earthquake forecasting from
|
|
space-weather observations.
|
|
It therefore demands rigorous scrutiny.
|
|
Three statistical pitfalls immediately suggest themselves:
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Temporal autocorrelation.}
|
|
Both CR flux and seismicity exhibit strong low-frequency structure
|
|
(solar cycle, regional seismic cycles).
|
|
Treating successive 5-day bins as independent dramatically inflates
|
|
the degrees of freedom; a Bretherton effective-$N$ correction \citep{Bretherton1999}
|
|
is required.
|
|
|
|
\item \textbf{Shared solar-cycle trend.}
|
|
Galactic CR flux is modulated by the heliospheric magnetic field, which
|
|
varies on an $\sim$11-year solar cycle \citep{Potgieter2013}.
|
|
Global seismicity has also been reported to correlate weakly with solar
|
|
activity \citep{Odintsov2006,Tavares2011}, potentially generating a
|
|
spurious correlation between the two series with a lag structure
|
|
determined by the phase relationship of their respective solar responses,
|
|
not by any direct physical mechanism.
|
|
|
|
\item \textbf{Multiple-comparison inflation.}
|
|
Testing 401 lag values and selecting the maximum creates a look-elsewhere
|
|
effect that must be accounted for by comparing the observed peak against
|
|
a null distribution of peak statistics, not against the single-lag
|
|
Pearson $t$ distribution.
|
|
\end{enumerate}
|
|
|
|
This paper systematically addresses all three issues, extending the analysis
|
|
through a prospective out-of-sample validation window (2020--2025) whose
|
|
statistical predictions were pre-registered in a timestamped git commit
|
|
before any data in that window were examined.
|
|
|
|
The remainder of the paper is organised as follows.
|
|
Section~\ref{sec:data} describes the data sources and preprocessing.
|
|
Section~\ref{sec:methods} presents the statistical methods.
|
|
Section~\ref{sec:results} reports the results of each analysis stage.
|
|
Section~\ref{sec:discussion} interprets the findings.
|
|
Section~\ref{sec:conclusions} concludes.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Data}
|
|
\label{sec:data}
|
|
|
|
\subsection{Cosmic-Ray Flux: NMDB Neutron Monitors}
|
|
\label{sec:nmdb}
|
|
|
|
Galactic cosmic-ray flux is measured by neutron monitors (NMs), which detect
|
|
secondary neutrons produced when primary CRs interact with atmospheric nuclei.
|
|
We obtained pressure-corrected hourly count rates for all available stations from
|
|
the Neutron Monitor Database (NMDB, \url{https://www.nmdb.eu}) for the period
|
|
1976--2025.
|
|
|
|
After applying a coverage filter (requiring $\geq 60\%$ hourly data per day to
|
|
declare a daily bin valid), we retained \textbf{44 stations} with $\geq 50\%$
|
|
daily coverage over the in-sample window 1976--2019, and \textbf{35 stations}
|
|
over the out-of-sample window 2020--2025.
|
|
Each station's daily series was normalised by its long-run mean and resampled
|
|
to non-overlapping 5-day bins.
|
|
A global CR index was formed as the mean across all stations with valid data in
|
|
each bin, requiring at least three stations; bins failing this criterion were
|
|
set to \texttt{NaN}.
|
|
|
|
\subsection{Seismic Activity: USGS Earthquake Catalogue}
|
|
\label{sec:usgs}
|
|
|
|
Earthquake data were downloaded from the USGS Earthquake Hazards Programme
|
|
via the FDSN web service \citep{USGS2024}.
|
|
We retained all events with $M \geq 4.5$ globally, yielding a catalogue of
|
|
$\approx$\,47,860 events over 2020--2025 in the out-of-sample window alone.
|
|
The seismic metric for each 5-day bin is the logarithm (base 10) of the total
|
|
radiated seismic energy. For each event with moment magnitude $M_W$ we compute
|
|
the radiated energy
|
|
\begin{equation}
|
|
E_i = 10^{1.5 M_{W,i} + 4.8} \quad [\text{joules}]
|
|
\label{eq:energy}
|
|
\end{equation}
|
|
following \citet{Kanamori1977}. Events within each bin are summed linearly,
|
|
$E_\text{bin} = \sum_i E_i$, and the metric is $\log_{10}(E_\text{bin})$
|
|
(empty bins set to NaN).
|
|
This formulation correctly weights large earthquakes: an $M_W\,8.0$ event
|
|
contributes $\sim$1000$\times$ more energy than an $M_W\,6.0$ event.
|
|
Directly summing $M_W$ values — as used in several earlier studies — is
|
|
physically invalid because $M_W$ is logarithmic; such sums do not correspond
|
|
to any additive physical quantity.
|
|
|
|
\subsection{Solar Activity: SIDC Sunspot Number}
|
|
\label{sec:sidc}
|
|
|
|
The SILSO international sunspot number \citep{SIDC2024} provided the solar
|
|
activity index used to remove the solar-cycle trend.
|
|
We used the daily series (version 2.0), smoothed with a 365-day running mean
|
|
for detrending purposes.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Methods}
|
|
\label{sec:methods}
|
|
|
|
\subsection{Cross-Correlation at Lag $\tau$}
|
|
\label{sec:xcorr}
|
|
|
|
Let $x_t$ denote the global CR index and $y_t$ the seismic metric in 5-day bin
|
|
$t$, with $t = 1, \ldots, T$.
|
|
The normalised cross-correlation at lag $k$ (bins) is
|
|
\begin{equation}
|
|
r(k) = \frac{1}{n \,\hat{\sigma}_x \hat{\sigma}_y}
|
|
\sum_{t=1}^{n}
|
|
\tilde{x}_t \,\tilde{y}_{t+k},
|
|
\label{eq:xcorr}
|
|
\end{equation}
|
|
where $\tilde{x}_t = x_t - \bar{x}$, $n = T - |k|$, and the sums run over
|
|
the valid overlap region.
|
|
A positive lag $k > 0$ corresponds to CR leading seismicity.
|
|
Lags range from $-200$ to $+200$~days (step $= 5$~days, i.e.\ 1 bin),
|
|
giving 81 lag values in the in-sample window.
|
|
|
|
\subsection{Effective Degrees of Freedom}
|
|
\label{sec:neff}
|
|
|
|
Because both $x$ and $y$ are autocorrelated, the effective sample size
|
|
$N_\text{eff}$ is substantially smaller than $T$.
|
|
We compare three $N_\text{eff}$ estimators; all are listed in Table~\ref{tab:neff}.
|
|
The \citet{Bartlett1946} first-order formula uses only the lag-1 autocorrelations:
|
|
\begin{equation}
|
|
N_\text{eff}^\text{Bart} = T\,
|
|
\frac{1 - r_{xx}(1)\,r_{yy}(1)}
|
|
{1 + r_{xx}(1)\,r_{yy}(1)},
|
|
\label{eq:neff_bartlett}
|
|
\end{equation}
|
|
the \citet{Bretherton1999} full-sum formula uses all significant lags:
|
|
\begin{equation}
|
|
N_\text{eff}^\text{Breth} = T \left(
|
|
1 + 2 \sum_{k=1}^{K} r_{xx}(k)\, r_{yy}(k)
|
|
\right)^{-1},
|
|
\label{eq:neff_breth}
|
|
\end{equation}
|
|
where $r_{xx}$ and $r_{yy}$ are the sample autocorrelation functions of $x$
|
|
and $y$, and $K = 200$ lags;
|
|
and a Monte Carlo estimate from 1000 phase-randomised surrogates:
|
|
$N_\text{eff}^\text{MC} = 1/\hat{\sigma}^2_{r_\text{null}}$,
|
|
where $\hat{\sigma}^2_{r_\text{null}}$ is the variance of the surrogate
|
|
correlation values at the target lag.
|
|
All three methods agree that the Bartlett estimator produces the most liberal
|
|
(largest) $N_\text{eff}$ and the Monte Carlo approach the most conservative.
|
|
|
|
\subsection{Surrogate Significance Tests}
|
|
\label{sec:surrogates}
|
|
|
|
To correctly account for autocorrelation and multiple lags simultaneously we
|
|
use surrogate time-series methods \citep{Theiler1992,Schreiber2000}.
|
|
|
|
\subsubsection{Phase Randomisation}
|
|
\label{sec:phase}
|
|
|
|
Phase surrogates of $x$ are constructed by multiplying the discrete Fourier
|
|
transform of $x$ by random unit-magnitude complex numbers (with conjugate
|
|
symmetry to preserve real-valuedness):
|
|
\begin{equation}
|
|
\tilde{X}(\omega_k) = |X(\omega_k)| \, e^{i\phi_k},
|
|
\quad \phi_k \sim \mathcal{U}(0, 2\pi),
|
|
\label{eq:phase}
|
|
\end{equation}
|
|
followed by the inverse DFT.
|
|
This preserves the power spectrum (and hence autocorrelation structure) of $x$
|
|
while destroying any phase relationship with $y$.
|
|
|
|
\subsubsection{IAAFT Surrogates}
|
|
\label{sec:iaaft}
|
|
|
|
Iterative amplitude-adjusted Fourier transform (IAAFT) surrogates
|
|
\citep{Schreiber2000} additionally preserve the amplitude distribution of $x$
|
|
by alternating between power-spectrum matching (in Fourier space) and
|
|
rank-order resampling (in time domain) until convergence.
|
|
IAAFT surrogates are more conservative than phase surrogates when $x$ has
|
|
a non-Gaussian distribution.
|
|
|
|
\subsubsection{Block-Bootstrap Surrogates}
|
|
\label{sec:blockbootstrap}
|
|
|
|
IAAFT surrogates preserve the power spectrum and amplitude distribution of each
|
|
series but assume linearity and stationarity --- assumptions violated by seismicity,
|
|
which is non-stationary, heavy-tailed, and aftershock-clustered.
|
|
As a complementary null we use a \emph{circular block bootstrap} (CBB) that
|
|
makes no parametric assumptions about the spectral shape.
|
|
Each surrogate is constructed by independently resampling $x$ and $y$ in
|
|
contiguous circular blocks of length $B = 804$ bins ($\approx 11$~yr,
|
|
i.e.\ approximately one solar cycle), drawn with replacement.
|
|
Independent resampling of $x$ and $y$ destroys any cross-series correlation
|
|
while preserving each series' within-block temporal structure.
|
|
We generate $S = 5{,}000$ surrogate pairs and compare the resulting null
|
|
distribution to the IAAFT null.
|
|
|
|
\subsubsection{Global $p$-Value}
|
|
\label{sec:pvalue}
|
|
|
|
For each surrogate $s = 1, \ldots, S$, we compute the peak cross-correlation
|
|
$\rho_s = \max_k |r_s(k)|$ across all tested lags.
|
|
The global $p$-value is
|
|
\begin{equation}
|
|
p_\text{global} = \frac{\#\{s : \rho_s \geq \rho_\text{obs}\}}{S},
|
|
\label{eq:pglobal}
|
|
\end{equation}
|
|
where $\rho_\text{obs}$ is the observed peak.
|
|
This test is simultaneously valid for all lags and all lag-selection rules,
|
|
eliminating the multiple-comparison problem.
|
|
|
|
\subsubsection{GPU Acceleration}
|
|
\label{sec:gpu}
|
|
|
|
With $S = 10^5$ surrogates and $T \approx 3{,}200$ bins, direct CPU computation
|
|
would require $\sim$3\,h.
|
|
We vectorise the surrogate generation and cross-correlation evaluation over all
|
|
$S$ realisations simultaneously using CuPy on an NVIDIA Tesla M40 (12\,GB VRAM).
|
|
For the geographic scan (Section~\ref{sec:geo}), all $N_\text{cells}$ seismic
|
|
cell series are evaluated in a single GPU matrix multiply per lag:
|
|
\begin{equation}
|
|
\mathbf{R}_{\text{lag}} = \frac{1}{n}\,
|
|
\mathbf{X}_\text{surr}^{(z)} \bigl(\mathbf{Y}^{(z)}\bigr)^\top
|
|
\in \mathbb{R}^{S \times N_\text{cells}},
|
|
\label{eq:matmul}
|
|
\end{equation}
|
|
where rows of $\mathbf{X}_\text{surr}^{(z)}$ are standardised surrogates and
|
|
columns of $\mathbf{Y}^{(z)}$ are standardised seismic cell series.
|
|
Benchmarks show a $2.9\times$ speedup for phase surrogates and $1.3\times$ for IAAFT
|
|
(limited by chunked argsort to avoid VRAM overflow).
|
|
|
|
\subsection{Solar-Cycle Detrending}
|
|
\label{sec:detrend}
|
|
|
|
We apply three complementary detrending approaches to isolate the CR--seismic
|
|
relationship from the shared solar-cycle trend:
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Hodrick--Prescott (HP) filter} \citep{HP1997} with smoothing
|
|
parameter $\lambda = 4.54 \times 10^{10}$.
|
|
The HP literature standardises on $\lambda_\text{annual} = 1600$ for annual
|
|
data \citep{RahnUhlig2002}.
|
|
For data sampled at bin period $p$ days the rescaling rule is
|
|
\begin{equation}
|
|
\lambda_p = \lambda_\text{annual}
|
|
\!\left(\frac{365}{p}\right)^{\!4}
|
|
= 1600 \times (365/5)^4
|
|
\approx 4.54 \times 10^{10}.
|
|
\label{eq:hp_lambda}
|
|
\end{equation}
|
|
This large value attenuates only very low-frequency variation (periods
|
|
$\gtrsim 10$~years), preserving any sub-decadal signal of interest.
|
|
The trend component is subtracted from both $x_t$ and $y_t$.
|
|
|
|
\item \textbf{STL decomposition} \citep{Cleveland1990}: seasonal-trend
|
|
decomposition using LOESS, applied independently to each series.
|
|
|
|
\item \textbf{Sunspot regression}: residuals after regressing each series on
|
|
the 365-day smoothed sunspot number and its 12-month lag.
|
|
\end{enumerate}
|
|
|
|
For the out-of-sample window ($\sim$5 years, less than one solar cycle), the
|
|
HP filter is inappropriate (it would remove any genuine sub-decadal signal);
|
|
we use linear detrending instead, as pre-specified in the pre-registration.
|
|
|
|
\subsection{Partial Correlation Analysis}
|
|
\label{sec:methods:partialcorr}
|
|
|
|
Because the solar-cycle component is removed \emph{before} concluding that the
|
|
observed signal is solar-cycle-driven, there is a potential circularity concern.
|
|
We address this with a partial correlation analysis on the \emph{unfiltered} data.
|
|
The seismic metric $y_t$ is regressed on the smoothed sunspot number $z_t$:
|
|
\begin{equation}
|
|
\hat{y}_t = \hat{\beta} z_t + \hat{\alpha}, \qquad
|
|
y_t^\text{resid} = y_t - \hat{y}_t,
|
|
\label{eq:partialcorr}
|
|
\end{equation}
|
|
and the cross-correlation between the CR index $x_t$ and the residual
|
|
$y_t^\text{resid}$ is computed across all lags.
|
|
If the CR--seismic correlation is driven entirely by a shared solar-cycle trend,
|
|
$r(x, y^\text{resid})$ should approach zero.
|
|
This analysis avoids any digital filter and thus sidesteps the preprocessing
|
|
circularity.
|
|
|
|
\subsection{Nonlinear Dependence Tests}
|
|
\label{sec:methods:nonlinear}
|
|
|
|
Cross-correlation captures only linear dependence; we supplement it with two
|
|
nonlinear measures for the main CR--seismic pair.
|
|
|
|
\paragraph{Spectral coherence.}
|
|
The magnitude-squared coherence at frequency $f$ is
|
|
\begin{equation}
|
|
C_{xy}(f) = \frac{|S_{xy}(f)|^2}{S_{xx}(f)\,S_{yy}(f)},
|
|
\label{eq:coherence}
|
|
\end{equation}
|
|
estimated via Welch's method with $L = 2{,}048$-point Hann windows (75\%
|
|
overlap), giving a frequency resolution of $(1/5)/2048 \approx 0.036$
|
|
cycles~yr$^{-1}$ --- sufficient to resolve the solar-cycle band
|
|
(0.08--0.115~cycles~yr$^{-1}$, periods 9--12.5~yr).
|
|
|
|
\paragraph{Mutual information.}
|
|
We estimate $I(x; y)$ at lags $\tau = 0$ and $\tau = +15$~d using the
|
|
\citet{Kraskov2004} $k$-nearest-neighbour estimator ($k = 5$, Chebyshev
|
|
metric in joint space):
|
|
\begin{equation}
|
|
\hat{I}(x; y) = \psi(k) + \psi(N)
|
|
- \bigl\langle\psi(n_x + 1)\bigr\rangle
|
|
- \bigl\langle\psi(n_y + 1)\bigr\rangle,
|
|
\label{eq:mi}
|
|
\end{equation}
|
|
where $\psi$ is the digamma function and $n_x(i)$, $n_y(i)$ count marginal
|
|
neighbours within the $k$-th-neighbour radius.
|
|
Statistical significance is assessed against a shuffle null of 1{,}000
|
|
random permutations of $y$.
|
|
|
|
\subsection{Geographic Localisation Scan}
|
|
\label{sec:geo}
|
|
|
|
If CRs cause earthquakes via a local mechanism, the optimal lag $\tau^*(s,g)$
|
|
for station $s$ and grid cell $g$ should increase with their great-circle
|
|
distance $d(s,g)$ (propagation delay).
|
|
Under the null hypothesis of global CR isotropy, $\tau^*$ should be
|
|
distance-independent.
|
|
|
|
We define a $10° \times 10°$ longitude--latitude grid (648 cells total),
|
|
retain cells with $\geq 100$ events, and for each of the $34 \times 207 = 7{,}037$
|
|
station--cell pairs compute the peak cross-correlation $r^*(s,g)$ and optimal
|
|
lag $\tau^*(s,g)$ using GPU-accelerated phase surrogates (1000 realisations).
|
|
|
|
Pairs are declared significant at false discovery rate $q = 0.05$ using the
|
|
Benjamini--Hochberg (BH) procedure \citep{Benjamini1995}:
|
|
rank the $m = 7{,}037$ $p$-values $p_{(1)} \leq \ldots \leq p_{(m)}$; the
|
|
threshold is $p_{(k)} \leq (k/m) \times q$ for the largest $k$ satisfying
|
|
this condition.
|
|
Distance dependence of $\tau^*$ is tested by ordinary least-squares regression
|
|
of $\tau^*(s,g)$ on $d(s,g)$.
|
|
|
|
\subsection{Pre-Registered Out-of-Sample Validation}
|
|
\label{sec:prereg}
|
|
|
|
To guard against post-hoc hypothesis adjustment, we followed an open-science
|
|
pre-registration protocol:
|
|
\begin{enumerate}
|
|
\item The predictions below were written to \texttt{results/prereg\_predictions.md}.
|
|
\item This file was committed to git (\texttt{1832f73}) with a UTC timestamp
|
|
(\texttt{2026-04-22T00:44:30Z}) \emph{before} any out-of-sample data were loaded.
|
|
\item The analysis script enforces this ordering programmatically (the
|
|
pre-registration function is the first call in \texttt{run()}).
|
|
\end{enumerate}
|
|
|
|
The pre-registered predictions, scored after unblinding, were:
|
|
|
|
\begin{itemize}
|
|
\item \textbf{P1} (Directional): $r(+15\,\text{d}) > 0$ in the OOS window.
|
|
\item \textbf{P2} (Significance): $p_\text{global} < 0.05$ and a
|
|
non-negative rolling trend.
|
|
\item \textbf{P3} (Stability): rolling $r$ standard deviation $\leq 0.10$.
|
|
\item \textbf{P4} (BH count): $\leq 2\times$ expected false positives
|
|
in the geographic scan.
|
|
\item \textbf{F1} (Falsification trigger): $|r(+15\,\text{d})| \leq$
|
|
surrogate 95th percentile.
|
|
\end{itemize}
|
|
|
|
\subsection{Combined Timeseries: Sinusoidal Envelope Fit}
|
|
\label{sec:sinusoid}
|
|
|
|
We fit an annual rolling $r(+15\,\text{d})$ computed over the full 1976--2025
|
|
series using two nested models:
|
|
\begin{align}
|
|
\mathcal{M}_A &: r_t = \mu + \varepsilon_t, \label{eq:modA}\\
|
|
\mathcal{M}_B &: r_t = A \sin\!\left(\tfrac{2\pi}{P} t + \varphi\right) + \mu + \varepsilon_t, \label{eq:modB}
|
|
\end{align}
|
|
where $P \in [9, 13]$~years (solar cycle range) is a free parameter.
|
|
Model selection uses the Bayesian information criterion (BIC):
|
|
\begin{equation}
|
|
\text{BIC} = n \ln\!\left(\frac{\text{RSS}}{n}\right) + k \ln(n),
|
|
\label{eq:bic}
|
|
\end{equation}
|
|
with $k_A = 1$, $k_B = 4$, and the Bayes factor approximated as
|
|
\begin{equation}
|
|
\text{BF}_{BA} \approx \exp\!\left(\frac{\Delta\text{BIC}}{2}\right),
|
|
\quad \Delta\text{BIC} = \text{BIC}_A - \text{BIC}_B.
|
|
\label{eq:bf}
|
|
\end{equation}
|
|
Parameters are estimated by nonlinear least squares with a grid search over
|
|
$(P, \varphi)$ to avoid local minima.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Results}
|
|
\label{sec:results}
|
|
|
|
\subsection{Raw Pairwise Correlations Between CR, Seismic, and Sunspot Data}
|
|
\label{sec:res:rawcorr}
|
|
|
|
Before applying any detrending, we characterise the raw statistical relationships
|
|
between all three observable time series across the three analysis windows.
|
|
This serves as the pre-detrending baseline that motivates the Hodrick--Prescott
|
|
filter analysis described in Section~\ref{sec:res:detrend}.
|
|
|
|
\subsubsection{CR index: station distribution}
|
|
|
|
The global CR index is not collapsed to a single mean.
|
|
Instead, for each 5-day bin $t$ we compute the distribution of normalised
|
|
count-rate values across all $n_t \geq 3$ contributing stations,
|
|
yielding five order statistics:
|
|
$\hat{x}_{t}^{(q)} = q\text{-th percentile}$ for $q \in \{5, 25, 50, 75, 95\}$,
|
|
together with the inter-station minimum and maximum.
|
|
All stations are normalised by their individual long-run mean before the
|
|
cross-station percentile is taken, so the station-median series has a grand
|
|
mean near unity.
|
|
The scatter panels in Figures~\ref{fig:rawinsample}--\ref{fig:rawcombined}
|
|
use $\hat{x}_{t}^{(50)}$ as the central CR value (horizontal axis) and display
|
|
the $[\hat{x}_{t}^{(5)}, \hat{x}_{t}^{(95)}]$ inter-station spread as
|
|
horizontal error bars, with the $[\hat{x}_{t}^{\min}, \hat{x}_{t}^{\max}]$
|
|
range overlaid in a lighter shade.
|
|
|
|
\subsubsection{Seismic energy metric}
|
|
|
|
The seismic activity per bin is measured by the total released seismic energy,
|
|
computed as the sum of the individual earthquake energies:
|
|
\begin{equation}
|
|
E_t = \sum_{i \in \mathcal{B}_t} 10^{1.5\, M_{W,i}},
|
|
\label{eq:energy_bin}
|
|
\end{equation}
|
|
where $\mathcal{B}_t$ is the set of $M_W \geq 4.5$ events falling in bin $t$.
|
|
Working with $\log_{10} E_t$ removes the extreme skewness of $E_t$ and is
|
|
physically preferable to summing $M_W$ directly, which is dimensionally
|
|
inconsistent because magnitude is already a logarithmic quantity.
|
|
The $\log_{10} E_t$ axis in the scatter panels spans roughly three orders of
|
|
magnitude, reflecting the heavy-tailed Gutenberg--Richter distribution of
|
|
earthquake sizes.
|
|
|
|
\subsubsection{Sunspot number}
|
|
|
|
Daily sunspot numbers from SILSO are smoothed with a 365-day rolling mean to
|
|
suppress intra-year variability and isolate the solar-cycle envelope.
|
|
Each 5-day bin carries both the smoothed value (scatter-plot axis) and the
|
|
daily min--max spread within that bin (shown as additional horizontal error
|
|
bars on the sunspot axis in Figure~\ref{fig:rawinsample}--\ref{fig:rawcombined},
|
|
panel 3).
|
|
|
|
\subsubsection{Correlation results}
|
|
|
|
We compute both Pearson $r$ (linear; Fisher $z$-transform 95\% CI) and
|
|
Spearman $\rho$ (rank-based; appropriate for the heavy-tailed marginal
|
|
distributions of $E_t$ and CR flux).
|
|
Bonferroni correction for $3\ \text{pairs} \times 3\ \text{windows} = 9$
|
|
tests is applied; star levels refer to the corrected $p$-values.
|
|
Full results are given in Table~\ref{tab:rawcorr}.
|
|
|
|
\paragraph{CR vs.\ seismicity.}
|
|
The raw Pearson correlation between the station-median CR index and
|
|
$\log_{10} E_t$ is $r = 0.057$ in the in-sample window ($N = 3214$ bins,
|
|
$p_\text{Bonf} = 0.011$, $\rho = 0.071$, $p_\text{Bonf} < 10^{-3}$) and
|
|
$r = 0.046$ in the OOS window ($N = 390$, $p_\text{Bonf} = 1.0$).
|
|
The correlation is thus only marginally significant after correction and
|
|
disappears entirely in the independent OOS window.
|
|
Using the station-95th-percentile CR series in place of the median
|
|
amplifies the correlation slightly ($r = 0.069$ in-sample), suggesting that
|
|
high-flux excursions drive the signal.
|
|
|
|
\paragraph{CR vs.\ sunspot number.}
|
|
The dominant raw relationship in the dataset is a strong anti-correlation
|
|
between the CR index and the smoothed sunspot number: $r = -0.820$
|
|
($\rho = -0.854$) in-sample, and $r = -0.939$ ($\rho = -0.951$) in the OOS
|
|
window.
|
|
This is the well-established \emph{Forbush decrease} mechanism: a higher
|
|
heliospheric magnetic flux during solar maximum deflects more galactic cosmic
|
|
rays before they reach the Earth's atmosphere
|
|
\citep{Potgieter2013}, so CR flux and solar activity are naturally anti-phase.
|
|
The stronger OOS value ($-0.94$ vs.\ $-0.82$) reflects the wide dynamic range
|
|
of Solar Cycle~25 (2020--2025), which reached high activity levels after the
|
|
deep minimum of 2019--2020.
|
|
|
|
\paragraph{Sunspot vs.\ seismicity.}
|
|
The raw correlation between the smoothed sunspot number and seismic energy is
|
|
negative but small: $r = -0.095$ ($\rho = -0.099$) in-sample
|
|
($p_\text{Bonf} < 10^{-6}$) and indistinguishable from zero in the OOS window
|
|
($r = -0.023$, $p_\text{Bonf} = 1.0$).
|
|
The negative sign is consistent with reports of a weak inverse relationship
|
|
between solar activity and global seismicity
|
|
\citep{Odintsov2006}, though the OOS failure indicates this effect is not
|
|
robust.
|
|
|
|
\subsubsection{Interpretation: a confounding triangle}
|
|
|
|
The three raw correlations form a confounding triangle.
|
|
The CR--sunspot anti-correlation is strong ($r \approx -0.82$ to $-0.94$)
|
|
and physically understood.
|
|
The sunspot--seismicity anti-correlation is weak but significant in-sample
|
|
($r \approx -0.095$).
|
|
Together these imply a spurious \emph{positive} CR--seismicity correlation:
|
|
both CR and seismicity are jointly modulated by the $\sim$11-year solar cycle
|
|
--- CR increases during solar minima, and seismicity may very weakly increase
|
|
during solar minima --- so the two series covary positively without any direct
|
|
physical connection.
|
|
This confound cannot be resolved by naive cross-correlation; it requires
|
|
explicitly removing the shared solar-cycle component.
|
|
The HP filter analysis (Section~\ref{sec:res:detrend}) accomplishes exactly
|
|
this, and shows that the apparent CR--seismicity signal vanishes once the
|
|
solar-cycle trend is eliminated.
|
|
|
|
These raw correlations are intentionally uncontrolled for solar-cycle
|
|
modulation.
|
|
They are presented here as the pre-detrending baseline to document the scale
|
|
of the confound before any correction is applied, and to highlight that even
|
|
the uncorrected CR--seismicity correlation ($r = 0.057$) is far weaker than
|
|
the CR--sunspot anti-correlation ($r = -0.82$) that drives it.
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Raw pairwise correlation statistics across three time windows.
|
|
Bonferroni correction applied for $3 \times 3 = 9$ tests.
|
|
CR uses the per-bin station-median index ($\hat{x}^{(50)}$).
|
|
Seismic energy is $\log_{10}\!\left(\sum 10^{1.5 M_W}\right)$.
|
|
Sunspot is the 365-day smoothed daily count.
|
|
$^{*}p_\text{Bonf}<0.05$, $^{**}p_\text{Bonf}<0.01$,
|
|
$^{***}p_\text{Bonf}<0.001$.}
|
|
\label{tab:rawcorr}
|
|
\setlength{\tabcolsep}{4pt}
|
|
\begin{tabular}{llrrrrrrr}
|
|
\toprule
|
|
Pair & Window & $N$ &
|
|
$r$ & 95\% CI &
|
|
$p$ (raw) & $p$ (Bonf.) &
|
|
$\rho$ & $p_\rho$ (Bonf.) \\
|
|
\midrule
|
|
CR vs Seismicity & In-sample & 3214 & $+0.057^{*}$ & $[+0.023, +0.092]$ & $0.0012$ & $0.011$ & $+0.071^{***}$ & $<10^{-3}$ \\
|
|
& OOS & 390 & $+0.046$ & $[-0.053, +0.145]$ & $0.362$ & $1.00$ & $+0.018$ & $1.00$ \\
|
|
& Combined & 3604 & $+0.055^{**}$ & $[+0.023, +0.088]$ & $0.0009$ & $0.008$ & $+0.065^{***}$ & $<10^{-3}$ \\
|
|
\addlinespace
|
|
CR vs Sunspot & In-sample & 3109 & $-0.820^{***}$ & $[-0.831, -0.808]$ & $\approx 0$ & $\approx 0$ & $-0.854^{***}$ & $\approx 0$ \\
|
|
& OOS & 385 & $-0.939^{***}$ & $[-0.950, -0.926]$ & $\approx 0$ & $\approx 0$ & $-0.951^{***}$ & $\approx 0$ \\
|
|
& Combined & 3494 & $-0.815^{***}$ & $[-0.826, -0.804]$ & $\approx 0$ & $\approx 0$ & $-0.844^{***}$ & $\approx 0$ \\
|
|
\addlinespace
|
|
Sunspot vs Seismicity & In-sample & 3109 & $-0.095^{***}$ & $[-0.130, -0.060]$ & $1.1 \times 10^{-7}$ & $10^{-6}$ & $-0.099^{***}$ & $<10^{-6}$ \\
|
|
& OOS & 385 & $-0.023$ & $[-0.123, +0.077]$ & $0.648$ & $1.00$ & $-0.016$ & $1.00$ \\
|
|
& Combined & 3494 & $-0.086^{***}$ & $[-0.119, -0.053]$ & $3.4 \times 10^{-7}$ & $3 \times 10^{-6}$ & $-0.086^{***}$ & $<10^{-6}$ \\
|
|
\addlinespace
|
|
\bottomrule
|
|
\end{tabular}
|
|
|
|
\medskip
|
|
\raggedright
|
|
\textit{Note:} CR$_{p95}$ variant (station 95th-percentile instead of median)
|
|
gives qualitatively identical structure; full results in
|
|
\texttt{results/raw\_pairwise\_correlations.json}.
|
|
\end{table}
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=\textwidth]{raw_corr_insample.png}
|
|
\caption{Raw pairwise scatter plots for the in-sample window (1976--2019,
|
|
$N = 3{,}215$ five-day bins).
|
|
\textbf{Left}: CR station-median index vs $\log_{10}$ seismic energy; horizontal
|
|
error bars span the station $[\hat{x}^{(5)}, \hat{x}^{(95)}]$ spread.
|
|
\textbf{Centre}: CR index vs 365-day smoothed sunspot number; the strong
|
|
anti-correlation ($r = -0.82$) reflects the Forbush decrease mechanism.
|
|
\textbf{Right}: Smoothed sunspot number vs $\log_{10}$ seismic energy; thin
|
|
horizontal error bars show the daily sunspot spread within each 5-day bin.
|
|
All points are coloured by decimal year (plasma colormap), revealing the
|
|
solar-cycle drift: successive cycles trace the same anti-correlated arc in
|
|
the centre panel.
|
|
Black curves are LOWESS trend lines ($f = 0.4$).}
|
|
\label{fig:rawinsample}
|
|
\end{figure}
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=\textwidth]{raw_corr_oos.png}
|
|
\caption{Raw pairwise scatter plots for the out-of-sample window
|
|
(2020--2025, $N = 390$ bins, 27 NMDB stations).
|
|
Layout identical to Figure~\ref{fig:rawinsample}.
|
|
The CR--sunspot anti-correlation strengthens to $r = -0.939$ during Solar
|
|
Cycle~25, which had a particularly wide dynamic range.
|
|
The CR--seismicity correlation ($r = 0.046$, $p_\text{Bonf} = 1.0$) is
|
|
indistinguishable from zero.}
|
|
\label{fig:rawoos}
|
|
\end{figure}
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=\textwidth]{raw_corr_combined.png}
|
|
\caption{Raw pairwise scatter plots for the full combined window
|
|
(1976--2025, $N = 3{,}604$ bins).
|
|
The multi-decadal colour gradient in the centre panel shows CR and sunspot
|
|
oscillating in anti-phase across four complete solar cycles.
|
|
The overall CR--seismicity Pearson correlation ($r = 0.055$, $\rho = 0.065$)
|
|
is statistically significant ($p_\text{Bonf} = 0.008$) but quantitatively
|
|
negligible, and its sign follows directly from the confounding triangle
|
|
described in the text.}
|
|
\label{fig:rawcombined}
|
|
\end{figure}
|
|
|
|
\subsection{In-Sample Replication (1976--2019)}
|
|
\label{sec:res:insample}
|
|
|
|
Figure~\ref{fig:homola} shows the full cross-correlation function of the raw
|
|
(undetrended) CR index and seismic metric (log$_{10}$ summed energy, Eq.~\ref{eq:energy})
|
|
over 1976--2019 ($T = 3{,}215$ five-day bins, 44 stations).
|
|
The dominant peak is at $\tau = -525$~days ($r = 0.139$), corresponding to a
|
|
half-solar-cycle lead of seismicity over CR flux.
|
|
At the claimed lag $\tau = +15$~days we find $r = 0.081$.
|
|
Although naive significance is high (4.6$\sigma$ treating bins as i.i.d.), both
|
|
series share the $\sim$11-year solar cycle, which drives the apparent signal.
|
|
The Bartlett effective-$N$ estimate gives $N_\text{eff} = 2{,}916$ and
|
|
$\sigma_{r(+15)} = 4.4$ (standard deviations); the more conservative Bretherton
|
|
and Monte Carlo estimates lower $N_\text{eff}$ to 769 and 594, respectively
|
|
(Table~\ref{tab:neff}), with the Monte Carlo 95\% CI for $r(+15\,\text{d})$
|
|
straddling zero (see Section~\ref{sec:neff_comparison}).
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{homola_replication.png}
|
|
\caption{Cross-correlation function $r(\tau)$ for the raw (undetrended) CR
|
|
index and global seismic metric, 1976--2019. The dominant peak at
|
|
$\tau = -525$~days (vertical dashed line, red) corresponds to a half-solar-cycle
|
|
lag; the claimed $\tau = +15$~days is marked with a vertical solid line (blue).
|
|
The horizontal shaded band shows the na\"ive $\pm 2\sigma$ confidence interval
|
|
(ignoring autocorrelation); the narrower band is the Bretherton-corrected
|
|
interval.}
|
|
\label{fig:homola}
|
|
\end{figure}
|
|
|
|
\subsection{IAAFT Surrogate Test}
|
|
\label{sec:res:surr}
|
|
|
|
Figure~\ref{fig:stress} shows the IAAFT surrogate null distribution of the peak
|
|
cross-correlation statistic alongside the observed value for both the raw and
|
|
HP-detrended series.
|
|
For the \textit{raw} series: $\rho_\text{obs} = 0.139$ exceeds all 10{,}000
|
|
surrogates ($p_\text{global} < 10^{-4}$, $> 3.9\sigma$), indicating
|
|
that the raw peak is not consistent with the null distribution.
|
|
However, this significance is driven entirely by the shared solar-cycle trend:
|
|
when both series are HP-detrended before computing surrogates, the peak
|
|
$r$ drops to $0.101$ (at $\tau = -125$~days) and achieves
|
|
$p_\text{global} < 10^{-4}$ ($> 3.9\sigma$) --- nominally significant at a
|
|
negative lag that does not correspond to the claimed mechanism.
|
|
Crucially, $r(+15\,\text{d})$ after HP detrending is $0.027$, well within the
|
|
surrogate null distribution.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{stress_test.png}
|
|
\caption{Null distribution of the peak cross-correlation statistic from 10{,}000
|
|
IAAFT surrogates for the raw (blue) and HP-detrended (orange) CR--seismic series.
|
|
Vertical dashed lines mark the observed peak for each case.
|
|
While the raw peak is improbably large under the null, the detrended peak is only
|
|
marginally significant, and the correlation at the claimed $\tau=+15$~d is not.}
|
|
\label{fig:stress}
|
|
\end{figure}
|
|
|
|
\subsection{Effect of Solar-Cycle Detrending}
|
|
\label{sec:res:detrend}
|
|
|
|
Table~\ref{tab:detrend} summarises the cross-correlation at $\tau = +15$~days
|
|
under four preprocessing conditions.
|
|
The raw $r = 0.081$ falls to $0.027$ after HP filtering, to $0.030$ after STL
|
|
decomposition, and to $0.037$ after sunspot regression.
|
|
In all detrended cases the claimed $\tau = +15$~day signal is negligible.
|
|
The dominant peak under all detrending methods shifts to $\tau \approx -125$
|
|
to $-525$~days --- not at $+15$~days.
|
|
Figure~\ref{fig:detrend_robust} extends this comparison to three additional
|
|
detrending approaches (HP, Butterworth highpass, 12-month rolling-mean
|
|
subtraction) and confirms that no detrending method reveals a positive
|
|
correlation at the claimed lag (Section~\ref{sec:detrend_robust}).
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Cross-correlation statistics at $\tau = +15$~days under four
|
|
preprocessing conditions, in-sample window 1976--2019.}
|
|
\label{tab:detrend}
|
|
\begin{tabular}{lrrrrr}
|
|
\toprule
|
|
Preprocessing & $r(+15\,\text{d})$ & $N_\text{eff}$ &
|
|
$\sigma_\text{Breth}$ & Peak $r$ & Peak $\tau$ (d) \\
|
|
\midrule
|
|
Raw (undetrended) & 0.081 & 2{,}916 & 4.4 & 0.139 & $-525$ \\
|
|
HP filter & 0.027 & 3{,}199 & 1.5 & 0.101 & $-125$ \\
|
|
STL decomposition & 0.030 & 3{,}031 & 1.6 & 0.093 & $-525$ \\
|
|
Sunspot regression& 0.037 & 3{,}056 & 2.0 & 0.092 & $-125$ \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
Figure~\ref{fig:detrended} shows the cross-correlation functions before and
|
|
after HP detrending.
|
|
Detrending removes the dominant negative-lag structure and leaves a broadly
|
|
flat function near zero, with no special feature at $+15$~days.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{detrended_xcorr.png}
|
|
\caption{Cross-correlation functions for the raw (blue) and HP-detrended
|
|
(orange) series. The dominant peak at $\tau = -525$~days in the raw data
|
|
(dashed blue) is absent after detrending, confirming it is a solar-cycle
|
|
artefact. Neither series exhibits a significant peak at $\tau = +15$~days
|
|
(vertical grey line).}
|
|
\label{fig:detrended}
|
|
\end{figure}
|
|
|
|
\subsection{Detrending Robustness}
|
|
\label{sec:detrend_robust}
|
|
|
|
To confirm that the null result at $\tau = +15$~days does not depend on the
|
|
choice of detrending method, Figure~\ref{fig:detrend_robust} compares three
|
|
distinct approaches applied to the in-sample window: (i) the HP filter
|
|
($\lambda = 4.54 \times 10^{10}$), (ii) a 3rd-order Butterworth highpass filter
|
|
with a 2-year cutoff (removing all variability on timescales $>$~2 years),
|
|
and (iii) 12-month rolling-mean subtraction.
|
|
All three leave $r(+15\,\text{d}) < 0.04$, with no approach producing a feature
|
|
at the claimed lag.
|
|
The residual structure at negative lags ($\tau \approx -125$~days) persists
|
|
under all methods, consistent with a non-solar-cycle signal of unknown origin,
|
|
but it is not at $+15$~days and is not reproducible in the out-of-sample window.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{detrend_robustness.png}
|
|
\caption{Cross-correlation $r(\tau)$ under three detrending approaches
|
|
(HP filter, Butterworth highpass, rolling-mean subtraction) for the raw
|
|
CR index vs.\ log$_{10}$ seismic energy, 1976--2019.
|
|
The vertical grey line marks the claimed $\tau = +15$~days.
|
|
No method produces a positive feature at this lag.}
|
|
\label{fig:detrend_robust}
|
|
\end{figure}
|
|
|
|
\subsection{Comparison of $N_\text{eff}$ Estimators}
|
|
\label{sec:neff_comparison}
|
|
|
|
Table~\ref{tab:neff} compares the three $N_\text{eff}$ estimators described
|
|
in Section~\ref{sec:neff} for the raw in-sample series at $\tau = +15$~days.
|
|
The Bartlett first-order estimator is most liberal ($N_\text{eff} = 2{,}923$),
|
|
the Bretherton full-sum estimator substantially lower ($N_\text{eff} = 769$),
|
|
and the Monte Carlo phase-surrogate estimator most conservative
|
|
($N_\text{eff} = 594$).
|
|
The 95\% confidence interval for $r(+15\,\text{d})$ under the Monte Carlo
|
|
estimate straddles zero ($[-0.002, +0.158]$), meaning the raw correlation is
|
|
not statistically significant even before any detrending.
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Comparison of $N_\text{eff}$ estimators for raw in-sample series at
|
|
$\tau = +15$~days ($N = 3{,}214$, $r = 0.079$).}
|
|
\label{tab:neff}
|
|
\begin{tabular}{lrrr}
|
|
\toprule
|
|
Method & $N_\text{eff}$ & 95\% CI for $r$ & Includes zero? \\
|
|
\midrule
|
|
Bartlett 1946 (first-order) & 2{,}923 & [0.042, 0.115] & No \\
|
|
Bretherton 1999 (full sum) & 769 & [0.008, 0.148] & No \\
|
|
Monte Carlo (phase surr.) & 594 & [$-$0.002, 0.158] & Yes \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\subsection{Magnitude Threshold Sensitivity}
|
|
\label{sec:mag_threshold}
|
|
|
|
To assess sensitivity to the event selection threshold and potential bias from
|
|
aftershock contamination, we recomputed the seismic energy metric for three
|
|
minimum-magnitude cutoffs: $M \geq 4.5$ (baseline), $M \geq 5.0$, and
|
|
$M \geq 6.0$.
|
|
Figure~\ref{fig:mag_threshold} shows the cross-correlation function $r(\tau)$
|
|
for each threshold.
|
|
The values at the claimed lag are $r(+15\,\text{d}) = 0.079$, $0.072$, and
|
|
$0.050$ for $M \geq 4.5$, $5.0$, and $6.0$ respectively --- a range of only
|
|
$0.029$, well below the $0.05$ threshold for a meaningful shift.
|
|
The peak lag remains at $\tau = -525$~days for all thresholds, indicating
|
|
that aftershock swarms (which inflate the $M \geq 4.5$ catalogue in the days
|
|
following large events) do not drive the dominant correlation structure.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{magnitude_threshold.png}
|
|
\caption{Cross-correlation $r(\tau)$ for three magnitude cutoffs
|
|
($M \geq 4.5$, blue; $M \geq 5.0$, orange; $M \geq 6.0$, green),
|
|
in-sample window 1976--2019. The vertical grey line marks $\tau = +15$~days.
|
|
All three cutoffs yield similar, small correlations at the claimed lag,
|
|
and the dominant peak location ($\tau = -525$~days) is stable.}
|
|
\label{fig:mag_threshold}
|
|
\end{figure}
|
|
|
|
\subsection{Block-Bootstrap Surrogate Comparison}
|
|
\label{sec:res:blockbootstrap}
|
|
|
|
Figure~\ref{fig:blockbootstrap} shows the block-bootstrap (CBB) null
|
|
distributions for $r(+15\,\text{d})$ and the peak $|r|$ applied to the raw,
|
|
undetrended series.
|
|
Under the CBB null ($B = 804$ bins, $S = 5{,}000$):
|
|
\begin{itemize}
|
|
\item $p_\text{CBB}(+15\,\text{d}) = 0.022$ for the raw $r(+15\,\text{d}) = 0.079$;
|
|
\item $p_\text{CBB}(\text{peak}) = 0.008$ for the dominant raw peak
|
|
$r = 0.135$ at $\tau = -525$~days.
|
|
\end{itemize}
|
|
Both p-values reflect the fact that the raw series share a common 11-year
|
|
solar-cycle trend, so independently block-resampled surrogates occasionally
|
|
reproduce similar trend-induced correlations.
|
|
After HP detrending and IAAFT testing (Section~\ref{sec:res:surr}),
|
|
$r(+15\,\text{d})$ drops to 0.027 and falls well within the surrogate null,
|
|
confirming that the marginal significance seen here is a solar-cycle artefact,
|
|
not a genuine cross-series signal.
|
|
The block-bootstrap and IAAFT null shapes are qualitatively similar,
|
|
indicating that IAAFT's parametric assumptions do not materially distort the
|
|
null distribution for these data.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.95\textwidth]{block_bootstrap_null.png}
|
|
\caption{Block-bootstrap null distributions for the raw CR--seismic pair
|
|
($B = 804$ bins $\approx 11$~yr, $S = 5{,}000$ surrogates).
|
|
\textbf{Left}: distribution of $r(+15\,\text{d})$ under the CBB null; observed
|
|
value $r = 0.079$ (red) lies at the $p = 0.022$ tail.
|
|
\textbf{Right}: distribution of the peak $|r|$; observed peak $0.135$ at
|
|
$\tau = -525$~d lies at $p = 0.008$.
|
|
Grey dashed lines mark the 95\% CI of the null.}
|
|
\label{fig:blockbootstrap}
|
|
\end{figure}
|
|
|
|
\subsection{Partial Correlation: Controlling for Sunspot Number}
|
|
\label{sec:res:partialcorr}
|
|
|
|
Figure~\ref{fig:partialcorr} shows the cross-correlation function for both the
|
|
raw seismic metric and the sunspot-regressed residual.
|
|
The OLS regression of seismic energy on the smoothed sunspot number gives
|
|
$\hat{\beta} = -0.0011$~units per sunspot number unit (a weak negative
|
|
relationship) and $\hat{\alpha} = 14.9$.
|
|
After removing this sunspot component, $r(+15\,\text{d})$ drops from 0.079
|
|
to 0.029 --- a reduction of 63\%.
|
|
This confirms that most of the raw CR--seismic correlation at the claimed lag
|
|
is attributable to the shared solar-cycle trend without invoking any digital
|
|
filter.
|
|
The residual partial correlation ($r = 0.029$) is indistinguishable from zero
|
|
($p = 0.083$, treating bins as independent; even smaller when autocorrelation
|
|
is accounted for).
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{partial_correlation.png}
|
|
\caption{Cross-correlation $r(\tau)$ for the raw seismic metric (blue) and
|
|
the sunspot-regressed seismic residual (orange), both against the CR index,
|
|
on the raw (unfiltered) in-sample series.
|
|
The claimed $\tau = +15$~d (grey dashed line) shows $r_\text{raw} = 0.079$
|
|
vs.\ $r_\text{partial} = 0.029$, a 63\% reduction once the shared solar-cycle
|
|
component is regressed out.}
|
|
\label{fig:partialcorr}
|
|
\end{figure}
|
|
|
|
\subsection{Spectral Coherence and Mutual Information}
|
|
\label{sec:res:coherence}
|
|
|
|
Figure~\ref{fig:coherence} shows the magnitude-squared coherence spectrum and
|
|
the mutual information shuffle null.
|
|
|
|
\paragraph{Coherence.}
|
|
The mean coherence in the solar-cycle band (0.08--0.115~cycles~yr$^{-1}$)
|
|
is $\bar{C}_{xy} = 0.840$, substantially above the 95\% significance
|
|
threshold of 0.776 (for $K = 3$ Welch segments; see Section~\ref{sec:methods:nonlinear}).
|
|
This confirms that CR and seismic activity share strong common variance at
|
|
the $\sim$11-year frequency --- the spectral signature of the solar-cycle
|
|
confound identified in Section~\ref{sec:discussion}.
|
|
Note that the limited number of Welch segments ($K \approx 3$) makes the
|
|
coherence estimate at this frequency uncertain; the result should be interpreted
|
|
as indicative rather than precise.
|
|
|
|
\paragraph{Mutual information.}
|
|
The kNN mutual information at lag $\tau = 0$ is
|
|
$\hat{I} = 0.007$~nats, with a shuffle-null p-value of $p = 0.062$ --- not
|
|
significant.
|
|
At $\tau = +15$~d, $\hat{I} = 0.000$~nats ($p = 1.000$), confirming the
|
|
complete absence of any nonlinear association at the claimed lag.
|
|
Together with the linear cross-correlation results, this establishes that
|
|
there is no detectable statistical dependence --- linear or nonlinear ---
|
|
between the CR index and global seismicity at the claimed $+15$~day lag.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.95\textwidth]{spectral_coherence.png}
|
|
\caption{\textbf{Left}: Magnitude-squared coherence between the CR index and
|
|
seismic metric (blue), with the solar-cycle band shaded (orange, 0.08--0.115
|
|
cycles~yr$^{-1}$) and the 95\% significance level (red dashed).
|
|
The mean coherence in the SC band is 0.840, confirming a strong shared
|
|
solar-cycle component.
|
|
\textbf{Right}: kNN mutual information ($k = 5$) at lag $\tau = 0$ (blue)
|
|
and $\tau = +15$~d (orange) vs.\ their respective shuffle-null distributions.
|
|
Both observed MI values are indistinguishable from zero; $p(+15\,\text{d}) = 1.000$.}
|
|
\label{fig:coherence}
|
|
\end{figure}
|
|
|
|
\subsection{Missing-Data Sensitivity}
|
|
\label{sec:res:missing}
|
|
|
|
Table~\ref{tab:missing} reports the NaN fraction in the global CR index for
|
|
station thresholds of 2, 3, and 5.
|
|
In all cases the NaN fraction is 0.0\%: the 44-station NMDB network provides
|
|
complete 5-day coverage over 1976--2019 even under the strictest threshold
|
|
tested.
|
|
Consequently, $r(+15\,\text{d})$ is identical (0.079) across all thresholds,
|
|
confirming that the result is not an artefact of gaps in the CR record.
|
|
NaN bins show no clustering near solar maxima (clustering ratio = 1.0 at all
|
|
thresholds, i.e.\ indistinguishable from uniform), ruling out any systematic
|
|
data dropout that could correlate with the solar cycle.
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Missing-data sensitivity: global CR index and correlation at
|
|
$\tau = +15$~days for three station-threshold values.}
|
|
\label{tab:missing}
|
|
\begin{tabular}{rrrr}
|
|
\toprule
|
|
Min.\ stations & NaN fraction & NaN near solar max & $r(+15\,\text{d})$ \\
|
|
\midrule
|
|
2 & 0.0\% & 0.0\% & 0.079 \\
|
|
3 & 0.0\% & 0.0\% & 0.079 \\
|
|
5 & 0.0\% & 0.0\% & 0.079 \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\subsection{Bin-Size Sensitivity}
|
|
\label{sec:res:binsize}
|
|
|
|
Figure~\ref{fig:binsize} shows the cross-correlation functions at three bin
|
|
sizes: 1-day, 5-day, and 27-day (Bartels rotation period).
|
|
|
|
\begin{itemize}
|
|
\item \textbf{1-day bins}: $r(+15\,\text{d}) = 0.036$;
|
|
dominant peak $r = 0.088$ at $\tau = -525$~days.
|
|
\item \textbf{5-day bins} (baseline): $r(+15\,\text{d}) = 0.079$;
|
|
dominant peak $r = 0.135$ at $\tau = -525$~days.
|
|
\item \textbf{27-day bins}: $r(+27\,\text{d}) = 0.123$;
|
|
dominant peak $r = 0.217$ at $\tau \approx -513$~days.
|
|
\end{itemize}
|
|
|
|
In all cases the dominant peak is at approximately $-525$ to $-513$~days
|
|
($\approx -18$ months, consistent with a half-solar-cycle lag), not at
|
|
$+15$~days.
|
|
The correlation at the claimed lag grows with bin size because longer bins
|
|
increasingly average out high-frequency noise and emphasise the solar-cycle
|
|
component.
|
|
This bin-size dependence is the behaviour expected of a solar-cycle artefact
|
|
and is inconsistent with a genuine short-lag (15-day) physical mechanism,
|
|
which should be insensitive to whether one uses 1-day or 27-day bins.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=\textwidth]{bin_size_sensitivity.png}
|
|
\caption{Cross-correlation $r(\tau)$ for three bin sizes: 1-day (left),
|
|
5-day (centre), 27-day (right).
|
|
The dominant peak (red dotted) consistently falls near $\tau \approx -520$~days
|
|
across all bin sizes.
|
|
The correlation at the claimed $\tau = +15$~days (grey dashed) increases
|
|
from 0.036 to 0.123 as bin size increases, consistent with increasing
|
|
solar-cycle leakage rather than a physical short-lag signal.}
|
|
\label{fig:binsize}
|
|
\end{figure}
|
|
|
|
\subsection{Earthquake Declustering (Gardner--Knopoff)}
|
|
\label{sec:res:decluster}
|
|
|
|
Figure~\ref{fig:decluster} compares the cross-correlation for the full
|
|
catalogue and the mainshock-only catalogue after applying the
|
|
\citet{GardnerKnopoff1974} declustering algorithm.
|
|
Of 232{,}043 events ($M \geq 4.5$, 1976--2019), 65{,}874 (28.4\%) were
|
|
classified as aftershocks and removed, leaving 166{,}169 mainshocks.
|
|
|
|
The correlation at the claimed lag changes from $r(+15\,\text{d}) = 0.079$
|
|
(full) to $r(+15\,\text{d}) = 0.065$ (declustered), a reduction of only 0.014
|
|
--- well below the $\Delta r = 0.03$ threshold for a material change.
|
|
The peak structure and dominant lag are unchanged.
|
|
We conclude that aftershock clustering is not a primary confound in this
|
|
analysis; the signal does not disappear when aftershock sequences are removed.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{declustered_xcorr.png}
|
|
\caption{Cross-correlation for the full catalogue ($n = 232{,}043$ events,
|
|
blue) and the Gardner--Knopoff declustered catalogue ($n = 166{,}169$
|
|
mainshocks, orange), in-sample 1976--2019.
|
|
Removing 28.4\% of events as aftershocks changes $r(+15\,\text{d})$ by
|
|
only $\Delta r = 0.014$, confirming the result is not driven by aftershock
|
|
swarms.}
|
|
\label{fig:decluster}
|
|
\end{figure}
|
|
|
|
\subsection{Sub-Period Analysis by Solar Cycle}
|
|
\label{sec:res:subcycles}
|
|
|
|
Figure~\ref{fig:subcycles} shows the cross-correlation computed independently
|
|
within each of the four complete solar cycles (21--24) in the in-sample window.
|
|
Table~\ref{tab:subcycles} summarises the results.
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Per-solar-cycle cross-correlation at $\tau = +15$~days and the
|
|
within-cycle dominant peak.}
|
|
\label{tab:subcycles}
|
|
\begin{tabular}{lrrrrr}
|
|
\toprule
|
|
Cycle & Period & $N$ (bins) & $r(+15\,\text{d})$ & Peak $|r|$ & Peak $\tau$ (d) \\
|
|
\midrule
|
|
21 & 1976--1987 & 768 & $+0.071$ & 0.118 & $-65$ \\
|
|
22 & 1987--1996 & 724 & $+0.057$ & 0.104 & $-125$ \\
|
|
23 & 1997--2009 & 901 & $+0.018$ & 0.060 & $+125$ \\
|
|
24 & 2009--2020 & 810 & $+0.073$ & 0.177 & $-125$ \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
The correlations at $\tau = +15$~d are positive in all four cycles (range
|
|
0.018--0.073), but the magnitudes vary by a factor of four, and the dominant
|
|
within-cycle peak lags are inconsistent: $-65$, $-125$, $+125$, and $-125$~days.
|
|
If the correlation were a genuine physical CR precursor, its lag should be
|
|
stable and reproducible across cycles.
|
|
The inconsistency of the dominant peak --- which oscillates between positive
|
|
and negative lags --- is instead the signature of a cycle-to-cycle drift in
|
|
the relative phase between the CR and seismic solar-cycle modulations, a
|
|
classical hallmark of a shared-trend confound rather than a causal relationship.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=\textwidth]{solar_cycle_xcorr.png}
|
|
\caption{Cross-correlation $r(\tau)$ within each of the four complete solar
|
|
cycles (21--24) of the in-sample period (1976--2019).
|
|
The claimed lag $\tau = +15$~days (grey dashed) shows modest positive
|
|
correlations (0.018--0.073), but the dominant peak lag (red dotted) is
|
|
inconsistent across cycles ($-65$, $-125$, $+125$, $-125$~days), pointing
|
|
to a drift in the relative solar-cycle phase rather than a physical mechanism.}
|
|
\label{fig:subcycles}
|
|
\end{figure}
|
|
|
|
\subsection{Geographic Localisation}
|
|
\label{sec:res:geo}
|
|
|
|
Figure~\ref{fig:geoheatmap} shows the BH-adjusted significance map for all
|
|
station--cell pairs.
|
|
Of 7{,}037 pairs tested, 455 survive FDR correction at $q = 0.05$.
|
|
The expected number of false discoveries under the global null is
|
|
$351.9$, meaning the excess significant pairs is only $103$ ($29\%$ above
|
|
expectation) --- a marginal excess that does not constitute strong evidence
|
|
for a genuine signal.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{geo_heatmap.png}
|
|
\caption{Heatmap of BH-significant station--grid-cell pairs ($q = 0.05$).
|
|
Each row is an NMDB station; each column is a $10° \times 10°$ seismic grid
|
|
cell. Significant pairs (455/7{,}037) are scattered without obvious geographic
|
|
clustering, inconsistent with a local coupling mechanism.}
|
|
\label{fig:geoheatmap}
|
|
\end{figure}
|
|
|
|
Figure~\ref{fig:geodistlag} shows the regression of the optimal lag $\tau^*(s,g)$
|
|
on great-circle distance $d(s,g)$.
|
|
The slope is $\beta = -0.45$~days/1000\,km ($p = 0.21$, $R^2 = 0.0002$),
|
|
indistinguishable from zero.
|
|
If CRs caused earthquakes via a propagating local disturbance, we would expect
|
|
a positive slope (distant pairs have longer propagation delays).
|
|
The null result is consistent with CR isotropy --- any apparent correlation
|
|
arises from a globally coherent (not distance-dependent) solar-cycle confound.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{geo_distance_lag.png}
|
|
\caption{Optimal lag $\tau^*(s,g)$ vs.\ great-circle distance $d(s,g)$ for all
|
|
7{,}037 station--cell pairs (grey) and BH-significant pairs (coloured by peak $|r|$).
|
|
The OLS regression line (red) has slope $\beta = -0.45$~days/1000\,km ($p=0.21$),
|
|
consistent with zero. A local propagation mechanism would predict a positive slope.}
|
|
\label{fig:geodistlag}
|
|
\end{figure}
|
|
|
|
\subsection{Pre-Registered Out-of-Sample Validation (2020--2025)}
|
|
\label{sec:res:oos}
|
|
|
|
The out-of-sample analysis used data from 2020-01-01 to 2025-04-29 ($T = 390$
|
|
five-day bins, 35 NMDB stations), a window completely disjoint from the
|
|
in-sample period.
|
|
|
|
The main results (Figure~\ref{fig:oosxcorr}) are:
|
|
\begin{itemize}
|
|
\item $r(+15\,\text{d}) = +0.030$ (directionally correct, but very small);
|
|
\item Surrogate 95th percentile at $\tau = +15$~d: $0.101$ (observed is
|
|
well below this threshold);
|
|
\item $p_\text{global} = 0.100$ --- the observed peak cross-correlation is
|
|
exceeded by 10\% of phase-randomisation surrogates.
|
|
\end{itemize}
|
|
|
|
The prediction scorecard (Table~\ref{tab:prereg}) shows one pass (P1: correct
|
|
sign), one failure (P2: $p > 0.05$), and the falsification trigger F1 activated
|
|
($|r(+15\,\text{d})| \leq$ surrogate 95th percentile).
|
|
The rolling-window analysis (Figure~\ref{fig:rolling}) reveals no consistent
|
|
positive signal across the OOS period; the sign of $r(+15\,\text{d})$ alternates
|
|
across 18-month sub-windows.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{oos_xcorr.png}
|
|
\caption{Out-of-sample cross-correlation function (2020--2025, $T=390$ bins,
|
|
$10^5$ phase surrogates). The observed $r(\tau)$ (black) lies within
|
|
the surrogate 95th-percentile envelope (grey shading). The claimed signal at
|
|
$\tau = +15$~d (vertical line) is $r = 0.030$ --- below the surrogate 95th
|
|
percentile of 0.101.}
|
|
\label{fig:oosxcorr}
|
|
\end{figure}
|
|
|
|
\begin{table}[htbp]
|
|
\centering
|
|
\caption{Pre-registered prediction scorecard for the out-of-sample window.}
|
|
\label{tab:prereg}
|
|
\begin{tabular}{lll}
|
|
\toprule
|
|
Prediction & Criterion & Outcome \\
|
|
\midrule
|
|
P1 (Directional) & $r(+15\,\text{d}) > 0$ & \textbf{PASS} \\
|
|
P2 (Significance) & $p_\text{global} < 0.05$ & \textbf{FAIL} \\
|
|
P3 (Stability) & std(rolling $r$) $\leq 0.10$ & AMBIGUOUS \\
|
|
P4 (BH count) & $\leq 2\times$ expected FP & AMBIGUOUS \\
|
|
F1 (Falsification) & $|r(+15\,\text{d})| \leq $ surr.\ 95th & \textbf{TRIGGERED} \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{rolling_correlation_oos.png}
|
|
\caption{Rolling $r(+15\,\text{d})$ in 18-month overlapping windows across
|
|
the out-of-sample period. Error bars are bootstrap 95\% confidence intervals.
|
|
The grey horizontal band shows the surrogate 95th percentile.
|
|
The signal shows no consistent sign or trend.}
|
|
\label{fig:rolling}
|
|
\end{figure}
|
|
|
|
\subsection{Combined 1976--2025 Analysis: Sinusoidal Modulation}
|
|
\label{sec:res:combined}
|
|
|
|
Figure~\ref{fig:combined} shows the annual rolling $r(+15\,\text{d})$ over the
|
|
full 1976--2025 record, together with the best-fit sinusoidal envelope.
|
|
|
|
\begin{figure}[htbp]
|
|
\centering
|
|
\includegraphics[width=0.90\textwidth]{full_series_with_envelope_fit.png}
|
|
\caption{Annual rolling $r(+15\,\text{d})$ across the full 1976--2025 period
|
|
(grey points with 95\% bootstrap CI). The sinusoidal best-fit (red curve,
|
|
$P = 13.0$~yr) and the constant-mean model (dashed) are nearly indistinguishable;
|
|
BIC comparison slightly favours the constant model ($\text{BF} = 0.75$).
|
|
The vertical dashed line marks the in-sample/out-of-sample split (2020).}
|
|
\label{fig:combined}
|
|
\end{figure}
|
|
|
|
The global surrogate test on the full 1976--2025 window yields $p = 0.010$
|
|
($\sigma = 2.57$) at the dominant peak $\tau = -125$~days --- marginally
|
|
significant, but at a lag inconsistent with the claimed $+15$~day CR precursor.
|
|
|
|
The sinusoidal fit (Equations~\ref{eq:modA}--\ref{eq:bf}) does \emph{not} prefer
|
|
$\mathcal{M}_B$ over $\mathcal{M}_A$ with the corrected seismic metric:
|
|
\begin{itemize}
|
|
\item Best-fit period: $P = 13.0$~years;
|
|
\item Amplitude: $A = 0.047$;
|
|
\item $\Delta\text{BIC} = -0.57$ (negative = $\mathcal{M}_A$ constant preferred);
|
|
\item Bayes factor: $\text{BF}_{BA} = 0.75$.
|
|
\end{itemize}
|
|
|
|
A Bayes factor of 0.75 is less than 1.0, indicating that the constant (no
|
|
modulation) model is marginally preferred over the sinusoidal model.
|
|
With the correct seismic energy metric, the oscillatory rolling cross-correlation
|
|
previously attributed to solar-cycle modulation is no longer strongly supported;
|
|
the apparent modulation in the old results was partly an artefact of the invalid
|
|
summed-$M_W$ metric, which amplified the solar-cycle confound.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Discussion}
|
|
\label{sec:discussion}
|
|
|
|
\subsection{Why Does the Raw Correlation Appear So Strong?}
|
|
|
|
The raw $r \approx 0.31$ reported by \citet{Homola2023} at $\tau = +15$~days
|
|
(reduced to $r = 0.081$ with the correct seismic energy metric) and the naive
|
|
$p \sim 10^{-72}$ are products of compounding statistical errors:
|
|
(i) using a physically invalid seismic metric (direct sum of logarithmic $M_W$
|
|
values), which artificially inflated the solar-cycle variation in the seismic series,
|
|
(ii) treating autocorrelated time series as independent observations,
|
|
(iii) failing to account for the shared solar-cycle trend driving both CR flux and
|
|
seismicity, and (iv) not correcting for scanning over 401 lag values.
|
|
|
|
The solar cycle is the key confounder.
|
|
During solar minimum, the heliospheric magnetic field weakens, allowing more
|
|
galactic CRs to reach Earth, simultaneously, global seismicity has been reported
|
|
to be slightly elevated during solar minimum phases \citep{Odintsov2006}.
|
|
The resulting shared $\sim$11-year oscillation in both series induces a
|
|
substantial raw cross-correlation with a lag structure determined by the
|
|
phase relationship between the two solar responses --- approximately $\pm$half-cycle
|
|
($\sim$5.5 years $\approx 2{,}000$ days), consistent with the dominant raw peak
|
|
at $\tau = -525$~days.
|
|
|
|
\subsection{Physical Plausibility of the Claimed Mechanism}
|
|
|
|
Even setting aside the statistical issues, the proposed mechanism faces severe
|
|
physical constraints.
|
|
The total ionisation dose from galactic CRs at the surface is
|
|
$\sim 0.3$~mGy/year \citep{Aplin2005} --- far too small to
|
|
transfer meaningful mechanical energy to fault zones, which require shear-stress
|
|
changes of order $\sim$0.01--1~MPa to trigger earthquakes.
|
|
Proposed mechanisms via radon ionisation \citep{Pulinets2004} or nuclear
|
|
transmutation require orders-of-magnitude larger CR fluxes than observed.
|
|
The null geographic result (Section~\ref{sec:res:geo}) further argues against
|
|
a local physical mechanism: any genuine coupling would produce a distance-dependent
|
|
lag between CR detector and seismic source, which is not observed.
|
|
|
|
\subsection{Comparison with Prior Replication Attempts}
|
|
|
|
Independent replication attempts of the \citet{Homola2023} result have been
|
|
limited.
|
|
\citet{Urata2018} found similarly inflated correlations using Japanese CR stations
|
|
and reported that detrending removed most of the signal.
|
|
Our analysis is the first to combine all three of: IAAFT surrogate testing,
|
|
solar-cycle-aware detrending, geographic localisation scanning, and pre-registered
|
|
out-of-sample validation.
|
|
|
|
\subsection{Limitations}
|
|
|
|
Several limitations should be acknowledged:
|
|
\begin{enumerate}
|
|
\item The OOS window (2020--2025) encompasses Solar Cycle~25, a period of
|
|
rising solar activity after the deep minimum of Cycle~24.
|
|
The absence of a solar minimum in this window limits statistical power.
|
|
\item Seismicity is not stationary; major seismic sequences (e.g.\ Tonga 2022)
|
|
can inflate the seismic metric in individual bins.
|
|
\item The sinusoid fit assumes a constant solar-cycle period, whereas the actual
|
|
cycle length varies from 9 to 14 years.
|
|
\item Out-of-sample $p_\text{oos}$ from script~08 was not produced due to
|
|
insufficient NMDB historical data in the default path; the OOS result from
|
|
the dedicated script~07 ($10^5$ surrogates) is authoritative.
|
|
\end{enumerate}
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section{Conclusions}
|
|
\label{sec:conclusions}
|
|
|
|
We have conducted a rigorous, pre-registered replication of the claimed
|
|
cosmic-ray/earthquake correlation from \citet{Homola2023} using 49 years of
|
|
data from 44 neutron monitors, the USGS global catalogue, and SILSO sunspot
|
|
numbers.
|
|
Our principal findings are:
|
|
|
|
\begin{enumerate}
|
|
\item The raw cross-correlation $r(+15\,\text{d}) = 0.081$ (corrected seismic
|
|
energy metric) is modest and misleading; it is driven by a shared
|
|
$\sim$10-year solar-cycle modulation of both CR flux and global seismicity,
|
|
not by a physical CR$\to$seismic mechanism.
|
|
The larger value ($r \approx 0.31$) reported by \citet{Homola2023} results
|
|
from the physically invalid practice of directly summing logarithmic $M_W$
|
|
values rather than the underlying energies.
|
|
|
|
\item After solar-cycle detrending, $r(+15\,\text{d})$ falls to $0.027$ (HP),
|
|
$0.030$ (STL), or $0.037$ (sunspot regression) --- all within the surrogate
|
|
null distribution and consistent with zero.
|
|
|
|
\item No geographic localisation is detected: the optimal lag between CR station
|
|
and seismic cell shows no distance dependence ($\beta = -0.45$~d/1000~km,
|
|
$p = 0.21$), inconsistent with a local propagation mechanism.
|
|
|
|
\item A pre-registered out-of-sample test on 2020--2025 yields
|
|
$r(+15\,\text{d}) = 0.030$ and $p_\text{global} = 0.100$, entirely
|
|
consistent with noise.
|
|
|
|
\item The 49-year annual rolling correlation timeseries is not significantly
|
|
better described by a sinusoid than a constant (Bayes factor 0.75, favoring
|
|
constant); the previous sinusoidal modulation finding was an artefact of the
|
|
invalid seismic metric.
|
|
|
|
\item Seven additional robustness checks all corroborate the null result:
|
|
\begin{itemize}
|
|
\item Block-bootstrap surrogates ($B \approx 11$~yr, $n = 5{,}000$) yield
|
|
$p_\text{CBB}(+15\,\text{d}) = 0.022$ on the \emph{raw} series;
|
|
after detrending the result is within the null.
|
|
\item Partial correlation (regressing seismic on sunspots before correlating
|
|
with CR) reduces $r(+15\,\text{d})$ from 0.079 to 0.029 (63\% drop),
|
|
confirming solar-cycle confounding without any filter.
|
|
\item Mean spectral coherence in the solar-cycle band is 0.840 (> 95\%
|
|
threshold), while mutual information at $\tau = +15$~d is exactly zero
|
|
($p = 1.000$): no nonlinear signal at the claimed lag.
|
|
\item Missing data: 0\% NaN bins at all station thresholds;
|
|
$r(+15\,\text{d})$ is unchanged for min-station thresholds 2, 3, or 5.
|
|
\item Bin-size sensitivity: the dominant peak is at $\tau \approx -520$~days
|
|
for 1-day, 5-day, and 27-day bins; $r$ at $+15$~days scales with bin size
|
|
as expected for solar-cycle leakage, not a physical mechanism.
|
|
\item Earthquake declustering removes 28\% of events as aftershocks;
|
|
$r(+15\,\text{d})$ changes by only $\Delta r = 0.014$ --- not a material confound.
|
|
\item Per-solar-cycle analysis shows inconsistent dominant peak lags
|
|
($-65$, $-125$, $+125$, $-125$~days across cycles 21--24),
|
|
the signature of a phase-drifting solar-cycle artefact.
|
|
\end{itemize}
|
|
\end{enumerate}
|
|
|
|
We conclude that there is no statistically credible evidence for a physical
|
|
causal link between galactic cosmic-ray flux and global seismicity.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\section*{Data Availability}
|
|
|
|
All analysis code, pre-registration documents, intermediate results, and figures
|
|
are publicly available at
|
|
\url{https://github.com/pingud98/cosmicraysandearthquakes} under the MIT licence.
|
|
Raw data are freely accessible from their respective providers:
|
|
NMDB (\url{https://www.nmdb.eu}),
|
|
USGS (\url{https://earthquake.usgs.gov/fdsnws/event/1/}),
|
|
and SIDC (\url{https://www.sidc.be/silso/datafiles}).
|
|
|
|
\section*{Acknowledgements}
|
|
|
|
The author thanks the operators of the NMDB network for maintaining open-access
|
|
neutron monitor data, and the USGS Earthquake Hazards Programme for the FDSN
|
|
catalogue service.
|
|
GPU computations were performed on an NVIDIA Tesla M40.
|
|
|
|
%%%% ═══════════════════════════════════════════════════════════════════════════
|
|
\bibliographystyle{plainnat}
|
|
\bibliography{refs}
|
|
|
|
\end{document}
|