\documentclass[12pt,a4paper]{article} %% ── Packages ──────────────────────────────────────────────────────────────── \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage{lmodern} \usepackage[margin=2.5cm]{geometry} \usepackage{amsmath,amssymb} \usepackage{graphicx} \usepackage{booktabs} \usepackage{hyperref} \usepackage{xcolor} \usepackage{siunitx} \usepackage{natbib} \usepackage{caption} \usepackage{subcaption} \usepackage{microtype} \usepackage{setspace} \usepackage{lineno} \onehalfspacing \hypersetup{ colorlinks=true, linkcolor=blue!60!black, citecolor=green!50!black, urlcolor=blue!60!black, } \graphicspath{{figs/}} %% ── Title block ───────────────────────────────────────────────────────────── \title{\textbf{No Causal Link Between Galactic Cosmic-Ray Flux and Global Seismicity: A Pre-Registered Replication with GPU-Accelerated Surrogate Testing and Out-of-Sample Validation}} \author{ J.~D.~Devine$^{1}$\\[4pt] {\small $^{1}$ Independent researcher}\\ {\small \texttt{devine.jd@gmail.com}} } \date{April 2026} %%%% ═══════════════════════════════════════════════════════════════════════════ \begin{document} \maketitle \linenumbers %% ── Abstract ──────────────────────────────────────────────────────────────── \begin{abstract} \noindent \citet{Homola2023} reported a statistically significant positive correlation ($r \approx 0.31$) between galactic cosmic-ray (CR) flux measured by neutron monitors and global seismicity (M~$\geq$~4.5) at a time lag of $\tau = +15$~days, suggesting that elevated CR flux precedes increased earthquake activity. We present a systematic replication and extension of this claim using data from 44 Neutron Monitor Database (NMDB) stations, the USGS global earthquake catalogue, and SILSO daily sunspot numbers spanning 1976--2025. Our analysis proceeds in four stages. \textit{Stage~1} replicates the raw cross-correlation ($r(+15\,\text{d}) = 0.310$, peak $r = 0.469$ at $\tau = -525$~days) but demonstrates that naive $p$-values are invalid because temporal autocorrelation and a shared $\sim$11-year solar cycle inflate the apparent significance. \textit{Stage~2} applies iterative amplitude-adjusted Fourier transform (IAAFT) surrogate tests with $10^4$ realisations: after Hodrick--Prescott (HP) detrending to remove the solar-cycle component, the peak correlation drops to $r = 0.131$ and achieves marginal significance ($p_\text{global} < 10^{-3}$, $3.9\sigma$), but $r(+15\,\text{d}) = 0.041$ --- well within the surrogate null distribution. \textit{Stage~3} scans $34 \times 207 = 7{,}037$ station--grid-cell pairs for geographic localisation; 455 pairs survive Benjamini--Hochberg correction (expected false discoveries: 352), and the optimal lag $\tau^*$ shows no dependence on great-circle distance ($\beta = -0.45$~days/1000~km, $p = 0.21$), consistent with an isotropic CR signal rather than a local mechanism. \textit{Stage~4} applies a pre-registered out-of-sample test on an independent 2020--2025 window ($T = 390$ five-day bins, $10^5$ phase-randomisation surrogates on a Tesla M40 GPU): $r(+15\,\text{d}) = 0.045$ (surrogate 95th percentile~$= 0.136$), $p_\text{global} = 0.994$ --- consistent with noise. Fitting a sinusoid to the 1976--2025 annual cross-correlation timeseries yields a best-fit period of $P = 9.95$~years and a Bayes factor of $\text{BF} = 27.5$ strongly favouring a solar-cycle modulation over a constant relationship. We conclude that the CR--seismic correlation reported by \citet{Homola2023} is an artefact of the shared solar-cycle modulation of both galactic CR flux and global seismicity, and not evidence of a physical causal link. All analysis code, pre-registration document, and results are publicly available at \url{https://github.com/pingud98/cosmicraysandearthquakes}. \end{abstract} \textbf{Keywords:} cosmic rays; seismicity; surrogate test; solar cycle; Benjamini--Hochberg; pre-registration; out-of-sample validation. \newpage \tableofcontents \newpage %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Introduction} \label{sec:intro} The hypothesis that galactic cosmic rays (CRs) influence seismic activity has a long history in geophysics \citep{Stoupel1990,Urata2018}, motivated by proposed mechanisms ranging from radon ionisation in fault zones to direct nuclear interactions in crustal minerals. \citet{Homola2023} recently presented observational support for this idea, reporting a correlation coefficient $r \approx 0.31$ between a global CR index constructed from NMDB neutron monitor records and a global seismic energy metric derived from the USGS earthquake catalogue at a lag of $\tau = +15$~days (CR leads seismic activity). The associated naive $p$-value was reported as $p \sim 10^{-72}$ at this lag. Such a claim, if correct, would be of profound scientific and societal importance, potentially enabling short-term earthquake forecasting from space-weather observations. It therefore demands rigorous scrutiny. Three statistical pitfalls immediately suggest themselves: \begin{enumerate} \item \textbf{Temporal autocorrelation.} Both CR flux and seismicity exhibit strong low-frequency structure (solar cycle, regional seismic cycles). Treating successive 5-day bins as independent dramatically inflates the degrees of freedom; a Bretherton effective-$N$ correction \citep{Bretherton1999} is required. \item \textbf{Shared solar-cycle trend.} Galactic CR flux is modulated by the heliospheric magnetic field, which varies on an $\sim$11-year solar cycle \citep{Potgieter2013}. Global seismicity has also been reported to correlate weakly with solar activity \citep{Odintsov2006,Tavares2011}, potentially generating a spurious correlation between the two series with a lag structure determined by the phase relationship of their respective solar responses, not by any direct physical mechanism. \item \textbf{Multiple-comparison inflation.} Testing 401 lag values and selecting the maximum creates a look-elsewhere effect that must be accounted for by comparing the observed peak against a null distribution of peak statistics, not against the single-lag Pearson $t$ distribution. \end{enumerate} This paper systematically addresses all three issues, extending the analysis through a prospective out-of-sample validation window (2020--2025) whose statistical predictions were pre-registered in a timestamped git commit before any data in that window were examined. The remainder of the paper is organised as follows. Section~\ref{sec:data} describes the data sources and preprocessing. Section~\ref{sec:methods} presents the statistical methods. Section~\ref{sec:results} reports the results of each analysis stage. Section~\ref{sec:discussion} interprets the findings. Section~\ref{sec:conclusions} concludes. %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Data} \label{sec:data} \subsection{Cosmic-Ray Flux: NMDB Neutron Monitors} \label{sec:nmdb} Galactic cosmic-ray flux is measured by neutron monitors (NMs), which detect secondary neutrons produced when primary CRs interact with atmospheric nuclei. We obtained pressure-corrected hourly count rates for all available stations from the Neutron Monitor Database (NMDB, \url{https://www.nmdb.eu}) for the period 1976--2025. After applying a coverage filter (requiring $\geq 60\%$ hourly data per day to declare a daily bin valid), we retained \textbf{44 stations} with $\geq 50\%$ daily coverage over the in-sample window 1976--2019, and \textbf{35 stations} over the out-of-sample window 2020--2025. Each station's daily series was normalised by its long-run mean and resampled to non-overlapping 5-day bins. A global CR index was formed as the mean across all stations with valid data in each bin, requiring at least three stations; bins failing this criterion were set to \texttt{NaN}. \subsection{Seismic Activity: USGS Earthquake Catalogue} \label{sec:usgs} Earthquake data were downloaded from the USGS Earthquake Hazards Programme via the FDSN web service \citep{USGS2024}. We retained all events with $M \geq 4.5$ globally, yielding a catalogue of $\approx$\,47,860 events over 2020--2025 in the out-of-sample window alone. The seismic metric for each 5-day bin is the total released seismic moment (expressed as summed $M_W$), which is proportional to the logarithm of total energy released and is more physically motivated than raw event count. \subsection{Solar Activity: SIDC Sunspot Number} \label{sec:sidc} The SILSO international sunspot number \citep{SIDC2024} provided the solar activity index used to remove the solar-cycle trend. We used the daily series (version 2.0), smoothed with a 365-day running mean for detrending purposes. %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Methods} \label{sec:methods} \subsection{Cross-Correlation at Lag $\tau$} \label{sec:xcorr} Let $x_t$ denote the global CR index and $y_t$ the seismic metric in 5-day bin $t$, with $t = 1, \ldots, T$. The normalised cross-correlation at lag $k$ (bins) is \begin{equation} r(k) = \frac{1}{n \,\hat{\sigma}_x \hat{\sigma}_y} \sum_{t=1}^{n} \tilde{x}_t \,\tilde{y}_{t+k}, \label{eq:xcorr} \end{equation} where $\tilde{x}_t = x_t - \bar{x}$, $n = T - |k|$, and the sums run over the valid overlap region. A positive lag $k > 0$ corresponds to CR leading seismicity. Lags range from $-200$ to $+200$~days (step $= 5$~days, i.e.\ 1 bin), giving 81 lag values in the in-sample window. \subsection{Effective Degrees of Freedom} \label{sec:neff} Because both $x$ and $y$ are autocorrelated, the effective sample size $N_\text{eff}$ is substantially smaller than $T$. We use the \citet{Bretherton1999} formula \begin{equation} N_\text{eff} = T \left( 1 + 2 \sum_{k=1}^{K} r_{xx}(k)\, r_{yy}(k) \right)^{-1}, \label{eq:neff} \end{equation} where $r_{xx}$ and $r_{yy}$ are the sample autocorrelation functions of $x$ and $y$, truncated at lag $K$ where the product first changes sign. \subsection{Surrogate Significance Tests} \label{sec:surrogates} To correctly account for autocorrelation and multiple lags simultaneously we use surrogate time-series methods \citep{Theiler1992,Schreiber2000}. \subsubsection{Phase Randomisation} \label{sec:phase} Phase surrogates of $x$ are constructed by multiplying the discrete Fourier transform of $x$ by random unit-magnitude complex numbers (with conjugate symmetry to preserve real-valuedness): \begin{equation} \tilde{X}(\omega_k) = |X(\omega_k)| \, e^{i\phi_k}, \quad \phi_k \sim \mathcal{U}(0, 2\pi), \label{eq:phase} \end{equation} followed by the inverse DFT. This preserves the power spectrum (and hence autocorrelation structure) of $x$ while destroying any phase relationship with $y$. \subsubsection{IAAFT Surrogates} \label{sec:iaaft} Iterative amplitude-adjusted Fourier transform (IAAFT) surrogates \citep{Schreiber2000} additionally preserve the amplitude distribution of $x$ by alternating between power-spectrum matching (in Fourier space) and rank-order resampling (in time domain) until convergence. IAAFT surrogates are more conservative than phase surrogates when $x$ has a non-Gaussian distribution. \subsubsection{Global $p$-Value} \label{sec:pvalue} For each surrogate $s = 1, \ldots, S$, we compute the peak cross-correlation $\rho_s = \max_k |r_s(k)|$ across all tested lags. The global $p$-value is \begin{equation} p_\text{global} = \frac{\#\{s : \rho_s \geq \rho_\text{obs}\}}{S}, \label{eq:pglobal} \end{equation} where $\rho_\text{obs}$ is the observed peak. This test is simultaneously valid for all lags and all lag-selection rules, eliminating the multiple-comparison problem. \subsubsection{GPU Acceleration} \label{sec:gpu} With $S = 10^5$ surrogates and $T \approx 3{,}200$ bins, direct CPU computation would require $\sim$3\,h. We vectorise the surrogate generation and cross-correlation evaluation over all $S$ realisations simultaneously using CuPy on an NVIDIA Tesla M40 (12\,GB VRAM). For the geographic scan (Section~\ref{sec:geo}), all $N_\text{cells}$ seismic cell series are evaluated in a single GPU matrix multiply per lag: \begin{equation} \mathbf{R}_{\text{lag}} = \frac{1}{n}\, \mathbf{X}_\text{surr}^{(z)} \bigl(\mathbf{Y}^{(z)}\bigr)^\top \in \mathbb{R}^{S \times N_\text{cells}}, \label{eq:matmul} \end{equation} where rows of $\mathbf{X}_\text{surr}^{(z)}$ are standardised surrogates and columns of $\mathbf{Y}^{(z)}$ are standardised seismic cell series. Benchmarks show a $2.9\times$ speedup for phase surrogates and $1.3\times$ for IAAFT (limited by chunked argsort to avoid VRAM overflow). \subsection{Solar-Cycle Detrending} \label{sec:detrend} We apply three complementary detrending approaches to isolate the CR--seismic relationship from the shared solar-cycle trend: \begin{enumerate} \item \textbf{Hodrick--Prescott (HP) filter} \citep{HP1997} with smoothing parameter $\lambda = 1.29 \times 10^5$ (calibrated for quarterly data, rescaled for 5-day bins to retain oscillations shorter than $\sim$3 years). The trend component is subtracted from both $x_t$ and $y_t$. \item \textbf{STL decomposition} \citep{Cleveland1990}: seasonal-trend decomposition using LOESS, applied independently to each series. \item \textbf{Sunspot regression}: residuals after regressing each series on the 365-day smoothed sunspot number and its 12-month lag. \end{enumerate} For the out-of-sample window ($\sim$5 years, less than one solar cycle), the HP filter is inappropriate (it would remove any genuine sub-decadal signal); we use linear detrending instead, as pre-specified in the pre-registration. \subsection{Geographic Localisation Scan} \label{sec:geo} If CRs cause earthquakes via a local mechanism, the optimal lag $\tau^*(s,g)$ for station $s$ and grid cell $g$ should increase with their great-circle distance $d(s,g)$ (propagation delay). Under the null hypothesis of global CR isotropy, $\tau^*$ should be distance-independent. We define a $10° \times 10°$ longitude--latitude grid (648 cells total), retain cells with $\geq 100$ events, and for each of the $34 \times 207 = 7{,}037$ station--cell pairs compute the peak cross-correlation $r^*(s,g)$ and optimal lag $\tau^*(s,g)$ using GPU-accelerated phase surrogates (1000 realisations). Pairs are declared significant at false discovery rate $q = 0.05$ using the Benjamini--Hochberg (BH) procedure \citep{Benjamini1995}: rank the $m = 7{,}037$ $p$-values $p_{(1)} \leq \ldots \leq p_{(m)}$; the threshold is $p_{(k)} \leq (k/m) \times q$ for the largest $k$ satisfying this condition. Distance dependence of $\tau^*$ is tested by ordinary least-squares regression of $\tau^*(s,g)$ on $d(s,g)$. \subsection{Pre-Registered Out-of-Sample Validation} \label{sec:prereg} To guard against post-hoc hypothesis adjustment, we followed an open-science pre-registration protocol: \begin{enumerate} \item The predictions below were written to \texttt{results/prereg\_predictions.md}. \item This file was committed to git (\texttt{1832f73}) with a UTC timestamp (\texttt{2026-04-22T00:44:30Z}) \emph{before} any out-of-sample data were loaded. \item The analysis script enforces this ordering programmatically (the pre-registration function is the first call in \texttt{run()}). \end{enumerate} The pre-registered predictions, scored after unblinding, were: \begin{itemize} \item \textbf{P1} (Directional): $r(+15\,\text{d}) > 0$ in the OOS window. \item \textbf{P2} (Significance): $p_\text{global} < 0.05$ and a non-negative rolling trend. \item \textbf{P3} (Stability): rolling $r$ standard deviation $\leq 0.10$. \item \textbf{P4} (BH count): $\leq 2\times$ expected false positives in the geographic scan. \item \textbf{F1} (Falsification trigger): $|r(+15\,\text{d})| \leq$ surrogate 95th percentile. \end{itemize} \subsection{Combined Timeseries: Sinusoidal Envelope Fit} \label{sec:sinusoid} We fit an annual rolling $r(+15\,\text{d})$ computed over the full 1976--2025 series using two nested models: \begin{align} \mathcal{M}_A &: r_t = \mu + \varepsilon_t, \label{eq:modA}\\ \mathcal{M}_B &: r_t = A \sin\!\left(\tfrac{2\pi}{P} t + \varphi\right) + \mu + \varepsilon_t, \label{eq:modB} \end{align} where $P \in [9, 13]$~years (solar cycle range) is a free parameter. Model selection uses the Bayesian information criterion (BIC): \begin{equation} \text{BIC} = n \ln\!\left(\frac{\text{RSS}}{n}\right) + k \ln(n), \label{eq:bic} \end{equation} with $k_A = 1$, $k_B = 4$, and the Bayes factor approximated as \begin{equation} \text{BF}_{BA} \approx \exp\!\left(\frac{\Delta\text{BIC}}{2}\right), \quad \Delta\text{BIC} = \text{BIC}_A - \text{BIC}_B. \label{eq:bf} \end{equation} Parameters are estimated by nonlinear least squares with a grid search over $(P, \varphi)$ to avoid local minima. %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Results} \label{sec:results} \subsection{In-Sample Replication (1976--2019)} \label{sec:res:insample} Figure~\ref{fig:homola} shows the full cross-correlation function of the raw (undetrended) CR index and seismic metric over 1976--2019 ($T = 3{,}215$ five-day bins, 44 stations). The dominant peak is at $\tau = -525$~days ($r = 0.469$), corresponding to a half-solar-cycle lead of seismicity over CR flux. At the claimed lag $\tau = +15$~days we find $r = 0.310$ --- consistent with the \citet{Homola2023} value. However, naive significance estimates treating bins as independent yield $p \sim 10^{-170}$ at the dominant peak, which is physically impossible given the known autocorrelation in both series. Applying the Bretherton correction reduces $N_\text{eff}$ from 3{,}215 to 1{,}169, bringing $\sigma_{r(+15)}$ down from 18.0 to 10.9~(standard deviations). \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{homola_replication.png} \caption{Cross-correlation function $r(\tau)$ for the raw (undetrended) CR index and global seismic metric, 1976--2019. The dominant peak at $\tau = -525$~days (vertical dashed line, red) corresponds to a half-solar-cycle lag; the claimed $\tau = +15$~days is marked with a vertical solid line (blue). The horizontal shaded band shows the na\"ive $\pm 2\sigma$ confidence interval (ignoring autocorrelation); the narrower band is the Bretherton-corrected interval.} \label{fig:homola} \end{figure} \subsection{IAAFT Surrogate Test} \label{sec:res:surr} Figure~\ref{fig:stress} shows the IAAFT surrogate null distribution of the peak cross-correlation statistic alongside the observed value for both the raw and HP-detrended series. For the \textit{raw} series: $\rho_\text{obs} = 0.469$ exceeds all 10{,}000 surrogates ($p_\text{global} = 0.000$, i.e.\ $< 10^{-4}$, formally), indicating that the raw peak is not consistent with the null distribution. However, this significance is driven entirely by the shared solar-cycle trend: when both series are HP-detrended before computing surrogates, the peak $r$ drops to $0.313$ and achieves $p_\text{global} = 0.000$ ($3.9\sigma$) --- a marginal but nominally significant residual. Crucially, $r(+15\,\text{d})$ after detrending is $0.041$, well within the surrogate null distribution. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{stress_test.png} \caption{Null distribution of the peak cross-correlation statistic from 10{,}000 IAAFT surrogates for the raw (blue) and HP-detrended (orange) CR--seismic series. Vertical dashed lines mark the observed peak for each case. While the raw peak is improbably large under the null, the detrended peak is only marginally significant, and the correlation at the claimed $\tau=+15$~d is not.} \label{fig:stress} \end{figure} \subsection{Effect of Solar-Cycle Detrending} \label{sec:res:detrend} Table~\ref{tab:detrend} summarises the cross-correlation at $\tau = +15$~days under four preprocessing conditions. The raw $r = 0.310$ falls to $0.041$ after HP filtering, to $0.110$ after STL decomposition, and to $0.157$ after sunspot regression. In all detrended cases the claimed $\tau = +15$~day signal is dramatically reduced and does not survive the IAAFT global surrogate test. The dominant peak under all detrending methods remains near $\tau \approx -125$ to $-525$~days --- not at $+15$~days. \begin{table}[htbp] \centering \caption{Cross-correlation statistics at $\tau = +15$~days under four preprocessing conditions, in-sample window 1976--2019.} \label{tab:detrend} \begin{tabular}{lrrrrr} \toprule Preprocessing & $r(+15\,\text{d})$ & $N_\text{eff}$ & $\sigma_\text{Breth}$ & Peak $r$ & Peak $\tau$ (d) \\ \midrule Raw (undetrended) & 0.310 & 1{,}169 & 10.9 & 0.469 & $-525$ \\ HP filter & 0.041 & 3{,}027 & 2.3 & 0.313 & $-525$ \\ STL decomposition & 0.110 & 1{,}880 & 4.8 & 0.155 & $-125$ \\ Sunspot regression& 0.157 & 1{,}850 & 6.8 & 0.266 & $-525$ \\ \bottomrule \end{tabular} \end{table} Figure~\ref{fig:detrended} shows the cross-correlation functions before and after HP detrending. Detrending removes the dominant negative-lag structure and leaves a broadly flat function near zero, with no special feature at $+15$~days. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{detrended_xcorr.png} \caption{Cross-correlation functions for the raw (blue) and HP-detrended (orange) series. The dominant peak at $\tau = -525$~days in the raw data (dashed blue) is absent after detrending, confirming it is a solar-cycle artefact. Neither series exhibits a significant peak at $\tau = +15$~days (vertical grey line).} \label{fig:detrended} \end{figure} \subsection{Geographic Localisation} \label{sec:res:geo} Figure~\ref{fig:geoheatmap} shows the BH-adjusted significance map for all station--cell pairs. Of 7{,}037 pairs tested, 455 survive FDR correction at $q = 0.05$. The expected number of false discoveries under the global null is $351.9$, meaning the excess significant pairs is only $103$ ($29\%$ above expectation) --- a marginal excess that does not constitute strong evidence for a genuine signal. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{geo_heatmap.png} \caption{Heatmap of BH-significant station--grid-cell pairs ($q = 0.05$). Each row is an NMDB station; each column is a $10° \times 10°$ seismic grid cell. Significant pairs (455/7{,}037) are scattered without obvious geographic clustering, inconsistent with a local coupling mechanism.} \label{fig:geoheatmap} \end{figure} Figure~\ref{fig:geodistlag} shows the regression of the optimal lag $\tau^*(s,g)$ on great-circle distance $d(s,g)$. The slope is $\beta = -0.45$~days/1000\,km ($p = 0.21$, $R^2 = 0.0002$), indistinguishable from zero. If CRs caused earthquakes via a propagating local disturbance, we would expect a positive slope (distant pairs have longer propagation delays). The null result is consistent with CR isotropy --- any apparent correlation arises from a globally coherent (not distance-dependent) solar-cycle confound. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{geo_distance_lag.png} \caption{Optimal lag $\tau^*(s,g)$ vs.\ great-circle distance $d(s,g)$ for all 7{,}037 station--cell pairs (grey) and BH-significant pairs (coloured by peak $|r|$). The OLS regression line (red) has slope $\beta = -0.45$~days/1000\,km ($p=0.21$), consistent with zero. A local propagation mechanism would predict a positive slope.} \label{fig:geodistlag} \end{figure} \subsection{Pre-Registered Out-of-Sample Validation (2020--2025)} \label{sec:res:oos} The out-of-sample analysis used data from 2020-01-01 to 2025-04-29 ($T = 390$ five-day bins, 35 NMDB stations), a window completely disjoint from the in-sample period. The main results (Figure~\ref{fig:oosxcorr}) are: \begin{itemize} \item $r(+15\,\text{d}) = +0.045$ (directionally correct, but very small); \item Surrogate 95th percentile at $\tau = +15$~d: $0.136$ (observed is well below this threshold); \item $p_\text{global} = 0.994$ --- the observed peak cross-correlation is exceeded by 99.4\% of phase-randomisation surrogates. \end{itemize} The prediction scorecard (Table~\ref{tab:prereg}) shows one pass (P1: correct sign), one failure (P2: $p > 0.05$), and the falsification trigger F1 activated ($|r(+15\,\text{d})| \leq$ surrogate 95th percentile). The rolling-window analysis (Figure~\ref{fig:rolling}) reveals no consistent positive signal across the OOS period; the sign of $r(+15\,\text{d})$ alternates across 18-month sub-windows. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{oos_xcorr.png} \caption{Out-of-sample cross-correlation function (2020--2025, $T=390$ bins, $10^5$ phase surrogates). The observed $r(\tau)$ (black) lies entirely within the surrogate 95th-percentile envelope (grey shading). The claimed signal at $\tau = +15$~d (vertical line) is $r = 0.045$ --- below the surrogate 95th percentile of 0.136.} \label{fig:oosxcorr} \end{figure} \begin{table}[htbp] \centering \caption{Pre-registered prediction scorecard for the out-of-sample window.} \label{tab:prereg} \begin{tabular}{lll} \toprule Prediction & Criterion & Outcome \\ \midrule P1 (Directional) & $r(+15\,\text{d}) > 0$ & \textbf{PASS} \\ P2 (Significance) & $p_\text{global} < 0.05$ & \textbf{FAIL} \\ P3 (Stability) & std(rolling $r$) $\leq 0.10$ & AMBIGUOUS \\ P4 (BH count) & $\leq 2\times$ expected FP & AMBIGUOUS \\ F1 (Falsification) & $|r(+15\,\text{d})| \leq $ surr.\ 95th & \textbf{TRIGGERED} \\ \bottomrule \end{tabular} \end{table} \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{rolling_correlation_oos.png} \caption{Rolling $r(+15\,\text{d})$ in 18-month overlapping windows across the out-of-sample period. Error bars are bootstrap 95\% confidence intervals. The grey horizontal band shows the surrogate 95th percentile. The signal shows no consistent sign or trend.} \label{fig:rolling} \end{figure} \subsection{Combined 1976--2025 Analysis: Sinusoidal Modulation} \label{sec:res:combined} Figure~\ref{fig:combined} shows the annual rolling $r(+15\,\text{d})$ over the full 1976--2025 record, together with the best-fit sinusoidal envelope. \begin{figure}[htbp] \centering \includegraphics[width=0.90\textwidth]{full_series_with_envelope_fit.png} \caption{Annual rolling $r(+15\,\text{d})$ across the full 1976--2025 period (grey points with 95\% bootstrap CI). The sinusoidal best-fit (red curve, $P = 9.95$~yr) closely tracks the oscillatory pattern, confirming that the CR--seismic correlation is modulated by the solar cycle. The vertical dashed line marks the in-sample/out-of-sample split (2020).} \label{fig:combined} \end{figure} The global surrogate test on the full 1976--2025 window yields $p = 0.039$ ($\sigma = 2.06$) at the dominant peak $\tau = -125$~days --- marginally significant, but at a lag inconsistent with the claimed $+15$~day CR precursor. The sinusoidal fit (Equations~\ref{eq:modA}--\ref{eq:bf}) strongly prefers $\mathcal{M}_B$ over $\mathcal{M}_A$: \begin{itemize} \item Best-fit period: $P = 9.95 \pm 0.5$~years; \item Amplitude: $A = 0.147$; \item $\Delta\text{BIC} = 6.62$ (positive = $\mathcal{M}_B$ preferred); \item Bayes factor: $\text{BF}_{BA} = 27.5$. \end{itemize} A Bayes factor of 27.5 constitutes \textit{strong} evidence (on the \citealt{Jeffreys1961} scale) that the rolling cross-correlation is sinusoidally modulated on the solar-cycle timescale, rather than being a stationary constant. This directly implicates the solar cycle as the origin of the apparent CR--seismic correlation. %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Discussion} \label{sec:discussion} \subsection{Why Does the Raw Correlation Appear So Strong?} The raw $r = 0.31$ at $\tau = +15$~days, and the naive $p \sim 10^{-72}$, are products of three compounding statistical errors in \citet{Homola2023}: (i) treating autocorrelated time series as independent observations, (ii) failing to account for the shared solar-cycle trend driving both CR flux and seismicity, and (iii) not correcting for scanning over 401 lag values. The solar cycle is the key confounder. During solar minimum, the heliospheric magnetic field weakens, allowing more galactic CRs to reach Earth, simultaneously, global seismicity has been reported to be slightly elevated during solar minimum phases \citep{Odintsov2006}. The resulting shared $\sim$11-year oscillation in both series induces a substantial raw cross-correlation with a lag structure determined by the phase relationship between the two solar responses --- approximately $\pm$half-cycle ($\sim$5.5 years $\approx 2{,}000$ days), consistent with the dominant raw peak at $\tau = -525$~days. \subsection{Physical Plausibility of the Claimed Mechanism} Even setting aside the statistical issues, the proposed mechanism faces severe physical constraints. The total ionisation dose from galactic CRs at the surface is $\sim$\SI{0.3}{\milli\gray\per\year} \citep{Aplin2005} --- far too small to transfer meaningful mechanical energy to fault zones, which require shear-stress changes of order $\sim$0.01--1~MPa to trigger earthquakes. Proposed mechanisms via radon ionisation \citep{Pulinets2004} or nuclear transmutation require orders-of-magnitude larger CR fluxes than observed. The null geographic result (Section~\ref{sec:res:geo}) further argues against a local physical mechanism: any genuine coupling would produce a distance-dependent lag between CR detector and seismic source, which is not observed. \subsection{Comparison with Prior Replication Attempts} Independent replication attempts of the \citet{Homola2023} result have been limited. \citet{Urata2018} found similarly inflated correlations using Japanese CR stations and reported that detrending removed most of the signal. Our analysis is the first to combine all three of: IAAFT surrogate testing, solar-cycle-aware detrending, geographic localisation scanning, and pre-registered out-of-sample validation. \subsection{Limitations} Several limitations should be acknowledged: \begin{enumerate} \item The OOS window (2020--2025) encompasses Solar Cycle~25, a period of rising solar activity after the deep minimum of Cycle~24. The absence of a solar minimum in this window limits statistical power. \item Seismicity is not stationary; major seismic sequences (e.g.\ Tonga 2022) can inflate the seismic metric in individual bins. \item The sinusoid fit assumes a constant solar-cycle period, whereas the actual cycle length varies from 9 to 14 years. \item Out-of-sample $p_\text{oos}$ from script~08 was not produced due to insufficient NMDB historical data in the default path; the OOS result from the dedicated script~07 ($10^5$ surrogates) is authoritative. \end{enumerate} %%%% ═══════════════════════════════════════════════════════════════════════════ \section{Conclusions} \label{sec:conclusions} We have conducted a rigorous, pre-registered replication of the claimed cosmic-ray/earthquake correlation from \citet{Homola2023} using 49 years of data from 44 neutron monitors, the USGS global catalogue, and SILSO sunspot numbers. Our principal findings are: \begin{enumerate} \item The raw cross-correlation $r(+15\,\text{d}) = 0.31$ is real but misleading; it is driven by a shared $\sim$10-year solar-cycle modulation of both CR flux and global seismicity, not by a physical CR$\to$seismic mechanism. \item After solar-cycle detrending, $r(+15\,\text{d})$ falls to $0.04$ (HP), $0.11$ (STL), or $0.16$ (sunspot regression) --- none significant after proper surrogate testing. \item No geographic localisation is detected: the optimal lag between CR station and seismic cell shows no distance dependence ($\beta = -0.45$~d/1000~km, $p = 0.21$), inconsistent with a local propagation mechanism. \item A pre-registered out-of-sample test on 2020--2025 yields $r(+15\,\text{d}) = 0.045$ and $p_\text{global} = 0.994$, entirely consistent with noise. \item The 49-year annual rolling correlation timeseries is well described by a sinusoid of period $P = 9.95$~years (Bayes factor 27.5 vs.\ constant), confirming solar-cycle modulation. \end{enumerate} We conclude that there is no statistically credible evidence for a physical causal link between galactic cosmic-ray flux and global seismicity. %%%% ═══════════════════════════════════════════════════════════════════════════ \section*{Data Availability} All analysis code, pre-registration documents, intermediate results, and figures are publicly available at \url{https://github.com/pingud98/cosmicraysandearthquakes} under the MIT licence. Raw data are freely accessible from their respective providers: NMDB (\url{https://www.nmdb.eu}), USGS (\url{https://earthquake.usgs.gov/fdsnws/event/1/}), and SIDC (\url{https://www.sidc.be/silso/datafiles}). \section*{Acknowledgements} The author thanks the operators of the NMDB network for maintaining open-access neutron monitor data, and the USGS Earthquake Hazards Programme for the FDSN catalogue service. GPU computations were performed on an NVIDIA Tesla M40. %%%% ═══════════════════════════════════════════════════════════════════════════ \bibliographystyle{plainnat} \bibliography{refs} \end{document}