8 of 8 · reference

Appendix

Sources, terms, and a couple of the technical specifics — for the curious or for anyone who wants to dig deeper. Skip if you're skimming.

Glossary

RMSE	Root mean squared error — the standard accuracy metric for forecast predictions vs. truth. Lower is better.
GFS	Global Forecast System — NOAA's operational global weather model. The industry-standard baseline.
ERA5	ECMWF's gold-standard reanalysis dataset. Used as ground truth for validation.
CHIRPS	Climate Hazards InfraRed Precipitation with Station data. A satellite + station blended precipitation dataset. Second validation source.
EMOS-NGR	Ensemble Model Output Statistics — Non-homogeneous Gaussian Regression. The post-processing method that combines the three ensemble members and produces calibrated uncertainty.
FCN3	NVIDIA's spherical Fourier neural operator weather model. One of the three ensemble members.
GraphCast	Google DeepMind's graph neural network weather model. Another ensemble member.
Calibrated uncertainty	A probability distribution where the stated probability matches reality — when the model says "80% chance of more than 5 mm," it should be right 80% of the time.
DMH / DINAC	Dirección de Meteorología e Hidrología, the Paraguayan national met service, sitting inside the civil aviation authority DINAC.
MAG	Ministerio de Agricultura y Ganadería — Paraguay's agriculture ministry.
Itaipú Binacional	The binational (Paraguay + Brazil) hydroelectric dam authority. Operates a substantial gauge network in the Paraná basin — exactly the region where this project's biggest validation gap is.

Source materials

Live technical demo — the canonical place to see results.
Project repository — private; available on request.
Validation methodology: 60 forecast initialization dates spanning October 2024 – March 2025, validated at lead times 24–240 hours in 6-hour steps. Bootstrap confidence intervals on all reported improvements.

The validation regions

The Paraguayan domain used throughout:

Region	Lat range	Lon range	Use
Full domain	−27.5 to −19.0	−63.0 to −54.0	National-level reporting
East / soybean belt	−27.0 to −22.0	−57.0 to −54.0	Primary commercial relevance
Chaco (northwest)	−23.0 to −19.0	−63.0 to −58.0	Drier; secondary cattle/cotton region

Stratified results — the honest table

Per-bin RMSE on the full × ERA5 view (60 dates × 35 × 37 cells = 77,700 observations). This is the table that lives behind the +25.7% headline.

Truth bin	n obs	Mean truth (mm)	Ensemble RMSE	GFS RMSE	Skill vs. GFS
Dry (<1 mm)	48,904	0.17	1.86	3.20	+41.7%
Light (1–10 mm)	20,705	3.80	4.98	9.18	+45.8%
Moderate (10–25 mm)	6,056	15.48	10.85	14.00	+22.5%
Heavy (>25 mm)	2,035	40.57	33.98	30.68	−10.8%

The pattern is consistent across all four validation views (full × ERA5, east × ERA5, full × CHIRPS, east × CHIRPS). Heavy events lose to GFS by 6–19% across the four. The bias is structural at 25 km cell resolution.

What's running passively right now

EMA poller. A Modal cron job pulls Paraguay's public weather station feed every 5 minutes and archives a daily Parquet file. Gives us a forward-looking observation archive with no permission required.
The deployed demo. Hosted on Netlify; regenerated on each repository push.

Why I'm doing this

Two reasons.

First, I think AI weather models are interesting and the application to underserved regions is more impactful than another marginal improvement on a North American or European benchmark. Paraguay is a real economy with real exposure to weather risk and worse forecasting infrastructure than its neighbors.

Second, I wanted to find out whether one person with a laptop can take a credible technical artifact from "research" to "actually used by a stakeholder" without an institution behind them. The honest answer to that is "I don't know yet — that's what the next 90 days are about."

That's the end of the plan site. Back to the overview if you want to reread anything, or how to help if anything specific came to mind.