Why field experiments will be necessary to understand stratospheric aerosol injection
What would actually happen if SO₂ were released into the stratosphere from an airplane? The answer to this question underlies all of our projections for the effects of stratospheric aerosol injection (SAI), and right now that answer is based entirely on modeling and what we have observed from volcanoes. While there is a lot we can learn from volcanoes, lab work, and modeling, significant uncertainties remain that are only resolvable with field experiments.
What volcanoes tell us and why they’re not enough
We know stratospheric aerosol injection (SAI) would cool the surface because of volcanoes. Consider Mt. Pinatubo. The eruption of Mt. Pinatubo in 1991 put somewhere around 20 million tons of SO₂ into the stratosphere. The SO₂ formed aerosols that reflected sunlight, cooling the planet by about 0.5°C for a couple of years or so. SAI is the idea that we could put SO₂ (or something else) into the stratosphere intentionally to reflect some of the incoming sunlight.
We know this would cool the surface, but there is a lot we don’t know. Note the approximations: Around 20 million tons. About 0.5°C. A couple of years or so. That is a disturbing amount of ‘ish’ for something that affects the whole planet.
Measurements in the aftermath of a volcanic eruption are genuinely difficult, and natural variability in the climate system makes precisely attributing temperature changes to the stratospheric aerosols challenging. The models that we use to simulate SAI deployment use the eruption of Mt. Pinatubo as a benchmark, and even with the large observational uncertainty, some models fail to reproduce the Pinatubo aerosol observations. Volcanoes have shown us the diversity of aerosol responses to different injections of material: some eruptions cause stratospheric aerosols to grow larger, while others cause them to shrink, and models do not capture the evolution of the smaller aerosols. The aerosol physics relevant to model results is not explicitly resolved but is represented by necessarily imperfect parameterizations. Using more volcanic eruption data to try to fix parameterizations is not helpful because models also have biases in stratospheric winds that compound the problem. A controlled experiment, where you know exactly what was released and where, would isolate the process-level uncertainties obscured in those parameterizations.
A deliberate release would also be a fundamentally different physical scenario from a volcanic eruption. An airplane plume is a lot smaller than the global scale of Pinatubo’s ash cloud, and no one is proposing to put volcanic ash and halogen compounds into the stratosphere along with SO₂. Instead, any SAI deployment would involve a controlled release of a precursor gas from an airplane in known stratospheric conditions, and it is precisely that scenario that our models have never been tested against.
Model errors matter beyond the immediate scientific questions because these same models are the ones used to make predictions about what would happen if SAI were actually deployed. Making consequential decisions about deployment on the basis of models that cannot reproduce the observations we have is quite problematic. Conducting experiments that allow us to improve those models is not a provocation — it is a prerequisite for responsible reasoning about SAI.
The processes that matter and why they can’t only be studied in isolation
What would actually happen if SO₂ were released into the stratosphere from an airplane? We can describe the chain of processes in theory. The SO₂ would first mix in the wake of the aircraft, entraining surrounding air. It would then disperse to larger scales. Meanwhile, the SO₂ would oxidize to sulfuric acid (H₂SO₄), which would either form new particles through nucleation or condense onto existing aerosol particles. Those aerosols would coagulate over time and ultimately reflect sunlight back to space.
Of these seven processes (wake mixing, large-scale dispersion, oxidation, nucleation, condensation, coagulation, and reflection), four are relatively well understood. Oxidation chemistry, condensation onto existing aerosols, coagulation of particles, and reflection by aerosols are generally well known. The other three processes — how SO₂ mixes in the wake of an aircraft, how the resulting plume disperses at larger scales, and nucleation (the formation of new particles from the gas phase) — all have significant uncertainties.
One difficulty is that these processes are not independent of one another, and two of the ones we understand the least set the conditions for everything downstream. Nucleation, for example, depends on local SO₂ concentrations, which are determined by wake mixing and large-scale dispersion — the other two uncertain processes. Even for oxidation chemistry, which we understand well at a fundamental level, the rate depends on reactant concentrations that are set by the mixing and dispersion we do not yet understand. You cannot study the downstream processes in isolation and expect to understand the full system because the conditions under which they operate are uncertain. This is the core scientific case for a field experiment releasing SO₂ from an airplane: the things we do not understand sit at the beginning of the causal chain, and their effects propagate through everything that follows.
What we can resolve without releasing SO₂
That is not to say that every uncertainty requires releasing SO₂ from an airplane. Some can be reduced through analysis of existing data and new modeling efforts — wake mixing, for example, can be studied using computational fluid dynamics (CFD) tools that have previously been applied to contrail formation, though such models have rarely been used to assess stratospheric aircraft emissions in civilian contexts. Others can be addressed through field experiments that do not use an airplane to release any gas or substance into the atmosphere.
One field experiment we are actively exploring would help characterize large-scale dispersion in the stratosphere. The basic idea is to send up multiple pairs of small tethered balloons, cut the tether between the balloons at altitude, and then track their subsequent trajectories — the divergence of those trajectories over time gives you direct information about how chaotic and variable the system is, and by extension how a plume released from an aircraft would likely spread. This kind of experiment has deep scientific roots. The statistical theory of dispersion in turbulent flows goes back over a century to G. I. Taylor’s seminal 1921 paper on diffusion by continuous movements. The methodology has been applied extensively in the ocean: the DIMES experiment in the Southern Ocean, for example, deployed hundreds of subsurface floats and tracked their trajectories to derive lateral diffusivities, precisely the kind of measurement we would be making in the stratosphere. Two balloon experiments in the 1970’s and 80’s did similar calculations for the Southern Hemisphere stratosphere, but those balloons’ positions were tracked by satellites, only accurate to within 5 km, and were in the lowermost stratosphere.
The balloons for a modern stratospheric experiment are available and routinely used: companies like WindBorne and UrbanSky launch stratospheric balloons with small payloads regularly. The scientific question is novel; the equipment and methods are not. We are looking into what we can learn from existing Project Loon data to see whether this experiment is worth pursuing.
Even a comprehensive program of modeling, data analysis, and narrowly-targeted field experiments will leave residual uncertainties that can only be resolved by observing what actually happens when SO₂ is released from an aircraft in the stratosphere. The decision about how to sequence that work depends on what we learn as we move into detailed experiment design, but the need for it is not in question.
Scale, safety, and risk
A field experiment releasing SO₂ into the atmosphere can sound alarming in the abstract, but the abstract is precisely where the concern lives. A vague experiment is harder to evaluate and easier to fear. Detailing the specifics of potential experiments — quantities, logistics, measurement strategies — is precisely what makes them evaluable on their actual merits and risks.
The experiments we are thinking about are small. Because we understand aerosol reflection well, a field experiment that injects 2% or less of the SO₂ emitted daily by global aviation would resolve much of the uncertainty about SAI by teaching us about the aerosol properties. At that scale, there are no measurable climate impacts: the quantity of aerosol produced would be far too small to affect surface temperatures, to cause acid rain, to affect ozone in any detectable way, or to alter precipitation patterns.1 These are real concerns for large-scale SAI deployment, but the experiments we are describing are far too small to have these effects.
The genuine safety considerations are more prosaic: pure SO₂ is toxic, and so the main risks are associated with its handling. These are risks that we take seriously, and they are manageable through standard aviation and industrial safety protocols for working with toxic materials. The balloon experiments have an even simpler safety profile: the payloads are tiny, consisting of little more than GPS equipment, and stratospheric balloon launches are routine – not unlike the weather balloons that meteorologists launch daily around the world.
Why this matters
The goal of this research is not deployment. The goal is to know enough to make an informed decision about whether deployment would ever be warranted and, if so, how to do it responsibly. Right now, the evidence is thin. Our understanding of what SAI would actually do rests on observations of volcanic eruptions filtered through models that cannot reliably reproduce even those imperfect observations.
It is tempting to avoid any field experiment that releases SO₂ into the atmosphere. But that is not the cautious position — it is the reckless one. Small, targeted experiments will not resolve every uncertainty about SAI. However, they will resolve the ones that sit at the foundation of everything else we need to know and will help to avoid decisions taken about a planetary-scale intervention on the basis of models we know are inadequate.
To arrive at quantitative estimates, we do linear estimation based on scaling effects from particularly large estimates of SO2 impacts on these quantities. We consider a 10T SO2 injection for an experiment:
Surface temperature: We scale based on the high end estimate of cooling per SO2 from this study https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2023EF003851, which is 0.1 K/Tg SO2 injected. As we are considering 10-5 Tg SO2, the temperature perturbation is 10-6 K. For a maximally pessimistic perspective, we compare with the local vs. global effects of reductions in tropospheric SO2 emissions from shipping (https://esd.copernicus.org/articles/15/1527/2024/esd-15-1527-2024.html), recognizing that local surface effects due to tropospheric emission will be much larger than local surface effects due to stratospheric emissions. Local effects can be about 10x larger than global, so our highest estimate of temperature perturbation is 10-5 K. High end temperature sensors for weather stations measure to within 0.3K.
Acid rain: We compare to tropospheric emissions of SO2 from Kilauea and use the highest H+ region in the study to arrive at ~40 nmol H+/m3 with daily emissions of 1600 T. Our study will span a larger area, but to get a worst case scenario, we scale directly to arrive at 0.25 nmol H+/m3. Adding that to a neutral pH of 7 results in a decrease of 1.1x10-6 or a pH of approximately 6.999999. The most accurate laboratory pH meters measure to within 0.001 units. (https://www.sciencedirect.com/science/article/pii/S0160412016301052)
Stratospheric ozone: We scale relative to the La Soufriere eruption. Taking the more pessimistic case, 0.3 Tg resulted in a 0.6% decrease in polar ozone. That is 30,000 times the SO2 we would release in an experiment, so we’d expect 2x10-5 % change in polar ozone. This is well below the daily variability in total column ozone (~1%). (https://acp.copernicus.org/articles/23/15351/2023/)
Precipitation: We use a dramatic scenario that injects 30-40 Tg SO2/yr, which is enough to disrupt the monsoon circulation (https://www.nature.com/articles/s41612-024-00875-z). Clearly an experiment would do no such thing, but that model shows a decrease of 1.5 mm/day as the max change in monsoon rainfall. Making the gross overestimation that this effect would take place equally over the entire duration of an experiment (~1 month or 30 days), given the 10-5 Tg SO2 under consideration, the resulting precipitation change would be 1.5x10-5 mm. The best rain gauges measure to within about 0.1 mm.


