the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Development of a Model Framework for Terrestrial Carbon Flux Prediction: the Regional Carbon and Climate Analytics Tool (RCCAT) Applied to Non-tidal Wetlands
Abstract. Wetlands play a pivotal role in carbon sequestration but emit methane (CH4), creating uncertainty in their net climate impact. Although process-based models offer mechanistic insights into wetland dynamics, they are computationally expensive, uncertain, and difficult to upscale. In contrast, data-driven models provide a scalable alternative by leveraging extensive datasets to identify patterns and relationships, making them more adaptable for large-scale applications. However, their performance can vary significantly depending on the quality and representativeness of the data, as well as the model design, which raises questions about their reliability and generalizability in complex wetland systems. To address these issues, we present a data-driven framework for upscaling wetland CO2 and CH4 emissions, across a range of machine learning models that vary in complexity, validated against an extensive observational dataset from the Sacramento-San Joaquin Delta. We show that artificial intelligence (AI) approaches, including Random Forests, gradient boosting methods (XGBoost, LightGBM), Support Vector Machines (SVM) and Recurrent Neural Networks (GRU, LSTM), outperform linear regression models, with RNNs standing out, achieving an R2 of 0.71 for daily CO2 flux predictions compared to 0.62 for linear regression, and an R2 of 0.60 for CH4 flux predictions compared to 0.54 for linear regression. Despite that, interannual variability is less well captured, with annual mean absolute error of 193 gC m-2 yr-1 for CO2 fluxes and 11 gC-CH4 m-2 yr-1 for CH4 fluxes. By integrating vertically-resolved atmospheric, subsurface, and spectral reflectance information from readily available sources, the model identifies key drivers of wetland CO2 and CH4 emissions and enables regional upscaling. These findings demonstrate the potential of AI methods for upscaling, providing practical tools for wetland management and restoration planning to support climate mitigation efforts.
- Preprint
(1888 KB) - Metadata XML
-
Supplement
(117 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-361', Toni Viskari, 15 Apr 2025
This is a review for the manuscript "Development of a Model Framework for Terrestrial Carbon Flux Prediction: the Regional Carbon and Climate Analytics Tool (RCCAT) Applied to Non-tidal Wetlands" submitted by Brereton et al. In this work multiple machine learning methods are tested within an established framework using a long measurement dataset from three sites on the Sacramento-San Joaquin Delta. In the examination, not only is the performance evaluated, but also the practical benefit of additional complexity.
For me, this was a well written manuscript that explains clearly the motivation for the work, how it was done and how the results should be interpreted. Overall, I thought the work here was so excellently presented that I almost feel guilty about the few minor notes I have below as I do not wish it to come across as just looking for something to be critical of. My notes, though, are so simply to address that I feel comfortable listing this as a recommendation for minor revisions.
Line 381: "After selecting LSTM as the model of choice..."
This paragraph belongs to the Methods as it explains how the work is done with very little with the actual results.
Figure 3: The lines in the legends here need to be thicker as in its current presentation, it is very difficult, at least for me, to gather with a quick glance which color represents which line. Additionally I would recommend reconsidering using, for example, red and blue instead of blue and green as the shades applied here are a bit too close to each other.
Figure 5: This figure should just be moved to supplemental material. There is just far too much empty space here some of the locations with data in it are so small that I had to look at the figure for a long while to be certain if it was even there. Note that while I am critical of this, I also cannot think of a better way to visually present this kind of map data.
Citation: https://doi.org/10.5194/egusphere-2025-361-RC1 -
RC2: 'Comment on egusphere-2025-361', Anonymous Referee #2, 05 May 2025
Brereton et al. describe the data-driven modeling framework for estimating terrestrial carbon fluxes, with a focus on its application to non-tidal wetlands. The study makes a valuable attempt at spatial upscaling, which is an important component in assessing carbon budgets in wetland ecosystems. The effort to scale predictions beyond site-level measurements highlights the potential utility of the dataset and model framework. The models were trained and validated using sufficient long-term observational data in the Sacramento Delta. However, I have several concerns, detailed below. While the manuscript is written in a clear and logical manner, it lacks sufficient detail to assess the validity of the upscaling approach. Substantial revisions to the manuscript are necessary for publication.
Major comments:
1. L19-21
Since this study focuses on daily-scale calculations and a specific region, the computational cost is weak as the limitations of the process-based model. It would be more appropriate to discuss the constraints of the input parameters, as mentioned in the introduction.2. L308-309 “tangible benefits over linear regression methods for upscaling flux predictions”
I assume that the scores presented in this paragraph were derived from LOSO cross-validation results across all three sites, as shown in the scatter plots in Figure 3. However, due to the lack of information on how these scores were calculated, it is difficult for readers to determine whether they support the validity of the upscaling and extrapolation.3. The scatter plots in Fig.3 show a patterned, line-like distribution. Is there a specific reason for this? While it might be due to regression to the mean, could this also occur with data that includes observational errors?
4. The R² values indicate good predictive performance of the model and suggest high extrapolation potential. Interpreting why the selected variables in this model effectively explain CO₂ and CH₄ fluxes would support broader application of the model. Can any explanations be drawn from observational evidence or insights from previous studies?
5. What does the gray grid in Figure 5 represent? It is difficult to understand the spatial extent of the region covered in Figure 5. Additionally, the location and relative distance to the training sites are unclear.
6. The manuscript discusses spatial variation based on Figure 5, but it is difficult to interpret this from the figure. Since upscaling is a key component of the study, it is important to clearly present the spatial distribution. While the general north-south differences can be understood, it is hard to assess spatial variation between adjacent areas. It may be helpful to either enlarge specific regions or refine the color scale to better illustrate the variation.
7. I understand the extrapolation capability assessed by cross-validation within the Sacramento Delta. However, in evaluating the validity of the upscaling, I believe it is also important to assess prediction accuracy in areas farther from the three nearby training sites. Even if full time series data are not available, is it possible to evaluate the model’s validity using datasets (observations or estimates in previous studies) from different locations or periods?
8. L446-450 “The difficulty in reproducing …”
This discussion is important for evaluating the model’s effectiveness and for guiding future improvements. Would it be possible to objectively and quantitatively explain the causes of the discrepancies in the early years using previous studies or available datasets?9. Does this framework require any special considerations for determining the models' hyperparameters?
10. What are the temporal and spatial resolutions of the input features? How are 4-day MODIS products applied to daily scale predictions?
Minner comments:
1. There are some cases where the formatting does not conform to the GMD template (e.g., line numbering). Additionally, there are several places where the font style unexpectedly changes in the middle of a sentence (e.g., L80).
2. Do the squares on the left plot in Figure 1 exactly match the area shown in Google Earth pictures?
3. L382 upscaling.. -> upscaling.
Citation: https://doi.org/10.5194/egusphere-2025-361-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
180 | 42 | 11 | 233 | 30 | 10 | 12 |
- HTML: 180
- PDF: 42
- XML: 11
- Total: 233
- Supplement: 30
- BibTeX: 10
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1