The CommUnity near-Surface Permafrost (CUSP) dataset: a global compilation of permafrost observations and related properties to support AI/ML model development
Permafrost loss from warming has increased at alarming rates. Given the preponderance of permafrost coverage in high latitude regions, these changes may result in widespread shifts in vegetation, hydrology, landscape or infrastructure stability and the global carbon cycle. Understanding the magnitude and extent of these changes as well as when they are likely to occur requires robust spatial coverage of permafrost observations and related data products. Yet, over large portions of the high latitudes, there is poor agreement among datasets on the occurrence of permafrost at scales finer than 1-km. We aim to address this gap through development of the CommUnity near-surface Permafrost (CUSP) dataset, which is a community-driven compilation of > 20,000 near-surface (< 150 cm) observations of permafrost originating from more than 20 published sources. CUSP includes point-source information on permafrost presence or absence, as well as active layer thickness. Data were derived from a variety of methods including frost probes, boreholes, thaw tubes, ground penetrating radar (GPR), electrical resistance tomography (ERT) and Interferometric Synthetic Aperture Radar (InSAR). The development of the CUSP codebase within cloud data-hosting platforms has enabled automated extraction of auxiliary environmental features, such as topography, vegetation indices, soil texture and climate data from remote sensing and data assimilation sources available on Google Earth Engine. CUSP is a living data repository, such that additional permafrost observations can be uploaded and accompanying environmental features extracted, as new information becomes available. We envision that CUSP will be an asset to the permafrost science community involved in AI/ML analysis, model building and validation of permafrost dynamics under present and future warming.