Core Concepts for Beginners
If you are reading this documentation, there is a good chance that you are developing/evaluating an earthquake forecast or implementing an experiment at a CSEP testing center. This section will help you understand how we conceptualize forecasts, evaluations, and earthquake catalogs. These components make up the majority of the PyCSEP package. We also include some prewritten visualizations along with some utilities that might be useful in your work.
Catalogs
Earthquake catalogs are fundamental to both forecasts and evaluations and make up a core component of the PyCSEP package. At some point you will be working with catalogs if you are evaluating earthquake forecasts.
One major difference between PyCSEP and a project like ObsPy is that typical ‘CSEP’ calculations
operate on an entire catalog at once to perform methods like filtering and binning that are required to evaluate an earthquake
forecast. We provide earthquake catalog classes that follow the interface defined by
AbstractBaseCatalog
.
The catalog data are stored internally as a structured Numpy array which effectively treats events contiguously in memory like a c-style struct. This allows us to accelerate calculations using the vectorized operations provided by Numpy. The necessary attributes for an event to be used in an evaluation are the spatial location (lat, lon), magnitude, and origin time. Additionally, depth and other identifying characteristics can be used. The default storage format for an earthquake catalog is an ASCII/utf-8 text file with events stored in CSV format.
The AbstractBaseCatalog
can be extended to accommodate different catalog formats
or input and output routines. For example UCERF3Catalog
extends this class to deal
with the big-endian storage routine from the UCERF3-ETAS forecasting model. More
information will be included in the Catalogs section of the documentation.
Forecasts
PyCSEP provides objects for interacting with earthquake forecasts. PyCSEP supports two types of earthquake forecasts, and provides separate objects for interacting with both. The forecasts share similar characteristics, but, conceptually, they should be treated differently because they require different types of evaluations.
Both time-independent and time-dependent forecasts are represented using the same PyCSEP forecast objects. Typically, for time-dependent forecasts, one would create separate forecast objects for each time period. As the name suggests, time-independent forecasts do not change with time.
Grid-based forecast
Grid-based earthquake forecasts are specified by the expected rate of earthquakes within discrete, independent space-time-magnitude bins. Within each bin, the expected rate represents the parameter of a Poisson distribution. For, details about the forecast objects visit the Forecasts section of the documentation.
The forecast object contains three main components: (1) the expected earthquake rates, (2) the
spatial region
associated with the rates, and (3) the magnitude range
associated with the expected rates. The spatial bins are usually discretized according to the geographical coordinates
latitude and longitude with most previous CSEP spatial regions defining a spatial size of 0.1° x 0.1°. Magnitude bins are
also discretized similarly, with 0.1 magnitude units being a standard choice. PyCSEP does not enforce constraints on the
bin-sizes for both space and magnitude, but the discretion must be regular.
Catalog-based forecast
Catalog-based forecasts are specified by families of synthetic earthquake catalogs that are generated through simulation by probabilistic models. Each catalog represents a stochastic representation of seismicity consistent with the forecasting model. Probabilistic statements are made by computing statistics (usually by counting) within the family of synthetic catalogs, which can be as simple as counted the number of events in each catalog. These statistics represent the full-distribution of outcomes as specified by the forecasting models, thereby allowing for more direct assessments of the models that produce them.
Within PyCSEP catalog forecasts are effectively lists of earthquake catalogs, no different than those obtained from authoritative sources. Thus, any operation that can be performed on an observed earthquake catalog can be performed on a synthetic catalog from a catalog-based forecast.
It can be useful to count the numbers of forecasted earthquakes within discrete space-time bins (like those used for
grid-based forecasts). Therefore, it’s common to have a spatial region
and
set of magnitude bins associated with a forecast. Again, the only rules that PyCSEP enforces are that the space-magnitude
regions are regularly discretized.
Evaluations
PyCSEP provides implementations of statistical tests used to evaluate both grid-based and catalog-based earthquake forecasts. The former use parametric evaluations based on Poisson likelihood functions, while the latter use so-called ‘likelihood-free’ evaluations that are computed from empirical distributions provided by the forecasts. Details on the specific implementation of the evaluations will be provided in the Evaluations section.
Every evaluation can be different, but in general, the evaluations need the following information:
Earthquake forecast(s)
Spatial region
Magnitude range
Authoritative earthquake catalog
PyCSEP does not produce earthquake forecasts, but provides the ability to represent them using internal data models to facilitate their evaluation. General advice on how to administer the statistical tests will be provided in the Evaluations section.