Forecasts
pyCSEP supports two types of earthquake forecasts that can be evaluated using the tools provided in this package.
Grid-based forecasts
Catalog-based forecasts
These forecast types and the pyCSEP objects used to represent them will be explained in detail in this document.
Gridded forecasts
Grid-based forecasts assume that earthquakes occur in independent and discrete space-time-magnitude bins. The occurrence of these earthquakes are described only by their expected rates. This forecast format provides a general representation of seismicity that can accommodate forecasts without explicit likelihood functions, such as those created using smoothed seismicity models. Gridded forecasts can also be produced using simulation-based approaches like epidemic-type aftershock sequence models.
Currently, pyCSEP offers support for two types of grid-baesd forecasts, i.e. conventional gridded forecasts and quadtree-based gridded forecasts. Conventional grid-based forecasts define their spatial component using a 2D Cartesian (rectangular) grid, and their magnitude bins using a 1D Cartesian (rectangular) grid. The last bin (largest magnitude) bin is assumed to continue until infinity. Forecasts use latitude and longitude to define the bin edge of the spatial grid. Typical values for the are 0.1° x 0.1° (lat x lon) and 0.1 ΔMw units. These choices are not strictly enforced and can defined according the specifications of an experiment.
pyCSEP aso offers support to handle forecast using quadtree approach. Single or multi-resolution spatial grid can be generated based on the choice of modelers. Then that grid can used for generating earthquake forecast.
Working with conventional gridded forecasts
PyCSEP provides the GriddedForecast
class to handle working with
grid-based forecasts. Please see visit this example for an end-to-end tutorial on
how to evaluate a grid-based earthquake forecast.
Class to represent grid-based forecasts |
Default file format
The default file format of a gridded-forecast is a tab delimited ASCII file with the following columns (names are not included):
LON_0 LON_1 LAT_0 LAT_1 DEPTH_0 DEPTH_1 MAG_0 MAG_1 RATE FLAG
-125.4 -125.3 40.1 40.2 0.0 30.0 4.95 5.05 5.8499099999999998e-04 1
Each row represents a single space-magnitude bin and the entire forecast file contains the rate for a specified time-horizon. An example of a gridded forecast for the RELM testing region can be found here.
The coordinates (LON, LAT, DEPTH, MAG) describe the independent space-magnitude region of the forecast. The lower coordinates are inclusive and the upper coordinates are exclusive. Rates are incremental within the magnitude range defined by [MAG_0, MAG_1). The FLAG is a legacy value from CSEP testing centers that indicates whether a spatial cell should be considered by the forecast. Currently, the implementation does not allow for individual space-magnitude cells to be flagged. Thus, if a spatial cell is flagged then all corresponding magnitude cells are flagged.
Note
PyCSEP only supports regions that have a thickness of one layer. In the future, we plan to support more complex regions including those that are defined using multiple depth regions. Multiple depth layers can be collapsed into a single layer by summing. This operations does reduce the resolution of the forecast.
Custom file format
The GriddedForecast.from_custom
method allows you to provide
a function that can read custom formats. This can be helpful, because writing this function might be required to convert
the forecast into the appropriate format in the first place. This function has no requirements except that it returns the
expected data.
- classmethod GriddedForecast.from_custom(func, func_args=(), **kwargs)[source]
Creates MarkedGriddedDataSet class from custom parsing function.
- Parameters:
- Returns:
forecast object
- Return type:
Note
The loader function func needs to return a tuple that contains (data, region, magnitudes). data is a
numpy.ndarray
, region is aCartesianGrid2D
, and magnitudes are anumpy.ndarray
consisting of the magnitude bin edges. See the functionload_ascii
for an example.
Working with quadtree-gridded forecasts
The same forecast GriddedForecast
class also handles forecasts with
quadtree grids. Please see visit this example for an end-to-end tutorial on
how to evaluate a grid-based earthquake forecast.
Class to represent grid-based forecasts |
Default file format
The default file format of a quadtree gridded-forecast is also a tab delimited ASCII file with the following columns. Just one additional column is added to the file format, i.e. quadkey to identify the spatial cells. If quadkeys for each spatial cell are known, it is enough to compute lon/lat bounds. However, lon/lat bounds are still kept in the default format to make it look consistent with conventional forecast format.
(names are not included):
QUADKEY LON_0 LON_1 LAT_0 LAT_1 DEPTH_0 DEPTH_1 MAG_0 MAG_1 RATE FLAG
'01001' -125.4 -125.3 40.1 40.2 0.0 30.0 4.95 5.05 5.8499099999999998e-04 1
Each row represents a single space-magnitude bin and the entire forecast file contains the rate for a specified time-horizon.
The coordinates (LON, LAT, DEPTH, MAG) describe the independent space-magnitude region of the forecast. The lower coordinates are inclusive and the upper coordinates are exclusive. Rates are incremental within the magnitude range defined by [MAG_0, MAG_1). The FLAG is a legacy value from CSEP testing centers that indicates whether a spatial cell should be considered by the forecast. Please note that flagged functionality is not yet included for quadtree-gridded forecasts.
PyCSEP offers the load_quadtree_forecast
function to read quadtree forecast in default format.
Similarly, custom forecast can be defined and read into pyCSEP as explained for conventional gridded forecast.
Catalog-based forecasts
Catalog-based earthquake forecasts are issued as collections of synthetic earthquake catalogs. Every synthetic catalog represents a realization of the forecast that is representative the uncertainty present in the model that generated the forecast. Unlike grid-based forecasts, catalog-based forecasts retain the space-magnitude dependency of the events they are trying to model. A grid-based forecast can be easily computed from a catalog-based forecast by assuming a space-magnitude region and counting events within each bin from each catalog in the forecast. There can be issues with under sampling, especially for larger magnitude events.
Working with catalog-based forecasts
Catalog based forecast defined as a family of stochastic event sets. |
Please see visit this example for an end-to-end tutorial on how to evaluate a catalog-based earthquake forecast. An example of a catalog-based forecast stored in the default pyCSEP format can be found here.
The standard format for catalog-based forecasts a comma separated value ASCII format. This format was chosen to be human-readable and easy to implement in all programming languages. Information about the format is shown below.
Note
Custom formats can be supported by writing a custom function or sub-classing the AbstractBaseCatalog.
The event format matches the follow specfication:
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID
-125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 0, 0
Each row in the catalog corresponds to an event. The catalogs are expected to be placed into the same file and are differentiated through their catalog_id. Catalogs with no events can be handled in a couple different ways intended to save storage.
The events within a catalog should be sorted in time, and the catalog_id should be increasing sequentially. Breaks in the catalog_id are interpreted as missing catalogs.
The following two examples show how you represent a forecast with 5 catalogs each containing zero events.
1. Including all events (verbose)
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID
,,,,,0,
,,,,,1,
,,,,,2,
,,,,,3,
,,,,,4,
2. Short-hand
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID
,,,,,4,
The following three example show how you could represent a forecast with 5 catalogs. Four of the catalogs contain zero events and one catalog contains one event.
3. Including all events (verbose)
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID
,,,,,0,
,,,,,1,
,,,,,2,
,,,,,3,
-125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 4, 0
4. Short-hand
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID
-125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 4, 0
The simplest way to orient the file follow (3) in the case where some catalogs contain zero events. The zero oriented catalog_id should be assigned to correspond with the total number of catalogs in the forecast. In the case where every catalog contains zero forecasted events, you would specify the forecasting using (2). The catalog_id should be assigned to correspond with the total number of catalogs in the forecast.