PyCSEP supports two types of earthquake forecasts that can be evaluated using the tools provided in this package.
These forecast types and the PyCSEP objects used to represent them will be explained in detail in this document.
Table of Contents
Grid-based forecasts assume that earthquakes occur in independent and discrete space-time-magnitude bins. The occurrence of these earthquakes are described only by their expected rates. This forecast format provides a general representation of seismicity that can accommodate forecasts without explicit likelihood functions, such as those created using smoothed seismicity models. Gridded forecasts can also be produced using simulation-based approaches like epidemic-type aftershock sequence models.
Currently, grid-based forecasts define their spatial component using a 2D Cartesian (rectangular) grid, and their magnitude bins using a 1D Cartesian (rectangular) grid. The last bin (largest magnitude) bin is assumed to continue until infinity. Forecasts use latitude and longitude to define the bin edge of the spatial grid. Typical values for the are 0.1° x 0.1° (lat x lon) and 0.1 ΔMw units. These choices are not strictly enforced and can defined according the specifications of an experiment.
Class to represent grid-based forecasts
Default file format
The default file format of a gridded-forecast is a tab delimited ASCII file with the following columns (names are not included):
LON_0 LON_1 LAT_0 LAT_1 DEPTH_0 DEPTH_1 MAG_0 MAG_1 RATE FLAG -125.4 -125.3 40.1 40.2 0.0 30.0 4.95 5.05 5.8499099999999998e-04 1
Each row represents a single space-magnitude bin and the entire forecast file contains the rate for a specified time-horizon. An example of a gridded forecast for the RELM testing region can be found here.
The coordinates (LON, LAT, DEPTH, MAG) describe the independent space-magnitude region of the forecast. The lower coordinates are inclusive and the upper coordinates are exclusive. Rates are incremental within the magnitude range defined by [MAG_0, MAG_1). The FLAG is a legacy value from CSEP testing centers that indicates whether a spatial cell should be considered by the forecast. Currently, the implementation does not allow for individual space-magnitude cells to be flagged. Thus, if a spatial cell is flagged then all corresponding magnitude cells are flagged.
PyCSEP only supports regions that have a thickness of one layer. In the future, we plan to support more complex regions including those that are defined using multiple depth regions. Multiple depth layers can be collapsed into a single layer by summing. This operations does reduce the resolution of the forecast.
Custom file format
GriddedForecast.from_custom method allows you to provide
a function that can read custom formats. This can be helpful, because writing this function might be required to convert
the forecast into the appropriate format in the first place. This function has no requirements except that it returns the
- classmethod GriddedForecast.from_custom(func, func_args=(), **kwargs)
Creates MarkedGriddedDataSet class from custom parsing function.
Catalog-based earthquake forecasts are issued as collections of synthetic earthquake catalogs. Every synthetic catalog represents a realization of the forecast that is representative the uncertainty present in the model that generated the forecast. Unlike grid-based forecasts, catalog-based forecasts retain the space-magnitude dependency of the events they are trying to model. A grid-based forecast can be easily computed from a catalog-based forecast by assuming a space-magnitude region and counting events within each bin from each catalog in the forecast. There can be issues with under sampling, especially for larger magnitude events.
Catalog based forecast defined as a family of stochastic event sets.
Please see visit this example for an end-to-end tutorial on how to evaluate a catalog-based earthquake forecast. An example of a catalog-based forecast stored in the default PyCSEP format can be found `here<https://github.com/SCECcode/pycsep/blob/dev/csep/artifacts/ExampleForecasts/CatalogForecasts/ucerf3-landers_1992-06-28T11-57-34-14.csv>`_.
The standard format for catalog-based forecasts a comma separated value ASCII format. This format was chosen to be human-readable and easy to implement in all programming languages. Information about the format is shown below.
Custom formats can be supported by writing a custom function or sub-classing the AbstractBaseCatalog.
The event format matches the follow specfication:
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID -125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 0, 0
Each row in the catalog corresponds to an event. The catalogs are expected to be placed into the same file and are differentiated through their catalog_id. Catalogs with no events can be handled in a couple different ways intended to save storage.
The events within a catalog should be sorted in time, and the catalog_id should be increasing sequentially. Breaks in the catalog_id are interpreted as missing catalogs.
The following two examples show how you represent a forecast with 5 catalogs each containing zero events.
1. Including all events (verbose)
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID ,,,,,0, ,,,,,1, ,,,,,2, ,,,,,3, ,,,,,4,
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID ,,,,,4,
The following three example show how you could represent a forecast with 5 catalogs. Four of the catalogs contain zero events and one catalog contains one event.
3. Including all events (verbose)
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID ,,,,,0, ,,,,,1, ,,,,,2, ,,,,,3, -125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 4, 0
LON, LAT, MAG, ORIGIN_TIME, DEPTH, CATALOG_ID, EVENT_ID -125.4, 40.1, 3.96, 1992-01-05T0:40:3.1, 8, 4, 0
The simplest way to orient the file follow (3) in the case where some catalogs contain zero events. The zero oriented catalog_id should be assigned to correspond with the total number of catalogs in the forecast. In the case where every catalog contains zero forecasted events, you would specify the forecasting using (2). The catalog_id should be assigned to correspond with the total number of catalogs in the forecast.