Regions

PyCSEP includes commonly used CSEP testing regions and classes that facilitate working with gridded data sets. This module is early in development and will be a focus of future development.

Practically speaking, earthquake forecasts, especially time-dependent forecasts, treat time differently than space and magnitude. If we consider a family of monthly forecasts for the state of California for earthquakes with M 3.95+, each of these forecasts would use the same space-magnitude region, even though the time periods are different. Because the time horizon is an implicit property of the forecast, we do not explicitly consider time in the region objects provided by pyCSEP. This module contains tools for working with gridded regions in both space and magnitude.

First, we will describe how the spatial regions are handled. Followed by magnitude regions, and how these two aspects interact with one another.

Currently, pyCSEP provides two different kinds of spatial gridding approaches to handle binning catalogs and defining regions for earthquake forecasting evaluations, i.e. CartesianGrid2D and QuadtreeGrid2D. The fruther details about spatial grids are given below.

Cartesian grid

This section contains information about using 2D cartesian grids.

CartesianGrid2D(polygons, dh[, name, mask])

Represents a 2D cartesian gridded region.

Note

We are planning to do some improvements to this module and to expand its capabilities. For example, we would like to handle non-regular grids such as a quad-tree. Also, a single Polygon should be able to act as the spatial component of the region. These additions will make this toolkit more useful for crafting bespoke experiments and for general catalog analysis. Feature requests are always welcome!

The CartesianGrid2D acts as a data structure that can associate a spatial location (eg., lon and lat) with its corresponding spatial bin. This class is optimized to work with regular grids, although they do not have to be complete (they can have holes) and they do not have to be rectangular (each row / column can have a different starting coordinate).

The CartesianGrid2D maintains a list of Polygon objects that represent the individual spatial bins from the overall region. The origin of each polygon is considered to be the lower-left corner (the minimum latitude and minimum longitude).

CartesianGrid2D.num_nodes

Number of polygons in region

CartesianGrid2D.get_index_of(lons, lats)

Returns the index of lons, lats in self.polygons

CartesianGrid2D.get_location_of(indices)

Returns the polygon associated with the index idx.

CartesianGrid2D.get_masked(lons, lats)

Returns bool array lons and lats are not included in the spatial region.

CartesianGrid2D.get_cartesian(data)

Returns 2d ndrray representation of the data set, corresponding to the bounding box.

CartesianGrid2D.get_bbox()

Returns rectangular bounding box around region.

CartesianGrid2D.midpoints()

Returns midpoints of rectangular polygons in region

CartesianGrid2D.origins()

Returns origins of rectangular polygons in region

CartesianGrid2D.from_origins(origins[, dh, ...])

Creates instance of class from 2d numpy.array of lon/lat origins.

Creating spatial regions

Here, we describe how the class works starting with the class constructors.

@classmethod
def from_origins(cls, origins, dh=None, magnitudes=None, name=None):
    """ Convenience function to create CartesianGrid2D from list of polygon origins """

For most applications, using the from_origins function will be the easiest way to create a new spatial region. The method accepts a 2D numpy.ndarray containing the x (lon) and y (lat) origins of the spatial bin polygons. These should be the complete set of origins. The function will attempt to compute the grid spacing by comparing the x and y values between adjacent origins. If this does not seem like a reliable approach for your region, you can explicitly provide the grid spacing (dh) to this method.

When a CartesianGrid2D is created the following steps occur:

  1. Compute the bounding box containing all polygons (2D array)

  2. Create a map between the index of the 2D bounding box and the list of polygons of the region.

  3. Store a boolean flag indicating whether a given cell in the 2D array is valid or not

Once these mapping have been created, we can now associate an arbitrary (lon, lat) point with a spatial cell using the mapping defined in (2). The get_index_of accepts a list of longitudes and latitudes and returns the index of the polygon they are associated with. For instance, this index can now be used to access a data value stored in another data structure.

Testing Regions

CSEP has defined testing regions that can be used for earthquake forecasting experiments. The following functions in the csep.core.regions module returns a CartesianGrid2D consistent with these regions.

california_relm_region([dh_scale, ...])

Returns class representing California testing region.

italy_csep_region([dh_scale, magnitudes, ...])

Returns class representing Italian testing region.

global_region([dh, name, magnitudes])

Creates a global region used for evaluating gridded forecasts on the global scale.

Region Utilities

PyCSEP also provides some utilities that can facilitate working with regions. As we expand this module, we will include functions to accommodate different use-cases.

magnitude_bins(start_magnitude, ...)

Returns array holding magnitude bin edges.

create_space_magnitude_region(region, magnitudes)

Simple wrapper to create space-magnitude region

parse_csep_template(xml_filename)

Reads CSEP XML template file and returns the lat/lon values for the forecast.

increase_grid_resolution(points, dh, factor)

Takes a set of origin points and returns a new set with higher grid resolution.

masked_region(region, polygon)

Build a new region based off the coordinates in the polygon.

generate_aftershock_region(mainshock_mw, ...)

Creates a spatial region around a given epicenter

Quadtree grid

We want to use gridded regions with less spatial cells and multi-resolutions grids for creating earthquake forecast models. We also want to test forecast models on different resolutions. But before we can do this, we need to have the capability to acquire such grids. There can be different possible options for creating multi-resolutions grids, such as voronoi cells or coarse grids, etc. The gridding approach needs to have certain properties before we choose it for CSEP experiments. We want an approach for gridding that is simple to implement, easy to understand and should come with intutive indexing. Most importantly, it should come with a coordinated mechanism for changing between different resolutions. It means that one can not simply choose to combine cells of its own choice and create a larger grid cell (low-resolution) and vice versa. This can potentially make the grid comparision process difficult. There must be a specific well-defined strategy to change between different resolutions of grids. We explored different gridding approaches and found quadtree to be a better solution for this task, despite a few drawbacks, such as quadtree does not work for global region beyond 85.05 degrees North and South.

The quadtree is a hierarchical tiling strategy for storing and indexing geospatial data. In start the global testing region is divided into 4 tiles, identified as ‘0’, ‘1’, ‘2’, ‘3’. Then each tile can be divided into four children tiles, until a final desired grid is acquired. Each tile is identified by a unique identifier called quadkey, which are ‘0’, ‘1’, ‘2’ or ‘3’. When a tile is divided further, the quadkey is also modified by appending the new identifier with the previous quadkey. Once a grid is acquired then we call each tile as a grid cell. The number of times a tile is divided is reffered as the zoom-level (L) and the length of quadkey denotes the number of times a tile has been divided. If a grid has same zoom-level for each tile, then it is referred as single-resolution grid.

A single-resolution grid is acquired at zoom-level L=5, provides 1024 spatial cells in whole globe. Increase in the value of L by one step leads to the increase in the number of grid cells by four times. Similary at L=11, the number of cells acquired in grid are 4.2 million approximately.

We can use quadtree in combination with any data to create a multi-resolution grid, in which the resolution is determined by the input data. In general quadtree can be used in combination with any type of input data. However, now we provide support of earthquake catalog to be used as input data for determining the grid-resolution. With time we intend to incorporate the support for other types of datasets, such as such as distance form the mainshock or rupture plance, etc.

Currently, for generating mult-resolution grids, we can choose two criteria to decide resolution, i.e. maximum number of earthquakes allowed per cell (Nmax) and maximum zoom-level (L) allowed for a cell. This means that only those cells (tiles) will be divided further into sub-cells that contain more earthquakes than Nmax and the cells will not be divided further after reaching L, even if number of earthquakes are more than Nmax. Thus, quadtree can provide high-resolution (smaller) grid cells in seismically active regions and a low-resolution (bigger) grid cells in seismically quiet regions. It offers earthquake forecast modellers the liberty of choosing a suitable spatial grid based on their choice.

This section contains information about using quadtree based grid.

QuadtreeGrid2D(polygons, quadkeys, bounds[, ...])

Respresents a 2D quadtree gridded region.

The QuadtreeGrid2D acts as a data structure that can associate a spatial location, identified by a quadkey (or lon and lat) with its corresponding spatial bin. This class allows to create a quadtree grid using three different methods, based on the choice of user. It also offers the conversion from quadtree cell to lon/lat bouds.

The QuadtreeGrid2D maintains a list of Polygon objects that represent the individual spatial bins from the overall region. The origin of each polygon is considered to be the lower-left corner (the minimum latitude and minimum longitude).

QuadtreeGrid2D.num_nodes

Number of polygons in region

QuadtreeGrid2D.get_cell_area()

Calls function geographical_area_from_bounds and computes area of each grid cell.

QuadtreeGrid2D.get_index_of(lons, lats)

Returns the index of lons, lats in self.polygons

QuadtreeGrid2D.get_location_of(indices)

Returns the polygon associated with the index idx.

QuadtreeGrid2D.get_bbox()

Returns rectangular bounding box around region.

QuadtreeGrid2D.midpoints()

Returns midpoints of rectangular polygons in region

QuadtreeGrid2D.origins()

Returns origins of rectangular polygons in region

QuadtreeGrid2D.save_quadtree(filename)

Saves the quadtree grid (quadkeys) in a text file

QuadtreeGrid2D.from_catalog(catalog, threshold)

Creates instance of class from 2d numpy.array of lon/lat of Catalog.

QuadtreeGrid2D.from_single_resolution(zoom)

Creates instance of class at single-resolution using provided zoom-level.

QuadtreeGrid2D.from_quadkeys(quadk[, ...])

Creates instance of class from available quadtree grid.

Creating spatial regions

Here, we describe how the class works starting with the class constructors and how users can create different types regions.

Multi-resolution grid based on earthquake catalog

Read a global earthquake catalog in CSEPCatalog format and use it to generate a multi-resolution quadtree-based grid constructors.

@classmethod
def from_catalog(cls, catalog, threshold, zoom=11, magnitudes=None, name=None):
    """ Convenience function to create a multi-resolution grid using earthquake catalog """

Single-resolution grid

Generate a single-resolution grid at the same zoom-level everywhere. This grid does not require a catalog. It only needs the zoom-level to determine the resolution of grid.:

@classmethod
def from_single_resolution(cls, zoom, magnitudes=None, name=None):
    """ Convenience function to create a single-resolution grid """

Grid loading from already created quadkeys

An already saved quadtree grid can also be loaded in the pyCSEP. Read the quadkeys and use the following function to instantiate the class.

@classmethod
def from_quadkeys(cls, quadk, magnitudes=None, name=None):
    """ Convenience function to create a grid using already generated quadkeys """

When a QuadtreeGrid2D is created the following steps occur:

  1. Compute the bounding box containing all polygons (2D array) corresponding to quadkeys

  2. Create a map between the index of the 2D bounding box and the list of polygons of the region.

Once these mapping have been created, we can now associate an arbitrary (lon, lat) point with a spatial cell using the mapping defined in (2). The get_index_of accepts a list of longitudes and latitudes and returns the index of the polygon they are associated with. For instance, this index can now be used to access a data value stored in another data structure.

Testing Regions

CSEP has defined testing regions that can be used for earthquake forecasting experiments. The above mentioned functions are used to create quadtree grids for global testing region. Once a grid However, a quadtree gridded region can be acquired for any geographical area and used for forecast generation and testing. For example, we have created a quadtree-gridded region at fixed zoom-level of 12 for California RELM testing region.

california_quadtree_region([magnitudes, name])

Returns object of QuadtreeGrid2D representing quadtree grid for California RELM testing region.