Track ARs at individual time steps to form tracks

The modified Hausdorff distance definition

After the Detect AR appearances from THR output process, assume we have detected

  • \(n\) ARs at time \(t\), and
  • \(m\) ARs at time \(t+1\).

There are theoretically \(n \times m\) possible associations to link these two groups of ARs. Of cause not all of them are meaningful. The rules that are applied in the association process are:

  1. The nearest neighbor link method: for any AR at time \(t\), the nearest AR at time \(t+1\) “wins” and is associated with it, subject to that:
  2. the inter-AR distance (H) is \(\le 1200 \, km\).
  3. no merging or splitting is allowed, any AR at time \(t\) can only be linked to one AR at time \(t+1\), similarly, any AR at time \(t+1\) can only be linked to one AR at time \(t\).
  4. after all associations at any give time point have been created, any left-over AR at time \(t+1\) forms a track on their own, and waits to be associated in the next iteration between \(t+1\) and \(t+2\).
  5. any track that does not get updated during the \(t-(t+1)\) process terminates. This assumes that no gap in the track is allowed.

The remaining important question is how to define that inter-AR distance (H). Here we adopt a modified Hausdorff distance definition:

\[H(A,B) \equiv min \{ h_f(A,B), h_b(A,B) \}\]

where \(H(A, B)\) is the modified Hausdorff distance from track A to track B, \(h_f(A,B)\) is the forward Hausdorff distance from A to B, and \(h_b(A,B)\) the backward Hausdorff distance from A to B. They are defined, respectively, as:

\[h_f(A, B) \equiv \operatorname*{max}_{a \in A} \{ \operatorname*{min}_{b \in B} \{ d_g(a,b) \} \}\]

namely, the largest great circle distance of all distances from a point in A to the closest point in B. And the backward Hausdorff distance is:

\[h_b(A, B) \equiv \operatorname*{max}_{b \in B} \{ \operatorname*{min}_{a \in A} \{ d_g(a,b) \} \}\]

Note that in general \(h_f \neq h_b\). Unlike the standard definition of Hausdorff distance that takes the maximum of \(h_f\) and \(h_b\), we take the minimum of the two.

The rationale behind this modification is that merging/splitting of ARs mostly happen in an end-to-end manner, during which a sudden increase/decrease in the length of the AR induce misalignment among the anchor points. Specifically, merging (splitting) tends to induce large backward (forward) Hausdorff distance. Therefore \(min \{ h_f(A,B), h_b(A,B) \}\) offers a more faithful description of the spatial closeness of ARs. For merging/splitting events in a side-to-side manner, this definition works just as well.

Input data

This step takes as inputs the AR records detected at individual time steps as computed in Detect AR appearances from THR output.

The tracker parameters used:

# Int, hours, gap allowed to link 2 records. Should be the time resolution of
# the data.
TIME_GAP_ALLOW=6

# tracking scheme. 'simple': all tracks are simple paths.
# 'full': use the network scheme, tracks are connected by their joint points.
TRACK_SCHEME='simple'  # 'simple' | 'full'

# int, max Hausdorff distance in km to define a neighborhood relationship
MAX_DIST_ALLOW=1200  # km

# int, min duration in hrs to keep a track.
MIN_DURATION=24

# int, min number of non-relaxed records in a track to keep a track.
MIN_NONRELAX=1

# whether to plot linkage schematic plots or not
SCHEMATIC=True

Usage in Python scripts

The tracking process is handled by the AR_tracer.trackARs() function:

from ipart.AR_tracer import trackARs
from ipart.AR_tracer import readCSVRecord

ardf=readCSVRecord(RECORD_FILE)
track_list=trackARs(ardf, TIME_GAP_ALLOW, MAX_DIST_ALLOW,
    track_scheme=TRACK_SCHEME, isplot=SCHEMATIC, plot_dir=plot_dir)

where

  • RECORD_FILE is the path to the csv file saving the individual AR records. Refer to this notebook for more information on the creation of this file.
  • ardf is a pandas.DataFrame object containing the AR records at individual time points.
  • track_list is a list of AR objects, each stores a sequence of AR records that form a single track. The data attribute of the AR object is a pandas.DataFrame object, with the same columns as shown in AR records DataFrame.

After this, one can optionally perform a filtering on the obtained tracks, using AR_tracer.filterTracks(), to remove, for instance, tracks that do not last for more than 24 hours:

from ipart.AR_tracer import filterTracks
track_list=filterTracks(track_list, MIN_DURATION, MIN_NONRELAX)

Example output

The resultant AR track can be visualized using the following snippet:

from ipart.utils import plot
import cartopy.crs as ccrs

latax=np.arange(0, 90)
lonax=np.arange(80, 440)  # degree east, shifted by 80 to ensure monotonically increasing axis

plot_ar=track_list[6]  # plot the 7th track in list

figure=plt.figure(figsize=(12,6),dpi=100)
ax=figure.add_subplot(111, projection=ccrs.PlateCarree())
plotplotARTrack(plot_ar,latax,lonax,ax,full=True)

The output figure looks like Fig. 7 below.

_images/ar_track_198424.png

Fig. 7 Locations of a track labelled “198424” found in year 1984. Black to yellow color scheme indicates the evolution.

Dedicated Python script

You can use the scripts/trace_ARs.py script for AR tracking process in production.

Note

Unlike the AR occurrence detection process, this tracking process is time-dependent and therefore can not be paralleized. Also, if you divide the detection process into batches, e.g. one for each year, you may want to combine the output csv records into one big data file, and perform the tracking on this combined data file. This would prevent a track lasting from the end of one year into the next from being broken into 2.

Notebook example

An example of this process is given in this notebook.