How SC_IN_DBMS Works from A High Level of View (based on Alg1)

Step 1: Load Links

Each tuple (row) in the DT_RDWY_LINK table of WisTransportal STN database is loaded into the memory as a Link object (hereafter called a link for short) in the program. A link has the following properties (as shown in Figure 1):

link ID,
link length (in ft),
valid period (the time span during which the link is current in the database),
from-reference-site, and
to-reference-site.

Figure 1 A typical Link object.

Additionally, in the second step, all crashes occurred on the same link will form a list and be added to that link.

Step 2: Load Crashes

Each tuple (row) in a crash table of WisTransportal is loaded into the memory as a Crash object (hereafter referred to as a crash). Only crashes within a user-specified analysis period are loaded. A crash has the following properties (as shown in Figure 2):

accident number,
accident time (date and time),
link ID of the link on which the crash happened, and
offset (in ft) from the from-reference-site of the crash containing link to the location of the crash.

Figure 2 A typical Crash object.

Crashes are not only loaded in a global list, but also added to the crash list of its containing link. So each link will have a list of crashes that happened in it. This list can be empty if there was not crash happening on this link during the user-specified analysis period.

Step 3: Construct A Site-to-Site (STS) Route Matrix

A STS route is a a sequence of connected links connecting one reference site (RS) to another. By saying connected links, no enforcement is made on following the directions of links. Two links are treated as connected if their from-reference-sites are the same, their to-reference-sites are the same, or one's from-reference-site is the other's to-reference-site. Since no direction is enforced, a STS route is different from some traditional notion of route. The reason why direction is not concerned for now is that only the route length will be used in future calculation. Although no direction is used, link distinction is enforced. In other words, a link cannot appear more than once in a STS route. Also, a STS route can be 0 ft long if its corresponding start RS and end RS are identical. Figure 3 shows some examples of STS routes. Each STS route has the following properties:

the list of connected links (or empty),
total length (in ft), which is the sum of all connected links' lengths (or 0 if the link list is empty), and
the valid period, which is the intersection of all connected links' valid periods (or the analysis period if the link list is empty).

Figure 3 Illustration of STS routes.

This step establishes a matrix of STS routes. The conceptual image of the matrix is illustrated in Figure 4. Each cell(i,j) contains a list of STS routes from RS_i to RS_j. There can be many STS routes from RS_i to RS_j, but only some of them are included in cell(i,j). The conditions for a STS route to be included are:

The length of the route ≤ (the user-specified distance filter + 2*the maximum length among all links)
There is no anther route that covers the current route in terms of valid period while has a shorter length than the current route does

A route will be excluded (removed) from the list if a new route causes it to violate the second condition above. The exploration of routes is similar to the Dijkstra's shortest path algorithm.
To save computing time and memory space, not all cells are calculated or stored. First, cell(i,j) is calculated only when RS_i is the from-reference-site or to-reference-site of a link that contains at least one crash. Second, cell(i,j) is only stored when it contains at least one STS route that satisfies the above two conditions.

	RS_1	RS_2	...	RS_i	...	RS_N
RS_1	cell(1,1)	cell(1,2)	...	cell(1,i)	...	cell(1,N)
RS_2	cell(2,1)	cell(2,2)	...	cell(2,i)	...	cell(2,N)
:	:	:	+	:	+	:
RS_j	cell(j,1)	cell(j,2)	...	cell(j,i)	...	cell(j,N)
:	:	:	+	:	+	:
RS_N	cell(N,1)	cell(N,2)	...	cell(N,i)	...	cell(N,N)

Figure 4 A STS route matrix.

Step 4: Identification

The following sub steps are performed on each link with at least one crash. For each of such links, we call it a base link.

Step 4.1: Find all candidate links

A candidate link is defined as a link any of whose crashes might be within the distance filter from any of the crashes on the base link. A candidate link is identified in the following way: Assume the from- and to- reference-sites of the base link (B) are RS_bi and RS_bj. Given a link L, with from- and to- reference-sites being RS_li and RS_lj, L is a candidate link of B iff cell(bi,li), cell(bi,lj), cell(bj,li), and cell(bj,lj) have been calculated and stored in the STS route matrix during Step 3.

Step 4.2: Calculate

For each crash prim in base link B and each crash scnd in a candidate link L, we determine whether scnd is a potential secondary crash of prim in two stages:

If scnd happened within the time filter after prim, process to the next stage. Otherwise, exclude scnd.
Calculate the distance between crash prim and crash scnd using the UMD algorithm. A UMD between RS_i and RS_j is the length of the shortest STS route between RS_i and RS_j, which was also valid when both prim and scnd occurred. If the calculated distance between prim and scnd is negative (upstream) and smaller than the distance filter in terms of its absolute value, scnd is considered a potential secondary crash of prim.