Fault Traces: Generation of Fault Segments and Estimation of Their Fractal Dimension

Fault damage zones have a higher upscaled permeability than the host rock because of a higher fracture intensity therein. Fracture distribution in the damage zone depends highly on the geometry of fault segments. However, precise images of architectural elements of large-scale faults at depth are di ﬃ cult to obtain by seismic acquisition and imaging techniques. We present a numerical method that generates fault segments at multiple scales from an imprecise fault trace based on the fractal properties of these segments. The generated fault segments demonstrate hierarchical self-similar architecture, and their lengths follow approximately a lognormal distribution. These characteristics are similar to real fault segments observed in outcrops and seismic surveys. An algorithm that covers fault segments accurately with the minimum number of circles is proposed to calculate the fractal dimensions for both natural and computer-generated faults. The fractal dimensions of natural and generated fault segments are similar and range between 1.2 and 1.4.


Introduction
Advancement in the understanding of slip along and fluid flow across faults requires a comprehensive knowledge of fault segments and their spatial distribution along the fault zone. Segmentation of different types of fault systems (strike-slip, normal, and reverse faults) plays an important role in finding the predominant locations and fractal dimension of fault splays [1][2][3]. These splays control properties of the damage zone around the parent fault and the geometrical and statistical properties of second-order fault networks around this fault [1,[4][5][6]. In general, the redistributions of slip and stress along fault systems and the resulting spatial distribution of damage zone are highly impacted by the fault architecture [1,4,7,8]. Stepovers between segments control the development of fault core because of the high strain levels therein [1]. Fault core distribution is critical for fluid flow in the subsurface because, in general, damage zones have a higher upscaled permeability than the host rock, whereas fault cores usually have a lower permeability [1,[9][10][11][12].
Given that fault segments are components of the main fault, they are parallel to subparallel to the main fault and are described by echelon geometry. Faults are made up of many segments in both vertical and lateral dimensions [13]. The segments develop by partitioning a fault into numerous fault branches that appear to be unlinked and separated on a horizon of interest, whether on a seismic time slice or as fault traces on an outcrop [14][15][16][17][18][19][20][21]. The discontinuous and segmented fault architecture has also been described by high-resolution seismic surveys [22][23][24][25][26][27][28].
It is important to note the segmentation of a growing fault is opposed by the newer continuous links of the small segments that form a larger fault segment. This phenomenon is called the growth of faults by linkage and coalescence [7]. Interacting fault segments link as fault displacement grows [14][15][16]29]. Linkage occurs when the segments overlap during a substantial fault interaction through their associated stress fields [30][31][32]. Such linkage and coalescence in a wide range of scales (mm-km) are observed from experiments [33] and outcrops [7,34] and also simulated numerically [35,36].
Through linkage and coalescence, small segments merge and form large segments. However, fault segments at a fine scale are still important, because damage zones show similar geometries across a wide range of scales (mm-km) [37,38], and their spatial distribution firmly depends on the segment geometries. Only considering damage zones on a large scale underestimates complex structures on a fine scale and neglects their impact on the fluid flow [39][40][41]. Precise images of the architectural elements of faults at depth are difficult to obtain by seismic acquisition and imaging techniques. Hence, seismic interpreters typically describe faults as single plane elements and fail to introduce the off-plane, closely spaced segments [1]. Therefore, the geometry of fault segments at depth is uncertain in general.
In this paper, we propose a method to generate detailed fault segments with only the trace length from an imprecise seismic map. Furthermore, we validate the correctness of generated fault segments from a fractal dimension perspective. We propose a modified algorithm to accurately calculate the fractal dimension of both natural fault segments observed from real seismic data or outcrops and generated fault segments. The generated fault segments are the premise for modelling associated damage zone around each fault segment and evaluating their impacts on the subsurface fluid flow.

Materials and Methods
This section introduces a method to generate fault segments from an imprecise seismic map and a modified algorithm to calculate their fractal dimensions.

A Method to Generate Fault Segments.
Mandelbrot [42] proposed the concept of fractals, which can be used to characterize irregular sets, regardless of the scale at which these sets are examined. Fractal dimension is a measure of a set's irregularity or complexity. Natural fault systems have fractal or multifractal characteristics [23,[43][44][45][46][47][48]. Different types of faults are discontinuous and are composed of fault segments. Adjacent discrete segments step aside and overlap slightly to form an echelon fault geometry [22]. The echelon fault geometry can be found at different scales, shown in Figure 1. This phenomenon depicts the fractal characteristic or the hierarchical self-similar architecture of fault systems [1,33]. Furthermore, from experiments [33] and outcrop observations [33,37], such self-similarity shows a wide range of scales from millimeters to kilometers. There are always two to five segments in any given hierarchical rank, and three or four segments are most observed.
Using outcrop images, de Joussineau and Aydin [1] investigated segmentation along strike-slip faults and found correlations between different parameters of the fault segments. These parameters are illustrated in Figure 2. According to the study of de Joussineau and Aydin [1], there exist power law relationships between the (1) Mean segment length ( L se ) and the maximum fault offset (S) L se = 2:16S 0:89 R 2 = 0:87 À Á ð1Þ (2) Mean step length ( L st ) and the maximum fault offset (S) L st = 0:30S 0:77 R 2 = 0:84 À Á ð2Þ (3) Step length x and step width y x = 2:69y 0:97 R 2 = 0:93 Generation of fine-scale fault segments from an imprecise seismic map is an inverse problem. The fault segments generated from a highly underdetermined inverse problem are nonunique. Our method incorporates minimum information (the trace length from an imprecise seismic map) to generate a detailed structure of fault segments. If more information is available, such as the slip distribution, it can help to constrain the results and make them closer to reality. To implement the algorithm, we need to make the following assumptions: (1) Fault segments are self-similar (2) Subfault segments of a younger generation are independent of each other (3) The relationships found in the study of de Joussineau and Aydin [1] are valid at each generation step of the subfault segments If these assumptions are valid, the fault segments of a younger generation should be constrained by Equation (4).
x = 0:30S 0:77 , L se = 2:16S 0:89 , where the length of the imprecise fault trace from a seismic map, L; the number of segments in each generation, n se ; the mean segment length, L se = 1/n se ∑ n se i=1 l i (l i is the length of the subfault i); the mean step length, x = 1/ðn se − 1Þ∑ is the step length of the i th step); and the step width of the i th step, y i , are all defined in Figure 2. S is the maximum fault offset. We assume that the step length is proportional to its width with a factor of 2.7.
The fault trace length, L, in Equation (4) can be obtained from the fault map, and we can solve for three unknowns (S, x, and L se ). We assume that there are three or four segments in each generation with an equal probability of 0.5, respectively. To make the generated fault segments closer to reality, we include both overlapping and underlapping between 2 Lithosphere segments. The overlapping structure is dominant with a probability of 0.9. However, to obtain the length of each subfault l i , we need to assume a relationship between the length of each subfault and the mean segment length. We assume that the subfault lengths are either equal, l i = l j = L se , or that their length ratio is l i / L se = randð0:8, 1:2Þ (a random number uniformly distributed between 0.8 and 1.2). These assumptions are implemented for individual step length (x i ) and the mean step length ( x). The fault segments generated with both assumptions are shown in Figure 3.
To the best of our knowledge, there are no field data reporting equal lengths of fault segments. de Joussineau and Aydin [1], Davy [52] found that the fault segment lengths obey roughly a lognormal distribution. We assume that (i) for a given subfault generation, the length of a fault segment is independent of its neighbours, (ii) the set of subfault lengths for a younger generation is obtained from an older one through a recursive division, and (iii) the length of the older segment is approximately equal to the summation of each segment in the younger generation. The length of the n th generation   Step width, y 1 Step width, Step length, x 1 Step length, x 2 3 Lithosphere subfault is then calculated in Equation (5).
where X 1 , X 2 , ⋯X n are the independent random variables from a uniform probability distribution. If we take the logarithm on both sides, log ðL n Þ is the summation of n independent random variables log ðX i Þ. According to the Central Limit Theorem, log ðL n Þ should follow a normal distribution and, consequently, the subfault lengths, L n , should follow a lognormal distribution (example in Figure 4(a)). However, n should be large to make the Central Limit Theorem hold. Also, the standard deviation of L n decreases with an increase of n. The lognormal distribution of segment lengths is found in one generation of fault segments. If the fracture lengths are measured in a target domain with many generations of fractures, it is more likely to follow a power-law distribution [53] because of the universal principle for fractality of solids [54].
The self-similar fault segments can have many subdivisions until reaching the scale of microfractures. Damage zones also show similar geometries across a wide range of scales (mm-km) [37,38]. Therefore, we can set a constraint on the mean segment length (which is widely used in geology) to a desirable scale for a specific engineering application, e.g., for reservoir engineering, a scale of hundreds of meters will be sufficient. The damage zones distributed at the relevant scale play an essential role in the corresponding fluid flow problem.
We assume the length of a fault trace from a seismic map to be L = 10 km. Figure 5 shows different generations of fault segments. They have the mean segment length of 1117 m, 353 m, and 116 m, respectively. For a desired value of the mean segment length, we repeat the simulation until the desired L se value is reached within a given tolerance. Real fault segments are not perfectly parallel. We can add randomness in the orientations of the segments and make the   Lithosphere geometry closer to reality. Figure 6 shows the tilted fault segments with the mean segment length of 1223 m, 399 m, and 136 m, respectively. Both parallel and tilted fault segments may have crossovers (Figure 7(a)) or intersections (Figure 7(b)) during the subdivision. It is not physically meaningful to have such patterns during the formation of fault segments, and it is also inappropriate to continue the subdivision with crossovers and intersections. Therefore, we implement an algorithm to merge the close and intersecting segments to prevent their    Figure 7. The merge algorithm naturally sets a limit on the mean length of fault segments, because the distance between subfault segments decreases with more subdivisions. Figure 4(b) shows a lognormal fit of the segment length before and after adding the merge procedure to the algorithm. The mean segment length is increased because of the merging. The segment lengths approximately follow the lognormal distribution as expected in Equation (5).

A Modified Algorithm to Calculate Fractal Dimension.
Fractal dimension is a measure of a set's complexity or its irregularity. Mandelbrot [42] defined a fractal as the following: "a set for which the Hausdorff-Besicovitch dimension strictly exceeds the topological dimension," where the topological dimension refers to the number of coordinates associated with each element in the set. The example chosen by Mandelbrot [42] deals with measuring the length of the coastline of Britain. In this example, the shorter the yardstick length, r, the longer the length of the coastline LðrÞ is, and L tends to increase without a limit. Mandelbrot [42] suggested that an empirical relationship between L and r is as follows: According to Mandelbrot [42], D in Equation (6) represents the fractal dimension of the coastline. The fractal dimension characterizes the complexity of the coastline, and a larger D reflects a higher complexity of the coastline.
Mandelbrot [42] also proposed other methods to calculate the fractal dimension of a set. One of the methods is to draw circles of a chosen radius r to cover the set completely using a minimum number N of such circles (to "cover a set" means that the set lies inside of the covering circles. A sketch map is shown in Figure 8(a)).
An example of the coastline of Scotland is shown in Figure 9. The image of Scotland has a size of 509 × 550 pixels, and the size of 1 pixel is about 1.6 km. In Figure 10(a), we plot the number of covering circles N versus circle radius r. They are plotted as a straight line on a double-logarithmic plot, with the slope, or the fractal dimension, D N = 1:24 (R 2 = 0:997). Figure 10(b) shows the dependence of the coastline length, LðrÞ = 2Nr, on the circle radius. It has a plateau for large radius values (>500 pixels), which can be related to skipping over the fine details of the coastline. For the radius values below 1 pixel, no finer image details can be resolved. The transitional part between the small and large radii can be fitted by a straight line, whose slope provides the fractal dimension. However, an arbitrary choice of data points for line fitting is subjective and may cause significant variations of the corresponding fractal dimension. Instead, we fit the L vs. r dependence with a scaled logistic function, Equation (7), and then, we find the derivative of this logistic function, Equation (8).
From the plot of the derivative, Figure 10(c), more accurate fractal dimension is determined as the maximum fractal dimension of the set, denoted as D L . Once we plot the tangent line with a slope of (1 − D L ) (the blue line in Figure 10(b)) and intersect the upper and lower bounds of the total length (green lines in Figure 10(b)), we get the upper and lower limits of the circle radii for the maximum fractal dimension.
Using this methodology, we find that the fractal dimension is 1.48 for the coastline of Scotland. The corresponding upper and lower limits on radius are 6 and 177 pixels, respectively. The two fractal dimensions, D N and D L , are different; however, D L reveals more information about the complexity of a given set and the applicable range of fractal characteristics, while D N averages over all data points.

Lithosphere
The circle-based method was adopted by Okubo and Aki [55] to calculate the fractal dimension of the San Andreas fault system. However, from their sketch map (Figure 2 in Okubo and Aki [55] or Figure 8(b)), we see that the circles do not cover their set completely, and the number of circles, NðrÞ, is underestimated. Therefore, we propose a modified algorithm and calculate the fractal dimension of parallel fault segments. The algorithm is accurate for parallel and vertically wellsorted fault segments. It can be extended to the case of subparallel and not well-sorted fault segments that are common in reality, but with lower accuracy. Figure 11 shows a set of parallel and well-sorted line segments to demonstrate the algorithm. The red circle is the circle used in the previous step to partially cover the line segments. We need to find the next circle and subsequent circles to cover the set with a minimum number of circles. The proposed algorithm minimizes the number of circles using the following two rules: (1) A single circle should partially cover as many line segments as possible, i.e., in Figure 11, a single circle should cover all n line segments (2) With the same number of line segments covered, the circle should cover as large a length of segment parts which lie inside that circle as possible, i.e., both the blue and green circles cover n line segments. But, we choose the green circle, because it covers a larger length, which is proved in the appendix In Figure 11, if the uppermost and lowermost line segments are covered completely, all other line segments in between will be covered automatically. Therefore, the only situation to be considered is a set composed of two line segments, with as many line segments in between as possible. Intuitively, the blue circle is the first option for covering these line segments, because it passes through the two intersection points, A and B. However, if we rotate the blue circle with respect to point B in a counter-clockwise direction, the coverage of line segment l n , BD, will decrease and the coverage of line segment l 1 , AC, will increase. If the rotation can make the covered length larger, the maximum length should emerge when point C is exactly on the line segment l 1 and BC is the diameter of the circle O 2 . At this critical point, the covered length is maximum, and beyond it, the covered lengths of both line segments l 1 and l n decrease. In the appendix, we demonstrate that the green circle (scenario 2) always covers a larger length than the blue circle (scenario 1). Figure 11 is a more general case. If AB is perpendicular to the line segments, the blue and green circle overlap with each other completely.
With the correct choice of the circle to cover the line segments, the pseudocode of our algorithm is shown in Algorithm 1. The complete program can be found online (doi:10 .4121/uuid:38a2ed3e-f456-429c-9fcd-704b392bc8a6).
In the initialization part of the algorithm, one rotates the set of line segments to make them parallel to the X-axis (please note that all segments are assumed to be parallel or subparallel). Afterwards, the start points (with smaller X -coordinates) of the line segments are sorted along the Y -direction and afterwards along the X-direction. The coverage of the line segments starts from the left-most position of the set and continues rightwards. The method used to find the first circle for each iteration is sketched in Figure 12. For a given radius r, (i) draw a circle with the left-most point, A, in l 1 as the center and 2 × r as the radius; (ii) find the intersection points on the right side of all intersecting line segments. If there is no other line segment (Figure 12(a)), use AB as the diameter of the first circle. Suppose now that there are more line segments that intersect the green circle, and the intersection points are B, B′, and B′′. Start from the farthest line segment (l 3 ), draw the red circle O, and calculate the covered length CB + AD. If the covered length is larger than or equal to 2 × r, circle O is the first circle (Figure 12(b)). Otherwise, find the next farthest line segment and repeat the procedure. This is also the method used to implement rule 1, which makes a single circle partially cover as many line segments as possible.
If the situation shown in Figure 13 happens where EC cannot be covered by the circle O 1 , we divide the line segment EB into two segments, EC and CB. Segment EC will be a new line segment mentioned in the pseudocode and will be covered in the next iteration. For the last circle in each iteration, circle O 2 in Figure 13, the line segments do not span the circle, and we count the number of circles with a fraction, the ratio between the maximum length covered (FN in Figure 13), and the diameter 2r instead of 1. By doing this, we can avoid an overestimation of the number of circles. To make the program more efficient and save computing    Lithosphere time, we also divide the domain into blocks, and only neighbouring segments that share the same blocks are checked for coverage in each iteration.

Results and Discussion
In this section, we implement the modified algorithm to calculate the fractal dimension of natural and generated fault segments.

Fractal Dimension of Real Fault
Maps. The discontinuous and segmented characteristics of faults can be analyzed using fault maps obtained from high-resolution seismic surveys or outcrops. In this paper, three fault maps are chosen to calculate the fractal dimensions using the proposed algorithm. The first one is the San Andreas fault system (Figure 1(b)). The northern part of the San Andreas fault system captures the geometric features that are common to all strike-slip faults with segmentations [22]. The fault segments that are mapped on the surface [50,56] are oriented NW-SE with the echelon patterns. A sketch map illustrating the process used to calculate the fractal dimension is shown in Figure 14  Considering the San Andreas fault, the fault length is plotted in Figure 15(a) versus the covering circle radii. This figure illustrates that the length of the fault and the circle radius have a linear relationship within a certain range on the log-log plot. After fitting with a scaled logistic function, we find that the fractal dimension is 1.29. The lower limit on the circle radius is about 66 meters, and the upper limit is about 226 meters. Below the lower limit, all details of fault segments are already captured due to the limited resolution of the map. Consequently, the total length of the fault does not change and yields a plateau in the figure. Above the upper limit, all details about the segmentation are ignored, and that makes the total length of the fault remain unchanged. In this case, the fault traces are relatively nonfragmented, simple, and nonfractal geometric sets. For the other two fault maps we chose, the results are shown in Figures 15(b) and 15(c). The fractal dimensions calculated from the fits are 1.29 and 1.34, respectively. The upper and lower limits for fault map 2 are 461 and 2597 meters, respectively. For fault map 3, the corresponding values are about 847 and 3054 meters. A detailed summary of the results for these three real fault maps is listed in Table 1.

Fractal Dimension of Generated Fault Segments.
In this section, the fractal dimensions of the fault segments  Figure 12: Two scenarios to find the first circle (red circle) in each iteration. Figure 13: A sketch map of the procedures used to cover the line segments. The coverage will start from the left-most position and continue rightwards. The left part, EC, will become a new segment and be covered in the next iteration. Circle O 2 will not be regarded as a whole circle but as a fraction of the circle (FN/2R). 9 Lithosphere generated with the algorithm described in Section 2.1 are calculated. We choose to generate fault segments with two segments in each generation, and no rotation of each segment is implemented because the fractal calculation is accurate for parallel and vertically well-sorted fault segments. The generated segments with L se =110, 311, and 1664 meters are shown in Figures 16(a)-16(c), respectively.
The calculated fractal dimension is shown in Figure 17 and summarized in Table 2. The fault length is a linear function of the circle radii within a certain range on a log-log plot. When the radius is small or large, the length of the fault is constant. The lower limit and upper limit of the radius yield an interval in which the fault segments show the fractal or hierarchical self-similar characteristic. The values of these limits are directly related to the resolution of the sample set. The upper and lower limits of the radius increase with an increase in the mean segment length. If we assume that there are two segments with equal length generated in each generation and the step length is k times of the segment length, we can have an exact fractal geometry with the fractal dimension: However, the nonuniform step length and width in the generated fault segment can change the fractal dimension, as well as the nonuniform segment length. Therefore, the fractal dimensions of the generated fault segments vary between 1.25 and 1.31. The fractal dimension is slightly smaller than the dimension of real fault segments, because generated fault segments are less complex in geometries than the real fault segments. If more complexities, such as more     11 Lithosphere structure of fault segments makes the modelling of associated damage zones around each fault segment possible, which is essential for subsurface fluid flow and will be discussed in future research.

Conclusions
In this paper, we have proposed a new method to generate fault segments from an imprecise fault trace obtained from a seismic survey or outcrop. The generated fault segments can be used to analyze the fault damage zone and the faultrelated flow problems. These segments have characteristics similar to those found in the natural fault segments observed from outcrops or seismic surveys. The generated segment lengths follow approximately a lognormal distribution. The segment geometry recovers the hierarchical self-similar architecture, which is consistent with the natural fault segments. We also propose a modified algorithm to calculate the fractal dimension of the fault segments. Our algorithm can cover the target set more accurately, and it improves on Okubo and Aki's method [1987]. The calculated fractal dimensions of the generated and real fault segments are similar. This finding validates our approach from a fractal dimension perspective.

Appendix Derivation of the Largest Covering Circle
Two cover scenarios are shown in Figures 18 and 19. The distance = h between the line segments l 1 and l 2 is known, the radius of the circle is R, and the length of O 1 M is unknown and denoted by a. Here, we prove that for a given h, no matter how a changes, scenario 2 will yield a longer covered length.
In Figure 18, the covered length is L 1 = AX + BY. It is simple to prove that ΔANO 2 is equal to ΔBMO 1 , because the following equations hold Therefore, NO 2 = MO 1 = a, and L 1 is ðA:2Þ In Figure 19, ∠ADN = π/2 because AN is the diameter of ðA:4Þ Since   Figure 19: A sketch map of scenario 2, where for a fixed red circle, the next circle is green, and it is obtained by rotating the blue circle with respect to the intersection point A until point N is on the segment l 2 (AN is the circle diameter).

Lithosphere
Denote ∠AEC = α and ∠O 1 EC = β. Then, ðA:8Þ From ΔO 1 CF and ΔAED, we also know that sin β = a R : 8 > > < > > : ðA:9Þ By denoting ∠O 2 AN = θ, we calculate the covered length in scenario 2: The difference between L 1 and L 2 is denoted as f . Take the derivative of f with respect to a.
It is easy to see then that if a < h/2, X 1 > 0. dL 2 /da is nontrivial.
: ðA:14Þ The first term is simple, because θ is independent of a.
If a < h/2, X 3 in Equation (A.23) is positive. Therefore, the derivative of f ðaÞ, Eq. (A.15), is df da = X 1 + X 2 × X 3 > 0: ðA:24Þ f ðaÞ is a monotonically increasing function for a ≤ h/2. When a = h/2, f has the maximum value, which is 0. If a > h/2, it is easy to see that X 1 < 0, X 2 > 0, and X 3 < 0. Therefore, f ðaÞ is a monotonically decreasing function with respect to a > h/2, and the maximum value is still 0 when a = h/2. When a ≠ h/2, f < 0, which means that scenario 2 always covers more segment length than scenario 1.