## Abstract

Most mountain belts on Earth show some degree of curvature in plan view, from a slight bend to horseshoe shapes. Such curvatures may occur on different scales, from individual thrust sheets to entire plate boundaries. Curvature may be acquired by vertical-axis rotation during or after orogenesis, or reflect primary lateral variations in shortening directions or physiographical features. Quantifying the amount of vertical-axis rotations of plan-view curvature is therefore helpful to our understanding of orogenesis, geodynamics, and paleogeography. The orocline test assesses to what extent vertical-axis rotations have played a role in the acquisition of an orogen’s curvature. The test quantifies through linear regression the relationships between changes in structural trends and the orientations of a geologic fabric. However, the current mathematical approaches to the orocline test show potential biases.

In this paper we aim to overcome such biases by developing a novel orocline test that applies total least squares (TLS) regression combined with a novel approach to bootstrapping. This bootstrap TLS orocline test can be used with all types of directional data acquired from structural geology, paleomagnetism, or sedimentology. It quantifies, for the first time, secondary curvature with confidence bands. We also provide several graphical and analytical tests to evaluate the statistical significance of the result. An open source online application implementing this method is available for use on www.paleomagnetism.org. We illustrate the use of the methodology by reanalyzing published data sets from two well-known oroclines in the Cantrabrian (northwest Iberia) and Aegean (Greece) regions.

## INTRODUCTION

Historically, orogenic systems have been described and characterized using transverse sections. Although some recognized plan-view curvature early on (e.g., Suess, 1892), it was Carey (1955) who noticed that most orogens on Earth show a certain degree of curvature in plan view; he emphasized the importance of horizontal deformation of Earth’s crust. Plan-view curvature is present at scales ranging from kilometers, such as in individual structures (e.g., Rodríguez-Pintó et al., 2016), to plate boundary scales (e.g., Johnston et al., 2013). Plan-view curvatures range from a few degrees to 180° bends. In addition, plan-view curvatures are observed as single bends, coupled, or multiple bends (Johnston et al., 2013).

Earth scientists have referred to plan-view curvature of mountain belts with different terms, including orocline, arc, syntaxis, curve, bend, virgation, salient, festoon, arcuate range, oroflex, or recess (Marshak, 2004). The term orocline (from Greek ορος, mountain, and κλινο, bend) is the most popular, and was coined by Warren Carey (1955, p. 257) to describe “…an orogenic system, which has been flexed in plan to a horse-shoe or elbow shape.” Note that the term orocline is sometimes used in the literature as a geometric description for any orogenic curvature, despite the kinematic implication of Carey’s (1955) definition, in which the bending occurs subsequent or coeval to the formation of a rectilinear mountain belt. In this paper we follow Carey’s (1955) original definition and use the term orocline only for map-scale bends that underwent vertical-axis rotations.

Multiple mechanisms are invoked to explain oroclines. Vertical-axis rotations require lateral strain gradients: laterally differential shortening or extension (e.g., block indentation, slab rollback, orogen-parallel shortening) or inherited structures (e.g., basin shape, reactivated structures) may explain the changes in trend in mountain belts (Marshak, 2004; Johnston et al., 2013). Sometimes the proposed mechanisms involve only the very top of the upper crust (Marshak, 2004) or the entire lithosphere (Gutiérrez-Alonso et al., 2004). Quantifying the kinematic evolution of plan-view curvature in orogens is therefore key to understanding the geodynamic evolution of orogenic belts in three dimensions.

Several classifications of orogenic curvature were proposed based on geometry (Marshak, 2004; Weil and Sussman, 2004). A widely used kinematic classification was proposed by Weil and Sussman (2004), and slightly modified by Johnston et al. (2013), and includes factors related to the inferred mechanism of formation. The kinematic classification distinguishes two end-member curvatures: (1) primary, which includes all orogens and thrust belts characterized by a curvature with an inherited physiographical feature present prior to the formation of the orogen, such as an oceanic embayment; and (2) secondary, which are plan-view curvatures that developed in response to bending or buckling of a preexisting linear orogenic belt about a vertical axis of rotation (Carey, 1955). All other curvatures between these end members are known as progressive (Fig. 1).

A first orocline test to evaluate the kinematics of curved mountain belts was proposed by Eldredge et al. (1985). Subsequent more advanced methods included sample uncertainties (Yonkee and Weil, 2010a), but introduced a mathematical bias. In this paper we propose a new quantitative methodology to evaluate the kinematic evolution of curved mountain belts using directional data from geological field observations or paleomagnetism. The method takes advantage of total least squares (TLS) regression (Golub and van Loan, 1980; Markovsky and Van Huffel, 2007) and a novel approach to bootstrapping data sets. We illustrate the applicability of our methodology using two case studies in Spain and Greece.

## PREVIOUS OROCLINE TESTS

Unraveling the kinematic evolution of oroclines requires analyzing the timing and rate of acquisition of curvature in the orogen and the extent the curvature was present before orogenesis started. To this end, we should collect accurate, statistically significant, and independent directional data sets. Traditionally, paleomagnetic data have been used the most to study vertical-axis rotations (e.g., Weil et al., 2013; Pueyo et al., 2007, 2016). Vertical-axis rotation patterns have also been studied using structural or sedimentological data, such as deformation fabrics and paleocurrent directions (e.g., Walcott and White, 1998; Kollmeier et al., 2000; Yonkee and Weil, 2010b; Pastor-Galán et al., 2011; Shaw et al., 2012; Li et al., 2012; van Hinsbergen and Schmid, 2012; Malandri et al., 2016). In the following we describe previous approaches for quantifying the kinematics of curved orogens and continents and discuss their strengths and weaknesses.

### Ordinary Least Squares Orocline Test

The first kinematic test was introduced by Schwartz and Van der Voo (1983) and Eldredge et al. (1985) and named orocline test (or strike test). This test evaluates the relationship between changes in regional structural trend (relative to a reference trend for an orogen) and the orientations of a given geologic fabric element (relative to a reference direction). The methodology was originally developed by paleomagnetists (e.g., Eldredge et al., 1985) to compare paleomagnetic declinations versus orogen strike. It has been adopted by geologists to test orogenic curvature using strain data (Yonkee and Weil, 2010b), fracture data (Pastor-Galán et al., 2011), calcite twin data (Kollmeier et al., 2000), anisotropy of magnetic susceptibility lineations (Weil and Yonkee, 2009), and paleocurrent directions (Shaw et al., 2012; Weil et al., 2013).

The original orocline test assumes that in case of vertical-axis rotations the orogenic trend and the chosen fabric rotate together, in which case the relationship between orientations is linear, i.e., the angle between the two linear features is constant throughout the orocline. The classical orocline test plots data on a Cartesian coordinate system with the strike (*S*) of the orogen (relative to a reference) along the horizontal axis, and the fabric azimuth (*F*, relative to a reference) along the vertical axis. This test uses the ordinary least squares (OLS) regression to estimate the slope (coded m in formulas), ideally between 0 and 1. Figure 1 illustrates some kinematic possibilities. Primary orogenic bends, i.e., those showing no vertical axis rotations (Fig. 1A), show no change of fabric orientations with varying structural trend, and therefore the slope is expected to be 0. In progressive oroclines, the investigated fabric develops progressively during the rotation. Accordingly, the slope in the test yields values between 0 and 1, depending on the amount of curvature present when the studied fabric formed. Secondary oroclines are those in which the investigated fabrics record 100% of the rotation, yielding slopes of 1, meaning that the studied fabric developed prior to any vertical-axis rotation. The slope obtained with the orocline test can only be confidently interpreted if the timing of fabric development and initial orientation are well constrained.

A problem associated with this classical test is that the OLS method assumes no variance in each data point, i.e., it considers all data to be 100% accurate and precise. Geological sampling and measurements, however, typically contain errors on individual orientations. Sources of error include, for example, compass and laboratory instrument precision, operating errors during measurements both in the field and laboratory, unidentified complexities in structure, stratigraphy, or metamorphism, and lack of control on the age of deposition, deformation, metamorphism, or magnetizations.

Sources of uncertainty propagate into errors associated with averaging groups of measurements. It would not be such an important issue if variance was present only in fabric orientation (the dependent variable, *F*, of the regression). The error in *F* would cause uncertainty in the estimated slope, but OLS would correctly calculate the slope located in the middle of the confidence interval. However, both the orientation of a fabric (*F*, dependent variable, vertical axis) and the orogen strike (*S*, independent variable, horizontal axis) are associated with uncertainties.

The presence of variance in the independent variable causes a problem known as regression dilution or regression attenuation (Draper and Smith, 1998), which tends to bias the regression slope toward lower values. In other words, uncertainty in *S* yields a systematic underestimation of the absolute value of the regression slope. In addition, the greater the variance in the *S* measurement, the more bias the estimated slope will have toward 0 instead of the true value. It may be counterintuitive that errors in *F* do not produce a bias but errors in *S* do. It is important to emphasize that OLS is not symmetric: the best fit regression for predicting *F* from *S* (the usual linear regression) is not the same as the best fit for predicting *S* from *F* (Fig. 2A; Frost and Thompson, 2000). Consequently, we establish that OLS is not a robust method to quantify the kinematic evolution of oroclines.

### Weighed Least Squares Orocline Test

Yonkee and Weil (2010a) suggested using a weighted least squares (WLS) regression in the orocline test to account for site uncertainties. WLS regression considers each observation to contain more or less information about the relationship between *S* and *F* than the other observations. In this method, each individual point is weighted depending on its uncertainty. Sites or localities exhibiting large uncertainties in fabric and/or strike orientations are assigned less weight in the regression and sites with more precision are assigned a higher weight.

The goal of the Yonkee and Weil (2010a) WLS regression was to develop a more accurate regression and provide confidence intervals based on the actual uncertainty of the input data set. We discuss two problems with this method.

First, the WLS method assumes that more precise measurements are more accurate, giving them more weight, but precision and accuracy are disparate. Precision regards consistency of the data set, describing how well an experiment repeats a previous result: if measurements show a small variance, the data are precise. Accuracy means how true the mean of the data represents an observation. A typical example in geology is obtaining bedding measurements of a loose block. The data can be very precise but are not representative of the regional bedding. Results with a very high precision may even be, in some cases, suspicious. For example, in structural geology and sedimentology high precision may indicate that results are not representative, i.e., the data were collected from a single lithology, or in an area too small to reflect regional structural grain. Likewise, in paleomagnetism very low uncertainties may indicate undersampling of the paleosecular variation of the geomagnetic field (Deenen et al., 2011). The WLS method is biased toward the more precise values, regardless if they are more or less accurate. Giving extra weight to small uncertainties (high precision) may thus introduce an undesired bias.

Second, WLS regression was not developed to correct regression dilution (Draper and Smith, 1998). WLS regressions are a powerful method for fitting trends in data showing heteroskedasticity. Heteroskedasticity is a property of distributions in which the variance changes through the distribution, e.g., when the variance becomes larger as *S* takes larger values (Fig. 2B). Heteroskedasticity can be identified by a patterned or nonrandom distribution of the residuals after OLS regression. No evidence has been found for heteroskedastic behavior in the geological distributions analyzed for the orocline test. As we discuss herein, geological data are mostly homoskedastic, which is the property in which the variance is equal throughout the range of values and residuals are randomly distributed. Figure 3 shows that the residuals from three different orocline tests are homoskedastically distributed. The examples shown in Figure 3 are calculated using different sources of data: cleavage (Li et al., 2012) and calcite twining (Kollmeier et al., 2000), and paleomagnetic data (Meijers et al., 2016), both plotted against the strike of the orogen. Associated statistics for the TLS orocline tests are presented in Tables 1 and 2.

To show the potential control of biased data on the WLS regression method we give an extreme example from a synthetic data set. The data set consists of 20 points; 18 points are randomly distributed in *S* (strike) from –45 to 45. The *F* coordinate follows the relation *S* = 0.35*F* plus a small source of noise (a random number from –3 to 3; see the GSA Data Repository^{1} for the distributions). Applying OLS regression to those 18 points shows a linear relationship, *S*_{(OLS)} = 0.35*F* and R^{2} = 0.98 (Fig. 4). The 2 extra points are outliers situated at *S* = –15 and 15. The *F* coordinate is equal to *S* plus a small source of noise (random number between –1 and 1). OLS regression with all 20 points yields *S*_{(OLS)} = 0.38*F* and R^{2} = 0.88.

To evaluate the WLS regression, the first 18 points were associated with uncertainties at 95% of confidence of 10° in both *S* and *F*, which represent a typical uncertainty in structural measurements. We have applied an unrealistically low uncertainty of 3° to the 2 outliers. Those points are still within the error limits of the surrounding points, making them difficult to identify as outliers. The WLS regression output with no outliers (n = 18) is *S*_{(WLS)} = 0.35*F* ± 0.11, exactly as the OLS. However, after adding the 2 outliers (n = 20), *S*_{(WLS)} = 0.62*F* ± 0.098 (Fig. 4B), i.e., strongly different from the OLS method. In typical Earth science, data set outliers are not always easy to identify. To avoid the possibility of a few unidentified outliers significantly distorting parameter estimates, we propose a new and more robust method.

## THE TLS OROCLINE TEST

Here we suggest a robust method for quantifying vertical-axis rotation patterns in curved orogens that is capable of providing a reliable fit for the slope and confidence intervals both for the slope and intercept. The method follows the philosophy of the orocline test (Eldredge et al., 1985) and the assumption of linearity between the strike of the orogen and the studied fabrics. To solve the mathematical complications outlined here we propose using a TLS regression (Golub and van Loan, 1980; Markovsky and Van Huffel, 2007). TLS is capable of fitting a line to data where errors may occur in both the dependent and independent variables without inducing any bias or giving arbitrary weights to the measurements. It is important to note that the TLS method requires that both variables are measured in the same units, in degrees from north in the case for the orocline test.

Solving the quadratic for β_{1} leads to two solutions, one representing a line closely aligned with the data and the other representing a line at right angles. The solution closest to the data may be determined choosing the solution resulting in the lowest sum of squared errors.

### Sample Size

It is important to know in advance how many data points are typically required to estimate the slope of the best-fit line within a certain confidence interval (e.g., 95%). To help answer this question, we performed a simulation study. Orogenic curvatures were considered from 20° to 180° in 5° steps and incremental data sets of size 25, 50, 75, 100, 150, 200, 300, and 400 were analyzed. For each combination, 200 data sets were generated by randomly selecting points along the line of slope from 1 through the origin. Each point then had its position randomly adjusted according to a normal distribution. Standard deviations (σ) of 5, 10, 15, and 20 were tested; σ = 5 (95% confidence interval of ±10) is a typical value for many structural or paleomagnetic sites. Whereas σ = 10 (95% confidence interval of ±20 if normally distributed) corresponds with typical errors associated with paleoflow indicators or other dispersed geological markers (Fig. 6). Standard deviations of σ = 15 and 20 would correspond with sites with very scattered data (Data Repository). We performed simulations in which points were selected randomly (Fig. 6A) and evenly along the curvature (Fig. 6B).

Figure 6 shows the results of simulation and shows that 25 data points with a typical σ = 5 for paleomagnetism or structural geology could be enough to evaluate orogenic bends >90°; from this curvature we would expect m ± 0.2. In contrast, data sets with σ = 10 will need ∼100 measurements to achieve the same confidence interval. The results also show that curvatures <30° are difficult to quantify statistically. Note that no large differences are observed between evenly or randomly spaced sampling. However, a good sampling strategy may decrease the error, especially if there is a careful selection of points in the most extreme differences in strike.

Yonkee and Weil (2010a) provided an equation to estimate the number of samples needed to achieve a certain degree of confidence in the slope. Our simulation results are less optimistic than this equation. The simulation suggests that it is necessary to systematically collect a slightly higher number of sites (between 10 and 20 more).

### Confidence Limits and Linearity Control through Bootstrap

Bootstrapping is a simple and reliable method to derive estimates of σ and confidence intervals for estimators of complex parameters of the distribution, such as percentile points, proportions, odds ratio, and correlation coefficients. Bootstrapping is also an appropriate way to control and check the stability of the results. Although for most problems it is impossible to determine the true confidence interval, bootstrapping is asymptotically more accurate than the standard intervals obtained using sample variance and assumptions of normality (Efron, 1987). To perform an accurate TLS orocline test, a minimum number of reliable sites are required, which provide estimates of strike and the fabric uncertainties. In other words, all sites with low-quality and/or dubious data ought to be discarded.

In the bootstrapped TLS orocline test, we use a novel approach of bootstrapping on the error margins of each data point to estimate the confidence interval for the regression. This procedure randomly creates a number of pseudosamples (we used a minimum of 1000) with the same size as the original date set, but now pseudosamples are created by resampling (with replacement) each data point within its confident limits. This is in contrast to normal bootstrapping where N data points are randomly resampled from a distribution of N without taking the individual error of each data point into account. A TLS regression is calculated for each pseudosample. Confidence intervals are calculated for the slope using percentiles and/or calculating the σ.

Our method of bootstrapping within the error bar of a data point first assumes the validity (or accuracy) of the data point and second takes into account the precision (or uncertainty) of that data point, but without giving weight to the actual value of the data point. Therefore, in our bootstrapping method we take always every data point, but the value of each data point varies according to its error bounds, simultaneously in *S* and in *F*. Therefore, varying variances in *S* and *F* do not matter but are automatically taken into account.

In addition, this novel approach to bootstrapping provides a control on the assumption of linearity of linear regression. If TLS regression on actual data points fits in the center of the bootstraps, then the linear approach is a valid assumption for that particular data set. Our method, included in the online paleomagnetic analysis tool www.paleomagnetism.org, provides two options of resampling: (1) a nonparametric standard option, in which resampling is randomly homogeneous through the error bounds, and (2) a parametric Gaussian option in which resampling follows a Box-Muller transform (Box and Muller, 1958). From these simulations we calculate a 95% confidence interval on the linear regressions. The standard option will always provide larger uncertainties. We recommend using this option if confidence intervals are not known to be Gaussian. For example, it will be always better to use standard resampling when data are taken from maps and literature. In contrast, the parametric approach is preferred for data collected from the field in which the actual statistical parameters are known for each site.

To illustrate the results produced by the bootstrapped orocline test we have selected two data sets: paleoflows (*F*_{c}) versus orogen *S* from the Lachlan orogen of Australia (Musgrave, 2015) and paleomagnetic declinations (*F*_{d}) versus orogen *S* from the Central Taurides orocline of Turkey (Koç et al., 2016). Figure 7 shows the results for the TLS orocline tests and the confidence bands obtained after applying plain bootstrapping on the errors of each data point (full statistics for these orocline tests can be found in Table 1).

### Evaluation of TLS Orocline Test

The TLS orocline test assumes that variances in *S* and *F* are equal. In nature they are usually of comparable magnitude, but not necessarily similar. Data sets that show variances in one of the variables very different from the other may introduce a bias in the TLS. If this is the case, the average of the bootstrap will not coincide with the average of the TLS. We have performed a simulation to test if the assumption of equality in the variances introduces a bias in typical geological cases.

The simulation tested 2000 synthetic samples consisting of 4 parameters, *S*, *F*, and their respective 95% confidence intervals (in the files they are presented in the following order: *F*, *F*_95%, *S* , *S* _95%). All the synthetic samples were produced with a strong linear correlation (Pearson’s R > 0.75 when m was >0.2), with slopes randomly chosen between 0 and 1. The differential strike of the orogen was selected in each simulation randomly between 45° and 180°. In each case 100 sites were taken randomly around the orocline. In half of the simulations (1000) confidence intervals in *S* were one order of magnitude (values ∼10) larger than *F* (values ∼1) and vice versa. We calculated the slope of the line by using OLS, WLS, and TLS in each synthetic sample. In addition, we performed 1000 bootstraps with replacement on the data-point error bars in every synthetic data set to estimate all possible slopes. From the bootstraps we have obtained the average slope and the 95% confidence interval, after discarding the 2.5% most extreme values obtained from the top and the bottom of the ordered distribution.

The results of the simulation indicate that slope estimates in the TLS orocline test on data always plot (100%) within the 95% confidence interval obtained by bootstrapping the confidence intervals of each data point (Fig. 8); 30% of slope estimates from TLS coincide with the mean bootstrap value, and in the remaining 70% the difference between the mean bootstrap value and the TLS was <0.01. These results indicate the validity of TLS even in the special case of significantly different variances in *F* and *S*.

Simulation results, however, are not as positive for OLS or WLS regression. OLS yielded 54% of the estimates within the 95% confidence and only 9% coincided with the average of the bootstraps, which always occurred when the actual slope was close to 0. In addition, and to be expected, in 81% of the cases the calculated slope was below the mean bootstrap value, indicating regression dilution. In the WLS orocline test only 44% of estimates were within the 95% confidence interval, yet the results were not systematically above or below the average. In 2% of the cases, the WLS test yields physically unrealistic estimates (m > 1.5, occasionally m > 7), despite all points showing similar variances in the simulation, and no overweighting is expected. The bootstrapped TLS orocline test showed the best performance even assuming dissimilar variances.

## CASE STUDIES: THE CANTABRIAN AND AEGEAN OROCLINES

### Cantabrian Orocline

The West European Variscan belt resulted from the collision between the Gondwana and Laurussia continents and several microcontinents upon the Devonian–Carboniferous closure of the Rheic Ocean (e.g., Pastor-Galán et al., 2013a). The remnants of this mountain belt are today found in western Europe and define a sinuous shape through Iberia (Fig. 9; Martínez-Catalán, 2011). The core of the Cantabrian orocline (Fig. 9A), known as the Cantabrian Zone, represents the Gondwanan foreland fold-thrust belt of the Variscan orogen. Structurally, the foreland fold-thrust belt is characterized by tectonic transport toward the core of the orocline, i.e., the orocline has a contractional core, where low finite strain values and locally developed cleavage occur (Pérez-Estaún et al., 1988; Gutiérrez-Alonso, 1996; Pastor-Galán et al., 2009). Illite crystallinity and conodont color alteration indexes are consistent with diagenetic conditions to very low grade metamorphism (e.g., Colmenero et al., 2008; Pastor-Galán et al., 2013b; García-López et al., 2013). Deformation in the Cantabrian Zone occurred between 330 and 300 Ma.

Many have studied the Cantabrian orocline over the past few decades, resulting in a variety of hypotheses for its origin. A wealth of paleomagnetic (e.g., Weil et al., 2013; Pastor-Galán et al., 2015a, 2015b, 2016) and structural data (e.g., Julivert and Marcos, 1973; Kollmeier et al., 2000; Pastor-Galán et al., 2011, 2012, 2014; Shaw et al., 2016) have constrained the Cantabrian orocline to be bent or buckled around a vertical axis in a short period of time from 310 to 297 Ma. Being well constrained, the Cantabrian orocline is an appropriate example to test the TLS orocline test.

We have selected several data sets from the Cantabrian orocline (Table 2). Paleocurrents measured in Ordovician sedimentary rocks (Shaw et al., 2012) suggest a 100% secondary curvature when tested against the strike of the orogen. Although sedimentary structures are not necessarily parallel along an orogen, the result at least suggests that the observed curvature did not exist in Ordovician time, when the paleoflows recorded the sedimentary structures. Calcite twins that formed in the first stages of folding (Kollmeier et al., 2000) also suggest a 100% secondary curvature, constraining the rotation to be synfolding or postfolding (i.e., Late Carboniferous). Prefolding and postfolding paleomagnetic data of Moscovian age (310–307 Ma; e.g., Weil et al., 2013) also show 100% secondary curvature, which implies that orocline formation postdates the latest stages of folding (Pastor-Galán et al., 2014). Joint sets in Gzhelian basins (304–300 Ma) show between 50% and 80% of secondary vertical-axis rotations. This constrains the formation of ∼65% of the orocline during the period that spans from late Moscovian (310 Ma) to Gzhelian. Joint sets in Asselian basins (295 Ma) recorded 0% of the rotation, meaning that the full curvature of the Cantabrian orocline occurred from 310 to 295 Ma. Figure 9B is an illustration exhibiting the evolution of the core of the Cantabrian orocline depicting some of the different data sets used.

### Aegean Orocline

In the eastern Mediterranean region, subduction of the African-Adriatic plate below Eurasia led to the accretion of a thin-skinned fold-thrust belt with upper crustal rocks derived mainly from now-subducted continental lithosphere of the African-Adriatic plate (van Hinsbergen et al., 2005a; Schmid et al., 2008; Jolivet and Brun, 2010). In the central segment of this fold-thrust belt, an extensional backarc basin opened, the Aegean Sea. Surrounding this backarc is the Aegean orocline (Fig. 10). Paleomagnetic research in the unmetamorphosed forearc domain of the Aegean–west Anatolian region has shown opposite vertical-axis rotations, predominantly in the past 15 m.y., of ∼50° clockwise in the west (Kissel and Laj, 1988; Horner and Freeman, 1983; Speranza et al., 1995; Duermeijer et al., 2000; van Hinsbergen et al., 2005b; Broadley et al., 2006) and ∼20° counterclockwise in the east (Kissel and Poisson, 1987; Morris and Robertson, 1993; van Hinsbergen et al., 2010a, 2010b), relative to an essentially unrotated hinterland to the north of the extensional backarc (van Hinsbergen et al., 2008, 2010a; Fig. 9). Stretching lineations associated with extensional detachments in the heart of the Aegean backarc region show a curved pattern from north-northeast trending in the northern, nonrotated domains, to east-west in the clockwise, and north-south in the counterclockwise rotated parts (Walcott and White, 1998; van Hinsbergen and Schmid, 2012; Brun and Sokoutis, 2007; van Hinsbergen et al., 2010a; Jolivet et al., 2015). It is generally interpreted that the extension in the Aegean domain kinematically accommodated the vertical-axis rotations of the forearc blocks (e.g., Walcott and White, 1998; van Hinsbergen et al., 2005b; Menant et al., 2016; Malandri et al., 2016).

We use the TLS orocline test to evaluate whether the curvature in the stretching lineations associated with extensional detachments in the Aegean–west Anatolian backarc is proportional to the amount of vertical-axis rotation of the block of which they are part. In addition, we test to what extent the curvature of the folds and thrusts in the Aegean and southwest Anatolian forearc resulted from Neogene vertical-axis rotations, and to what extent such curvature already existed prior to rotation, e.g., inherited from a curvature of the Africa-Eurasia plate boundary. We used a compilation of stretching lineations from the Aegean–west Anatolian metamorphic complexes of van Hinsbergen and Schmid (2012), based on Hetzel et al. (1995), Walcott and White (1998), Bozkurt and Satir (2000), Işık et al. (2003), Ring et al. (2003), Rimmelé et al. (2003), Brun and Sokoutis (2007), Marsellos and Kidd (2008), Tirel et al. (2009), and Jolivet et al. (2010a, 2015b). Paleomagnetic data were restricted to those collected in Oligocene and early Miocene volcanics and sediments that predate the onset of regional block rotations, and were compiled from Horner and Freeman (1983), Kondopoulou and Lauer (1984), Kondopoulou and Westphal (1986), Kissel and Poisson (1987), Kissel and Laj (1988), Speranza et al. (1992, 1995), Morris and Robertson (1993), Atzemoglou et al. (1994), Mauritsch et al. (1995), Morris and Anderson (1996), Avigad et al. (1998), van Hinsbergen et al. (2005b, 2008, 2010a, 2010b), and Broadley et al. (2006).

Performing the TLS orocline test on the stretching lineation azimuths versus paleomagnetic declinations shows that within uncertainty, a 1:1 correlation exists between the two (Table 2; Figs. 10B, 10C). In other words, the variation in stretching lineations can be explained by vertical-axis rotations, lending support to the interpretation that forearc block rotation accommodated the opening of the Aegean extensional backarc. The TLS orocline test performed on the regional strike of the fold-thrust belt in the Aegean–west Anatolian forearc shows that 40% of the modern angle between the opposite limbs of the orocline, corresponding to ∼30°, is a primary configuration (Fig. 10C). This may be explained by the fold-thrust belt wrapping around the Moesian platform that formed the southern margin of Eurasia during Cretaceous–Paleogene subduction, as widely shown in kinematic or paleogeographic reconstructions (Dercourt et al., 2000; Barrier and Vrielynck, 2008; Schmid et al., 2008; van Hinsbergen and Schmid, 2012; Menant et al., 2016).

## SUMMARY AND CONCLUSION

In this paper we provide a new methodology to quantify regional vertical-axis rotation patterns in curved orogens, or oroclines. To that end we compared angular relationships between directional data sets, such as strikes of folds or faults, stretching lineation azimuths, paleomagnetic declinations, fracture strikes, and paleocurrent directions, across the curved orogeny, performing a so-called orocline test. We show how previous versions of the orocline test introduce bias. OLS reveals an artifact known as regression dilution, which implies a systematic underestimation of the slope if both *S* and *F* variables show uncertainties. WLS regressions arbitrarily weight data points, biasing the slope toward the overweighted values. Instead, we propose an orocline test using a bootstrapped TLS linear regression, a method in which observational errors in both dependent and independent variables are taken into account. This method avoids regression dilution and equally considers every point. TLS considers the variances in *S* and *F* equal. We have proved the method to be valid under typical dissimilarities in variances in *S* and *F*, as typically found in geological observations.

Our method provides confidence limits and a test of linearity through bootstrapping. Pseudosamples are created by randomly choosing (with replacement) a single data point within the confidence limits of each studied locality in order to include each point in the confidence margin calculation. Therefore, the method takes into account the uncertainties of each data point, but without giving weight to the actual value of the data point. Therefore, varying σ values in *S* and *F* are automatically taken into account.

The TLS orocline test gives slopes between 0 (no secondary rotation) and 1 (100% of secondary rotation). We provide an online application to apply the TLS orocline test at Paleomagnetism.org (Koymans et al., 2016). The code, written in javascript and therefore platform independent, is open source code and available at the public GitHub repository (https://github.com/Jollyfant/Paleomagnetism.org). The application is capable of performing the described analysis and providing vectorial figures. We illustrate the use of the TLS orocline test applying it to data sets from the Cantabrian and Aegean oroclines, and show how the method allows us to calculate to what extent oroclinal bending affected a primary curved (or linear) feature.

We thank Pengfei Li, Robert Musgrave, Marco Maffione, Ayten Koç, and Maud Meijers for sharing their data sets to produce several of the illustrations in the paper. Comments by Emilio Pueyo, two anonymous reviewers, and editors Arlo Weil and Kurt Stüwe significantly improved this paper. Pastor-Galán acknowledges support from an ISES grant; van Hinsbergen acknowledges financial support through European Research Council Starting Grant 306810 (SINK, Subduction Initiation reconstructed from Neotethyan Kinematics) and NWO (Netherlands Organisation for Scientific Research) Vidi grant 864.11.004.

### APPENDIX 1. TOTAL LEAST SQUARES OROCLINE TEST CODE

We developed a code written in javascript that performs the bootstrapped total least squares (TLS) orocline test. The code permits us to choose standard or Gaussian sampling for each variable and also performs the standard ordinary least squares (OLS) and Yonkee and Weil’s (2010a) weighted least squares (WLS) regression orocline tests for comparison. The code provides the most relevant statistical information (slope, intercept, confidence intervals, and Pearson’s R) and produces graphs showing both the orocline test with the specified uncertainties in each data point as well as the calculated 95% bootstrapped confidence interval. The code provides the cumulative distribution functions of the slope and intercept for the bootstraps to check if they follow a normal distribution, as expected. It includes a graphical analysis of the residuals: (1) a plot showing the residuals in strike (*S*) and fabric azimuth (*F*) marking lines at 1σ, 2σ, and 3σ to identify outliers; (2) histograms of the residuals and the bootstraps to check how they are distributed; and (3) a normal probability plot, which is a graphical technique to identify substantive departures from normality. If the resulting image looks close to a straight line, then data are approximately normally distributed. This graphical analysis permits identifying outliers, skewness, heteroskedasticity, and kurtosis.

The required input file is an ASCII (American Standard Code for Information Interchange) file consisting of four columns either comma, space, or tab separated with no headers. The columns should appear in the following order: azimuth of a fabric; confidence interval of the azimuth; strike of the orogen; confidence interval of the strike. The code is included in the suite Paleomagnetism.org/oroclinal.html (Koymans et al., 2016) and can be used online or downloaded from a public GitHub repository (https://github.com/Jollyfant/Paleomagnetism.org).

In this paper we provide several data sets (included in the Data Repository). All of them have been taken from the cited papers. No special treatment has been applied to the used data sets. We have used the provided reference orientation of the strike and studied fabric in all those papers that used them (Pastor-Galán et al., 2011; Weil et al., 2013; Meijers et al., 2016; Musgrave et al., 2015; Koç et al., 2016). For the papers in which we used the raw data, we have used circular averages as a reference line (Li et al., 2012). For the TLS test on the Aegean orocline we have done a compilation on several papers. In this case we used the expected pole as a reference for paleomagnetic data (taken from Torsvik et al., 2012) and the circular average for the structural data sets.

^{1}GSA Data Repository Item 2017085, text files including: 1) the orogens described in the text formatted to run them with the online application www.paleomagnetism.org, and 2) the numerical simulations performed to test the differences between the three kinematics tests described in the text, is available at www.geosociety.org/datarepository/2017, or on request from editing@geosociety.org.