## Abstract

A tangent linear version of the complex soil–vegetation–atmosphere transfer model Community Land Model has been developed and its ability to reproduce relevant sensitivities of the modeled soil moisture and latent heat flux (LE) evolution to a number of different parameters on a short time scale (3 d) was analyzed. To this end, a series of idealized experiments for different soil moisture states and three different soil types was conducted. We found that the tangent linear model performs well for a large range of conditions. Situations in which the linear approximation potentially fails were connected with the occurrence of saturation and with the highly nonlinear parameterization of water availability to plants. The sensitivity of LE with respect to the model's initial soil moisture state was calculated and compared with the LE sensitivity to soil texture, leaf area index (LAI), and vegetation roughness length. These sensitivities were found to be highly variable with soil type and soil moisture. Our results confirm that soil texture and LAI are key parameters that have a dominant influence on modeled LE under specific environmental conditions. As a preparatory study to soil data assimilation developments, the results also serve to quantify model uncertainties due to limited knowledge of forcing processes and model parameters.

Many data assimilation algorithms rely on linearized models. The questions arise whether the inherent assumptions are viable for land surface data assimilation and which model parameters have a large impact on latent heat fluxes. These questions were addressed using a linearized version of the Community Land Model.

**Land surface state variables** like soil temperature and humidity are influenced by a large number of interacting processes, influencing the evolution of the atmosphere, biosphere, and soil on a wide variety of time and spatial scales (e.g., Entekhabi, 1995). The necessity to realistically represent land surface processes in climate modeling and weather prediction has led to increasingly complex models that require a large number of often insufficiently known input parameters and initial state variables. Most parameters that characterize the properties of soil and vegetation are measured, if at all, only locally, while for modeling purposes representative parameter values for the whole gridded model domain are needed. Therefore, the questions to what degree uncertain input parameters and initial state variables deteriorate the model output and how these model parameters can be calibrated have attracted much research ever since soil–vegetation–atmosphere transfer (SVAT) models were developed. A large number of sensitivity studies addressing these questions can be found in the literature. Methodologies ranging from one-factor-at-a-time experiments (e.g., Wilson et al., 1987; Mihailovic et al., 1992; Pitman, 1994; Gao et al., 1996; Dirmeyer et al., 2000) to factorial experiments (e.g., Henderson-Sellers, 1993; Liang and Guo, 2003) to complex Monte Carlo based statistical analysis (e.g., Bastidas et al., 1999; Demarty et al., 2004; Liu et al., 2004) have been applied to identify parameters to which the model output is most sensitive.

Despite the multitude of models tested in these studies under various experimental setups, no universally valid conclusions can be drawn from these studies. Rather, the identification of important parameters seems to depend on (i) the model setup and parameterizations themselves, (ii) the modeled climatic conditions (e.g., dry vs. wet soils, vegetation cover and kind of vegetation), (iii) the time scale under consideration (e.g., latent heat flux is insensitive to deep soil moisture below the root zone for short time intervals, yet sensitive at a monthly time scale). Nevertheless, there is evidence from the studies cited above that parameters with a large influence on the modeled fluxes include soil texture, LAI, and vegetation roughness length.

In this study, we conducted a sensitivity analysis with a linearized (tangent linear) version of a state-of-the-art SVAT model and a comparison with the corresponding nonlinear sensitivities. The tangent linear model development was performed in the framework of setting up a four-dimensional variational (4D-Var) land surface assimilation scheme, and is based on the Community Land Model (CLM) version 3.5 as described by Oleson et al. (2008). Tangent linear and adjoint models can be seen as a representation of the model Jacobian and its transpose, respectively. They have been widely used in meteorological (e.g., Courtier et al., 1994; Kalnay, 2003; Rabier, 2005) as well as air-quality and atmospheric chemistry data assimilation applications (e.g., Errera et al., 2008; Elbern et al., 2007, 2010). For example, the 4D-Var data assimilation technique makes use of an adjoint model to determine the gradient of an objective function, measuring the misfit between a model run and the available observations with respect to the model parameters or the initial state to be optimized (e.g., Kalnay, 2003). Also, because tangent linear and adjoint models offer the possibility to calculate derivatives of the model output with respect to its input, they have been frequently used to perform sensitivity analyses (e.g., Gustafsson and Huang, 1996; Margulis and Entekhabi, 2001a,b; Sandu et al., 2003).

In the area of meteorological data assimilation, the need to constrain soil moisture by indirect observations (e.g., temperature and humidity measured 2 m above the ground surface or brightness temperature) has led to the development of adjoint land surface models (Marais and Musson-Genon, 1992; Callies et al., 1998; Bouyssel et al., 1999; Margulis and Entekhabi, 2001a; Reichle et al., 2001). Because the dimension of the control vector is relatively small for land surface schemes used in numerical weather prediction models, derivatives have also been approximated by finite difference schemes in a number of studies (Rhodin et al., 1999; Hess, 2001; Seuffert et al., 2004; Balsamo et al., 2004; Mahfouf et al., 2009; Drusch et al., 2009). The growing complexity of SVAT schemes tends to increase the number of control parameters, however, and consequently the latter approach becomes less practicable. For example, the approximation of the CLM model Jacobian by simple one-sided finite differences with respect to initial soil moisture and soil temperature in 10 soil layers would require 21 model integrations.

In recent years, the ensemble Kalman filter (EnKF) has become increasingly popular for land data assimilation. Since EnKF methods do not require the use of linearized model equations, they are easy to implement and have been successfully applied to soil moisture data assimilation (e.g., Reichle et al., 2002; Zhou et al., 2006; De Lannoy et al., 2007; Sabater et al., 2007). Furthermore, the EnKF has the conceptual advantage of providing estimates of uncertainties represented by an ensemble of model states. There are also certain drawbacks of Kalman filter type assimilation schemes, however, depending on the specific objective of the data assimilation application (e.g., Talagrand, 1997): the strength of variational assimilation compared with Kalman filter methods is given by the fact that nonlinear dependencies between observations and state variables can be taken into account without any approximation. Another potential drawback of the EnKF (like all sequential data assimilation methods) is the interruption of the model trajectory each time an observation is encountered and assimilated. This hinders, for example, the calculation of energy and mass budgets from a sequence of analyzed fields. In contrast, a two- or 4D-Var analysis provides a smooth and physically consistent model trajectory across the time span of the assimilation window.

The purpose of this study was twofold. First, the tangent linear CLM was validated and the usefulness of the tangent linear hypothesis was investigated by comparison of linear sensitivities with the nonlinear response of the full model. Because SVAT models are highly nonlinear and often not continuously differentiable, the questions to what extent initial state variables and parameters act approximately linear on the evolution of the soil state and when feedback mechanisms with nonlinear effects degrade a linear approximation, were key issues of this study. This knowledge is of special value in view of data assimilation applications, because with most assimilation algorithms only observations from linearly or weakly nonlinearly evolving quantities can be useful. Second, the sensitivity of latent heat flux to the initial soil moisture and model parameters was investigated. The question of which input quantities control the model output, e.g., the modeled temporal soil moisture distribution or energy fluxes, is crucial from the data assimilation point of view. For numerical weather prediction purposes, assimilation schemes are primarily designed to obtain the correct energy fluxes at the lower boundary of the atmosphere. The soil moisture state itself, however, is often not improved (e.g., Hess, 2001; Drusch and Viterbo, 2007). Besides an imperfect model, this must be attributed to uncertainties in model input parameters like soil texture, LAI*,* etc. Generally, data assimilation should aim to correct those model parameters that have a large influence and a large uncertainty at the same time. Although we could not address this issue to its full extent in this study, we used the tangent linear CLM to create “maps” of the sensitivity of LE with respect to selected model parameters under various soil moisture conditions.

The perturbation approach applied in this study for both linearly and nonlinearly propagated parameters and forcing variations can also be interpreted as quantitative estimates of the model uncertainties. This may serve as a guideline for the formulation of forecast error covariance matrices, which exert a direct impact on data assimilation results.

## Theory

*M*, which calculates a vector of prognostic and diagnostic output variables

**y**(

*t*), from given initial values

**x**(

*t*

_{0}) ∈ ℝ

*. As the output at time*

^{n}*t*also depends on a set of

*m*model parameters

**p**∈ ℝ

*, we write*

^{m}**z**:= (

**x**

^{T},

**p**

^{T})

^{T}to combine the initial condition and parameter vectors. If the model

*M*is differentiable for a given input

**z**

_{0}:= (

**x**

_{0}

^{T},

**p**

^{T})

^{T}, an approximation of

**y**(

*t*) for small variations δ

**z**:=

**z**−

**z**

_{0}can be obtained by a first-order Taylor expansion:

**M**|

_{z}_{0,t}is the Jacobian of model

*M*about the state

**z**

_{0}at time

*t*and the last term subsumes nonlinear contributions.

**y**(

*t*) with respect to a perturbation δ

**z**in the input about a reference state

**z**(dropping the subscript 0 for convenience) is defined by the finite difference form

*linear sensitivity*:

**M**contain the partial derivatives of the model output with respect to the

*l*th component of the model input vector,

*z*

_{l}. These vectors of linear sensitivity coefficients

*S*

_{l}(

*t*) can be extracted by choosing δ

**z**=

**e**

_{l}, the

*l*th canonical unit vector:

*l*= 1, …,

*n*+

*m*.

### Tangent Linear Models

*M*not only opens a convenient way to calculate (linear) sensitivities according to Eq. [5] but also figures prominently in inverse modeling and data assimilation schemes. Therefore, an efficient method to calculate the model Jacobian is needed. It should be noted that there is a clear distinction between sensitivity analysis and data assimilation applications: while a sensitivity analysis of model

*M*focuses on the behavior of the full nonlinear model in the presence of uncertain inputs, convergence of minimization algorithms used in variational assimilation schemes is attained fastest (in terms of number of iterations) if the representation of partial model derivatives is exact. Therefore, partial derivatives have to be considered as an approximation in the context of sensitivity analysis. Contrarily, inexact calculation of

**M**|

_{z}_{,t}is likely to deteriorate the performance of a data assimilation scheme from an algorithmic point of view. An example for an inexact calculation of

**M**|

_{z}_{,t}is the “perturbation method,” where

**M**|

_{z}_{,t}, or equivalently

*S*

_{l}(

*t*), is approximated by applying a perturbation δ

**z**

_{l}:= (0, …, Δ

*z*

_{l}, …, 0)

^{T}to the

*l*th component of

**z**:

*S*

_{l}, at least two model integrations are necessary. Hence, the complete Jacobian can be calculated by integrating the model

*n*+

*m*+ 1 times. A higher order difference scheme can be selected, which will increase the accuracy as well as the number of required model integrations. In practice, the magnitude of Δ

*z*

_{l}has to be chosen with care, because too large perturbations will result in inexact derivatives due to model nonlinearities and, on the other hand, too small perturbations will produce inaccurate results as a consequence of numerical rounding off error.

An exact method for calculating **M**|_{z}_{,t} involves the application of a linearized model, which is usually called the *tangent linear model* of *M* (e.g., Kalnay, 2003). A representation of the matrix–vector multiplication in Eq. [4] can be developed starting from the original source code by means of automatic differentiation (Giering and Kaminski, 1998, Hascoët and Pascual, 2004). Automatic differentiation tools transform a given source code line by line into tangent linear source code, which calculates the model solution **y**(*t*) along with its derivatives (for details, see Giering and Kaminski, 1998).

In practice, the model *M* is not necessarily differentiable due to discontinuous model equations corresponding to conditional statements in the source code. For the purpose of constructing a tangent linear model, however, the model operator needs only be differentiable along the reference state **z**. This is often the case, even if the functional relationship encoded in the model may have discontinuities. Furthermore, for the same reason, the coded model equations may be continuous but not continuously differentiable. If it happens that the state **z** touches a point of discontinuity, the calculated derivative has to be interpreted as a one-sided derivative. In summary, the tangent linear model is always correct in the sense that it accurately (to rounding off error) describes the first-order Taylor series approximation of the behavior of perturbations in the nonlinear model. The tangent linear model may be of limited usefulness in the strongly nonlinear or discontinuous case, however; for example, a linear sensitivity

**y**even for relatively small δ

**z**. It was one of the primary goals of this study to investigate the usefulness of the linearized CLM.

### Community Land Model and its Tangent Linear Version

The CLM is a land surface model that parameterizes the mass and energy balances at the land surface. The tangent linear model was developed based on CLM3.5 in its standard version without any changes to the original source code. We use the generic abbreviation CLM to refer to the utilized model version. We emphasize that instead of repeating a large number of equations from the literature describing the CLM, only the most important concepts and parameterizations with respect to evapotranspiration and subsurface hydrology, which are the focus of the ensuing sensitivity study, are introduced here. For more information, see Oleson et al. (2004, 2008) and references therein for a comprehensive model description. Specific details on the selected parameterizations are presented together with the explanation and discussion of results of the sensitivity analysis where necessary.

*R*

_{n}, latent heat, LE, sensible heat,

*H*, and the ground heat flux,

*G*, summarized in the energy balance equation:

The CLM is based on a one-dimensional, isolated column approach, i.e., there is no lateral transport in mass and energy in any of the parameterizations at the land surface. Vertical transport of mass, energy, and momentum from the land surface into the lower atmosphere is calculated according to Monin–Obukhov similarity theory, which is a one-dimensional approach and has no horizontal scale a priori attached to it. The applicability of the theory depends on the spatial scale of the land surface heterogeneity and must be evaluated on a case by case basis. Furthermore, the CLM does not incorporate physics-based lateral moisture transport in the subsurface. Instead, the location of a free water table and base flow contribution to river runoff are approximated using mass balance considerations and an exponential law for the dependence of lateral drainage on the water table depth, respectively (Niu et al., 2007; Oleson et al., 2008). To account for land surface heterogeneity, the CLM utilizes a nested subgrid hierarchy, in which the vegetated part of each grid cell is subdivided into “columns” of specific soil properties. Each column may be occupied by up to four out of 17 so-called plant functional type (PFT) land cover classes, including a class for unvegetated (i.e., bare ground) surface patches. Fluxes of water and energy within the soil are calculated at the column subgrid level, while all fluxes from and to the surface are defined at the PFT level (for details, see Oleson et al., 2004).

*q*is the specific humidity of the soil pore space or canopy space,

*q*

_{a}is the specific humidity at the atmospheric reference level or the saturated specific humidity within the canopy, and ρ is the density of air. The resistance function

*r*and the scaling function

*C*are discussed in more detail below. Equation [8] is equivalent to a conductance expression parameterizing transport across an interface in various fields of physics and chemistry. Applied to the problem of soil evaporation, it is

*C*= 1 and

*r*=

*r*

_{a}+

*R*

_{soil}. The formulation of the aerodynamic resistance to water vapor transfer

*r*

_{a}is based on the aforementioned Monin–Obukhov theory and depends on the turbulence of the lower atmosphere, which makes Eq. [8] nonlinear because

*r*

_{a}becomes a function of LE and also

*H*, requiring an iterative solution approach. The quantity

*R*

_{soil}is an additional resistance describing the molecular diffusion of water vapor through large soil pores to the land surface (see Mahfouf and Noilhan, 1991, and references therein; Oleson et al., 2008).

The variable *C* is a scaling function that depends on the physics, geometry, and approximations made to represent the different exchange processes. For example, in the case of plant transpiration, *C* describes the limitation of moisture flux due to plant physiology and geometry (for details, see Oleson et al., 2004). Without going into much detail at this point, it is worthwhile mentioning that in this case *C* depends on a stress function *w*(θ) that limits the assimilation and transpiration rates under conditions of soil moisture scarcity. Here, the first link is established between the mass and energy balances at the land surface and the hydrology of variably saturated flow that determines the vertical soil moisture dynamics in the shallow subsurface.

*q*

_{s}, right at the land surface following the widely applied equation that relates the matric potential as a function of soil moisture, ψ(θ), to

*q*

_{s}:

*q*

_{sat}is the saturated specific humidity,

*g*is the gravitational acceleration,

*T*is the temperature, and

*R*is the gas constant for water vapor. This relationship demonstrates the direct dependence of the lower atmosphere on the soil moisture state right at the land surface, which in turn is influenced by vertical moisture flux in the shallow subsurface. It should be noted that the CLM sets

*q*

_{s}=

*q*

_{a}in cases where

*q*

_{sat}>

*q*

_{s}and

*q*

_{a}>

*q*

_{s}(leading to zero latent heat flux from the soil surface) to prevent numerical instabilities for very dry soils.

**q**is the moisture flux and

*e*is a general source/sink term (s

^{−1}). Note that

*e*includes fluxes from evaporation from the bare ground and depth-differentiated root water uptake by plants based on the parameterization of the vertical root distribution that decreases exponentially with depth.

*z*is the elevation above some datum and

*k*(θ) is the hydraulic conductivity. The constitutive relationships for ψ(θ) and

*k*(θ) are based on the empirical Clapp and Hornberger (1978) pedotransfer functions:

*k*

_{sat}(

*z*) is the saturated hydraulic conductivity that decreases with depth

*z*, ψ

_{sat}is the saturated soil matric potential, θ

_{sat}is volumetric soil moisture at saturation, and

*b*is an empirical exponent that depends on the clay content (%) of the soil:

*k*

_{sat}and ψ

_{sat}are also parameterized depending on soil texture (sand%), hence the latter relationships illustrate the nonlinear dependence of subsurface moisture transport on soil moisture and also soil texture. It should be mentioned that the CLM imposes a zero-flux boundary condition at the bottom of the soil column. Water from the lowest soil levels can be removed or added by interaction with a simple groundwater storage and base flow parameterization as described by Niu et al. (2007) (see also Oleson et al., 2008).

The tangent linear CLM was developed by use of the TAPENADE software (Hascoët and Pascual, 2004). A large number of model parameters and all model state variables have been considered as “active” for the tangent linear code generation (“active” means that the code has been differentiated with respect to those variables). In summary, the tangent linear code can be used to calculate derivatives with respect to the initial soil moisture and temperature, soil texture and related parameters, and plant physiological parameters (including roughness length). The code was extensively tested by comparing the tangent linear with finite-difference derivatives according to Eq. [6]. The run time ratio of the tangent linear code to the original code is about 2:1.

## Experimental Setup

The CLM and its tangent linear version were set up to run on four columns of identical soil composition, one soil column without vegetation (bare ground) and each of the remaining columns occupied by one PFT. Here, we concentrate on the simulations for bare ground and the PFT “crop.” Simulations were performed for the location of the Rur catchment area in western Germany (50°N, 7°E). To cover a wide range of soil conditions, experiments with three soil types including sandy loam (70% sand, 10% clay), loam (40% sand, 25% clay), and clay loam (25% sand, 40% clay) were conducted.

*T*and incident radiation

*R*

_{i}is approximated by the simple sinusoidal relationship

*t*in days,

*a*

_{R}= 700 W/m

^{2},

*T*

_{0}= 290 K, and

*a*

_{T}= 10 K. The atmospheric downward longwave radiation

*L*

_{atm}is calculated by the offline version of the CLM according to

*e*(

*t*) is the atmospheric vapor pressure (hPa) and σ is the Stefan–Boltzmann constant (Oleson et al., 2004). These meteorological conditions were chosen to obtain large latent heat fluxes, because we were primarily interested in situations in which the coupling between the soil–vegetation component and the atmosphere is strong.

For all experiments in this study, we initialized the model with a constant relative soil moisture profile, θ_{R,i} = θ_{i}/θ_{sat,i} = constant, where θ_{i} is the volumetric soil moisture in model level *i*, and θ_{sat,i} is the corresponding soil moisture at saturation. The layer node depths *z*_{i} and layer thicknesses *dz*_{i} for the 10 soil layers currently used in the CLM are listed in Table 1 . To cover a wide range of representative soil moisture conditions, model runs with initial soil moisture values between θ_{R,i} = 0.9 and θ_{R,i} = 0.3 in steps of 0.05 were performed. Initialization with a hydrostatic soil moisture profile was also tested, but no significant qualitative differences were found.

Latent heat fluxes obtained with our experimental setting for the three soil types are presented in Fig. 1 for bare ground and the CLM crop PFT, respectively. The variability of LE with time and under different soil moisture conditions was largest for the clay loam and smallest for the sandy loam. The presence of crops generally tends to attenuate the variability of fluxes with soil moisture. As shown in Fig. 2 , maximum fluxes were reduced for wet soils and increased for drier soils unless the availability of water for plants became limited below θ_{R,i} ≤ 0.55 (clay loam) and θ_{R,i} ≤ 0.45 (loam), leading to a steep decrease of fluxes over the crop PFT for these soil types. This issue is discussed further below.

## Validity of the Linear Approximation

**z**

_{l}:= (0, …, Δ

*z*

_{l}, …, 0)

^{T}in the

*l*th component of the model input vector

**z**with the corresponding tangent linear sensitivity

**z**

_{l}. In this study, we focused on key elements of the perturbed output vector δ

**y**

_{l}(

*t*), namely, the latent heat flux component δ

*L*

_{l}(

*t*), and the volumetric soil moisture δθ

_{l,i}(

*t*), at model layer

*i*. The performance of the tangent linear model was rated by the error of linearization, which is defined as

_{l,i}(

*t*).

*z*. The large number of sensitivity experiments conducted in the framework of this study was used to check that the tangent linear and nonlinear model results behaved in accordance with Eq. [16] for our test cases.

### Soil Moisture

As a starting point of this study, we addressed the question, how good is the time evolution of soil moisture approximated by the tangent linear model? This task is directly related to the impact of using a linearized model in a 4D-Var scheme for the assimilation of temporally scattered soil moisture observations. Strong nonlinearities in the model will slow down or, in the worst case, prohibit convergence of the assimilation algorithm.

For this test case, the perturbation of the input vector **z** was chosen as δ**z**_{l} := (0, …, Δθ_{0,i}, …, 0)^{T}, where θ_{0,i} is the initial volumetric soil moisture content in model layer *i*. As a typical result, we present a set of experiments that was conducted using a Δθ_{0,i} corresponding to a 20% reduction in initial soil moisture for *i* = 1, …, 10. The resulting nonlinear and tangent linear sensitivities δθ_{i}(*t*) and

_{i}(

*t*) due to a variation of initial soil moisture θ

_{0,j}in neighboring model layers (

*i*≠

*j*) are presented below.

As expected, the soil moisture perturbations δθ_{i}(*t*) and

The tangent linear approximation was generally able to reproduce the full sensitivity pattern (Fig. 3) very well. The deviation between tangent linear and nonlinear perturbations can be explained in terms of the nonlinear dependence of hydraulic conductivity and soil matric potential on soil water content (Eq. [10–11]). As an example, Fig. 4 displays the time evolution of a soil moisture perturbation for four different variations (±5 and ±20%) of the initial soil moisture in Model Layer 5. While the agreement between δθ_{i}(*t*) and

Larger linearization errors occurred solely for deeper soil layers (*z*_{i} ≥ 170 cm) under almost saturated (θ_{R} ≥ 0.85) conditions. This fact can be attributed to the occurrence of saturation, as illustrated in Fig. 5 . Because the hydraulic conductivity is large in wet soils, water was rapidly drained to the underlying model layer. Contrarily, the interaction with groundwater as implemented in the CLM (recharge and lateral drainage) was relatively slow. Therefore, the CLM bottom layer (Layer 10, remember that the CLM imposes a zero-flux boundary condition at the bottom of the soil column) reached saturation in the reference (unperturbed) model run after approximately 20 h and, somewhat delayed, in the case of a modest initial soil moisture reduction of 5%. Thereafter, Layer 9 soil moisture began to increase again. If the initial soil moisture was reduced by 20%, no saturation occurred at all, leading to a qualitatively different evolution of Layer 9 soil moisture. Naturally, the perturbation

_{9}(

*t*) in the latter case: from a coding point of view, the occurrence of saturation guides the program flow through dedicated blocks of conditional statements, which potentially leads to discontinuous derivatives. This must be taken into account if a linearized model is used in a data assimilation framework in the future. Nevertheless, it can be noted that the qualitative agreement between the tangent linear and nonlinear sensitivity was good for the perturbation magnitude of 5%. We also tested cases with a hydrostatic initial soil moisture distribution, that is, cases where the dynamics of soil water movement was small (not shown). The results were similar in that a perturbation of soil moisture applied in the layer directly above the water table led to a poor tangent linear approximation.

Note that additional experiments were conducted for the other two soil types in the presence and absence of vegetation. We found similar results as described above; therefore no additional analyses are presented here.

The sensitivity of soil moisture evolution with respect to the initial conditions in adjacent soil layers was also investigated. These sensitivities allowed a conclusion to be drawn about the effectiveness of indirect measurements in combination with advanced data assimilation methods. For example, satellite microwave instruments, while providing a superior spatial coverage, only probe a very shallow soil layer near the surface (about 5-cm depth). Soil moisture values beneath the surface layer can be retrieved by Kalman filter type or variational assimilation schemes, provided a reasonably strong sensitivity to upper level soil moisture exists.

As an example, we present the nonlinear and tangent linear sensitivities δθ_{i}(*t*) and

**z**

_{l}:= (0, Δθ

_{0,2}, …, 0)

^{T}along with the linearization error in Fig. 6 . Note that the second row is identical to the second row of Fig. 3 except for a change in units, which are in millimeter water column to account for differences in layer thicknesses (compare Table 1). It should be stressed that this choice overemphasizes the sensitivity of relatively thick model layers. For example, the observed maximum perturbation of about −0.5 mm in the CLM Layer 5 at the 21-cm depth due to a 20% reduction in the Layer 2 initial soil moisture (2.8-cm depth) corresponds to just 0.004 mm

^{3}/mm

^{3}. As in Fig. 3, it can be observed that perturbations in the initial soil moisture quickly vanished for wet soil (θ

_{R}> 0.7) due to a large hydraulic conductivity and corresponding fast gravitational drainage. As a consequence, under these conditions, the sensitivity tends to be larger in soil layers below the initial perturbation than in the surface layer. Contrarily, for dry soil (θ

_{R}< 0.5), the initial soil moisture perturbation Δθ

_{0,2}was largely retained within Layer 2, leading to small sensitivities in neighboring model layers. We also analyzed the simulation results for perturbations Δθ

_{0,i}applied in other model layers (not shown) and found the results to be qualitatively similar. It can be concluded that there is a range of soil moisture values (roughly 0.5 < θ

_{R}< 0.7) that favors a sensitivity of soil moisture with respect to the soil moisture in adjacent model layers. For drier soils (θ

_{R}< 0.5), there was practically no sensitivity of soil layers below 20 cm to the initial soil moisture of the two topmost model layers.

### Latent Heat Flux

#### Perturbation in Soil Moisture

*L*(

*t*) and

_{R}≤ 0.5, where the linearization error increased. Figure 8 shows more detail for θ

_{R}= 0.5 and a soil moisture perturbation applied in Layer 5. These differences can be mainly attributed to the soil moisture limitation function

*w*, which was mentioned above and is also shown in Fig. 8. This function restricts the availability of water to plants depending on the soil water content:

_{so}and ψ

_{sc}are PFT-specific parameters, the soil water potential at full stomatal opening and closure, respectively. The soil moisture thresholds θ

_{R,w = 0}and θ

_{R,w = 1}, which determine the matric potential ψ such that

*w*= 0 for θ

_{R}< θ

_{R,w = 0}and

*w*= 1 for θ

_{R,w = 1}< θ

_{R}are given in Table 2 for the three soil types considered in this study. Note that ψ(θ) also depends on soil texture via Eq. [11].

In the case of the clay loam, *w* = 0 for θ_{R} < 0.485. Therefore, the nonlinear sensitivity to a 20% reduction in soil moisture drastically decreased for initial θ_{R} < 0.5, as shown in Fig. 7. This indicates that a further reduction in soil moisture had no effect on canopy transpiration. Because the slope of *w* at θ_{R} = 0.5 was still large (as indicated in Fig. 8), however, the tangent linear model locally calculated a large sensitivity to soil moisture changes, which was valid only for very small perturbations in soil moisture. Interestingly, the tangent linear model also overestimated the sensitivity of LE to large (20%) positive soil moisture perturbations in this situation (see Fig. 8). Again, the tangent linear model uses a linear approximation (the slope of *w* at θ_{R} = 0.5) to calculate the response of LE to changes in soil moisture, whereas the actual graph of *w*(θ) is flatter. These simulations suggest that, in terms of uncertainty in θ_{0}, the function *w* dominates uncertainty in latent heat flux estimates for vegetated patches under dry conditions.

#### Perturbation in Soil Texture

We also studied how well the tangent linear model can reproduce LE perturbations caused by variations soil texture, which is spatially highly variable in real systems. A number of parameters and processes in the CLM are parameterized based on soil texture, which is specified in terms of sand (%) and clay (%). Parameters depending on soil texture include soil porosity, saturated hydraulic conductivity, the aforementioned functions *k*(θ) and ψ(θ) via the Clapp and Hornberger formulation, the saturated soil matric potential, and others.

In principle, the joint estimation of model parameters like soil texture and state variables can be accomplished by 4D-Var data assimilation if the derivatives of fluxes with respect to the respective parameters are accurately provided by the linearized model. Nevertheless, it should be mentioned that the resulting optimization problem may be ill posed (depending also on the nature and configuration of observations), given the large set of parameterizations in which soil texture is involved. Additional work will be necessary to clarify whether soil texture can be a viable control parameter in 4D-Var assimilation schemes. In this study, we solely addressed the question whether the derivatives calculated by the tangent linear model could be used to approximate the nonlinear relationship between soil texture and latent heat flux.

The tangent linear approximation of LE sensitivity due to the variation in sand or clay content is generally excellent. As an example, we show the latent heat flux perturbations over bare ground for a 20% reduction in clay content (applied to all model layers, the missing clay fraction was replaced by silt) in Fig. 9 .

We observe that even in this case, where the latent heat flux depended only indirectly on the perturbed parameter, the tangent linear approximation captured the relevant features of the nonlinear model sensitivity. The linearization error, which exhibited distinctive patterns, is difficult to explain, because clay content appears in a large number of parameters, which in turn are applied in various parameterizations in the CLM. The basic rationale, however, stays the same. Strong nonlinearities may result in a relatively poor approximation of the tangent linear model. Additionally, discontinuous parameterizations may result in a relatively large deviation between the nonlinear and tangent linear sensitivities if the results from the tangent linear model are extrapolated across a large uncertainty range.

#### Perturbation in Leaf Area Index

Leaf area index plays an important role in modeling plant evapotranspiration. Therefore, the sensitivity of latent heat flux with respect to variations in the LAI is expected to be high. On the other hand, LAI is also important for radiative transfer calculations and determination of canopy and ground temperature, which leads to a number of possible nonlinear feedbacks between LAI and LE. We found that the linear approximation of LE sensitivity was excellent in many cases but failed for some of the scenarios considered in this study. Two specific examples can be found in Fig. 10 , which displays the linear and nonlinear sensitivities of canopy transpiration, under-canopy ground evaporation, and total latent heat flux in response to perturbations (−5% and −20%) in LAI for a dry (θ_{R} = 0.5; Fig. 10, top) and a wet (θ_{R} = 0.9; Fig. 10, bottom) loam soil. Note that evaporation from the canopy space can be neglected here because there is only a small contribution to LE directly after sunrise due to evaporating dew. Generally, the response of canopy transpiration to a reduction in LAI was in reasonable agreement between the nonlinear simulations and the tangent linear approximation (see Fig. 10a). As expected, a smaller leaf area basically led to less transpiration and consequently to reduced latent heat flux from the canopy space. The opposite effect was observed for ground evaporation: a reduction in LAI leads to increased ground evaporation because more incident radiation reaches the soil surface. As shown in Fig. 10b, the change in under-canopy ground evaporation was considerably underestimated by the tangent linear model. For the dry soil (θ_{R} = 0.5), the evaporation from the ground constitutes a comparatively small fraction of total LE (about one third), while for the wet loam (θ_{R} = 0.9) this flux begins to dominate the total LE after 1200 h. Consequently, the LE response to changes in LAI was poorly represented by the tangent linear approximation in cases where the under-canopy ground evaporation was large. An inspection of the full set of simulations revealed that this behavior was found for θ_{R} > 0.7 in the loam and clay loam cases.

## Sensitivity of Latent Heat Fluxes: Results

*S*

_{l}

^{LE}(

*t*) with respect to various inputs

*z*

_{l}in a similar manner as the validation results above. To make the derivatives

*S*

_{l}

^{LE}(

*t*) comparable, we calculated the normalized quantities:

^{2}) resulting from a 1% change in the input

*z*

_{l}. Thus, to get an estimate of

*S*

_{l}

^{LE}(

*t*) for, e.g., an uncertainty of 10 or 20% in

*z*

_{l}, the results have to be extrapolated by a factor of 10 or 20, respectively.

### Sensitivity of Latent Heat Flux to Soil Moisture Uncertainty

For bare ground, the sensitivities

For the following quantitative discussion, we assumed 20% uncertainty in initial soil moisture θ_{0}. It should be noted that this number is a rough estimate, which we believe is reasonable in many situations found in the field. Using this estimate, a large sensitivity of

^{2}was observed at a depth of 21 cm in case of the clay loam soil. The corresponding maximum values for the loam and sandy loam were found to be 22 and 12 W/m

^{2}, respectively. In the shallowest layer at the 0.7-cm depth, we found that uncertainty in θ

_{0}(e.g., due to erroneous precipitation input) may lead to approximately 7 to 30 W/m

^{2}uncertainty in the modeled LE for intermediate moist (0.5 ≤ θ

_{R}< 0.7) soil. The largest sensitivities of up to 42 and 24 W/m

^{2}were found for dry clay loam and dry loam soils, respectively. These large sensitivities were due to the strong variation of the soil matric potential ψ

_{1}with soil moisture under dry conditions (see Eq. [9]). It is remarkable that uncertainty in θ

_{0}at intermediate depth appears to have a comparably strong influence on LE as uncertainty in θ

_{0}right at the land surface for a broad range of soil states. This can only be explained with soil moisture dynamics that redistributes moisture from deeper levels, where more water can be stored due to the larger layer thickness, throughout the soil column. We note that these results are consistent with previous sensitivity and assimilation studies (e.g., Jacquemin and Noilhan, 1990; Bouttier et al., 1993; Bouyssel et al., 1999), which highlighted the importance of deeper soil moisture for weather forecasting.

For the CLM crop PFT (Fig. 12 ), there was almost no sensitivity to θ_{0} in any model layer for wet conditions (θ_{R} ≥ 0.7). The large values of

_{R,w=0}< θ

_{R}< θ

_{R,w=1}(compare Table 2) were due to the aforementioned parameterization of water availability to plants in the CLM. As discussed above, the interpretation of these large sensitivities in terms of a reaction of the nonlinear model to variations in soil moisture is limited to small variation magnitudes due to the strong nonlinear dependence of the stomatal resistance formulation on soil moisture. Outside the range θ

_{R,w = 0}< θ

_{R}< θ

_{R,w = 1}, the presence of vegetation generally tended to reduce the sensitivity of LE with respect to soil moisture. This behavior can be explained by the fact that canopy transpiration is only marginally or not at all affected by a change in soil moisture once the stress function

*w*= 1 (θ

_{R,w = 1}< θ

_{R}) or

*w*= 0 (θ

_{R}< θ

_{R,w = 0}). Therefore, it is solely the response of under-canopy ground evaporation to changes in soil moisture that contributes to the LE sensitivity in the latter cases.

### Sensitivity of Latent Heat Flux to Soil Texture Uncertainty

The sensitivity of latent heat fluxes with respect to sand content,

Assuming an uncertainty of 20% in sand and clay contents for this study, we observe that the magnitude of maximum sensitivities is comparable to maximum sensitivities with respect to soil moisture for a large range of conditions. Especially, the sensitivity of

^{2}found for wet sandy loam is much larger than the sensitivity to soil moisture in any model layer (compare Fig. 11). A similar, but opposite, pattern was found for variations in clay content (not shown, but compare Fig. 9; note that this figure refers to a 20% reduction in clay content): in this study, fluxes were increased for wet soils and tended to be reduced for dry soil. Again, particularly for wet soils, LE was much more sensitive to uncertainties in soil texture than in soil moisture. This can be explained by the fact that under wet conditions, LE occurs at the potential rate and is not water limited. Soil texture, on the other hand, influences the drainage behavior of the soil and determines the transition from wet to dry conditions, under which LE may be limited.

Sensitivities

^{2}. Nevertheless, this sensitivity was still much larger than the sensitivity to the initial soil moisture (compare Fig. 12). Large sensitivities occurred for the clay loam and loam in a soil moisture range θ

_{R,w = 0}< θ

_{R}< θ

_{R,w = 1}, which again can be attributed to the stomatal resistance function

*w*. In this study, the connection between

*w*and sand content was established via the saturated soil matric potential ψ

_{sat}(Eq. [11] and [17]). Note, that the tangent linear approximation was found to work well in this case (not shown), hence the large sensitivities of up to 34 and 44 W/m

^{2}(taking 20% uncertainty in sand content as a basis) encountered for the clay loam and loam, respectively, can be considered realistic.

### Sensitivity of Latent Heat Flux to Uncertainty in Roughness Length and Leaf Area Index

The CLM calculates the fluxes of momentum, latent heat, and sensible heat from the soil to the atmosphere using Monin–Obukhov similarity theory (e.g., Brutsaert 2005). Numerical values for the corresponding roughness length parameters *z*_{0}, *z*_{0q}, and *z*_{0h} for momentum, moisture, and energy, respectively, are attributed with large uncertainties because comprehensive experimental data are typically unavailable. Therefore, literature values for certain surface types have to be adopted for modeling, which have uncertainties as large as ±50%. In the CLM, *z*_{0q} and *z*_{0h} are derived from *z*_{0} as *z*_{0q} = *z*_{0h} = *z*_{0}exp[*a*(*u*_{*}*z*_{0}/*v*)^{0.45}], where *u*_{*} is the friction velocity, *v* is the kinematic viscosity of air, and *a* = 0.13 (Oleson et al., 2004, and references therein). The tangent linear LE sensitivities were found to be in good agreement with nonlinear calculations for both vegetated and bare ground surfaces and perturbations in *z*_{0} as large as 30% (not shown). For larger perturbations, the tangent linear approximation was found to underestimate the actual nonlinear sensitivity. Hence, the sensitivities

*z*

_{0}. As expected, the sensitivity pattern of

*z*

_{0}is taken as a basis for the estimate of LE uncertainty, the resulting maximum value of ∼5 W/m

^{2}is relatively small. Although this finding is of limited scope because it was obtained for a single meteorological forcing scenario only, it is consistent with the results of previous studies. Jacquemin and Noilhan (1990) and Bouttier et al. (1993) reported a low sensitivity of surface fluxes with respect to variations in

*z*

_{0}as large as an order of magnitude. This weak sensitivity is due to the fact that a perturbation in

*z*

_{0}induces a small variation of the aerodynamic resistance

*r*(Eq. [8]) that depends on its logarithm (Oleson et al., 2004).

Above, the tangent linear approximation for the sensitivity of LE to variations in LAI was found to be relatively poor under wet conditions (θ_{R} > 0.7), especially for the loam and clay loam soils. Because the largest sensitivities occurred for θ_{R} < 0.7, however, where the tangent linear model performed well (compare Fig. 10), we present a sensitivity map of

_{R,w = 0}for the clay loam, loam, and sandy loam soils. For the clay loam and loam, the limitation of water availability led to vanishing LE sensitivity for θ

_{R}< 0.5 and θ

_{R}< 0.4, respectively. For an assumed 20% uncertainty in LAI, the resulting uncertainties in LE reached maximum values of about 30 W/m

^{2}for the sandy loam and only a little less for the other soil types. Hence, LAI must be considered as an important source of uncertainty for modeled latent heat flux under dry conditions.

## Conclusions

The tangent linear CLM developed in this study generally reproduced the processes related to soil moisture dynamics in the shallow subsurface and to latent heat flux very well. Perturbations in the initial soil moisture content evolved in time in good accordance with the full nonlinear model for realistic perturbation magnitudes of 20%. Because 4D-Var data assimilation utilizes an iterative approach, i.e., successively smaller corrections are applied to the initial soil moisture conditions, it can be concluded that the linear approximation is in general applicable. Problems may arise with the assimilation of soil moisture observations following strong precipitation events, when the soil becomes saturated, because the model is not continuously differentiable at the transition between saturated and unsaturated conditions. Additional 4D-Var data assimilation studies are necessary for clarification.

The approximation of latent heat flux sensitivity with respect to initial soil moisture content θ_{0}, soil texture (sand and clay contents), *z*_{0}, and LAI through the tangent linear model was found to be viable in most cases. As expected, model parameterizations with strong nonlinearities may invalidate the tangent linear model results when extrapolated across a large uncertainty range of the input parameter in question. The number of processes that we identified as significantly deteriorating the calculated linear LE sensitivities, however, is small. The plant stress parameterization *w*(θ), which is strongly nonlinearly dependent on θ, potentially leads to large differences between linear and nonlinear LE sensitivities with respect to soil moisture under dry conditions. Furthermore, the response of LE to variations in LAI was not reproduced well for wet soils because the tangent linear model underestimated the nonlinear response of under-canopy soil evaporation to changes in LAI.

Bearing these limitations of the tangent linear model in mind, we conclude that the tangent linear model is a useful tool for sensitivity studies. The linear sensitivity analysis presented for three generic soil types and a range of initial soil moisture conditions in this study provides a comprehensive survey about which model parameters control latent heat flux under which conditions. We found that soil moisture had only a minor impact on LE for wet soils; this was particularly true for the vegetated surfaces (covered with an agricultural crop). More importantly, the latent heat flux was sensitive to soil moisture under medium-wet and dry conditions, including a significant contribution of soil moisture at about the 20-cm depth. Hence, to model realistic latent heat fluxes, it is not sufficient, even at the relatively short time scale of 3 d considered here, to correct the initial soil moisture in the uppermost soil layers of some 5 cm. The moisture state of the subsurface must also be corrected at deeper levels, which requires depth-differentiated measurements. This must be kept in mind for the application of remotely sensed soil moisture fields in data assimilation approaches that only provide limited information on the skin soil moisture right at the land surface. These results are consistent with previous sensitivity and assimilation studies, which have highlighted the importance of deeper soil moisture for weather forecasting (e.g., Jacquemin and Noilhan, 1990; Bouttier et al., 1993; Bouyssel et al., 1999).

Additionally, soil texture was shown to play a pivotal role under wet conditions for unvegetated surface patches, in agreement with previous findings (e.g., Wilson et al., 1987; Jacquemin and Noilhan, 1990; Bouttier et al., 1993). On the other hand, LE tended to be most sensitive to LAI over dry soils. The influence of uncertainty in *z*_{0} on LE seems to be less important, albeit this statement might be revised if sensitivities under different meteorological forcings are studied.

This study was conducted using a single set of synthetic meteorological forcing data, which was intended to reflect the conditions on a clear sky day at the beginning of June. This choice was motivated by the fact that the coupling between the land surface and the atmosphere is strong under these conditions. It should be mentioned that our findings are not necessarily transferable to different meteorological situations.

The development presented in this study was done in the framework of setting up a 4D-Var land surface assimilation scheme. As a next step, the implementation of an adjoint version of the CLM into a variational scheme for the assimilation of soil moisture and soil temperature observations will be presented in a subsequent study. Because the CLM treats columns of soil separately (without any lateral interaction), the assimilation scheme remains two-dimensional variational as long as the model is driven by offline meteorology and as long as no horizontal correlations are encoded in the background error correlation matrix. Coupling to the Weather Research and Forecasting model (Skamarock and Klemp, 2008) as well as exploitation of horizontal correlations is planned for the future.

In summary, we conclude that a tangent linear model is able to capture many relevant processes encoded in a complex nonlinear SVAT model. Particularly in the context of data assimilation applications, it is useful to study the potential and limitations of the linearized model under varying conditions.

This work was funded by Deutsche Forschungsgemeinschaft (DFG) in the framework of the SFB/TR 32 “Patterns in Soil–Vegetation–Atmosphere Systems: Monitoring, Modeling and Data Assimilation.” The National Center for Atmospheric Research (NCAR) is gratefully acknowledged for providing the CLM source code and input data sets. The tangent linear CLM source code was generated using the TAPENADE software provided by the Institut National de Recherche en Informatique et en Automatique (INRIA). The authors thank two anonymous reviewers for their constructive comments and suggestions, which helped improve the manuscript.