A physically based, coupled and distributed hydrologic model has been set up for the Ringkøbing Fjord catchment, Denmark. This transient model, built with the MIKE SHE/MIKE 11 code, comprises all major components of the terrestrial water cycle, including a three-dimensional finite-difference groundwater model. The dynamic coupling of the hydrologic processes secures physically sound feedback and makes the model an ideal tool for evaluating the overall water balance and quantifying potential water balance issues. Historically, failure to obtain water balance closure has been a persistent and much debated problem in Denmark, presumably arising mainly from uncertainties in precipitation, actual evapotranspiration, and groundwater flow to the sea. In this study, the water balance issues were addressed within the modeling framework through analysis of different rain gauge catch corrections and potential evapotranspiration input. The analysis focused on the effect of different rain gauge catch corrections on the model performance, the optimized parameter sets, and state variables not included in the model calibration. The results suggest that water balance problems can be reduced by using a dynamic rain gauge catch correction based on daily wind speed and temperature fields. The model optimization and performance evaluation revealed, however, that several parameter sets gave similar performance compared with the observed groundwater head and river discharge data but still resulted in significant differences with respect to internal water fluxes such as evapotranspiration, groundwater recharge, and stream flow components. New observational data are needed to constrain the model further and thus reduce the water balance uncertainties convincingly.

By introducing different model input scenarios in a systematic way, the impact of precipitation correction on catchment-scale hydrologic modeling results has been elucidated probably for the very first time. The results show that great improvement in model performance can be achieved by applying a dynamic rain gauge catch correction.

A correct quantification of the water balance and its individual components (precipitation, evapotranspiration, river runoff, subsurface runoff, and storage change) is vital for water resources management (Wisser et al., 2010). Quantification of these components relies on the availability of accurate hydrologic data and on the application of reliable hydrologic models. Accurate precipitation measurements are of particular importance, especially in the northern latitudes, where uncertainties are large due to undercatch of solid precipitation (Adam and Lettenmaier, 2003; Tian et al., 2007; Yang et al., 2005). Recent studies have documented the large uncertainty in precipitation estimated at global (Fekete et al., 2004; Legates, 1995) and continental scales (Tian et al., 2007; Walsh et al., 1998). There are, however, few detailed modeling studies of the effect of precipitation accuracy and catch correction at local to regional scales.

Problems are often encountered when modeling the water balance at the catchment scale (Arnold et al., 2000; Henriksen et al., 2003). Hydrologic models are typically calibrated against existing field data, of which observed river discharge is the main water balance criterion. If the climate input data are moderately biased, this can often be compensated for during model calibration by tweaking parameter values so that the model performs well against discharge. There is, however, no guarantee that the individual water balance components are correctly described, especially if the model is constrained using only a few objective functions.

Even without erroneous input data, the internal state variables are uncertain. It is generally recognized that calibration cannot lead to a unique set of “true” parameter values. To the contrary, many alternative combinations of parameter values may result in similar performance against the available field data. Or in other words, a model that is performing well in a calibration test is likely to provide the right answer for the wrong reasons. This is denoted the equifinality problem (Beven, 2006). A consequence of equifinality is that when a hydrologic model is constrained by few field data, which is often the case, then the model simulations of the internal variables that have not been subject to calibration are often poor (Refsgaard, 1997). Hence, it appears that constraining a hydrologic model against multiple types of data such as time series of river discharge, groundwater heads, soil moisture, and evapotranspiration fluxes will be required to achieve a fully reliable water balance (Franks et al., 1998; Pauwels et al., 2002; Immerzeel and Droogers, 2008), and this requires the use of a coupled hydrologic model with physically based descriptions of all these components.

Biased climate input data represent a special case of the equifinality problem because models driven by different climate input can be calibrated to perform similarly, especially if the model is not well constrained. To examine these problems, a fully coupled modeling approach is needed that accounts for the entire water budget. Rainfall–runoff models with simplified descriptions of the evapotranspiration and groundwater components or more detailed energy balance models (soil–vegetation–atmosphere transfer [SVAT] or land surface models) focusing on surface processes will have difficulties revealing biases in the climate input because they do not account for the entire water budget. There are, at the moment, surprisingly few studies that have documented and evaluated the existence of biases in climate input through a modeling approach. We assume that this is mainly due to the fact that most models are either not fully coupled or not well enough constrained, and not because climate input biases are neglectable.

In a Danish context, water balance closure through hydrologic modeling at a national scale has not been achievable with realistic model parameters (Henriksen et al., 2003). Uncertainties in especially winter precipitation due to catch correction of solid precipitation has been considered a major obstacle in this regard. Recognizing these problems, Plauborg et al. (2002) prepared a set of recommendations for how to use the existing Danish precipitation and climate data to obtain a closure of the water balance. At the same time, they concluded that there is a severe knowledge gap on basic components of the hydrologic cycle. These recognized problems were the motivation for the formulation of the Hydrological Observatory and Exploratorium (HOBE) project (Jensen and Illangasekare, 2011), which aims at providing improved scientific knowledge on the individual hydrologic processes as well as the overall water balance at the catchment scale by combining new, sophisticated field measurement techniques and integrated hydrologic modeling.

The overall objective of our study was to evaluate the effect of applying different realistic climate forcing data sets through a coupled surface–subsurface modeling framework for a relatively well constrained, large-scale river catchment. The probable biases in some of the climate input data pose a major challenge for the closure of the water balance and simulation of seasonal dynamics. The specific scientific objectives were: (i) to construct and calibrate a coupled and distributed hydrologic model suitable for integrating new field data such as evaporative fluxes, soil moisture, stream–aquifer interaction, and subsurface groundwater flow at different spatial and temporal scales; and (ii) to evaluate the effect of applying different climate forcing data considering the reliability of model parameters, simulated fluxes, and alternative parameter optimization criteria.

The study area included the catchment area for the Ringkøbing Fjord lagoon situated in western Jutland, Denmark (Fig. 1 ). The catchment has an area of approximately 3500 km2 and is bounded by the North Sea to the west and the Jutland Ridge to the east. The present topography reflects the physical environment from the last glaciation with a maximum glaciation border trending north–south through central Jutland (Houmark-Nielsen and Kjaer, 2003) as a ridge with elevations up to approximately 130 m above sea level. The topography slopes gently to the west of the ridge, where land surface elevations close to sea level are found in the western part of the area. The northern and southern boundaries are specified by local water divides. Based on satellite data, the land use in the area has been estimated as grain and corn (Zea mays L.) (56%), grass (29%), heath (5%), forest (7%), urban (2%), and other (1%). The main grain crops are winter and summer barley (Hordeum vulgare L.) and wheat (Triticum aestivum L.). Most agricultural areas perform rotation shifts, while some are covered with multiannual grasses. The area covered by forest is dominated by conifer trees.

Geology and Hydrogeology

The surface geology is dominated by glacial material from the Quaternary age. In the central and eastern part, outwash material, mainly from the last glaciation, is found. The soil type is defined as coarse sandy soil, generally with high permeability. To the north and south, sedimentary remnants from earlier glaciations, primarily the Saale, exist locally. The soil type is characterized as clayey sandy soil; however, the permeability of these deposits is relatively high.

The thickness of the Quaternary deposits is generally <50 m in the central and northeastern part of the area and increases in the southern and western part; in some places, it reaches depths of approximately 250 m (Fig. 2 ). Alternating layers of marine, lacustrine, and fluvial deposits of Miocene age underlie the Quaternary deposits. The sequence is formed by layers of mica clay, silt, and sand, together with quartz sand and gravel (Scharling et al., 2009). Thick clay layers from the Paleogene underlie the Miocene deposits. The Quaternary and Miocene sand formations often form large interconnected aquifers. At depth, however, confined Miocene sand units are found on top of the Paleogene clay, which acts as an impermeable flow boundary.

The hydraulic conductivity of the sand formations is generally high, on the order of 10−4 to 10−3 m s−1, while the Quaternary and Miocene clay units have low conductivities on the order of 5 × 10−9 to 1 × 10−6 and 1 × 10−8 to 1 × 10−7 m s−1, respectively (Harrar et al., 2003; Sonnenborg et al., 2003). The dominant groundwater flow direction is from east to west. At the water divide to the east, the hydraulic head level is about 75 m above sea level, decreasing to sea level at the west coast. The average head gradient is 0.001; seasonal variation in head values is generally <1 m.

Climate and Hydrology

The weather in the area is highly dependent on the wind direction because of the proximity to both the sea and the European continent. The dominant westerly wind results in mild winters and chilly summers with variable weather and frequent rain. Winds from the south and east are influenced by continental weather systems characterized by low temperatures in winter and high temperatures in summer (van Roosmalen et al., 2007). The average annual temperature is 8.2°C, with a maximum of 16.5°C in August and a minimum of 1.4°C in January (Fig. 3 ). During the winter season, the dominant precipitation system is extratropical storms from directions between southwest and northwest. The frontal precipitation mechanism is enhanced by orographic effects caused by the moderate increase in surface elevation from west to east. In summer, convective rain events significantly influence the precipitation pattern, and the most intense rainfall events are observed from June to August with maximum daily rainfall of up to 50 to 60 mm. Using standard methods to correct the undercatch caused by turbulence effects in the vicinity of the observation gauges (Allerup et al., 1998), the mean annual precipitation in the area is estimated at approximately 1050 mm yr−1. The precipitation varies seasonally, with a maximum in autumn and winter and a minimum in spring (Fig. 3).

Due to the sandy soil and relatively low rainfall intensities, overland flow seldom occurs, except in wetlands, and almost all precipitation that is not intercepted by the vegetation infiltrates into the soil. The average reference evapotranspiration (ET) calculated using the Makkink (1957) formula is 570 mm yr−1. Hence, a large part of the precipitation leaves the catchment through the surface water and groundwater systems.

Some of the recharge is captured by the drainage system, composed of tile drains, ditches, and small creeks, and is transferred relatively rapidly to the streams. The remaining part of the recharge enters the aquifers and discharges to the streams downstream of the infiltration areas. The main river in the catchment is the Skjern River, which drains approximately 70% of the total area. The average discharge at the outlet of the catchment is almost 40 m3 s−1, corresponding to 470 mm yr−1. Evapotranspiration and groundwater discharge out of the catchment are responsible for the remaining components in the water balance. Airborne geophysical mapping performed by Kirkegaard et al. (2011) indicated that the groundwater discharge perpendicular to the western boundary of the model area is relatively low. Hence, the uncertainty related to submarine groundwater discharge to the sea is, based on current knowledge, expected to be relatively low.

Climate Data

Daily rain gauge data were available from 45 stations within and around the Ringkøbing Fjord catchment (Fig. 1). Minor gaps existed in the time series, which were filled by replacing the missing data by values from the nearest neighboring station. Regarding reference ET and temperature data, the daily 20-km national grid data set produced by the Danish Meteorological Institute was used. The location of the grid is illustrated in Fig. 1. Reference ET was calculated using the Makkink equation adjusted for Danish conditions (Scharling, 1999). All meteorological time series were available as daily data for the period 1990 to 2003.

Recommended Climate Input, Catch Correction, Crop Coefficient, and Reference Evapotranspiration

In light of the observed problems related to water balance closure for Danish catchments, Plauborg et al. (2002) prepared a set of recommendations for operational application of the existing Danish precipitation and climate data. They recommended the use of standard rain gauge catch correction factors based on the 1961 to 1990 period for modeling purposes. With regard to the reference ET, problems arising from inaccuracy in measurements of relative humidity led to the recommendation to utilize the Makkink equation instead of the Penman equation because the former depends on global radiation and temperature only. The simplifications inherent in the Makkink equation, however, contribute to some regional biases in the estimated ET, where the reference ET tends to be overestimated in western Denmark compared with the eastern part of the country, for which the Makkink equation was adjusted. Additionally, it was suggested that a crop coefficient, Kc, be used as a multiplication factor with the reference ET to account for seasonal vegetation dynamics (Allen et al., 1998). The recommended Kc values are presented in a simplified manner in Table 1 , along with the values used in the initial model setup. As can be seen, the initial values represent the upper envelope of the recommended range because experience has shown that a high ET rate is favorable in combination with the standard rain gauge catch correction.

This study primarily investigated the effect of applying different rain gauge catch corrections, but also investigated the use of alternative reference ET and Kc values. For Kc, the alternative values are given in Table 1, representing the lower envelope of the recommended range. For the reference ET, an alternative model input was applied, where the original reference ET time series was reduced by 5% due to the expected overestimation in western Denmark.

Correction of Precipitation Data

Although the Danish rain gauge network (based on the Hellmann rain gauge) is dense and of high quality, the estimated rainfall is associated with some uncertainty and possibly biased. The uncertainty is predominantly related to the precipitation catch deficiency (Allerup and Madsen, 1980). Rain gauges, which are projected above the ground (World Meterological Organization [WMO] standard 1.5 m), produce eddies that tend to reduce the catch of small raindrops and especially snowflakes. Catch deficiencies of 10 and 30% are typical for rain and snow, respectively. These have been found mainly to be a function of precipitation type, shielding, and wind speed (Larson and Peck, 1974).

In 1972, a study dealing with corrections of precipitation in Denmark was initiated and resulted in a correction model for liquid precipitation (Allerup and Madsen, 1980). Based on a WMO initiative from 1985 (Goodison et al., 1998), the Nordic Field Comparison project was performed in Jokioinen, Finland, during 1987 to 1993. The aim of the project was to compare different rain gauges, including the Danish Hellmann gauge, and to establish a comprehensive correction model for solid precipitation (Allerup et al., 1997). These studies are still believed to be the most comprehensive for the Nordic region and the simplicity of the correction model has enhanced its application for operational use.

The Allerup Model

The model by Allerup et al. (1997) was developed for the unshielded Danish Hellmann rain gauge and treats solid and liquid precipitation separately. Precipitation type is simplified to be a function of temperature only because precipitation falling at below 0°C is assumed to be snow, while precipitation falling at above 2°C is regarded as liquid and anything in between is a linear mix of the two. The correction factors (CFs) are given by the following equations:
where T is the air temperature at reference height (∼2 m above the ground), I is the rainfall intensity (mm h−1), and u is the wind speed (m s−1) at reference height. The wind speed is corrected for shelter effects by a simple notion that the wind speed at highly protected stations (A) is reduced to a factor of 0.57 and 0.44 for liquid and solid precipitation, respectively. Wind speed at intermediately protected stations (B) is corrected by factors of 0.78 and 0.70, while wind speed at unprotected stations (C) is not corrected.

Application of the Allerup Model

This study investigated three ways of applying the catch correction method for modeling purposes, all based on Eq. [1] and [2] and all applied to daily rain gauge measurements. The traditional and recommended way of applying the catch correction for operational purposes is to use the 30-yr statistical mean monthly correction factors. These are estimated for the most recent 30-yr period 1961 to 1990 and are calculated as a countrywide mean value for three different shelter categories (A, B, and C). Alternatively, a more recent (1989–1999) and more local set of correction factors could be used based on a station placed centrally in the Ringkøbing Fjord catchment. The strength of these two application methods is that the corrections are calculated at rain gauge stations where the required measurements are available (wind speed, intensity, and temperature), and once the correction factors are estimated, they are extremely easy to implement. The downside is that fixed monthly factors do not account for either day-to-day or interannual variations in climatic conditions that may influence the catch deficiency.

In the third approach, the catch correction factors are estimated on a daily basis using data on wind speed, precipitation intensity, and temperature. This ensures that short-term variations and especially interannual variations are captured. Problems may arise from data availability and, consequently, the interpolation of wind speed and temperature to the rain gauge locations. The three applications tested in the current study can be summarized as:

  • 30-yr statistical countrywide mean monthly correction factors (standard);

  • 10-yr statistical local mean monthly correction factors (local); and

  • daily correction based on the best available data on wind speed, precipitation intensity, and temperature (dynamic).

For all three applications, each rain gauge was classified according to its shelter level using the three shelter categories. Additionally for the dynamic correction, no data exist on wind speed, intensity, and temperature for the manual rain gauges, which only measure total daily rainfall. Regarding wind speed and temperature, a 20-km gridded data set of interpolated daily values was used, choosing the grid values co-located with each rain gauge station. For rainfall intensity, monthly mean values were used.

Monthly average correction factors for the 45 rain gauges used in the model setup for the calibration period 2000 to 2003 are listed in Table 2 . There were distinct differences across the three application methods, with consistently higher factors for the standard correction while the lowest values were found for the dynamic correction method. These differences were most pronounced for the winter months. Besides the difference in monthly mean correction values, the dynamic correction method allows for both day-to-day variation and interannual variations in the correction factor.

Geology and Land Use Data Sets

In the study model setup, both soil and land use data were represented in a rather simplified manner with little spatial variability. This was owing to a principle of parsimony by aiming at few model parameters (Hill, 2006) and to a simple lack of spatial information, which was the case for the parameters of drainage and river–aquifer interaction. In addition, the model was regarded as a large-scale model that mainly aimed at representing the overall water balance and ongoing work within the HOBE project deals specifically with the issues of spatial variability.

Geologic and Soil Data

The geologic model was developed during the recent update of the national water resources model (DK-Model) and is mainly based on the lithological information from water supply and oil exploration boreholes combined with geophysical surveys. Five different hydro-facies were used to describe the subsurface. The Quaternary was divided into a binary system of sand and clay, Sand1 and Clay1, respectively, whereas three pre-Quaternary units were used, Miocene quartz sand, Miocene mica sand, and clay–silt, here denoted as Sand2, Sand3, and Clay2, respectively (Fig. 2).

Topsoil data, corresponding with the root zone depths, were distributed according to a continuous soil map describing field capacity, wilting point, and saturated water content (Greve et al., 2007). The continuous map was classified into four classes based on the field capacity map, with each class occupying similar fractions of the total model area. Values on field capacity, wilting point, and saturated water content were estimated as average values from the continuous map for each class.

Land Use and Vegetation Data

The land use in the catchment is dominated by agriculture, which takes up around 90% of the land area. In the model setup, three land use types (grain crops, grasslands, and heath) were aggregated to represent agriculture, leaving around 10% for forest and urban areas. No subdivision of the agricultural area was made according to management or crop distribution. The agricultural area was, however, subdivided according to the four soil types because the root depth is known to vary more between soil types than between crop types (Olesen and Heidmann, 2002). Crops on sandy soils have less developed roots than crops on soils with higher clay content. The root depth was used as a calibration parameter, but the relation between root depths of different soils was assumed to be constant, as well as the ratio between summer and winter root depth. Given the conceptual model for ET simulations (Kristensen and Jensen, 1975), the root depth is the major controlling parameter, while the leaf area index (LAI) has only a secondary influence. Given the limited sensitivity of LAI, this parameter was described by a fixed annual development for each land use class.

Sinks and Sources

Within the model, a number of abstractions and wastewater point sources were included. In total, 5588 groundwater abstraction wells were included, 834 for domestic and industrial use and 4754 for irrigation. The groundwater abstractions for domestic and industrial use were specified by their location and filter depth as well as their annual abstraction. For the irrigation wells, an irrigation demand area surrounding each well was defined. The demand area prescribed for each well was a function of the density of wells and the fraction of land defined as agricultural within a buffer zone around each well. The size of the buffer zone was determined iteratively to match the registered total irrigated area of the local county. The actual irrigation amount was calculated internally in the model code (MIKE SHE) by a demand function and the simulated water deficit in the root zone, and therefore varied according to climate, root depth, and soil type.

The groundwater abstracted for irrigation was taken from the registered depth of the screen in each well and added as rainfall on the irrigation command area, making it an internal source–sink term. Unfortunately, no reliable data exists on actual irrigation amounts. Farmers are permitted to irrigate up to 100 mm yr−1 on irrigated land, which comprises 33% of the total model domain. The actual irrigation varies, however, according to climatic conditions and energy prices. Reasonable irrigation amounts for the period 2000 to 2003 were assumed to be between 15 and 30 mm yr−1 for the entire model domain (∼45–90 mm yr−1 for the irrigated areas). The domestic abstractions were treated as an external sink, which was removed from the model calculations. Most of this was captured in the wastewater point sources, however, of which there were 49. These point sources were registered outlets from wastewater treatment plants corrected for rainwater contribution and were specified as discharge into the stream system.

The Model Code

The MIKE SHE modeling system was selected as the hydrological modeling tool. The MIKE SHE system is a physical distributed modeling system covering the entire land phase of the hydrologic cycle (Abbott et al., 1986; Graham and Butts, 2005). This code was selected due to the comprehensive description of groundwater–surface water interactions and the ability to include various unsaturated zone and ET formulations. The model setup used in this study was based on a full, three-dimensional, finite difference groundwater module coupled with a simplified linear unsaturated zone module (Yan and Smith, 1994). The simulation of evaporative losses from interception and the root zone is described by the formulation of Kristensen and Jensen (1975). The dominating transpiration process is governed by a set of reduction functions expressed through the effect of LAI, moisture content in the root zone, root distribution, and reference ET. Stream flow is computed by the MIKE 11 code, coupled to MIKE SHE, applying a kinematic routing method.

In contrast to fully integrated model codes such as HydroGeoSphere (Therrien et al., 2005), MODHMS (Panday and Huyakorn, 2004), InHM (VanderKwaak and Loague, 2001), and ParFlow (Kollet and Maxwell, 2006; Maxwell et al., 2007), which solve surface–overland flow and subsurface–variably saturated equations simultaneously, MIKE SHE solves the groundwater, unsaturated zone, overland flow, and river flow in separate modules that are then coupled in a two-way coupling at each time step. Although less sophisticated than the fully integrated models, the transient coupling in MIKE SHE enables sound feedbacks between the different flow components combined with a flexible modeling tool applicable to large-scale catchments with reasonable simulation times. Specifically, the interaction between the variable groundwater table and the root zone and surface water is important in this catchment because the rivers and streams are primarily groundwater fed, although with large seasonal variation in the exchanges between surface and groundwater. In many parts of the catchment, there is also interaction between the root zone and the groundwater, which was described in a simplified manner by the two-layer unsaturated zone module (Yan and Smith, 1994).

Conceptual Model and Model Setup

The model domain covered the 3500-km2 Ringkøbing Fjord catchment bounded by the North Sea to the west. Along the coastline, a constant-head boundary was assumed, while the remaining boundary was a zero flux boundary coinciding with the topographic divide. To justify the no-flow boundary, a similar model was set up for a larger region, which revealed a convincing agreement between topographic and groundwater divides. A regular 500-m grid was used for the horizontal discretization. The geological model was pixel based, comprised of grid boxes of 1 by 1 km horizontally and 10 m in thickness. Each grid box was assigned a geological unit. In addition, a number of continuous and homogeneous geological lenses were defined. These replaced the “background” pixel geology wherever they were defined. The geological lenses corresponded roughly to the division of the 11 computational layers used in the numerical groundwater model.

Calibration Scheme

The calibration scheme was based on the PEST optimization tool (Doherty, 2004), which is a nonlinear estimator that has been widely used for groundwater–surface water optimization problems (Keating et al., 2003). The Gauss–Marquardt–Levenberg local search optimization method was used, which has similarly been used in the calibration of the Danish national water resources model.

The model was run for the period 1990 to 2003. Simulations were evaluated for a calibration period (2000–2003) and a validation period (1996–1999). This ensured a 10-yr (6 yr for validation) warm-up period, which was regarded as appropriate for the modeled hydrologic system because the initial conditions from the end of a previous model run were assigned. The mean annual discharge from a downstream station is given in Fig. 4 , showing the annual variation in runoff for the warm-up, validation, and calibration periods, the latter two including both dry, wet, and average years. The initial model parameters were based on initial model optimizations for a larger model domain encompassing the Ringkøbing Fjord basin.

Based on a simple sensitivity analysis using initial model parameters and the standard catch correction model (PEST1), the nine most sensitive parameters were selected for calibration. An extra 14 parameters were included in the calibration by tying these to the nine free parameters. Model parameters and initial values are listed in Table 3 . The tied parameters all have meaningful relations with the free parameters. For example, vertical hydraulic conductivity is tied to horizontal hydraulic conductivity for all geological units, while winter root depth is tied to summer root depth and the relation between root depths for different soil types is tied according to their field capacities. The root depth selected for calibration represents the soil type with the shallowest root depth, being half as deep as the soil type with the deepest root depth. No parameter bounds were put on any of the parameters, based on the philosophy that in case unrealistic parameter values were estimated, this was a valuable indication of conceptual model errors because the most sensitive parameters were selected for calibration. For all model calibrations with alternative climate input data, the same parameters were free, tied, or fixed.

To evaluate if the same parameters were sensitive for the alternative climate input and optimized parameters, the sensitivity analysis was repeated for the optimized parameters set using the dynamic catch correction and alternative reference ET (PEST3b, see Table 4 for the differences in objective functions and relative weights between PEST1 and PEST3b). The two models used in the sensitivity analysis covered the range of climate inputs and initial and optimized parameter values. The sensitivity analysis based on PEST3b with optimized values showed slightly different sensitivities; however, all nine calibration parameters remained sensitive (all were among the 13 most sensitive). Compared with PEST1, the sensitivities were particularly increased for two parameters: the horizontal saturated hydraulic conductivity of peat in the top 3 m (Kx_Top_Peat) and the field capacity for soil type 4 (Thfc4); however, Kx_Top_Peat is tied to the horizontal saturated hydraulic conductivity of till in the top 3 m (Kx_Top_Till) in the calibration while Thfc4 is believed to be well parameterized from field data. Correlations between calibration parameters were generally low, with the highest correlation found between the time constant for drain flow and Kx_Top_Till, with a correlation coefficient of −0.59, while the average absolute correlation was 0.24 for the optimized model PEST3b.

A multiobjective calibration was pursued to increase the constraint on the model (Gupta et al., 1999). In the study model optimization, however, the available calibration data were limited to stream discharge and hydraulic head observations. To exploit this data fully, a set of objective functions were designed. These are listed in Table 4, along with their respective weights in the PEST optimization. The eight objective functions are defined as follows (number of observation stations in parentheses):

  • R2NS, the Nash–Sutcliff coefficient based on daily data (dimensionless) (14)

  • Fbal, the total water balance error (%) (14)

  • Fbalsummer, the water balance error for the low-flow months of June, July, and August (%) (14)

  • MEHTS, the mean error of time series of hydraulic head (m) (12)

  • ErrAmplHTS, the error of the mean annual amplitude in time series of hydraulic head (m) (12)

  • RMSEHmean, the RMSE of the mean hydraulic head for the period 1990 to 2000 (m) (1714)

  • RMSEHdyn, the RMSE of individual hydraulic head observations in the period 2000 to 2003 (m) (1913)

  • MELayers, the mean hydraulic head error for each model layer (m) (11)

Although the optimization included eight objective functions, the model was still unconstrained with regard to important hydrologic processes such as ET. The problems of unconstrained model components was not solved in this study, but is a major issue under the HOBE project in which the described hydrologic model is considered a baseline model.

The weighting of objective functions can have a major effect on optimization results (Madsen, 2003), especially if the initial parameter values are far from optimum. The effect of objective function weight is addressed briefly below. For the initial model optimizations, a relatively balanced weighting between discharge (Q-weight) and hydraulic head (H-weight) related objective functions was utilized, giving 58 and 42% weights, respectively (see Table 4). For the alternative weighting analysis, altered weights of 90 and 10% were assigned to Q-weights and H-weights, respectively.

A common problem encountered when using a local optimization method is the risk of ending the optimization in a local minimum. To investigate whether a local rather than a global optimum was found, a test with three different initial parameter sets was performed. This test is described below and concluded that, for the major calibration parameters, the optimum was largely independent of the initial values, although the results for two parameters were less conclusive.

Model Calibrations using Standard, Local, and Dynamic Catch Corrections

Three model optimizations (PEST1, PEST2, and PEST3, see Table 4), one for each rainfall input, were performed according to the described calibration scheme based on the PEST optimization tool and the selected calibration parameters and objective functions. The most weight was put on the objective functions R2NS, Fbal, and MELayers. Hence, these targets are highlighted in the model evaluations. The overall water balance for the calibration period is given in Table 5 and reveals differences in annual precipitation amounts, with a 3% decrease between the standard and local corrections and 6% between the standard and dynamic corrections. In spite of the difference in rainfall amounts, the three models performed equally well regarding annual Fbal and hydraulic head simulations (Tables 6 and 7 ). The optimized parameter sets are illustrated in Fig. 5 . The similarity in water balance performance was attributed to the calibration, which allowed for differences in the simulated actual ET due to the lack of constraint on the evaporative flux in the calibration scheme. In this way, the major water balance controlling calibration parameter, the root depth, was utilized to adjust the overall water balance to produce different models with similar performance, illustrating the problem of input bias compensation and equifinality (Beven, 2006). When looking at the other major calibration target, however, the R2NS of discharge, the equifinality problem is reduced. Here, a significant improvement was achieved for all discharge stations when reducing the winter precipitation by using the local (PEST2) and, especially, the dynamic (PEST3) correction factors (Table 8 ).

Table 5 reveals a redistribution of internal flow paths in the modeling system because the dynamic model had a higher contribution to stream flow by overland flow while the base flow and drain flow components were reduced. Using the dynamic correction factors, less water is stored temporally from winter to summer, thereby allowing for a much better description of the recession period (March–April) by decreasing the drainage time constant and the summer flow dynamics by reducing the root depth (Fig. 5). This is illustrated by Fig. 6a and 6b , which show the observed and simulated hydrographs for a downstream and an upstream station for the hydrologic year 2002. There was a significant improvement in the simulations of discharge dynamics when applying the dynamic correction factor (PEST3), especially during the recession period.

The model performances for the validation period from 1996 to 1999 reveal a similar pattern of improving R2NS when applying the dynamic correction (Table 8). Additionally, the dynamic method (PEST3) performed much better on the water balance objective function Fbal (Table 7), where the area weighted mean error was 1%, compared with 11 and 7% for the standard (PEST1) and local (PEST2) factors, respectively. This indicates the strengths of the dynamic method, which distinguishes between warm and cold winters and thereby has much stronger predictive capabilities outside the calibration period.

A closer analysis of the simulated discharge dynamics reveals that the major improvement in model performance on R2NS was mainly due to the generally lower winter precipitation resulting in different model parameters when recalibrated using the dynamic correction (PEST3). This is illustrated in Fig. 7 , which shows the simulated hydrographs for PEST1 and PEST3 together with the difference in daily precipitation input for two different winters. The hydrographs behave very differently for these years. During the 1996–1997 winter (Fig. 7a), the dynamically corrected simulation shows higher discharges than the standard corrected simulation, while the opposite is the case for the 1999–2000 winter (Fig. 7b). For both winters, the simulation with the dynamically corrected precipitation performed significantly better than the standard corrected simulation. This could be attributed to the introduction of interannual variability through the dynamic correction; however, the red curves in Fig. 7 showing the daily difference in precipitation between the standard and dynamically corrected time series (standard − dynamic) reveal that even for the 1996–1997 winter, where PEST3 simulated higher discharge, the standard corrected precipitation was, on average, higher than the dynamically corrected precipitation. This leads to the conclusion that the increased dynamic and improved R2NS is mainly attributable to the change in parameters through the model optimization. The main factor in this regard is the change in buffering capacity through the root depth. Using the standard corrections resulting in higher winter precipitation led to a very large root depth, which acted as a buffer, holding back the water in the root zone to allow more ET during spring and summer. The drawback is that the large buffer capacity in PEST1 reduced the dynamics in the discharge simulations compared with PEST3, where the shallow root depth allowed much better discharge dynamics.

The introduction of daily and interannual variations in precipitation correction is still believed to have improved the discharge simulation, but the effect was secondary compared with the improved dynamic caused by the reduction in the model buffer capacity through parameter calibration.

Model Calibrations Using Reduced Crop Coefficient and Reference Evapotranspiration

Following the results presented above, the dynamic application of catch correction seemed to be favorable. During model optimization, however, an estimated summer root depth of only 74 mm was obtained. Although this was for the soil type with the shallowest root depth, the deepest being twice that, it was still much below what is regarded as a realistic value. Expected values fall in the range 400 to 600 mm for the given soil type. Therefore, new considerations were made to the crop coefficient Kc and the reference ET. As mentioned above, the initial values for Kc and reference ET were at the high end of the recommended ranges (Table 1), allowing for a reduction of these vegetation- and climate-specific inputs. Hence, two additional model optimizations were performed, one utilizing the lower values of the recommended Kc range (PEST3a) and a second where these lower Kc values were combined with a 5% reduction in reference ET (PEST3b). Both scenarios were combined with the dynamically corrected rainfall input (see Table 4). The aim was to produce model performance similar to PEST3 (dynamic) but at the same time obtain more realistic model parameters.

As can be seen from Fig. 8 , the optimized parameters were similar for the three optimizations, with the exception of Kx_Top_Till, which was not well determined, and the root depth. An expected increase in root depth was found with decreasing Kc and reference ET (PEST3a and PEST3b, respectively). This resulted in much more realistic parameters for the root depth for PEST3a and especially PEST3b.

Model performances for the three optimizations were very similar (Tables 6, 7, and 8) for both calibration and validation periods, although neither PEST3a nor PEST3b performed quite as well as PEST3 on discharge simulations. This followed the observations from PEST1 and PEST3 in Fig. 7 because the increased root zone (buffer capacity) reduced the hydrograph dynamics. The largest difference between model simulations in PEST3, PEST3a, and PEST3b was on the partitioning between base flow and drain flow (Table 5).

Alternative Weighting of Objective Functions and Initial Parameter Values

In an attempt to evaluate the robustness of the optimized hydrologic model, an alternative objective function weighting and initial parameter value exercise was performed. This was based on PEST3b, which was believed to represent the best combination of model performance and realistic parameter values. For the alternative weighting, the Q and H objective functions were weighted by 90 and 10% and vice versa. The new optimized parameters are presented in Fig. 9 , and model performances are listed in Tables 5 to 8,67.

The model performance on simulated discharge (R2NS and Fbal) and hydraulic head (RMSEHdyn and MELayers) was very similar for all optimizations. Table 5 reveals considerable redistributions, however, among the internal flow processes, base flow, drain flow, and overland flow. Especially the fraction of total stream flow originating from base flow and drain flow was shifted considerably between PEST3bQ and PEST3bH. This shift was primarily caused by a change in the leakage coefficient (Fig. 9), which was reduced in PEST3bQ, leading to lower base flow and an increase in the drainage time constant, which increases the drain flow. Apart from this redistribution, only minor changes occurred in spite of the major change in objective function weights. This was mainly attributed to the fact that the model was believed to have reasonable initial parameter values.

The second test addressed the issues of local vs. global optimums in parameter space. Two additional model calibrations were performed using initial parameter values that were double (PEST3b2nd) and half (PEST3b3rd) their original initial values. The results are presented in Fig. 10 and show that for most parameters very similar optimum values were reached regardless of the initial value. for the parameters Kx_Top_Till and the horizontal saturated hydraulic conductivity of the Miocene mica sand (Kx_Sand3), however, the results are less conclusive owing to larger uncertainties in these parameters, as illustrated by the 95% confidence limits. The overall model performance of the optimized models in Fig. 10 expressed through the sum of squared weighted residuals (phi) was very similar for PEST3b and PEST3b2nd (51.4 and 50.1, respectively), while phi was higher for PEST3b3rd at 60.9. This is, however, mainly attributed to the fact that the initial root depth was very shallow in PEST3b3rd because a fourth model optimization identical to PEST3b3rd but with a deeper initial root depth (237 mm) resulted in almost identical parameters and phi (51.3) as PEST3b.

The results of the model performance using different catch correction applications showed that much better dynamics in the simulation of discharge can be achieved when choosing a correction method that reduces the winter precipitation, a result similar to the findings of Tian et al. (2007), who investigated large river systems in northern latitudes. In our study, this was due to the existence of an imbalance in the dynamics between summer and winter discharge when using the standard and to some extent also the local correction factors. If winter precipitation is overestimated due to differences in climate between the current winters (1996–2003) and the reference period (1961–1990 or 1989–1999), then excess precipitation occurring during the winter needs to be evaporated during the summer, when the potential ET is high and the crop and root system is developed. This leads to a model optimization that exaggerates the root depth to remove water during the growing season and at the same time underestimates summer low flow and reduces flow variability. Likewise, as a consequence of the overestimation of winter discharge, the drainage time constant is increased, leading to a poor description of recession flow (March–April; Fig. 6).

Besides the generally lower winter precipitation using the dynamic correction factors, the ability of the dynamic correction method to distinguish cold and warm periods obviously improved the R2NS values and especially the water balance (Fbal) simulations during the validation period because a lower average winter correction factor was experienced for the calibration period (2000–2003) than for the validation period (1996–1999). This can only be attributed to either fewer snowy days (warmer winters) or lower wind speed during the calibration period.

Generally there was very little difference between the optimizations regarding the simulated hydraulic head (Table 6). This was assumed to be a result of the large number of head observations, distributed across the entire catchment, combined with areas of both large positive and negative biases and the almost complete coverage of the major geological units in the catchment. As a consequence, any adjustment of hydraulic conductivity during the calibration will contribute to both increases and decreases of the largest simulated biases and thereby constrain the parameters. This mechanism is evident by looking at the optimized parameters in Fig. 5 and 8, where the values of the saturated hydraulic conductivity of the major geological units (Sand1 and Sand2) varied little between model setups and are associated with narrow confidence bands.

The dynamic catch correction has been shown to be favorable, especially with regard to simulated discharge. Given the initial Kc and reference ET parameterization, however, the dynamic catch correction resulted in an unrealistically shallow root depth (PEST3). This was improved by reducing the Kc factor and the reference ET (PEST3a and PEST3b) within recommended values, and almost identical model performances were achieved as with PEST3.

Our findings of equifinality and the challenges of achieving a robust calibration are not new (Franks et al., 1997). Previous attempts to constrain hydrologic models by introducing additional data on soil moisture (Pauwels et al., 2002), ET (Immerzeel and Droogers, 2008), or saturated areas (Franks et al., 1998) have focused on improving the discharge simulations. Our results show that our model is quite robust for the simulation of discharge. Hence, the value of additional data for constraining the model lies in potentially improved reliability of model simulations of ET and the separation of flow components (overland flow, drain flow, and base flow).

The results highlight the need for increasing the constraints of the model, e.g., by including measurements of variables such as soil moisture or actual ET in the calibration scheme. But also more control on the internal variables contributing to stream flow is needed because the current discharge measurements exclude information on the balance between stream flow components. Additionally, by improving the ET description through incorporation of both a full Richards' equation for unsaturated flow and an energy-based SVAT model for ET simulations (Overgaard, 2005), new types of data, such as sensible heat and surface temperature from both point measurements and remote sensing, can be incorporated into the modeling framework by a scheme similar to that of McCabe et al. (2005).

This work was part of the “Center for Hydrology, HOBE–a Hydrological Observatory” funded by the Villum Kann Rasmussen Foundation.

Freely available online through the author-supported open access option.