ASSESSING HILLSLOPE-CHANNEL CONNECTIVITY IN AN AGRICULTURAL CATCHMENT USING RARE-EARTH OXIDE TRACERS AND RANDOM FORESTS MODELS

Soil erosion from agricultural areas is a large problem, because of off-site effects like the rapid filling of reservoirs. To mitigate the problem of sediments from agricultural areas reaching the channel, reservoirs and other surface waters, it is important to understand hillslope-channel connectivity and catchment connectivity. To determine the functioning of hillslope-channel connectivity and the continuation of transport of these sediments in the channel, it is necessary to obtain data on sediment transport from the hillslopes to the channels. Simultaneously, the factors that influence sediment export out of the catchment need to be studied. For measuring hillslope-channel sediment connectivity, Rare-Earth Oxide (REO) tracers were applied to a hillslope in an agricultural catchment in Navarre, Spain, preceding the winter of 20142015. The results showed that during the winter no sediment transport from the hillslope to the channel was detected. To test the implication of the REO results at the catchment scale, two contrasting conceptual models for sediment connectivity were assessed using a Random Forest (RF) machine learning method. The RF method was applied using a 15-year period of measured sediment output at the catchment scale. One model proposes that small events provide sediment for large events, while the other proposes that only large events cause sediment detachment and small events subsequently remove these sediments from near and in the channel. For sediment yield prediction of small events, variables related to large preceding events were the most important. The model for large events underperformed and, therefore, we could not draw any immediate conclusions whether small events influence the amount of sediment exported during large events. Both REO tracers and RF method showed that low intensity events do not contribute any sediments from the hillslopes to the channel in the Latxaga catchment. Sediment dynamics are dominated by sediment mobilisation during

ABSTRACT.Soil erosion from agricultural areas is a large problem, because of off-site effects like the rapid filling of reservoirs.To mitigate the problem of sediments from agricultural areas reaching the channel, reservoirs and other surface waters, it is important to understand hillslope-channel connectivity and catchment connectivity.To determine the functioning of hillslope-channel connectivity and the continuation of transport of these sediments in the channel, it is necessary to obtain data on sediment transport from the hillslopes to the channels.Simultaneously, the factors that influence sediment export out of the catchment need to be studied.For measuring hillslope-channel sediment connectivity, Rare-Earth Oxide (REO) tracers were applied to a hillslope in an agricultural catchment in Navarre, Spain, preceding the winter of 2014-2015.The results showed that during the winter no sediment transport from the hillslope to the channel was detected.To test the implication of the REO results at the catchment scale, two contrasting conceptual models for sediment connectivity were assessed using a Random Forest (RF) machine learning method.The RF method was applied using a 15-year period of measured sediment output at the catchment scale.One model proposes that small events provide sediment for large events, while the other proposes that only large events cause sediment detachment and small events subsequently remove these sediments from near and in the channel.For sediment yield prediction of small events, variables related to large preceding events were the most important.The model for large events underperformed and, therefore, we could not draw any immediate conclusions whether small events influence the amount of sediment exported during large events.Both REO tracers and RF method showed that low intensity events do not contribute any sediments from the hillslopes to the channel in the Latxaga catchment.Sediment dynamics are dominated by sediment mobilisation during large (high intensity) events.Sediments are for a large part exported during those events, but the system shows a memory of the occurrence of these large events, suggesting that large amounts of sediments are deposited in and near the channel after these events.These sediments are gradually removed by small events.To better understand the delivery of sediments to the channel and how large and small events influence each other more field data on hillslope-channel connectivity and within-channel sediment dynamics is necessary.

Introduction
Soil erosion in agricultural areas is a large problem worldwide, because of a loss of productivity (Cerdà et al., 2009;García-Orenes et al., 2009), but also because of off-site effects like the rapid filling of reservoirs (Ben Slimane et al., 2016;Mekonnen et al., 2017;Poesen and Hooke, 1997).To mitigate the problem of sediments from agricultural areas reaching the channel and, in a later stage, reaching reservoirs and other surface waters, it is important to understand the connectivity between hillslopes, channels and the outlet of the catchment.
Hillslope-channel connectivity depends on hillslope topography, soil types and structure, (riparian) vegetation and management practices (Harvey, 2001;Kirkby et al., 2002).Hillslopes can be directly connected to the channel, with steep slopes, no floodplain and no riparian vegetation or be unconnected through floodplains and dense riparian vegetation.The structure of the hillslope-channel connection (structural connectivity) and the processes that act on that structure (functional connectivity) determine the existence and size of the hillslope-channel connections (Bracken and Croke, 2007;Brunsden, 1993).
Several concepts regarding sediment connectivity have been developed over the past years, some of which focus on hillslope-channel connectivity.One of the most recent concepts of connectivity suggests that small events "liberate" sediments which then concentrate on the lower parts of the hillslopes and channel, gradually increasing sediment connectivity (Bracken et al., 2015).Most of these sediments do not reach the outlet of the catchment during those small events.During a large event, when the previously deposited sediments have caused higher sediment connectivity, these sediments are remobilised.These sediments are subsequently exported from the catchment, causing high sediment discharge at the outlet.This to some extent contrasts with the study of Cammeraat (2002), who showed that during small events only small pockets within a catchment are active and have only a very minor (or no) connection to the channels.Large events activate the entire catchment, making sediment transport from the hillslope to the channel and out of the catchment possible.At the end of such an event, large amount of sediments might deposit in the channel, which is then gradually removed by small events.Supporting this model, a study of Thompson et al. (2016) showed that smaller events are more effective in transporting sediments through a channel, because during large events, large amounts of sediments are deposited on the banks and floodplains.As a consequence, the amount of sediment exported out of the catchment during an event is an indirect result of hillslope-channel connectivity and the continuation of sediment transport within the channel.To determine the functioning of hillslope-channel connectivity, and the continuation of sediment transport within the channel, it is necessary to obtain data on sediment transport from the hillslopes to the channels.Furthermore, it is necessary to simultaneously look at factors that influence sediment export out of the catchment.
Sediment tracers can be used to determine which areas on a hillslope contribute sediments to the channel.Tracers have been increasingly used in studies looking at the redistribution of sediments on hillslopes (Guzmán et al., 2013).Rare-Earth oxides (REOs) are types of tracers that are actively applied to the soil.REOs occur naturally in soils in small concentrations, but are applied to the soil at 10-100 times the background concentrations by lawn spreaders or by spraying (Deasy and Quinton, 2010;Kimoto et al., 2006;Polyakov et al., 2004).
Principal component analyses or cluster analyses are often done to analyse factors that influence the hydrological behaviour and the sediment export of a catchment (García-Ruiz et al., 2005;Giménez et al., 2012;Zabaleta et al., 2007).Many of these methods can assess which factors are important for e.g.sediment export at the outlet of a catchment, but they do not always take the interaction between variables into account.Furthermore, they are often not able to take categorical and continuous variables into account side-by-side.Techniques that do take these interactions into account are machine learning techniques.Machine learning techniques are powerful tools that can be used for regression analysis, and moreover, can be used to assess the importance of (categorical) variables and the interaction between variables.One such machine learning algorithm is Random Forest (RF), which has already been successfully applied to improve the mapping of soil characteristics (Hengl et al., 2015) and to model suspended sediment concentrations (Francke et al., 2008).Determining key variables and their interaction for hillslope-channel connectivity has not been done yet using RF.
Hence, the objective of this study was to assess catchment sediment dynamics regarding hillslope-channel connectivity and within-channel sediment transport for a Mediterranean agricultural catchment.We assessed factors influencing hillslope-channel connectivity and resulting catchment sediment yield for varying event magnitudes.The connectivity behaviour of the catchment was compared to connectivity behaviour as described by several conceptual models.To assess hillslope-channel connectivity, transport of sediments from the hillslope to the channel was measured using sediment tracing and influencing factors for catchment connectivity were assessed using the Random Forest regression method.

Study Area and data
The 'Latxaga' catchment (2.07 km 2 ) is located in Northern Spain in the autonomous region of Navarre (Fig. 1).The climate is humid sub-Mediterranean, with mean annual precipitation of 835 mm, the majority of which falls from October to April (Gobierno de Navarra, 2001).Soils are a silty clay loam, with large, stable aggregates and land use is predominantly agriculture, of which winter wheat is the most abundant crop (Chahor et al., 2014;Giménez et al., 2012).Slopes in the catchment can be steep, up to 30%, but towards the main channel the slope angles decrease to approximately 5-7%.Daily hydrological, meteorological and sediment data is available for the period 2002-2015, for details on collection and devices please see Casalí et al. (2008Casalí et al. ( , 2010)); Giménez et al. (2012); Chahor et al. (2014).Furthermore, a high-resolution (10 cm) digital terrain model of February 2015 was available for the interpretation of flow paths on the hillslope (Masselink et al., 2017).

Rare-Earth Oxides tracer application, sampling and interpolation
Rare-Earth Oxide (REO) tracers were used to assess whether or not sediment from the studied hillslope (Fig. 1) was transported to the channel during the winter 2014-2015.The hillslope was selected because of its topography and vegetation arrangement.The hillslope is representative for the area, because like almost all hillslopes in the catchment, it contains a steep area with both agriculture and a semi-natural shrub area.It is connected to the channel via a relatively flat area and a densely vegetated riparian zone.
REO tracers strongly adhere to soil particles, without changing the behaviour of these particles and their aggregates (Zhang et al., 2001).Four REO types were used: Erbiumoxide (Er 2 O 3 ), Yttriumoxide (Y 2 O 3 ), Praseodymiumoxide (Pr 2 O 3 ) and Samariumoxide (Sm 2 O 3 ), later on referred to as Er, Y, Pr and Sm, respectively.Laboratory tests showed that these REO tracers penetrated to a maximum of 1cm after a sequence of 3 simulated rain events, which confirmed the limited vertical mobility of the REO tracers within the soils of the study area and confirmed the utility of these tracers for assessing hillslope-channel connections.
To determine background concentrations of the four REOs, 20 samples were taken at different locations on the hillslope before the start of the experiment.These background concentrations were used to calculate the necessary amount of tracer that needed to be applied to reach at least a concentration 10 times that of the background concentration (Polyakov et al., 2004).
REO tracers were mixed with soil from the hillslope, which was dried and ground before mixing.The REO tracers were applied to the hillslope sections on October 30 2014, after tillage and seeding of the winter wheat.The REO tracers were applied using a standard lawn spreader that was calibrated to disperse 500g of mixture on a stretch of 10 meters on flat, even terrain (Table 1).In contrast with earlier studies using REOs spread with a lawn spreader (Kimoto et al., 2006;Polyakov et al., 2004), applied tracers were not mixed into the ground by disking or tillage because: (i) we wanted to keep the oxides within the applied areas, (ii) we did not want to make any changes to normal farming practices, and (iii) the main focus of this study was on the assessment of hillslopechannel connectivity, and, therefore not on calculating exact amounts of displacement of sediments.This choice is supported by Deasy and Quinton (2010), who found that the incorporation of the REO tracers through disking caused high sediment yields for the first events after disking.
Compound samples existing out of 5 randomly taken samples within the application areas were used to determine the final mean application concentrations.The concentrations were assumed to be uniformly spread within each of the application areas.
At the beginning of summer (June 30, 2015), samples were taken for the assessment of sediment movement.This was done just before harvest of the winter wheat to ensure minimum disturbance by machinery.102 compound samples, consisting of 5 samples taken within a 1m 2 area were taken in a stratified random sampling approach.A set of tractor tracks parallel to the slope was additionally sampled to follow the erosion and sedimentation patterns within the tracks.The tractor tracks were caused by normal farming practices (fertiliser and pesticide application).Furthermore, 8 compound samples of sediment were taken in the channel bed, ranging from next to the hillslope to the outlet of the catchment.These channel samples were taken from both the areas in the thalweg, as well as close to the banks to well represent the area in which the samples were taken.Grain size distributions of the channel sediments were assumed to be similar to the soils within the fields, although they might have been enriched by clay and/or silt.Samples were dried, sieved to 2 mm and colloid ground.Subsamples of 500 mg were destructed using an Aqua Regia method.This entails subjecting the sample to 6mL HCL and 2mL HNO3 and leaving this standing overnight.Afterwards the samples were heated to 103 °C for 2 hours.This completely dissolves the oxides.REO concentrations were measured using a high-resolution ICP-MS (Thermo Scientific Element 2).
Measured sample concentrations were compared with the normal background concentration range by using a 99% confidence interval of the measured background values.Samples where REO concentrations were above background concentrations for the confidence interval were interpolated using an adaptation of the standard inverse distance weighting interpolation method.The standard method, unrealistically, does not take into account that REOs can only move downslope.To solve this, the interpolated values were constrained by using flow paths of the high-resolution DTM.

Random Forests
Random Forests (RF) is an ensemble machine learning method for classification or regression problems (Breiman, 2001).RF can deal with large datasets of observations, and also with a large number of predictor variables.RF is not restricted to normally distributed data, does not assume linear relations, and can incorporate categorical variables.RF combines an ensemble of models (classification or regression trees) into one prediction model.In contrast to single classification trees, RF does not suffer from overfitting because of the Strong Law of Large Numbers (Feller, 1968).Overfitting refers to the problem that a model works (almost) perfectly on a training set but performs poorly on a test set.
RF has many advantages over other machine learning algorithms: it is not fully a black-box algorithm, it can calculate the model error internally so there is no need for a training and a validation set and, finally, it determines for each variable the importance for the classification or regression of the target variable.A disadvantage of the method is that for large datasets in combination with a large forest, the operations can become slow and that the model does not perform well for predictions that are outside of the range of training samples.
RF uses random subsets of measured variables and predictor variables (bootstrapping) to create many decision or regression trees.The final prediction value in the case of regression is the mean predicted value of all trees within the forest (Fig. 2).

Application of Random Forests for determining hillslope-channel connectivity
The objective of the application of the Random Forest (RF) method was to assess important factors for sediment export at the catchment scale and to determine whether these factors differ between small events and large events.We hypothesized that hillslopechannel connectivity would most likely occur in large events, that, on average, occur a few times per year.Therefore, we chose a threshold for events with an event probability of 5% or lower, to represent an average of ~18 days per year with hillslope-channel connections.To determine which events (i.e.days) are within those 5% percent and at which threshold this occurs, a cumulative distribution function was created, using all daily sediment export data with the ECDF function in the statsmodel.smlibrary for Python 2.7.
The RF model (RF in Python 2.7 using sklearn.ensemble.RandomForestRegressor) was subsequently run for three datasets: the entire dataset, the dataset for large events (<5% probability) and the dataset for small events (≥5% probability).All datasets were split into two subsets: a random subset of 70% for training and 30% for validation.

Input variables and prediction variable
The total amount of sediment discharge from the catchment (kg day -1 ) was used as the variable to predict on the basis of daily discharge, meteorological data and derivatives from those data (Table 2).Other factors that might play a direct or indirect role for the amount of sediments discharged from the catchment were also taken into account.These factors are day of the year and season, as well as a vegetation index (Normalised Vegetation Index; NDVI), extracted and interpolated at daily basis (Masselink et al., 2016).Some of these input variables are collinear up to a certain extent (e.g.cumulative rainfall for 1 and 2 days) but the use of sufficient trees in the random forest ensures that this collinearity does not affect model results (Breiman, 2001).The required number of trees for the forest was determined using the training set from the entire dataset where sediment discharge was larger than 0 kg (n=2451 days) and the base input variables (Table 2).The R 2 of variation for the test set was determined for random forests ranging from 1 to 3000 trees using: Cuadernos de Investigación Geográfica 43 (1), 2017, pp.19-39 Where y i is the measured value at i, f i the predicted value at i, and ȳ the mean of the measured values.The threshold at which the model stabilised was chosen as the number of trees necessary in the forest.

Determining important variables for small and large events
To determine differences between the behaviour of the catchment during large events and small events, the datasets for both type of models were modelled twice, once using the basic variables and once with additional variables that possibly influence the behaviour within the catchment regarding sediment transport (Table 2), either for small or for large events.
To test whether small events affect sediment export of large events, large events affect sediment export of small events or that the influence is mutual, the influence of several input variables on model performance and variable importance was tested.
In the connectivity concept of Bracken et al. (2015), small events gradually increase sediment connectivity through depositing sediments near or in the channel, which are then removed during large events.The ratio between the total amount of precipitation and the total amount of sediment discharge could be indicative for the amount of sediment that has accumulated near or in the channel in between large events; many small events sum up to large amounts of precipitation but low sediment export, while few large events might sum up to less precipitation but more sediment export.Furthermore, the amount of time that has passed since the last large event could play a large role in the amount of sediment accumulation in or near the channel.Therefore, the variables 'P cum , S cum and D_Event' were added, which correspond to the sum of precipitation since the last large event, the sum of sediment export since the last event and the time since the last large event.
In the alternative case, where large amounts of sediments are deposited at the end of a large event, the number of days after the event and the magnitude of the large event could influence the sediment export of subsequent small events.Therefore, to test if large events influence subsequent small events the input variables 'D_Event' and 'Event_S' were introduced into the RF model, which are the days passed since the last large event and the magnitude of that event.
The number of large events is small compared to the number of small events (i.e.~5% of number of small events).This might lead to unbalanced results for the model performance, depending on which random subset is taken for training and validation.To be able to compare the results for both conceptual models, the models were trained and validated for 100 different random subsets.
Model performance was tested by calculating the R 2 of variation (Eq. 1) and by calculating the root mean square error (RMSE in kg day -1 ): Hillslope-channel connectivity in an agricultural catchment Cuadernos de Investigación Geográfica 43 (1), 2017, pp.19-39 29 where ŷ are the predicted values, y the measured values and n the number of samples.Model performance (R 2 and RMSE) was calculated for all 100 runs for the RF models and the sample medians between the models with the basic input variables and the additional variables were compared using a non-parametric Mann-Whitney U test (α =0.05).
Furthermore, the variable importance of all input variables was assessed, using the calculated variable importance from the RandomForestRegressor function in the sklearn.ensemblepackage in Python 2.7.A level of 5% importance was assumed as a threshold for valuable contribution for the model.

Hillslope-Channel coupling assessment for 2014-2015 winter using Rare-Earth Oxides
The interpolated Rare-Earth Oxide (REO) observations show that there was little sediment transport during winter (Fig. 3).None of the sediments that had been tagged with tracers reached the channel.The only location where REOs had moved significantly out of the application area was within the tractor tracks, albeit not much farther downslope (28 m from the sampling area).The samples taken in the channel (Fig. 3) showed no trace of any of the REOs that were applied on the hillslope.These results show that during the 2014-2015 winter the number of erosive rainfall events has not been enough to mobilise sediments and to transport these sediments to the foot of the hillslope and into the channel.This is probably related to the maximum precipitation intensities of the winter of 2014-2015, which are lower than those in three previous winters in which more hillslope-channel connectivity was observed (Table 3).
Table 3.The maximum cumulative amount of precipitation (mm) for 4 consecutive winters (October-June) for the Latxaga catchment for 10 minutes (P 10 ), 30 minutes (P 30 ), 60 minutes (P 60 ), 120 minutes (P 120 ), daily, weekly, monthly and total.Total yearly sediment discharge (S total ) and the number of days with sediment data (N) are also depicted for each year.

Long-term Hillslope-Channel coupling assessment using Random Forests
The cumulative probability function (Fig. 4) shows that the threshold for sediment discharge for those events (days) that occur less than 5% of the times is 1455 kg day -1 .This threshold resulted in splitting of the entire dataset (2451 days) into 132 large events and 2319 small events that were used for the Random Forest (RF) models.
The RF model on the entire dataset using only the basic variables starts to stabilise at a number of trees of around 1000 (Fig. 5).In order to ensure a stable model when including more variables, all consecutive model runs were done with 1500 trees in the forest.
The results of the RF model of the total dataset (Fig. 6) show that the model underperforms (R 2 = 0.01), which is mainly due to the presence of two outliers.When these outliers are removed the model results explain 45% of the variation (R 2 = 0.45).RF is known to underperform in the case the test data is not within the range of the training data.The variable importance of the total dataset shows that mainly antecedent precipitation, precipitation on the day itself and precipitation intensity control the model, followed by the day of the year (Julian day).The vegetation index (NDVI) and the season are relatively unimportant variables in the model.The RF model results with the basic input variables for the large events (Fig. 6) show that the highest R 2 still only explains 1 % of the variation (R 2 = 0.01) and the RMSE is 22568 kg.When including the additional variables, the maximum R 2 increases to 0.15 and the RMSE decreases to 20806 kg.The variable importance plots show that for the basic model the most important variables are (antecedent) precipitation and precipitation intensity, followed by the Julian day and the vegetation Index.When the three extra variables (P cum , S cum , D_event) are included, these become the most important variables for modelling sediment discharge.
The RF model results with the basic input variables for the small events (Fig. 6) shows a higher R 2 value (R 2 =0.42) than the model for the large events.The RMSE value (138 kg) is also lower than for the large events, but these values cannot be compared, because of the difference of magnitude of events.When including the additional variables, the maximum R 2 of the model increases to 0.56 and the RMSE decreases to 109 kg.The importance of the models shows that the most important variables for the basic input variables are 5-day antecedent precipitation, the Julian day and the vegetation index (NDVI).The precipitation on the day itself is of much less importance.When the three extra variables are included, the days since the last event becomes the most important parameter for the modelling of sediment discharge.The Spearman correlation coefficient between the days since the last event and the sediment output is -0.55 (p<10 -181 ), showing a negative correlation between the number of days since the event and the sediment output.
The results of the Mann-Whitney U test show that the differences for the medians for the R 2 and the RMSE for the model with base variables and the model with extra variables for the large events are not significantly different (p=0.65 and p=0.76 resp.).For the small events the medians for the model with base variables and extra variables, the R 2 and RMSE are significantly different (p< 1e-15 for both).This means there is no significant improvement from adding the additional variables for the large events, whereas there is a significant improvement when adding the additional variables for the small events.

Factors controlling Hillslope-Channel connectivity
Using the Rare-Earth Oxide (REO) tracers we were able to demonstrate that hillslopechannel connectivity was low for the winter of 2014-2015, with low precipitationintensity events (Fig. 3).This means that these events contributed little to no sediment to the channel.The only place where some more sediment transport took place was in the tractor tracks parallel to the slope.Even though the movement of sediments was still limited (<30 m), this shows the importance of linear features like tractor tracks, rills and drainage ditches for hillslope-channel connectivity (Basher and Ross, 2001;Collins and Davison, 2009;Heathwaite et al., 2005).
Even on days that have large total precipitation (i.e.54.77 mm), precipitation intensity seems to be the controlling factor for sediment mobilisation and, therefore, for hillslopechannel connectivity.The p-values of the independent t-tests show that there is a large difference for precipitation intensities, especially between the winter of 2012-2013 and the winter in which the REOs were applied (2014-2015; Table 3).Precipitation intensity was also found to be the major determinant for hillslope-channel coupling in a modelling study done by Michaelides and Wainwright (2002).The REO findings on the studied slope, together with the fact that the slope is representative for other hillslopes within the catchment, indicate that the majority of the sediments leaving the catchment in the winter of 2014-2015 must have come from sources within the channel or the channel banks.
The results of the Random Forest (RF) method for the entire dataset and the large events partly agree with the findings of the REO tracers, in the sense that precipitation intensities are considered to be important variables for estimating sediment discharge.However, antecedent precipitation is considered to be more important than precipitation intensity (Fig. 6).It is likely that the combination of large amounts of antecedent precipitation and high precipitation intensities lead to a large amount of sediment detachment and enough overland flow to transport these sediments, as also shown in other studies in Spain (Baartman et al., 2012;Cantón et al., 2001;Giménez et al., 2012;Gómez-Plaza et al., 2001).Giménez et al. (2012) furthermore, argued that most of the sediments at the outlet of the Latxaga catchment seemed to have come from areas close to the drainage network, which agrees with our findings.Similarly, Casalí et al. (2008Casalí et al. ( , 2010)), suggested that, within the Latxaga catchment, summer storms provoke detachment and sediment movement due to highly erosive events.These sediments, however, do not make it to the outlet of the catchment, due to a lack of overland flow for sediment transport.
The RF model for the entire dataset and the RF model with the basic variables for large events showed that the factors determining sediment discharge were antecedent precipitation, total daily precipitation and precipitation intensity.Model results for these large events, however, were unsatisfactory (median R 2 <0) and, therefore, little can be said for the actual importance of any of these variables.The unsatisfactory results for the large events most likely originate from i) the use of fewer events for training and validation than for the small events and ii) the low correlation between any of the input variables and the sediment discharge.This means that some of the important variables that determine the amount of sediment discharge are missing from the RF model.
The RF model with the basic variables for small events showed that the factors determining sediment discharge were antecedent precipitation, vegetation cover and the Julian day (day of the year).This shows that seasonality plays a large role in the amount of sediments discharged out of the catchment during small events.In summer, more vegetation is present along and within the channel, retaining more sediments, while in winter this sediment retention is reduced.The variable 'season' was deemed less important, possibly because of its coarse temporal resolution (i.e.only 4 seasons), while the Julian day variable includes more variability over the year (i.e.365 days).In another study looking at modelling sediment concentrations in the Spanish Pyrenees using a Random Forest model, the Julian day was also one of the most important factors in some catchments (Francke et al., 2008).

Conceptual models for hillslope-channel connectivity
The extra variables in the RF model did not significantly improve model results for the large events.Therefore, it is not possible with our dataset to assess a possible influence of small events on the sediment discharge of large events through the RF modelling procedure.However, there could be contribution of sediments from small events to the areas near the channel, but the signal of this sediment accumulation is relatively minimal compared to the large amounts of sediments mobilised during large events.
Model efficiency for the RF models for small events significantly increases when including the additional variables (p<1e-15), with a negative Spearman correlation of -0.55 (p<10 -181 ).This shows that the amount of time that has passed since a large event and the size of that event have a large influence on sediment export of days that follow with little or no rainfall.These modelling results, in combination with the sediment tracing experiments indicate that large events not only export large amounts of sediment out of the catchment, but also provide the sediments for small subsequent events that export the sediment out of the catchment.This indicates a similar functioning of a catchment as shown by Cammeraat, (2002;Fig. 7).Furthermore, this shows that at least for Mediterranean settings like the Latxaga catchment, the conceptual model of Bracken et al. (2015;Fig. 7) does not adequately describe the catchment sediment dynamics.CM1 shows the conceptual model as proposed by Bracken et al. (2015), with sediments from the hillslopes gradually accumulating in and near the channel.These sediments are flushed out of those areas during large events, with which the system "resets".CM2 shows the model shown in this study.Sediments in and near the channel are gradually removed and are replenished during large, fully connecting events.
The actual sediment dynamics in the Latxaga catchment and many other Mediterranean catchments are more complex than either two conceptual models.In reality, a combination of the two is more likely, depending on what antecedent conditions are present in terms of e.g.vegetation and soil moisture.The results show that there is feedback of large events on small events and perhaps also vice versa.However, this hypothesis needs to be further validated using field data.Field data that need to be gathered to test the hypothesis are sediment tracing data either after every event, and/or tracking sediments over multiple seasons and years (Kimoto et al., 2006).In addition, sediment volumes within the channel need to be quantified to obtain a closing sediment budget.

Conclusions
In this study, we looked at hillslope-channel connectivity, factors influencing sediment connectivity and sediment export out of the agricultural Latxaga catchment in Navarre, Spain.For measuring hillslope-channel sediment transport, Rare-Earth Oxide (REO) tracers were applied to a hillslope preceding the winter of 2014-2015.The results showed that during the winter there have been no sediments transported from the hillslope into the channel, which was most likely due to low precipitation intensities.The sediment connectivity of the catchment was assessed using a Random Forest (RF) machine learning method, which was applied to the entire dataset (N=2451 days) and two subsets of the whole dataset: small events (N=2319 days) days and large events (N=132 days).The model for small events showed that there is a significant increase in model performance when variables related to preceding large events are included (p<1e-15).Furthermore, the variables related to these large preceding events are the most important variables in the model for prediction of sediment export.The model for large events underperformed and we can, therefore, not draw any immediate conclusions from the model results regarding variable importance.Because we cannot make any conclusions regarding variable importance, we are not certain that small events influence the amount of sediment exported during large events.The large variability in sediment export for large events and the relatively small contribution during large events of sediments earlier mobilised during small events are the most likely cause of the underperformance of the RF models.The sediment dynamics in Latxaga are dominated by sediment mobilisation during large events.These sediments are for a large part exported during those large events.Large amounts of sediments in and near the channel are deposited at the end of the large events, which are gradually removed by subsequent small events.To determine how exactly large and small events influence each other, we need to gather more sediment tracer data on hillslope-channel connectivity and within-channel sediment dynamics.

Figure 1 .
Figure 1.The study area of Latxaga, located in Navarre, Spain.The hillslope on which the tracer study was done is depicted on the right in 2d (top) and 2.5d (bottom).Contour lines depict the altitude of the hillslope.

Figure 2 .
Figure 2. Example of a random forest, existing out of three regression trees.The prediction variables are led through all the trees and the final prediction value is the mean of all individual tree outcomes.

Figure 3 .
Figure 3. Hillslope showing application areas of Rare-Earth Oxides, sample locations and interpolated concentrations.Depicted concentrations are measured minus the background concentrations.The figure shows contour lines and the location of the tractor wheel tracks.Right side of the figure shows a 2.5d representation of the hillslope, showing the concentrations of REOs.The top right figure shows the locations of the hillslope within the catchment and the locations of the samples taken in the channel.

Figure 4 .
Figure 4. Cumulative probability of sediment discharge for a random day, based on ~15 years of daily sediment discharge measurements.The dotted line indicates the threshold at 5% and the associated sediment discharge.

Figure 5 .
Figure 5. Model performance for the number of trees in a forest for the base variables on the entire dataset (n=2451).

Figure 6 .
Figure 6.Results of the random forest models for the entire dataset (top), the large events (middle) and the small events (bottom).The scatter plots show the measured and predicted sediment discharge quantities (kg) of the best performing model out of 100 model runs and the bar plots show the mean variable importance of all 100 model runs.The left side of the figure for the large and small events show the model runs with the base variables and the right side with the added extra variables.The variable importance bar plots show the 5% limit at which variables were considered to contribute significantly to the model results.

Figure 7 .
Figure 7. Illustration of catchment response for two different Conceptual Models (CM).The top plot shows a sequence of events, with an arbitrary "event threshold" of connectivity depicted.CM1 shows the conceptual model as proposed byBracken et al. (2015), with sediments from the hillslopes gradually accumulating in and near the channel.These sediments are flushed out of those areas during large events, with which the system "resets".CM2 shows the model shown in this study.Sediments in and near the channel are gradually removed and are replenished during large, fully connecting events.

Table 1 .
Mean background concentrations, concentrations after application and the amount of times the background concentration was multiplied according to measured concentrations.

Table 2 .
Basic input variables and additional variables for the datasets containing large and small events for the RF models.