Spatial variability in commercial orange groves. Part 2: relating canopy geometry to soil attributes and historical yield

Site-specific management strategies are usually dependant on the understanding of the underlying cause and effect relationships that occur at the within-field level. The assessment of canopy geometry of tree crops has been facilitated in recent years through the development of light detection and ranging sensors mounted on terrestrial platforms. The main objective of this study was to uncover the factors driving orange tree variability in commercial orange groves. Secondly, this study sought to investigate whether tree geometry information derived from a terrestrial sensing platform is useful information to guide management zones delineation in such groves. A database of soil physical attributes, elevation, historical yield and canopy geometry (canopy volume and height) was analysed in three commercial orange groves in São Paulo, Brazil. Canopy geometry and historical yield were correlated with soil attributes in two of the three groves evaluated; in these groves, the correlation coefficient between yield and soil/landscape information was often above 0.6, depending on the year. Zones of different tree sizes presented different historical yield and soil properties in all three groves. In conclusion, assessing canopy volume provides useful information to delineate management zones and guide enhanced site-specific management strategies.


Introduction
In precision agriculture (PA), diagnostics, recommendations and management actions are carried out locally, at the within-field level (i.e. site-specifically). Many site-specific applications can be quite straightforward and easy to understand, for example, the 'spot spraying' of herbicides (e.g. Esau et al. 2016), where the product is applied only on the weeds spots rather than to the entire field; or the variable rate application of plant protection products based on crop biomass or crop volume (Berk et al. 2016), where higher spraying volumes are needed to cover a crop with higher biomass or volume. Conversely, the sitespecific recommendations of other important inputs such as fertilizers, can be much more complex because it often involves the local estimation of the crop yield potential and crop response to the applied fertilizer. An accurate recommendation is dependant on a thorough understanding of cause and effect relationships that occur in the cropping system. Such an understanding involves the assessment of the many variables governing yield potential, including the identification of which one is the most yield limiting.
Identifying management zones in a field is a way forward to promote such an understanding. Management zones are regions within a field with particular soil and terrain characteristics governing yield potential. Typically, a database combining maps of soil physical properties, elevation, historical yield and other layers of information (e.g. satellite imagery) is used to identify the different zones within a field. The knowledge of the underlying characteristics of each zone should then help farmers to make enhanced site-specific decisions.
The management zone approach has been extensively reported in grain crops and its usefulness to site-specific management has been greatly confirmed by research (Nawar et al. 2017). However, studies on methods to generate and make use of management zones in the citrus crop, or in other important tree crops, are notably scarce. In Florida, USA, early studies on PA for citrus were mostly based on developing yield mapping techniques Tumbo et al. 2002b;Whitney et al. 2001Whitney et al. , 1999 and electronic sensors to measure canopy volume and guide variable rate application of inputs (Schumann et al. 2006b;Tumbo et al. 2002a;Zaman et al. 2005). Two studies that applied the concept of management zones were carried out by  and by Mann et al. (2011). They analysed multiple layers of soil and plant information in two variable citrus blocks uncovering the main causes for crop performance variability.
In São Paulo, Brazil, a series of studies has been conducted to map and characterize the spatial variability of citrus groves (Leão et al. 2010;Molin et al. 2012;Molin and Mascarin 2007;Siqueira et al. 2010) and to develop and test site-specific nutrient management Molin 2017, 2014).  reported a variable rate application method based on yield and soil fertility mapping but, without the actual implementation of management zones-historical yield variability patterns, terrain characteristics and soil electrical conductivity information were not part of the variable rate fertilizer decision. The authors evaluated the long term effects of such a strategy and found potential value in it given that less fertilizer was applied without much impact on fruit yield. However, their approach relied heavily on traditional, outdated, regional fertilizer equations that were often not adequate to the local soil condition found in the experimental groves-apparently because important landscape and physical soil properties were not part of the recommendation. Their study was also unable to clearly demonstrate whether the positive impacts on fertilizer use efficiency resulted from the site-specific strategy itself or simply because fertilizer rates were significantly lower than those used by the farmer. The authors concluded that, regardless of the positive results obtained from their strategy, recommendations based on local response and yield potential would probably outperform regional fertilizer equations. Such an approach would certainly be aligned with a proper delineation of management zones and a thorough understanding of the soil constraints and yield potential in each zone.
Soil and terrain attributes are typically the main type of information used to help explain stable variability patterns of crop performance and yield potential. In the context of grain crops, crop performance is typically assessed via yield maps and/or remote (or proximal) sensing. However, in the scope of horticultural tree crops, yield maps are less common, given that automatic data acquisition by yield monitors is usually not available and harvesting itself is not mechanical. The use of satellite imagery to provide information about crop development is also difficult given that it often does not provide sufficient spatial resolution to make accurate assessments on the canopies. In the past two decades, new sensing technologies based on terrestrial platforms have been developed, enabling the quick and accurate assessment of crop growth and development based on geometrical features of the tree canopy. Ranging sensors, especially light detection and ranging (LiDAR) sensors, have been regarded as some of the most promising solutions for that purpose.
In a recent review of ultrasonic and LiDAR sensors applied to horticultural tree crops, Colaço et al. (2018a) identified that the majority of studies in the past two decades have been focused on the development of data acquisition and processing systems to generate 3D models of the trees and derive geometric information such as canopy volume or leaf density. Studies on the application of such sensors to precision horticulture have usually been limited to the variable application of spraying rates based on the tree canopy variability (an example of simple, straightforward site-specific management strategy as described above). The authors pointed out the need for studies with a more holistic agronomic perspective where the information from sensors are combined with other layers of information for a proper understanding of cause and effect relationships to enable enhanced crop management decisions.  reported the development of a mobile terrestrial laser scanner (MTLS) based on a LiDAR sensor able to estimate citrus tree geometry at high spatial resolution and accuracy. Such an effort followed previous developments from Florida, USA (Lee andEhsani 2009) andCatalonia, Spain (Rosell-Polo et al. 2009a, b) with the addition that their data acquisition and processing systems were able to scan and map groves at large commercial scales. The first study (Part 1, Colaço et al. 2018b) of this two publication series reported the variability of canopy volume and height in five commercial orange groves in São Paulo, Brazil, measured with the developed MTLS system. Results indicated that the variability found in canopy volume should encourage variable rate application of inputs based on the sensor readings; sensor-based variable rate application might attain input savings of around 40%. However, when developing strategies to recommend input applications involving the understanding of crop response, the question of how to make use of canopy variability information remains; e.g. larger trees should receive more or less fertilizer? The straightforward answer to that and other questions is that it depends on which factors are constraining crop performance in each zone; in conclusion, more layers of information should be analysed.
In order to better interpret tree canopy variability and to fully explore the usefulness of sensor-based canopy geometry measurements to site-specific management, a comprehensive investigation of the factors driving tree canopy variability is needed. Thus, the objectives of the present study were twofold: (a) to investigate the possible causes of tree canopy variability in orange groves and (b) to investigate whether canopy geometry derived from a mobile terrestrial laser scanning system is useful information to delineate management zones in commercial orange groves.

Materials and methods
A spatial database of soil and historical yield information was available in only three of the five citrus groves analysed in Part 1, so this research was confined to these three groves. Groves were approximately 25 ha each, located in São Paulo, Brazil (Fig. 1). Trees were grown in a rain-fed system. Main soil types were Arenosol-loamy sand in groves 1 and 3 and Ferralsol-clay loam in grove 2, which are deep soils with uniform texture across their profile. The groves were scanned by a MTLS system to generate maps of canopy volume and canopy height. The system was based on a LiDAR sensor (LMS 200,Sick,Waldkirch,Germany) and an RTK-GNSS receiver (Real Time Kinematics-Global Navigation Satellite System, GR3, Topcon, Tokyo, Japan), which was operated along the alleys of the groves to take vertical scans of the side of the tree rows. A 3D point cloud was generated and a surface reconstruction algorithm was used to retrieve canopy geometry information for each row section (equivalent to one tree) across the entire grove. At the time of scanning (2015), groves were 6, 12 and 11 years old, respectively. A thorough description of the canopy variability is available in Part 1 (Colaço et al. 2018b). Details on the scanning system and processing steps (from the raw LiDAR data to the canopy geometry calculation) are available in .
In order to explore the possible causes of canopy variation and their relation with other agronomic parameters, a database of different map layers was assembled and analysed (Table 1). Soil and terrain layers were: elevation, soil apparent electrical conductivity (EC a ), soil texture and soil organic matter information. These parameters were chosen due to their temporal stability; i.e. their spatial variability patterns are not expected to change significantly over the years. A series of yield maps was also analysed. All maps were produced by interpolating the data using ordinary kriging. Local variograms were used for maps from high density data (EC a sensors and yield mapping) in order to better capture local variations. A global variogram was used for sparse data (soil sampling for soil texture and soil organic matter). A 5 m pixel grid was used in all maps. The Vesper 1.6 software Fig. 1 Top view of the commercial orange groves used in this study (Minasny et al. 2005) and QGIS 2.10 (QGIS Development Team 2018) software were used for interpolation and final editing of the maps, respectively.

Mapping of elevation and soil attributes
Georeferenced soil samples were collected throughout the groves in order to map soil texture and organic matter. These data were available from other experiments carried out in these groves Molin 2017, 2014). 25 composite soil samples (0-0.2 m depth) were collected in grove 1 (approximately one sample per hectare) and 50 in groves 2 and 3 (approximately two samples per hectare). Elevation data was derived from the RTK-GNSS track data derived from the LiDAR scanning.
Soil EC a was obtained through a Veris 3100 sensor (Veris Technology, Salina, USA). This instrument has six electrode discs which are inserted in the soil to collect on-the-go EC a data from two depth layers of approximately 0-0.3 m and 0-0.9 m. In this study, only the shallow EC a data was used for analysis. One might expect that because of the deep root system of citrus trees, information from the deep EC a would add important information. However, the spatial variability patterns between the two layers were similar, which was expected for these soil types. In addition, correlations between the 0.9 m layer and other crop and soil attributes were generally slightly lower than the shallow layer (data not shown). It was then considered that EC a information from the deeper layer would not add much value to the analysis. The sensor was pulled by a tractor along the alleys of the groves at 2.8 m s −1 . Data were recorded at 1 Hz frequency. Before generating the final map, discrepant values (exceeding two standard deviations from the average) were excluded. A local search for outliers was also carried out following the method of Spekken et al. (2013).

Yield mapping
Yield maps were also available from previous studies testing site-specific nutrient management practices Molin 2017, 2014). Yield maps were collected between 2008 and 2013 for groves 2 and 3, and from 2012 to 2015 for grove 1. The harvest of the fruits was manual so the yield mapping had to follow a specific technique developed for hand harvest. During the harvest, the pickers use 'big bags' to store the fruits while they carry out their harvesting work. To collect yield data, the location of these bags was georeferenced using a common navigation GNSS receiver (coarse acquisition code with accuracy of approximately 3 m). Yield values were calculated for each point based on the mass of the bag and the area it represented in the field. The mass of each bag was visually estimated by the harvest crew leader. In previous studies, this visual estimation showed errors below 4% (Molin et al. 2012;Molin and Mascarin 2007). The corresponding area of each point was computed using the 'Voronoi diagram' tool available in the QGIS software. This algorithm divides the field into smaller polygons, each corresponding to the coverage area of one point (Fig. 2). The boundaries of those polygons are given by halving the distances between the point and its neighbours. The final yield value was calculated by dividing the mass of the bag by its area. This value was assigned to a centroid point inside each polygon. Finally, the data were interpolated to produce the final yield map. The logic behind this yield mapping technique is that yield should be higher with higher concentration of points. Colaço et al. (2015) tested the Voronoi-based method against a modelled reference yield map and found that the yield mapping technique was able to represent spatial variability patterns. The correlation between reference and predicted yield maps was 0.75 (R 2 ), and the average yield error was 15%.

Data analysis
The initial analysis was based on descriptive statistics and geostatistics followed by a visual assessment over the maps of yield, soil attributes, elevation and canopy geometry in order to identify patterns in the spatial variability. Spatial dependence was calculated as the nugget variance (non-spatial variance) divided by sill variance (spatially dependent variance). Higher values of spatial dependence mean that variability was less spatially dependant, i.e. more random in space. According to Cambardella et al. (1994), spatial dependence can be interpreted as strong (< 0.25), moderate (between 0.25 and 0.75) or weak (> 0.75). A pixelbased correlation analysis was performed for each pair of maps to assess the relationships between the different variables.
To address the second objective of this study, to evaluate whether the canopy geometry is useful in guiding management zone delineation in an orange grove, the following procedure was conducted: the canopy volume maps from 2015 were classified into three classes, large, medium and small, by the fuzzy k-means clustering algorithm available in the software MZA (management zone analyst, Fridgen et al. 2004). Using R software (R Core Team 2018), the average values of the different variables (soil attributes, elevation and yield) were computed for each zone and the Tukey test was performed (p > 0.001) to assess the differences between the three zones. Similar analysis was conducted by Mann et al. (2011). Even though canopy geometry is available for 1 year only, comparing it with fruit yield from other years provide insight into the temporal stability of canopy geometry and whether historical yield (i.e. yield potential) can be informed by canopy geometry of any particular year. It is reasonable to expect that variability patterns of tree geometry should not change significantly over the years, as demonstrated by Escolà et al. (2017) in an olive grove.

Grove 1
The first noticeable fact about the maps of grove 1 is the little resemblance between them, especially for the yield maps of the different seasons (Fig. 3). This is evidenced Fig. 3 Maps of elevation, soil attributes, historical yield and canopy geometry in grove 1 by a generally low coefficient of correlation between maps (Table 3). The coefficients of variation of yield maps were also generally low, ranging from 7 to 13% (Table 2), with weak spatial dependence. The range (difference between minimum and maximum values) of soil organic matter, clay content and soil EC a were also very low (Table 2), despite the 12 m variation in elevation. Even though the resemblance between these maps was not very clear, the correlation between soil/terrain attributes reached 0.61 in the case of soil organic matter vs EC a , which indicates some consistency in their spatial variability (Table 3).
It is clear that soil properties were not the main factors driving crop performance variability in this grove. Higher yield usually occurred in the lower parts of the grove (see negative correlations between yield and elevation in Table 3). However, the yield spatial patterns were not consistent over the years. The variability found in the maps of canopy volume and height (Fig. 3) also did not clearly match with any of the other variables. In fact, the coefficient of correlation was usually low, reaching between 0.2 and 0.3 in the best cases. There are numerous factors, besides soil variables, that might have affected the variability of yield and canopy development in this grove.

Grove 2
In contrast to the results for grove 1, some similarities among maps were visible in groves 2 and 3. Yield and canopy geometry were more spatially dependent in grove 2 than in grove 1 (spatial dependence was usually moderate). The clay content in grove 2 was markedly variable (from 18 to 50%, Table 4). The spatial distributions of clay content and organic matter in this field are similar to the variation of elevation; clay content and organic matter are higher in the lower areas of the grove (Fig. 4). The soil EC a was also highly variable (values from 1.65 to 14.1 mS m −1 ). Generally, soil EC a also followed the variation pattern of elevation (higher values in the lowest part of the field). The correlation between soil variables and elevation ranged from − 0.31 up to − 0.75 (Table 5). The variability of the yield maps from 2008 until 2010 was similar to soil variability patterns (highest correlation was 0.79, between yield map of 2009 and clay content). This behaviour was not so clear in the subsequent yield maps. Canopy geometry variability was reasonably similar to soil EC a , clay content and to the yield maps of 2008, 2009 and 2012. Yet, correlations were not particularly high (highest correlation was 0.41, between canopy volume and 2012 yield map).

Grove 3
A consistent variability pattern was found in grove 3 for most soil and yield maps (Fig. 5), despite the small range in soil attributes (clay content varied from 11.5 to 16.1% and organic matter from 1.4 to 2%, Table 6). As in grove 2, spatial dependence values for yield and canopy geometry were usually classified as moderate. It was noticed that in the southern-most corner of the grove, where elevation is also lower, there is a lower amount of organic matter and clay. Also, this is the region where lower yield occurred in practically all the years evaluated, especially in the yield maps of 2011, 2012 and 2013 (the correlation between yield in these years, clay content and elevation varied between 0.50 and 0.60, Table 7). This same region presented the highest values of EC a . This small southern    portion of the field is known for having drainage problems, which explains higher levels of EC a and lower yields. This grove presented the highest correlations between canopy geometry and soil parameters and yield. As for the other groves, the canopy height generally yielded lower correlations with yield and soil variables than canopy volume. The maps of canopy volume and height showed that in the region with bad drainage, tree development was harmed. A negative correlation between canopy volume and soil EC a (r = − 0.51) was found. The map of canopy volume was also similar to most of the yield maps (excluding the yield maps from 2009 to 2010). Figure 6 shows the cluster classification of the groves into three classes of canopy volume (canopy volume was chosen over canopy height due to its higher correlations with other variables); it should be noted that these maps do not represent the final management zones for these groves, but only the canopy volume data classified into three clusters.  ( The classification in grove 3 produced continuous zones matching with spatial patterns observed in soil and yield maps. The zones were more diffuse in groves 1 and 2. The classification of canopy volume in grove 2 resembled the elevation, soil attributes and some yield maps. Regarding grove 1, the classification of the canopy volume did not clearly match any of the yield or soil maps.

Does canopy geometry information help in delineating management zones?
The mean values of each variable in the database (historical yield and soil attributes) were computed for each canopy volume zone (Table 8). As expected, zones with larger trees were found in sites with higher clay and organic matter content. The soil EC a was also higher in those regions, with the exception of grove 3, where the higher EC a was found in the zone with smaller trees (in that grove, high EC a was related to bad soil drainage). Historical yield was usually significantly different between the canopy volume zones. As expected, zones of larger trees had superior historical yield performance and vice versa. It can be suggested that canopy volume for a particular yield can give information about yield performance in past years, even though in a few cases some unexpected results were found where the highest yield occurred in the zones with medium (grove 2 in 2013 and grove 3 in 2010) or small trees (grove 3 in 2009). Generally, soil and crop variables differed significantly (p > 0.001) between tree size classes in the three groves. Therefore, the canopy volume map can be used to guide management zone delineation.

Discussion
A general evaluation of the spatial variability in the three groves indicates that the youngest grove, grove 1, presented less variability in yield, soil attributes and canopy volume. Spatial dependence was usually weaker than in the other groves. The variability found was not very consistent and there was little resemblance between maps. As discussed in Part 1 (Colaço et al. 2018b) the canopy volume showed a frequency distribution close to normal and a weak spatial dependence indicating a certain level of randomness of the spatial variability of canopy volume. Grove 3 showed higher variability in soil conditions, significant yield variability with reasonably consistent patterns between the years (lower yields in area with bad soil drainage). The canopy geometry variation usually matched the variability found in soil and yield. Grove 2 showed intermediate results. Significant variation was found in elevation and soil attributes, which affected canopy geometry and yield variability. Overall, it can be suggested that the effects of soil constraints on the crop development can accumulate over time leading to greater and more spatially structured spatial variability of canopy geometry in more mature groves then in younger ones.
Some studies in Florida showed strong correlations between orange fruit yield and ultrasonically measured canopy volume. Mann et al. (2011) found a correlation of 0.85 between the two parameters in a 10 ha orange grove.  divided an orange grove into forty 0.4 ha plots and calculated the total tree volume and fruit yield per ha for each plot. Using half of the plots (the other half were used as a validation data set), they got an R 2 of 0.80 between canopy volume and fruit yield. In the same grove but using all forty plots, Schumann et al. (2006a) reported an R 2 of 0.64. In their study, the canopy height was slightly less correlated (R 2 = 0.54) with yield than canopy volume; similar behaviour was found in this study. Those groves in Florida presented significantly higher canopy volume variability than the groves evaluated in this study. For instance, the coefficient of variation of canopy volume in the study of Mann et al. (2011) was 54%. Since the work of Whitney et al. (1999), studies on PA in Florida have demonstrated that tree canopy and yield Table 8 Mean values of canopy volume, yield and soil attributes in three zones delineated based on canopy volume variability can be significantly affected by differences in soil properties (e.g. soil texture and water table depth) and disease occurrence. Groves in Florida are usually kept for several decades before they are renewed, which helps increase canopy variability. In Brazil, groves are renewed every 15-20 years. The analysis of spatial and temporal variability demonstrated that the opportunity for site-specific management varies between different groves (see differences between groves in this study), agro-ecological regions and management systems (see differences between Florida, USA and São Paulo, Brazil). When patterns of spatial variability of soil and plant properties are consistent, site-specific strategies are much easier to develop and their usefulness is easily perceived by the grower; for example, instead of trying to increase yield with higher fertilizer rates in low canopy volume areas in grove 3, inputs should be kept to minimum levels given that the main constraint in that region is poor soil drainage. Conversely, because spatial variability was not as significant nor consistent in grove 1, strategies such as the above are not obvious and further investigation is needed before any enhanced fertilizer strategy can be applied with confidence. Meanwhile, straightforward strategies such as sensor-based variable rate spraying should be encouraged.
Generally, soil and crop variables differed significantly between tree size classes in the three groves. The long-term average yield in each zone behaved as expected, where higher average yields occurred in zones with larger trees and vice versa. Mann et al. (2011) obtained similar results by classifying an orange grove into five zones based on canopy volume. These results indicate that the canopy volume map from one particular year can provide zones where soil and historical yield are different. Long-term relative average yields between zones can be used as proxy to zone yield potential, then greatly enhancing site-specific (or zone-specific) fertilizer recommendations. Management zones are usually delineated based on the combination of several layers of information. In this case, all available variables could be considered as management zone indicators. However, in the absence of a large database, the canopy volume alone can help to reveal zones of different soil and yield performance.
The use of management zones derived from canopy volume information or from a comprehensive database of soil and yield maps does not mean that application rates must be kept constant inside each zone. Laser scanning systems are able to provide information about variability even within each plant . The optimum use of such technology should be a system that combines the sensor-based variable rate application with a base-map layer of the established management zones. In such a system, the sensor readings should adjust the application rate proportionally to the canopy volume variation along the row. Simultaneously, the management zone base map should guide the fertilization strategy (e.g. choosing the optimal fertilization algorithm) to achieve realistic yield goals for each specific zone. In conclusion, the base map of management zones could provide an idea of the 'macro' amount of input whilst the sensor reading could adjust it within each zone ('micro' adjustment).

Conclusions
Spatial variability of citrus canopy volume and fruit yield was related to soil physical properties in two of the three groves (groves 2 and 3) evaluated. Poor soil drainage was responsible for limited canopy growth and yield in a portion of grove 3. In grove 2, variability of soil texture and soil electrical conductivity (proxies to soil water availability) had an important effect on the variability of canopy growth and fruit yield. In the youngest grove, grove 1, canopy geometry and fruit yield were spatially variable but their variability patterns were not consistent. Soil parameters were less variable and their relationship with crop performance was not clear. Results in this study suggests that more mature groves might present more consistent and structured spatial variability patterns of crop performance, given that the effects of soil constraints on the crop can accumulate over time. In addition, opportunities for site-specific management are greater in groves with more variable soil properties and when the causes for crop variability are easily identified.
Canopy volume information is an important layer for delineating management zones in orange groves. Historical yield and soil properties were significantly different between zones of different canopy volume, i.e. canopy geometry information helped to identify zones with different yield potential and soil characteristics. Having reliable management zone information can greatly enhance site-specific management in orange groves given that growers are able to tailor input requirements to local yield goals and soil constraints.