Spatial variability in commercial orange groves. Part 1: canopy volume and height

Characterizing crop spatial variability is crucial for estimating the opportunities for site-specific management practices. In the context of tree crops, ranging sensor technology has been developed to assess tree canopy geometry and control real-time variable rate application of plant protection products and fertilizers. The objective of this study was to characterize the variability of canopy geometry attributes in commercial orange groves in Brazil and therefore estimate the potential impact of sensor-based site-specific management. Using a mobile terrestrial laser scanner, canopy volume and canopy height were measured in 0.25 m length transversal sections along the rows across five large scale commercial orange groves in São Paulo, Brazil. The coefficient of variation of canopy volume ranged from 30 to 40%. Canopy height was less variable, but closely related to canopy volume. Histograms of canopy volume and height were usually negatively skewed indicating regions of the groves with smaller plants and punctual plant resets. In scenarios where input application rates followed canopy volume variability, input savings were around 40% compared to constant rates based on the maximum canopy volume. Maps of canopy geometry derived from mobile terrestrial laser scanning revealed significant canopy spatial variability, suggesting that the groves would benefit from strategies based on management zones and other forms of site-specific management.


Introduction
Characterizing the spatial variability of crop performance is the first step for developing new site-specific management strategies. A proper understanding of the level of variability within cropping fields enables farmers and researchers to estimate the potential benefit and opportunities for precision agriculture practices (Pringle et al. 2003;Robertson et al. 2008;Tisseyre and McBratney 2008). Amongst available technologies, crop variability can be assessed by the use of plant sensors (from either terrestrial, airborne or satellite platforms) and/or by yield monitors during harvest (Fisher et al. 2009). In the context of precision horticulture, light detection and ranging (LiDAR) is one of the most promising sensing technologies as it can accurately measure structural tree parameters that are closely related to crop performance such as canopy volume and leaf density (Rosell-Polo and Sanz 2012).
Mobile terrestrial laser scanning (MTLS) systems based on LiDAR sensors have been reported by research groups around the world for different fruit and nut crops (Colaço et al. 2018). Typically, the sensor faces the side of the tree rows and takes vertical scans whilst carried through the grove alleyways. After data processing, the laser impacts on the canopies can be visualized in the form of a 3D point cloud from which parameters related to the shape and density of the canopies can be retrieved. Such a method has been described in detail by many authors (e.g. Escolà et al. 2017;Rosell-Polo et al. 2009) and demonstrated to be a viable option to map tree parameters at high spatial resolution and at the wholeblock scale .
Maps derived from MTLS can help identify how variable is the canopy size in commercial groves. It can also help to assess whether such variability occurs within short distances (from one tree to the next) or in large zones within the grove. Such characterization is crucial for the understanding of which factors might be causing tree canopy variability and therefore support a tailored management action-e.g. short distance variability is likely to be related to local pest or disease occurrence whereas variation in larger areas is more likely related to differences in soil type and landscape.
Besides enabling the characterization of spatial variability, ranging systems based on LiDAR or ultrasonic sensors can be mounted on sprayers and fertilizer spreaders to guide real time variable rate application of inputs at tree or within-tree scale, i.e. proportional rates of inputs are applied according to the tree size variability. Such technology has been reported for several tree crops such as apple, pear, olive, citrus, vineyards and pistachio (Colaço et al. 2018). LiDAR sensor has also been regarded as an effective tool to measure woody structures to guide pruning in vineyards (Tagarakis et al. 2018) and in other fruit trees (Méndez et al. 2016). Given that sensor-based variable rate application at tree or within-tree scale was demonstrated to be technically feasible, maps derived from MTLS can be used to estimate the potential input savings by such technology. Canopy geometry maps can also be used to guide non-real time variable rate applications.
In horticulture industries, citrus is of great importance. In the USA, where the early developments of precision agriculture (PA) applied to citrus were reported (research goes back to the late 1990s, Whitney et al. 1999), the use of ranging sensors, particularly ultrasonic sensors, has been reported for mapping orange grove variability (Mann et al. 2011;Schumann et al. 2006;Schumann and Zaman 2005; and also to control variable fertilizer rates (Schumann et al. 2006b;. In Brazil, the world's largest orange producer (about 17.2 Mt were produced in approximately 660,000 ha in 2016-FAO 2018), research has demonstrated that individual citrus groves can be significantly variable in terms of soil properties and landscape (Oliveira et al. 2009;Leão et al. 2010;Siqueira et al. 2010), disease occurrence (Molin et al. 2012) and fruit yield (Farias et al. 2003;Molin and Mascarin 2007). Research on site-specific management practices, such as variable rate fertilization, has also been reported in Brazil (Colaço et al. 2014;, however with no use of sensing technologies. Citrus in both São Paulo, Brazil and in Florida, USA has suffered lately from a severe bacterial disease which can increase tree size variability (the Huanglongbing-HLB disease is responsible for the cutting off, eradication and replacement of infected trees, worsening tree scale variability). However, the two growing regions could not be more different in terms of agronomic practices, landscape, soil and climate conditions. For example, in Brazilian groves, the trees in one grove are entirely replaced every 20 years, approximately, whereas, in Florida, the original trees are kept for an indeterminate time and the trees are individually replaced as they become diseased or unproductive. Such management strategies may affect the variability of canopy size, which is expected to be higher in Florida. Despite the fact that individual groves in both regions are roughly similar in size (the area of one grove is usually around 20-30 ha), the tree canopy variability reported by researchers in Florida cannot be generalized to Brazilian conditions. Even if such a generalization were acceptable, the above ultrasonic mapping carried out in Florida was limited to a few orange groves.
Despite research efforts, PA uptake by citrus growers has been apparently small. For example, in Brazil, sensors are not being used to aid spraying applications based on canopy volume information. PA in Brazilian citrus has been restricted to variable rate fertilization based on soil georeferenced sampling, with farmers often using questionable data processing and recommendation methods (e.g. generating interpolated soil fertility maps with a small number of sampling points). A possible explanation for the small technology adoption is the fact that research has not yet exhaustively explored spatial variability in commercial groves and thus the potential opportunities for PA practices are still not clear. The present study is published in two parts. In this first part the previously developed MTLS system described in  was used to characterize the spatial variability of tree canopy in commercial orange groves and then estimate the potential impact of sensorbased site-specific management. The second part (Colaço et al., in press) focuses on understanding the causes for such canopy variability and how that knowledge can be used for enhanced site-specific management.

Orange groves
Five commercial orange groves located in the state of São Paulo, Brazil, were selected for this study (Table 1). Grove 1 was the youngest one at the time of scanning. This grove was also used for the validation of the MTLS system reported by . Groves 2 and 3 were the oldest ones. Trees were fully developed, being mechanically pruned every 2 years. These groves were also used in other previous research projects . Finally, groves 4 and 5 were those that had the smallest surface. They were chosen due to the high occurrence of citrus trunk gummosis (Phytophthora parasitica), a fungal infection that can lead to a severe weakening of the tree. Gaps and variation in canopy size can be seen in the overview photo of these groves (Fig. 1).

Lidar data acquisition and processing
The orange groves were scanned with a MTLS based on a 2D LiDAR sensor (LMS 200,Sick,Waldkirch,Germany) and an RTK-GNSS receiver (Real Time Kinematics-Global Navigation Satellite System, GR3, Topcon, Tokyo, Japan) (Fig. 2). The development and validation of such a system is available in . During the data acquisition, the system was operated along the alleys of the groves at a constant speed of 3.3 m s −1 , scanning one side of each tree row at a time. The system performed 75 vertical scans per second; each scan was formed by 181 distance measurements taken in 1° angular steps along the vertical plane. Such a configuration resulted in point clouds of approximately 700 points per m 2 .
Prior to further data processing, a visual assessment of the point cloud, using the Cloud-Compare 2.6.1 software (CloudCompare [GPL software] v2.6.1 2018), was carried out in order to recognize undesired scanned targets (e.g. power poles, individual native trees inside the grove, etc.) and spots where the point cloud was not generated correctly due to rough and abrupt irregularities of the terrain. The canopy volume and height were then computed for every 0.25 m length transversal sections along the rows. The convex-hull  1 Overview of the orange groves 1-5 used in the study model was used to create the shape of the trees and retrieve canopy volume and height for each section. More details on the data processing steps, from the raw LiDAR data to canopy volume and height information, are available in .

Variability of canopy geometry
Histograms and descriptive statistical analysis were used to assess the distribution of canopy volume and height for the row section data. A regression between canopy volume and height was also carried out with data at the per tree scale (see below). In order to estimate the potential benefit of sensor-based variable rate applications, different scenarios of input application (e.g. fertilizer or plant protection products) were designed based on the canopy volume data. A variable rate application scenario was defined as if the application of plant protection products or fertilizers would employ proportional rates directly related to the canopy volume (i.e. larger plants receive higher amount of inputs and vice versa). Operational application errors were not considered. For the fixed rate scenarios, it was considered that farmers could calculate the application rate based on different methods: by using the maximum, the average, the median (2) point cloud from a portion of an orange grove; (3) 3D modelling of transversal sections of the row based on surface reconstruction; (4) 2D representation of canopy volume for each transversal section of the rows; (5) histogram and descriptive statistics of canopy volume for each section; (6) grouping of subsequent sections to prepare data for interpolation; (7) interpolated map of canopy volume and geostatistical analysis or the mode value of canopy volume from a set of sampled canopies in the grove. Adopting the maximum canopy volume as a guide for constant rate applications is usually recommended as a low-risk strategy because it ensures that all plants will receive enough product even though some or many of them will be over-dosed. In the Brazilian groves, the average canopy volume is sometimes adopted but often very few trees are sampled. The mode value of canopy volume reflects a strategy where the farmers adopt as reference, a tree size that appeared more representative of the grove, i.e. one that was frequently observed during the grove inspection. The median canopy volume was evaluated because canopy size data distributions could be asymmetrical, and in those cases would lead to different results from those of the average scenario. In order to exclude the random error of the sampling from the analysis, the average, median and mode canopy volume values were computed from the entire data set, not from sampling. The maximum value of canopy volume was given as two standard deviations above the average. The variable rate scenario, and conventional fixed rate scenarios were compared in terms of input consumption and input dose accuracy. A spatial variability analysis was conducted in order to assess whether possible variations of canopy geometry were random in space or spatially dependent, i.e. whether canopy geometry variations are spatially structured within each grove. Such analysis was carried out in two steps: (a) analysis of canopy volume and height variograms for each grove; (b) generation of maps of canopy volume and height to assess spatial patterns of variability in each grove. Prior to geostatistical analysis and interpolation, adjacent data points representing 0.25 m row sections were merged together to form groups equivalent to individual trees (Fig. 2); for mapping purposes and whole-of-grove spatial variability analysis, this is a more meaningful representation of canopy volume than the one based on small row section. The number of merged sections varied between groves, depending on the tree spacing. For example, if the tree spacing was 4 m, as in grove 2, the number of merged points was 16 (equivalent to one tree); if the tree spacing was not a multiple of 0.25 m, the number of merged sections was rounded up. The values of volume and height assigned to the central point of each group were the sum of the volume and the highest height of the merged sections. The objective of this step was to mask the within-plant variability of volume and height, even though the merged points did not necessarily match the centre of each tree individually. This step enabled better interpretation over the geostatistical analysis and the spatial variability. With the new set of points (of merged sections), canopy volume and height variograms were generated using Vesper 1.6 software (Minasny et al. 2005). Fitting models were chosen based on the Akaike Information Criteria (Akaike 1974). Finally, canopy volume and height maps were created using block kriging interpolation. Both the grid pixel size and the interpolation block size were set to 5 m. Block kriging was chosen over punctual kriging because it allowed a slight smoothing of the data along the tree rows, which was desirable given the fact that the centre of merged sections did not match exactly the centre of individual trees. Interpolation of canopy geometry data was conducted using the Vesper software and generation and final editing of the maps was done using the QGIS 2.10 (QGIS Development Team 2018) software.

Variability of canopy geometry and potential benefit of sensor-based variable rate applications
Canopy volume and height measured for 0.25 m sections along the rows were markedly variable within each grove. The coefficient of variation (CV) of canopy volume varied between 31 and 41% ( Table 2). The smallest variation was found in grove 1, which was the youngest grove. Highest variation occurred in grove 5, which was one of the groves with high disease infection. The variability of canopy height was lower than that of canopy volume in all groves. This is evidenced by the greater CV and smaller kurtosis in the canopy volume histograms.
The presence of relatively smaller trees caused by either tree replacements or regions in the grove with some sort of impediment to tree growth was evident through observation of the histograms in Fig. 3. The histograms of canopy volume and height were asymmetric, especially for canopy height, with negative skewness. The distribution of canopy volume and height got closer to a normal distribution in grove 1. Through the histograms of canopy height, a thin tale on the left side of the distribution was observable, indicating the presence of very small trees in the grove. This is also noticed in the canopy volume histograms of groves 4 and 5. This is probably due to tree replacements. As expected, those small trees are in a relatively higher frequency in groves 4 and 5, due to higher disease occurrence. In the canopy volume and height histograms in grove 4, a second smaller peak on the left side was observed. This suggests that these small trees were replaced at the same time, whereas in the other groves, trees were replaced continuously at a steadier rate, except for grove 1 where no replacements have been made so far. Although the canopy volume histograms were also negatively skewed, this thin left tale is not present as in the canopy height histograms, i.e. tree replacements are not very evident in canopy volume maps as they are in canopy height maps. The negative skewness of the canopy volume histograms in groves 2 and 3, and the relatively high frequency of volumes smaller but close to the average,    indicates the presence of regions in the field with smaller canopy volumes. A generalization of these results suggests that very small trees from replacements are easily recognized in the canopy height histograms, whereas the variation of canopy size due to different conditions in the field (e.g. regions with different soil types) are more evident in the canopy volume histograms. A strong relationship between canopy volume and canopy height was observed, which suggests that canopy height information may be useful in predicting canopy volume, which is a much more complex feature to measure in the field than canopy height. The relationship between height and volume (using data of merged sections, i.e. at the treebasis) ranged between 0.44 and 0.77 of R 2 , with a power regression fit (Fig. 4). A strong relationship was found when data from all groves were combined, resulting in an R 2 of 0.86. The mean square error of the regression was 5.4 m 3 . These results have a significant impact on the current method of canopy volume estimation and spraying practices in citrus groves. In Brazil, the most common form of estimating orange tree canopy volume is by calculating the volume of a hypothetical cube containing the tree. This method is based on Fig. 4 Regression analysis between canopy volume and height using data on a per tree basis (merged sections) adapting the tree-row-volume concept (Byers 1987;Byers et al. 1984;Unrath 1984, 1988) to individual trees. As demonstrated by , this approach is extremely simplistic and inaccurate. Besides, it is time consuming resulting in few plants being measured. The high correlation found between canopy volume and height indicates that a simple measurement of height can be used to predict the tree canopy volume more accurately even if the purpose of this measurement is for a constant rate application.
Since a significant variation in canopy volume and height was found in the groves evaluated, it is reasonable to expect that sensor-based variable rate applications might provide a more rational use of inputs. Indeed, from the simulations carried out in this study, it can be suggested that significant input savings and higher application accuracy would result from sensor-based variable rate applications. A variable rate application scenario was defined as if input rates were applied in accordance with tree canopy volume variation. The input savings obtained by this approach varied according to the method used to calculate the constant dose rate. When the constant rate was based on the average canopy volume, the total amount of input consumption was equal to the amount applied with variable rate technology. However, when the constant rate was determined based on the maximum canopy volume, significant input reduction resulted from variable rate applications. The over-application caused by constant rates were generally close to 40%, reaching up to 45% in grove 5 (Fig. 5), which is close to the results reported for trials to validate variable rate prototypes (Giles et al. 1987(Giles et al. , 1989Solanelles et al. 2006;. If the approach adopted by the conventional treatment was based on the canopy volume of higher frequency (the mode value), the input consumption would be lower, as well as the input savings by the variable rate application. These over-applications ranged from 6 up to 20%. Fixed rates based on the median canopy volume, which is not common in conventional practice, would result in 1-5% over-application. It is important to keep in mind that given the risk aversion of farmers, the usual approach is to adopt the maximum observed canopy volume to calculate input rates (e.g. spraying volumes). This approach is also often the official recommendation of rural extension services, because this will ensure that plants are well treated, reducing the risk of disease infection, for example. Of course, and as demonstrated in this study, the possible over-application resulting from the canopy size variation across the grove is being markedly overlooked by such strategy. It should be also reminded that, in this simulation, constant rates were estimated based on MTLS-derived volumes. In conventional practice, those rates are based on manual estimations of canopy volume, which are significantly less accurate than the ones based on MTLS. In addition, manual techniques are based on measuring a different type of volume ), for Fig. 5 Over application of inputs based on a constant rate prescription exemple, the volume of a cube that encloses the tree canopy, which means that, if MTLSderived volumes are adopted, new calibrations to transform canopy volume information into input dose rates would be required.
Besides the inputs savings, it was observed that sensor-based variable rate applications would minimize over-and under-dose deviations resulting from fixed rate input applications. The graphs in Fig. 6 show the percentage of row sections in the grove (y axis) for increasing levels of dose deviation (x axis) in each application method, i.e. a 'y' amount of rows are receiving at least a 'x' level of under-or over-dosage. Steep slopes in the curves means that the amount of sections with dose deviation quickly decreases as the magnitude of the error increases. It is observed that when the constant rate is defined based on the maximum canopy volume, nearly 100% of the trees receive more input than necessary. If the application rate is based on the mode canopy volume, the over-dosage is significantly reduced, however, under-dosage errors start to occur in the groves. For both approaches, the over-dosage would affect more trees than the under-dosage, which is not true for the constant rate based on the average canopy volume. The orange lines representing this approach start from the y axis with a larger percentage of canopies with under-dosage, which is a reflection of the distribution frequency of the canopy volume of these groves  (Fig. 3). Due to a large number of small plants in the grove, the average value of canopy volume is pulled down, getting further from the median value (which divides the distribution into two equal size number of sections). For that reason, under-dosage is more frequent than over-dosage when the constant rate is based on the average canopy volume. The mode and (obviously) the adopted maximum values were greater than the median value of canopy volume. For that reason, over-dosage was more frequent than under-dosage. As the analysis is carried out in more intensive dose deviations (increasing the value on the x axis) the over-dosage prevailed over the under-dosage in all scenarios of constant rate. This is because in the extremities of the canopy volume distribution, smaller plants were more frequent in all groves.

Spatial variability of canopy geometry
Geostatistical analysis (using merged section data) showed that canopy geometry variation was spatially dependent, i.e. it was not entirely random across the grove area. However, a significant portion of that variability was not spatially related. This is evidenced by the fact that, although spatial structure was observed in all variograms, the nugget variance (C0) occupied a significant portion of the sill variance (C0 + C1) (Fig. 7), which characterizes the weak spatial dependence of canopy geometry in these groves. Mann et al. (2011) found a moderate spatial dependence on the canopy volume variogram in an orange grove in Florida (about 50% nugget/sill ratio). In this study, the weak spatial dependence, found especially for grove 1 and for canopy height data in all groves, suggests that these geometrical attributes varied significantly within short distances. This high nugget variance could be also derived from errors during canopy geometry measurement. However, this is unlikely given that such errors were minimized by supervised and unsupervised filtering steps carried out previously during data processing . However, besides the spatially random error observed, a spatial structure is noticeable in all variograms. The range of canopy volume and height went from approx. 50-120 m, indicating that, besides short distance variability, there were also large regions in the groves with distinct canopy sizes.
Canopy volume maps showed that tree volume was markedly variable within each grove (Fig. 8). An even greater variability was reported by Schumann et al. (2006a) in a 17 ha grove in Florida. In their study, the variability in tree canopy volume ranged from less than 5 up to around 160 m 3 per tree. The grove was over 40 years old at the time of their study. Certainly, individual tree replacements during that period might have contributed to such a variability. In this study, the canopy volume and height maps were similar through visual assessment (Figs. 8,9). As pointed out before, the canopy height is a good estimator of tree canopy volume. The canopy height maps showed very similar variability patterns to the canopy volume maps. This fact indicates that the canopy height map might be used for spatial variability investigation when the measurement of canopy volume is not possible. Some disagreements between these maps were more evident in grove 1. It should be reminded that this was the youngest grove and the one that had the smallest trees amongst the five groves. The canopy volume and height data were closer to a normal distribution and the weakest spatial dependence was also found for this grove. As evidenced by the shape of the regression models between canopy volume and height (Fig. 4), the relationship between the canopy volume and height is weaker (the regression curve is less steep) for smaller trees. Regarding the shape and size of the variability zones in the canopy volume and height maps (Figs. 8,9), as previously inferred from the geostatistical results, variation in short distances as well as large regions with different canopy sizes was observed. The appearance of these regions indicates that each grove can be divided into different zones for sitespecific management. Zaman and Schumann (2006) and Mann et al. (2011) considered such a strategy, suggesting that tree growth indicators such as canopy volume or NDVI (normalized difference vegetation index) were good options to divide the grove into distinct management zones. In this study, it was observed that each grove could be divided into at least two management zones, one with larger trees and another with smaller trees. Zone transitions were usually smooth, indicating that they are probably related to natural variability of soil properties. The exception was grove 4 where a clear-cut transition between zones was observed, suggesting that past land transformation or changes in management practices could be the driver for canopy geometry variability-the study by Uribeetxebarria et al. (2018) demonstrated the importance of land transformation influencing spatial variability in fruit orchards. The delineation of management zones would enable different fertilization strategies for each zone. However, the establishment of different fertilizer recommendations and how the differences in canopy size should be interpreted depend on a deeper understanding of the soil spatial variability in each grove, which is the focus of Part 2 (Colaço et al., in press).

Conclusions
Five commercial oranges groves in Brazil were scanned using a mobile terrestrial laser scanner. Canopy geometry (canopy volume and height) was markedly variable within each grove. As a consequence, if input rates (e.g. plant protection products or fertilizers) were applied proportionally to the canopy volume variation, about 40% of product could be saved. In addition, over-and under-dose deviations could also be minimized by such an approach. There was a strong relationship between canopy volume and canopy height information, which means that in the absence of sensor-based systems to measure tree canopy volume, simple measurements of canopy height could be used to estimate canopy volume.
Geostatistical analysis showed that canopy geometry variation was spatially dependent, i.e. variability was not entirely random across the grove area, even though a significant portion of that variability was not spatially related. Overall, it is concluded that variability may occur at both short distances (from one tree to the next), which could be caused by local disease infection, and also across long distances within the grove, which could be related to variations in soil properties, or even to land use transformations or changes in management practices in the past.
Maps of canopy geometry showed that the groves could be divided into at least two zones of distinct tree sizes. However, treating such zones independently would require a deeper understanding of the drivers for such variability as well as new input recommendation formulae that incorporate sensor-based canopy volume information. Overall, this study demonstrated that there is a great potential for precision agriculture practices in orange groves based on canopy geometry information from laser scanners.