A comparative framework for broad-scale plot-based vegetation classification

Miquel De C aceres, Milan Chytr y, Emiliano Agrillo, Fabio Attorre, Zolt an Botta-Duk at, Jorge Capelo, B alint Cz ucz, J€ urgen Dengler, J€ org Ewald, Don Faber-Langendoen, Enrico Feoli, Scott B. Franklin, Rosario Gavil an, Franc ois Gillet, Florian Jansen, Borja Jim enez-Alfaro, Pavel Krestov, Flavia Landucci, Attila Lengyel, Javier Loidi, Ladislav Mucina, Robert K. Peet, David W. Roberts, Jan Role cek, Joop H.J. Schamin ee, Sebastian Schmidtlein, Jean-Paul Theurillat, Lubom ır Tich y, Donald A. Walker, OttoWildi, WolfgangWillner & Susan K. Wiser


Introduction
Humans have an inherent need to classify in order to make sense of the world around them. The term classification can refer to either the activity of defining classes of objects or the outcome of such activity (Everitt et al. 2011). Vegetation classification aims to summarize the spatial and temporal variation of vegetation using a limited number of abstract entities. These are often called vegetation types, and we will follow this convention here. The typologies produced by vegetation classification are useful for multiple purposes (Dengler et al. 2008), including: (1) communication about complex vegetation patterns; (2) formulation of hypotheses about the ecological and evolutionary processes shaping these patterns; (3) creation of maps to display the spatial variation of vegetation and related ecosystem properties and services; (4) surveying, monitoring and reporting plant and animal populations, communities and their habitats; and (5) development of coherent management and conservation strategies.
Vegetation changes over time and space as a result of ecological processes acting on plant populations and communities at different temporal and spatial scales. In addition, the quality and quantity of information available about vegetation patterns changes as new vegetation data become available. These two facts have important implications for the stability of classifications. Far from being static or finished products, vegetation classifications need to be continually updated and refined in order to appropriately integrate and summarize all available information (Mucina 1997;Peet & Roberts 2013;Wiser & De C aceres 2013). In other cases, the need to update vegetation classifications arises from changes in the taxonomy of the plants that sustain the classification. This dynamic perspective contrasts with the need to maintain descriptions and access to the vegetation types already in use (in vegetation maps, biodiversity reports, etc.), a requirement that is especially important for the conservation of habitats (e.g. Jennings et al. 2009;Neldner et al. 2012;European Commission 2013). Hence, a vegetation classification may be understood as a set of vegetation types where new types may be added if needed, but where previously defined types may be modified or discarded only after careful reflection (Jennings et al. 2009;Peet & Roberts 2013).
The beginnings of vegetation classification can be traced to the 19th century, with the pioneering, mainly qualitative, works of early plant geographers (e.g. von Humboldt 1807; Grisebach 1838; De Candolle 1855). However, the majority of conceptual and methodological developments were made during the 20th century. Different traditions were developed and pursued during this period (see Whittaker 1978a;Mucina 1997), including the spread of numerical approaches in the 1960s and 1970s (Mucina & van der Maarel 1989). The long history of vegetation classification has resulted in an extensive literature, with different approaches emphasizing different characteristics and often adopting different classification procedures (Mueller-Dombois & Ellenberg 1974;Whittaker 1978c;Dierschke 1994;Dengler et al. 2008;Kent 2012;Peet & Roberts 2013). Moreover, vegetation classifications, although often following similar principles, have usually evolved quite idiosyncratically and without reporting clear formal procedures regarding how to extend or modify them.
Recently, there has been a renewed interest in vegetation classification worldwide and efforts have been made at the national and international level to develop new classification systems using standardized procedures (e.g. Rodwell 1991Rodwell -2000Schamin ee et al. 1995;ESCAVI 2003;Jennings et al. 2009;Faber-Langendoen et al. 2014). Moreover, there is growing interest in harmonizing approaches worldwide and standardizing the information content of classifications that serve similar purposes. This interest is motivated by the need to both increase the usefulness of vegetation typologies and to enhance the acceptance of their scientific underpinnings. In order to advance toward classification practices that enjoy broad international acceptance, it is first necessary to develop a general framework in which the concepts and criteria of classification approaches can be appropriately described and compared. Such a framework would be useful to those trying to integrate existing classifications and to those initiating new vegetation classification projects. This paper aims at developing such a framework, and represents an attempt towards crafting a global consensus perspective in this subject.
Because our framework cannot encompass all possible ways to classify vegetation, we focus on approaches dealing with data in the form of vegetation records, each of them describing a plant community occurring in a small and delimited areaa vegetation plotat a given time. Moreover, our framework is mainly directed towards extensive regional, national or international classification initiatives, which are referred to here as broad-scale classification projects. These typically involve conducting many classification exercises, each focusing on a particular kind of vegetation, and integrating their results into a single classification system. In the following, we first present the main conceptual elements of our framework, where we distinguish between structural and procedural elements and describe those element properties that are essential for comparisons (section 'Comparative framework'). We then review critical decisions and alternative choices regarding classification approaches (section 'Critical decisions: classification approaches and protocols'); with a special emphasis on the procedures used to define vegetation types from plot records (section 'Critical decisions: plot-based class definition procedures'). After that, we illustrate our comparative framework by using it to briefly describe several classification approaches (section 'Examples'). We conclude with highlighting what we see as the most important future development needs in this field.

Structural and procedural elements
In our comparative framework we distinguish between procedural and structural elements of plot-based classifica-tion of vegetation (Table 1). Two structural elements, vegetation-plot record and vegetation type, are well known to vegetation scientists. The most comprehensive structural element is the classification system, which we define as an organized set of vegetation types used to describe the variation of vegetation within given spatial, temporal and ecological scopes. Examples of classification systems are the British National Vegetation Classification (Rodwell 1991(Rodwell -2000, the US National Vegetation Classification (USFGDC 2008) or the Vegetation of the Czech Republic (Chytr y 2007. Classification systems are often hierarchical, meaning that vegetation types are organized in hierarchical classification levels and qualified using ranks (e.g. association or alliance). In addition, hierarchical systems usually include nested relationships between vegetation types of different ranks.
Broad-scale classification systems often involve sets of vegetation types defined based on varying classification criteria. To account for this variation explicitly, we introduce a new concept called consistent classification section (CCS) and define it as a subset of a classification system where vegetation types are defined using the same criteria and procedures (i.e. using the same classification protocol; see below). For example, the vegetation types of a CCS may broadly summarize the woody vegetation of a given area on the basis of physiognomy, whereas another may classify the same vegetation based on detailed floristic composition; in this example, the set of vegetation types of each CCS might be placed at different hierarchical levels within the same classification system (e.g. CCSs A and B in Fig. 1a). Classification systems may allow vegetation types of the same hierarchical level, but corresponding to very different kinds of vegetation, to be defined using different criteria. For example, a classification system may allow forest associations to be defined based on the dominant species of the tree layer and species composition of the herb layer, while aquatic associations are defined focusing on the dominant species and its position in the water column; these will represent different CCSs of the same hierarchical level (e.g. CCSs B and C in Fig. 1a; or CCSs A and B in Fig. 1b). Now we turn our attention to procedural elements. We define classification protocol as the set of criteria and procedures that underlie the creation or modification of a consistent classification section. For example, the protocol for a set of floristically-based vegetation types may include specifications of field sampling design, plot size, taxonomic resolution, taxon abundance measure, plot resemblance measure, clustering algorithm, etc. Although the focus of our framework is on plot-based classification, we do not require all vegetation types to be defined directly as groups of plot records. Vegeta-tion types of a given hierarchical rank may be explicitly defined as groups of vegetation types of a lower rank (e.g. CCS A in Fig. 1a). For example, one may define floristically-based alliances after grouping the constancy columns of a synoptic table of associations. Classification protocols of this kind will be qualified as type-based, whereas those dealing with plot records directly will be qualified as plot-based. The CCS and vegetation types resulting from the application of classification protocols will also be qualified as type-based or plot-based, accordingly. We will use the term classification exercise to denote the application of a classification protocol to a particular subset of the vegetation continuum.
Finally, we define classification approach as the set of concepts, criteria and procedures that underlie the creation or modification of a classification system. Examples of classification approaches are the Braun-Blanquet approach (Braun-Blanquet 1964;Westhoff & van der Maarel 1973), the Integrated Synusial approach (Gillet et al. 1991;Gillet & Gallandat 1996) or the EcoVeg approach (Jennings et al. 2009;Faber-Langendoen et al. 2014). Analogously to classification exercises, we will use the term classification project to denote the application of a classification approach to a particular subset of the vegetation continuum, an activity that creates or modifies a classification system.

Properties of structural and procedural elements
We provide definitions for the properties of structural and procedural elements in Table 2. These properties are meant to organize the comparison of classifications. For the sake of brevity, we omitted properties of plot records and other properties, such as nomenclatural rules, that are not essential for comparisons. In the following we detail the most important ones. The primary vegetation attributes of a plot-based classification protocol are the attributes consistently used to determine whether plot records are members of the same or different vegetation types. Analogously, the primary vegetation attributes of a type-based protocol are the attributes consistently used to determine which vegetation types of a lower rank are grouped to form a vegetation type of a higher rank. In both cases, these are attributes of the vegetation itself and not of its environment. Vegetation classifications are often required to describe, reflect or indicate other vegetation characteristics not included in the set of primary attributes, or external factors, such as climatic or edaphic conditions, anthropogenic disturbance regime or biogeographic history. We use secondary attributes to collectively refer to all those attributes (whether of vegetation or not) that are not primary vegetation attributes. A special situation arises when a subset of secondary attributes, without being explicitly used to determine membership, are used to constrain the definition of vegetation types. We refer to these as constraining attributes of the classification protocol. For example, although 'classes' of the Braun-Blanquet approach are defined using floristic composition, a specific subset of plant taxa may be selected as primary attributes in order to make classes distinct in terms of environmental conditions and biogeographic context (e.g. Pignatti et al. 1995). The presence or absence of those taxa is the only information needed to consistently determine membership, but climatic and biogeographic factors have indirectly influenced the definition of vegetation types.
The extensive class definition of a plot-based vegetation type is a list of the plot records that belong to it. This list will be enlarged every time new plot records are assigned to the type. Analogously, the extensive class definition of a typebased vegetation type is a list of the vegetation types of the lower rank that belong to it. The intensive class definition of a vegetation type is a statement about the values of primary vegetation attributes that are required to be a member (either plots or vegetation types of a lower rank). A broader property of a vegetation type is its primary characterization (or description), which includes all statements about primary vegetation attributes. Whereas intensive definitions impose limits to plot membership for a single vegetation type, they are often not sufficient to unambiguously determine the membership of a plot record among the set of vegetation types that constitute a CCS. We refer to the formal procedures used to determine the membership of new plot records to the predefined vegetation types of a CCS as assignment rules. For example, sets of assignment rules may be defined using diagnostic species or species combinations (e.g. Bruelheide 1997;Ko c ı et al. 2003;Willner 2011;. Because different sets of assignment rules can produce different plot memberships, the definition of a CCS should include a preferred set of assignment rules. To preserve consistency, such set of rules should be able to reproduce the extensive class definition of vegetation types when applied to the original plot records . We refer to these as consistent assignment rules. Additional sets of rules of a CCS are referred to as complementary assignment rules in our framework. While the attributes used in the consistent rules must be primary vegetation attributes, the attributes used in complementary rules may be either primary or secondary. Examples of two hypothetical classification systems. Vegetation types and plot records are indicated using shaded and empty boxes, respectively. Classification system (a) has two hierarchical levels, nested relationships between types and four consistent classification sections (CCS A-D). Classification system (b) has two classification levels and three CCS (A-C). In system (b) nested relationships between types are not always possible.

Critical decisions: classification approaches and protocols
Following the terminology presented in the previous section, here we briefly review some of the most important decisions and alternative choices regarding the design of classification approaches and protocols.

General requirements
Guiding principles of classification approaches largely depend on the expected usage of classification systems. Although each stakeholder may tend to tailor a classification approach according to his/her specific needs, we list in Table 3 a set of characteristics that users commonly require from classification approaches.

Structural requirements
Depending on their purpose, classification approaches often specify several hierarchical levels, each describing vegetation using different primary attributes and/or typological resolution. To preserve nested relationships classification approaches have to constrain the definition of the vegetation types of one hierarchical level using the types of the other, either in a bottom-up or top-down direction (Willner 2011). One possibility to achieve this is to use a single plot-based CCS encompassing several hierarchical levels (e.g. CCS D in Fig. 1a), for example by using hierarchical agglomerative or divisive clustering. A more common approach is to define the vegetation types of the lowest hierarchical level using plot-based classification protocols and then to progressively aggregate them Table 2. Properties of structural and procedural elements (the order of properties follows their appearance in Table 1).

Properties of structural elements
Extensive class definition List of the plot records (or vegetation types of lower rank) that are members of the vegetation type Intensive class definition The primary attribute values that are required to be a member of the vegetation type Primary characterization All statements about the primary attributes of the vegetation type (includes intensive definition) Secondary characterization All statements about the secondary attributes of the vegetation type (e.g. altitudinal range) Spatial characterization All statements about the spatial dimensions of the vegetation type (e.g. spatial distribution) Temporal characterization Statements about the temporal aspects of the vegetation type (e.g. successional relationships) Spatial scope Geographical area of interest of a CCS or a classification system Temporal scope Time window during which the classification system (or a CCS) is intended to be comprehensively represent the vegetation in the target geographical area Ecological (thematic) scope Range of ecosystems described in a classification system or a CCS. The ecological scope of a classification system (respectively, CCS) is limited by the corresponding scope of the approach (resp., protocol) used to create it Classification level The set of vegetation types that are given the same qualifier within a classification system. Classification levels often are hierarchically arranged and vegetation types are qualified using ranks Assignment rules Formal procedures used to determine the membership of plot records with respect to predefined vegetation types of a given CCS Properties of procedural elements Ecological (thematic) scope Range of ecosystems where a given classification approach or classification protocol is applicable (e.g., a classification system may be restricted to natural vegetation and a classification approach may be valid for aquatic vegetation only) Typological resolution Amount of variation that is placed between, as opposed to within, vegetation types Spatial resolution Range of vegetation plot sizes that are allowed in a plot-based classification protocol Temporal resolution Temporal resolution required for plot records in a plot-based classification protocol (i.e., whether temporal variation is pooled or kept separately) Primary vegetation attributes Set of vegetation attributes that are used to determine whether plots records are considered as members of the same or different vegetation types Secondary attributes All those attributes (whether of vegetation or not) that are not primary vegetation attributes Constraining attributes Set of attributes (not necessarily of vegetation) used to constrain the definition of vegetation types. Constraining attributes are a subset of secondary attributes Class definition procedures Set of procedures used to define new vegetation types, sometimes accounting for pre-existing types of the same CCS Purpose Set of applications for which a given classification approach provides useful classification systems General requirements Requirements to accept the usefulness of classification systems obtained from the application a given classification approach Structural requirements Specifications of a classification approach regarding the number of classification levels and the relationships between types belonging to different CCS Applied Vegetation Science into higher levels using type-based protocols (e.g. CCSs A, B and C in Fig. 1a).

Primary vegetation attributes
An important decision regarding the primary vegetation attributes concerns the subset of plants of interest. Plant communities are usually composed of multiple organisms, not all of which may be of interest (Barkman 1980). The choice of the subset of plants of interest may be influenced by the ecological scope of the classification protocol or by technical restrictions. For example, classifications of boreal forests, wherein vascular plant diversity is typically low, often place a high importance on bryophytes and lichens, whereas classifications of temperate forests are generally described in terms of vascular plants only, and tropical forests are often floristically described focusing on a small subset of plants (e.g. woody plants or ferns) owing to their high taxonomic diversity. If the classification is expected to be indicative of the prevailing environmental conditions, an important consideration is whether all plants or plant groups in the community are sensitive to the same environmental factors in the same ways. For example, some understorey plants may respond to the microclimatic and edaphic conditions created by canopy trees more strongly than to the external climatic conditions. To deal with this problem, classification approaches have been proposed that describe different synusiae (i.e. assemblages of plants having similar size and habitat use) and classify them using independent protocols (see subsection 'Synusial approaches').
Another decision concerns the attributes of the plants, which can be grouped into (a) structure: the spatial (horizontal and vertical) arrangement of plants within the plot and their size (e.g. height or trunk diameter); (b) taxonomy: the identity of plants (e.g. species or genus); and (c) morphology and function: a set of relevant morphological, physiological or phenological plant traits (e.g. life form, leaf size or reproductive strategy). Classification protocols normally combine more than one group of plant attributes. For example, physiognomic approaches often combine information about morphological (life form, leaf type and leaf longevity) and structural components (e.g. Fosberg 1961;UNESCO 1973). A focus on the taxonomy of plants has a great advantage in that it allows additional information to be obtained by linking the taxonomic composition of the vegetation type with taxon attributes or conservation status (e.g. Feoli 1984), hence increasing the value of the classification.
Finally, plant attributes can be considered at different levels of detail. For example, the horizontal structure of vegetation can be simply accounted for as open-versus-closed vegetation, but it can also be accounted for in more detail by using the percentage of ground surface covered by projection of the canopy. Similarly, different levels of resolution can be used for the taxonomic identity of plants (e.g. species level or family level). Table 3. Common requirements for vegetation classification approaches.

Requirement Explanation
Comprehensiveness Classification systems should include vegetation types that encompass, as much as possible, the full range of vegetation variation within their spatial, temporal and ecological extents. This includes the need to appropriately summarize transitional and rare plant species assemblages Consistency A similar set of concepts and procedures should be consistently used for the definition of vegetation types. Because broadscale classification projects may address the classification of vegetation with strikingly different features or be intended to satisfy many potential users, it is useful to explicitly define different CCS Robustness Minor changes in the input data (e.g. adding or deleting some plot records) should not considerably alter the result of plot-based class definition procedures Simplicity A vegetation classification may be difficult to understand and to apply by potential users when vegetation types do not have simple definitions or when assignment rules (or nomenclatural rules) are complex Distinctiveness of units Vegetation types should be distinct with respect to the values of the primary vegetation attributes. Distinctiveness may sometimes be artificially increased by the choice of class definition procedures (e.g. sampling design) Identifiability of units Vegetation types should be easy to identify in the landscape. This requires clear, reliable and simple assignment rules that may complement the possibly more complex consistent assignment rules Indication of context Vegetation types should preferably reflect and be predictive with respect to its context, such as soil conditions, climatic factors, management practices or biogeographic history Compatibility Vegetation types of a given classification system may be required to have clear relationships with the vegetation types of other classification systems (whether of vegetation or not) because this facilitates transferring information from one classification system to another

Spatial and temporal resolution
There are practical reasons for requiring a limited range of plot sizes, because the use of records from plots of very different size and forms in a single analysis can introduce various artifacts (Ot ypkov a & Chytr y 2006; Dengler et al. 2009). In general, plot size is decided in accordance with both the purpose and the scale of spatial variation of the factors that determine changes in the primary vegetation attributes (Reed et al. 1993). Sometimes the choice of plot size is adapted to the size of the bigger plants in the vegetation considered (e.g. Barkman 1989;Peet et al. 1998; Chytr y & Ot ypkov a 2003).
The temporal grain of a plot-based protocol is rarely made explicit. However, it is important to define whether a given temporal variation should be addressed using different plot records or not. For example, to address intraannual (seasonal, phenological) variation of vegetation features, practitioners may sample vegetation at the time of its optimal phenological development only, pool observations from two or more observation dates within the same year (Dierschke 1994) or separate the information from plot records collected during different seasons (Vymazalov a et al. 2012).

Class definition procedures
An important decision is the nature of extensive class definitions to be produced. Extensive class definitions can be hard or fuzzy, non-overlapping or overlapping, and some plots may be left unclassified. Users of vegetation classifications have different attitudes with respect to these decisions. For example, one may require every plot record to be assigned to a single vegetation type at each hierarchical level and allow no plot records to remain unclassified (Berg et al. 2004;Willner 2011). This strategy is needed for applications such as vegetation mapping, where crisp boundaries of the mapping units are often required. Alternatively, some outlying plots may be left unclassified and/or overlaps allowed (e.g. Wiser & De C aceres 2013). This second approach might improve distinctiveness of vegetation types and thus help users understand the concepts represented in the classification, while simultaneously preserving the information on transitional or outlying character of some plots.
Our concept of vegetation type includes both the ideas of 'type' and 'class' (M€ oller 1993). Accordingly, there are two main perspectives regarding class definition procedures. The first emphasizes the boundaries between vegetation units, whereas the second emphasizes central tendencies or noda (Poore 1955). We will refer to vegetation types of the first and second kinds as boundary-based and node-based, respectively. For example, in a plot-based classification protocol the boundary-based perspective would specify a range of values in primary vegetation attributes, while the node-based perspective would specify the values of its most typical plot records. The choice of boundary-based vs node-based classification profoundly affects the definition of vegetation types and the treatment of intermediate or transitional plot records.
Vegetation types may be defined from expert knowledge, without an explicit use of plot records and/or formal procedures to group them. For example, an expert may produce a set of assignment rules in the form of dichotomous keys (e.g. Barkman 1990). In this approach, the expert is responsible for consistently applying the same set of guiding principles in the definition of vegetation types. In some cases, the expert defines a set of categories for each of the primary vegetation attributes and intensive class definitions are produced as a result of combining those categories (e.g. Dansereau 1951;Beard & Webb 1972;ESCAVI 2003;Gillison 2013). Formal procedures to define vegetation types from plot records often involve different steps (Peet & Roberts 2013;Lengyel & Podani 2015), including the acquisition and preparation of plot data, using a manual or a computer-based algorithm to group plot records, evaluating classification results and characterizing the vegetation types (see section 'Critical decisions: plot-based class definition procedures').
Most legacy classifications include the original type definitions but they do not include reports on class definition procedures. This hinders consistency when trying to modify or extend such classifications. Similarly, formal assignment rules are often not included in legacy classifications, or they are poorly specified. In the latter case, calibration of new assignment rules is required to enable assignments of new plot records to the original vegetation types. The calibration of assignment rules from training data and subsequent application of those rules for assignments is commonly referred to as supervised classification. Supervised classification sometimes involves modifying the original definition of vegetation types, because the assignment of the original plot records with the new assignment rules usually does not allow the original extensive class definition to be reproduced exactly (e.g. Ko c ı et al. 2003).

Application of constraining attributes
Restrictions coming from constraining attributes are often applied when selecting the primary vegetation attributes. For example, morpho-functional classifications of vegetation are often based on those morphological and physiological plant traits that are indicative of their adaptations to the environment in which they live (Gillison 2013). In the case of plot-based classification protocol, restrictions coming from constraining attributes may also be applied at different stages of the class definition procedures (see section 'Critical decisions: plot-based class definition procedures'). First, a restriction may be implemented by the sampling design. For example, if a set of plot records is collected to reflect some environmental gradient, the classification based on these data will tend to reflect this gradient (Knollov a et al. 2005;Cooper et al. 2006). Second, the restriction can be implemented at the stage of grouping plot records, as in constraining groups of plot records to have similar environmental characteristics (e.g. Carleton et al. 1996). Finally, using additional attributes to evaluate the validity of the classification may also constrain the definition of vegetation types. For example, one might examine whether vegetation types can be separated in environmental space (Orl oci 1978;Hakes 1994;Willner 2006).

Acquisition of plot data
Plot records can be obtained by conducting field surveys, which requires deciding a sampling design, or by drawing them from available vegetation-plot databases (Dengler et al. 2011). In both cases one has to specify a sampling design (or a re-sampling design in the case of using databases; De Gruijter et al. 2006). The advantages and drawbacks of different sampling (and re-sampling) designs for vegetation-plot data have been extensively discussed elsewhere (e.g. Kenkel et al. 1989 Table 4. In practice, sampling (and re-sampling) designs may combine elements of different approaches (Role cek et al. 2007;Peet & Roberts 2013). It is important to emphasize that the statistical procedures used to group plot records are descriptive rather than inferential (i.e. they do not involve inference with respect to a larger population). This calls for ensuring comprehensiveness of the sample (i.e. that the selected plot records encompass the full range of vegetation variation within the scope of the classification), a less demanding requirement than ensuring its representativeness (i.e. that the proportions of plot records corresponding to distinct types are in concordance with their frequency in geographical/ecological space).

Preparation of plot data
Broad-scale classification often involves the compilation of plot records from very different sources. This may lead to inconsistencies between plot records included in the data set (see Table 5). Consequently, decisions have to be made to remove, or at least reduce, their effect on the classification (Peet & Roberts 2013).

Grouping plot records
Plot-grouping algorithms produce extensive class definitions from plot records. When no prior information is used regarding membership, plot-grouping algorithms are commonly referred to as unsupervised classification or clustering (Everitt et al. 2011). There are different ways to introduce previous information on the membership of plot records into clustering procedures, an approach that can be called  Many plot-grouping algorithms require a resemblance coefficient to be chosen to quantify the similarity or dissimilarity in primary vegetation attributes between plot records, and the consequences of this decision should be understood. This choice will be partly constrained by previous choices of the primary vegetation attributes selected, the field measuring protocols used or abundance scales unified during data preparation. However, additional decisions are still required, such as the appropriateness of applying variable transformations, standardizations or variable weights; or the selection of a resemblance coefficient (e.g. Faith et al. 1987;Shaukat 1989;Legendre & De C aceres 2013). Finally, resemblances between plot records may be transformed before clustering (e.g. De'ath 1999; Schmidtlein et al. 2010).
Choosing a plot-grouping algorithm entails deciding on many characteristics of the vegetation types that will be defined. Providing a comprehensive review of methodological choices in plot-grouping algorithms is beyond the scope of this paper (but see Podani 1994;Everitt et al. 2011;Kent 2012;Peet & Roberts 2013;Wildi 2013). Nevertheless, we provide a brief overview of the main advantages and disadvantages of the most commonly used algorithm families ( Table 6).
The number of vegetation types to define is a critical decision because it strongly influences typological Table 5. Common sources of inconsistency when pooling plot data of different origin.

Source of Inconsistency Explanation
Spatial grain Plot size affects species richness, within-plot homogeneity, species constancy and therefore comparisons of community composition and structure Sampling season The structure and composition of some plant communities can show strong seasonal variation Subset of plants considered When pooling plot records of different origin, one should check that the same subsets of plants have been considered in all of them. For example, non-vascular plants or tree seedlings may have been recorded in some plot records but not in others Taxonomic nomenclature Pooling plot records of different origin often results in different names for the same entity or identical names for different entities, depending on the taxonomic concepts and determination literature used in a particular region or period Taxonomic resolution The amount of detail in the taxonomic identification may vary within or across plot records, especially in regions where the flora is not completely known or where plants are difficult to identify down to the species level Plant abundance scales The lack of common measurement scale is problematic for procedures requiring plant abundance measurements Vegetation layers The lack of common definition of vegetation layers may be problematic for procedures requiring information about the vertical structure Functional attributes Class definition procedures explicitly using morphological or functional attributes will require common measurement scales Observer bias Differences in plot records can partly result also from variation in sampling accuracy among field observers (e.g. overlooked or misidentified species, biased cover estimates) resolution (e.g. a larger number of clusters leads to a finer typological resolution). Alternatively, specifying a priori desired resolution for the classification protocol may help determine the number of clusters to be sought. Most nonhierarchical methods require the number of clusters to be specified before executing the algorithm. In hierarchical clustering the number of clusters is either decided a posteriori (when cutting the hierarchy) or is a function of a stopping rule (Role cek et al. 2009;Schmidtlein et al. 2010). Although one would be inclined to let the data 'speak' for themselves, the idea of a one and only 'natural' grouping is a myth (Dale 1988). Sometimes the groups resulting from a plot-grouping algorithm are modified a posteriori, with the aim to facilitate the calibration of assignment rules and achieve consistency between these and the definition of vegetation types (e.g. Li et al. 2013). For example, when diagnostic species are calculated from the results of clustering, re-assignment of the plots might be necessary in order to achieve a consistent classification (Willner 2011;Luther-Mosebach et al. 2012).

Evaluation of vegetation types
Following Gauch & Whittaker (1981), we distinguish internal and external evaluation criteria (Table 7). Internal criteria evaluate the appropriateness of the vegetation types by using the primary vegetation attributes. Internal evaluation is often used to choose among alternative grouping procedures, or to choose between alternative parameterizations of a given procedure, for example to decide on the number of clusters Vendramin et al. 2010). External evaluation uses secondary attributes, or a previous classification of the same plot records, as a benchmark for comparison. In relation to the requirements of a classification (Table 3), external criteria often evaluate the ability of vegetation types to indicate external conditions (e.g. how well the site conditions or the geographic location of a plot can be predicted from its membership to a given unit). Alternatively, one may assess the degree to which vegetation types are identified using external attributes (e.g. whether plot membership can be predicted from environmental conditions).

Characterization of vegetation types
Characterization should include the most important information about vegetation types that different end-users may require. Table 8 summarizes different kinds of information that the characterization of vegetation types may include. Additional information may be added to complement the characterization of vegetation types for particular applications. Examples include assessments of degree of conservation, protection status, vulnerability to invasions, animal habitat suitability, recommendations for management or ecosystem services provided (e.g. Berg et al. 2004Berg et al. , 2014. Table 7. Evaluation criteria for plot-based classification protocols (compare to Table 3).

Criterion Explanation
Internal criteria Distinctiveness of units Evaluates how distinct vegetation types are in terms of primary vegetation attributes. For example, one can evaluate the compactness and between-cluster separation in the multivariate space (e.g. Carranza et al. 1998;Aho et al. 2008;Roberts 2015) Similar internal heterogeneity Evaluation of the similarity of vegetation types in their internal heterogeneity (e.g. compositional variability) Classification stability Evaluates whether similar units are obtained (1) in a slightly modified data set (e.g. bootstrapped, or with a few plots added, deleted or replaced, or jittering abundance values; e.g. Tich y et al. 2011); or (2) in parallel non-overlapping subsets, selected randomly from the same data set or sampled independently in the same area (e.g. Botta-Duk at 2008) Identifiability of units Evaluates the ability to easily identify the vegetation types using a subset of the primary vegetation attributes, for example with diagnostic species (Willner 2006) External criteria Environmental evaluation Evaluates the compactness and differentiation of vegetation units in environmental space, often by using multivariate statistics (e.g. Orl oci 1978; Hakes 1994) Geographic evaluation Evaluates the appropriateness of the vegetation type from its spatial distribution. For example, it may be important to assess whether the geographic extent of a given vegetation unit is too small; or whether the geographic ranges of vegetation units overlap or correspond to some meaningful biogeographic regions (e.g. Loidi et al. 2010) Evaluation by using taxon traits Evaluates the predictive value with respect to biogeography, population ecology or ecological requirements of their component taxa by examining taxon attributes such as distribution range, functional traits or life history Comparison with an alternative classification Evaluation by comparison to a previous classification of the same plots. For example, to determine the algorithm and parameterization that best fits the criteria used by experts in the definition of the legacy classification (e.g. Grabherr et al. 2003) Examples The following examples have been chosen to illustrate our comparative framework. Although we tried to include frequently used approaches, our selection is neither comprehensive nor is meant as a recommendation of preferred approaches.

Physiognomic approaches
The first classification attempts ever made for large areas were physiognomic (Grisebach 1838). Most physiognomic classifications are not plot-based, in the sense that plot records are not used to define vegetation types and classification keys (e.g. UNESCO 1973). An example of a modern, plot-based, physiognomic system is that adopted for the Australian National Vegetation Information System (see Beard & Webb 1972;Walker & Hopkins 1990;ESCAVI 2003). This system has six hierarchical levels and is primarily physiognomic, although floristic composition also plays a role. Vegetation types in each level arise as combinations of predefined categories. Nested relationships between vegetation types are ensured because the sets of primary vegetation attributes used at coarser levels are a subset of those used at finer levels: 'Classes' (level I) are defined according to the dominant growth form of the dominant stratum, whereas 'structural formations' (level II) are defined as the combination of dominant growth form, cover class and height class for the dominant stratum.
Levels III and IV incorporate the dominant genus of the dominant stratum and of three strata, respectively, as classification criterion; additional floristic criteria are considered for levels V and VI. Whereas the system has a predefined set of vegetation types for the two uppermost levels, the vegetation types of the remaining levels are defined when using the set of predefined categories and a specific grammar to describe individual plot records, as in other descriptive physiognomic approaches (e.g. Dansereau 1951). The protocols in this system can be labelled as plot-based, but they are fundamentally distinct to floristic approaches, which typically use formal procedures to group plot records.

Dominant species approaches
Although species dominance has long been used as a classification criterion to informally classify forest stands, there are formal classification approaches that use this as the main classification criterion of low-level units. The ecological scope of dominant species approaches is often limited to floristically poor areas, because the concept of species dominance is difficult to apply as a classification criterion to communities composed of large numbers of species, such as lowland tropical forests. One example of dominant species approach is that proposed by Du Rietz (1930) and employed in Northern Europe, where the 'sociation' was the basic unit of vegetation classification (see Mueller-Dombois & Ellenberg 1974; Trass & Malmer 1978). The protocols for sociations were plot-based and use the dominant species of each vegetation layer as primary vegetation attribute. Another hierarchical level was that of 'consociations', which were type-based classes of sociations whose uppermost layer was dominated by the same species. Thus, in this case building definitions of vegetation types in the bottom-up direction ensured their nestedness. Another example of species dominance approach is the one used for some time in British and North American ecology, where vegetation was classified according to 'dominance types' (Whittaker 1978b). Dominance types were defined by the dominance (in terms of importance values) of one or more species in the uppermost layer, thus resembling the notion of consociation. In Russia, the most successful classification approach, developed by Sukachev (1928), was similar to that of Du Rietz. The units from the 'association' (close to the 'sociation' of Du Rietz) to the 'formation' levels were defined by dominance criteria, while additional coarser classification levels were defined according to vegetation physiognomy (Aleksandrova 1978).

Floristic approaches
Under this label we include classification approaches whose lowest level units are defined according to the complete (or nearly so) taxonomic composition. These are often called phytosociological approaches, although the term phytosociology can be also used for plot-based vegetation classification in general (Dengler et al. 2008).

Traditional Braun-Blanquet approach
The Braun-Blanquet approach (Braun-Blanquet 1964) aims at producing a universal classification system including vegetation of any kind. The following description is based on Westhoff & van der Maarel (1973). Vegetation units in the traditional Braun-Blanquet approach are arranged into four main hierarchical levels, with 'association' being the basic one, followed by 'alliance', 'order' and 'class'. All vegetation types (called syntaxa) are defined by floristic composition as the primary vegetation attribute. The basic unit, association, is defined by a characteristic species combination, which includes diagnostic species (i.e. species that find their optimum within the vegetation type and/or that allow differentiation between the current and closely-related types), and constant companions (i.e. species with high frequency). In contrast, primary vegetation attributes at higher hierarchical levels (alliance up to class) are normally restricted to diagnostic species. In the case of associations, classification protocols are plot-based and class definition procedures include preferential sampling, the rearrangement of compositional tables according to groups of differentiating species and the comparison of preliminary plot groupings with the floristic composition of types already defined. Uniform physiognomy and environmental conditions can be regarded as validation criteria for new associations, in addition to the requirement of distinct species composition. Classification protocols for vegetation types of higher rank are type-based and, broadly speaking, class definition procedures include the identification of groups of species whose occurrence is restricted to a group of types of the lower rank.

Modern variants of the Braun-Blanquet approach
The Braun-Blanquet approach has followers in many parts of the world, although it has been most extensively applied in Europe. Due to the long tradition of this approach and the lack of a central coordination, many different variants have emerged and been applied in different countries and epoques. This has led to classification systems that widely differ between regions and countries, which in extreme cases might share not much more than common naming conventions (syntaxonomy) and a similar typological resolution. Variations can be in the choice of primary vegetation attributes. In some cases, a complementary or prominent role is given to dominant species. In others, vegetation structure or physiognomy is considered in addition to floristic composition (e.g. Landucci et al. 2015). The use of constraining attributes also differs across applications of the method, particularly regarding types of high rank. Class definition procedures are varied, ranging from expertbased approaches to highly formalized node-based or boundary-based plot-grouping algorithms. In fact, most of the methodological alternatives listed in section 'Critical decisions: plot-based class definition procedures' have been used in modern applications of the Braun-Blanquet approach. The structural requirements for classification hierarchies, and the role that diagnostic species play, also widely vary between different variants (and are often not made explicit  (Willner & Grabherr 2007).

British National Vegetation Classification
The British National Vegetation Classification (Rodwell 1991(Rodwell -2000 is an example of a classification system where a clear classification approach has been consistently followed. It can be considered either as one of the modern variants of the Braun-Blanquet approach or as an independent phytosociological approach. Four plot-based classification protocols can be distinguished, due to variation in spatial grain: four plot sizes were used to sample different vegetation types depending on the size of dominant plants. Primary vegetation attributes were the complete species list, including cryptogams, with cover being recorded using the Domin scale. Field sampling locations followed a preferential design; and data sets of new plots sampled in the field were complemented with additional plot records from previous studies. Sets of plots were grouped using the TWINSPAN algorithm (Hill 1979). Vegetation types, called 'communities', were the product of many rounds of analyses, with classification stability and expert-based assessment being used as validation criteria. Primary characterization included constancy classes and the range of cover values for all species. Although the classification system has one main classification level, vegetation types were presented in 12 major vegetation groups. Manual classification keys exist but an automated assignment procedure for new plots was developed based on the similarity of these plots with constancy columns of particular communities (Hill 1989).

Synusial approaches
The traditional Braun-Blanquet approach and its modern variants are restricted to the classification of phytocoenoses, i.e. assemblages that include all plants (or at least all vascular plants) of the community. However, other branches of phytosociology have focused on the classification of synusiaeone-layered, ecologically homogeneous assemblages (e.g. epiphytic or epilithic communities, herbaceous communities, shrubby fringe communities)using similar classification approaches (see Barkman 1980). A modern example is the Integrated Synusial approach, developed in Switzerland and France (Gillet et al. 1991;Gillet & Gallandat 1996;Julve 1998Julve -2014. This approach implies having separate plot records and building separate CCS for each category of synusiae, i.e. tree, shrub, herb and cryptogam layers. Synusial vegetation types are called 'elementary syntaxa'. Class definition procedures for elementary syntaxa are very similar to those of the Braun-Blanquet approach, although with some notable differences in the sampling protocols (Gillet et al. 1991). After elementary syntaxa are defined, a type-based CCS can be created for the classification of complete phytocoenoses, based on their synusial composition. For this purpose, plot records are made of lists of elementary syntaxa and they are subsequently compared and grouped as plot records of taxa in the Braun-Blanquet approach.

The EcoVeg approach
EcoVeg (USFGDC 2008;Jennings et al. 2009;Faber-Langendoen et al. 2014) is an integrated physiognomicfloristic-ecological classification approach that aims to systematically classify all of the world's existing vegetation, preferably using vegetation plots. EcoVeg has broadly distinct protocols for natural/semi-natural vs cultural vegetation, including separate eight-level hierarchies. Within each hierarchy there are somewhat distinct protocols for three sets of levels (upper, mid and low levels). For natural and semi-natural vegetation, the upper levels (L1: 'Formation class'; L2: 'Formation subclass'; L3: 'Formation') use classification protocols based on growth forms as primary vegetation attributes; the mid levels (L4: 'Division'; L5: 'Macrogroup'; L6: 'Group') use protocols based on both growth forms and floristic composition; and the lower levels (L7: 'Alliance'; L8: 'Association') use protocols based on floristic composition only. In addition to the primary vegetation attributes, protocols also include also the specification of constraining attributes. For example, 'Formation Subclasses' (L2) of natural vegetation are defined using combinations of dominant and diagnostic growth forms that are chosen to reflect specific global macro-climatic factors (e.g. tropical vs temperate) or macro-substrate factors (e.g. saltwater vs freshwater). All cases type definitions are boundary-based. Although not all levels are plot-based, the goal of this approach is to document all types at all levels from plot data, using a dynamic peer review process. The characterization of types includes the vegetation attributes, environment, dynamics, key diagnostic features, geographic range and synonymy. Levels L5-L8 of EcoVeg are similar to the 'class', 'order', 'alliance' and 'association' levels of the Braun-Blanquet approach.

Concluding remarks
The development of common concepts and terminology is essential for providing a global perspective to vegetation classification approaches. Working towards that end, the broad international authorship of this article extensively discussed various concepts, often specific to local and regional traditions, and finally was able to accept certain conventions. The framework presented here will be useful for describing and comparing both new and legacy classification approaches. We tried to avoid being overly prescriptive because our aim was not to compare the advantages and disadvantages of the different classification approaches and protocols. Nevertheless, we feel that our globalized world will sooner or later require international conventions with respect to vegetation classification practices. Because a single, universally valid, classification approach may not satisfy everybody, users and developers of vegetation classifications should work together to seek commonalities among the different approaches and, ultimately, promote a set of conventional, harmonized practices adapted for different situations. For example, standard guidelines could be recommended for the development of CCS conditioned on the choices made by the user regarding the ecological scope (e.g. temperate forest vegetation), primary vegetation attributes (e.g. floristic composition or morpho-functional attributes) and typological resolution (e.g. associations or formations). This huge task demands operative and shared definitions forming a common vocabulary, and the main goal of the framework in this paper was to provide direction for this process.
The need for broad-scale classification systems has recently driven European vegetation scientists to work hard on the integration of CCS and classification systems that the application of the different variants of the Braun-Blanquet approach has produced in different areas. This task is particularly challenging due to the multiplicity of approaches and because the validity of diagnostic species and floristic vegetation types is inherently geographically limited. Integration of CCS is usually done at the national or regional scale through compilation of national monographs or hierarchical lists of vegetation types (Jim enez-Alfaro et al. 2014). Only relatively recently, have CCS been developed for all the vegetation types of whole countries or states, such as in the Netherlands (Schamin ee et al. 1995 et seq.) and the Czech Republic (Chytr y 2007-2013); and initiatives exist for larger areas (e.g. Dengler et al. 2013;Walker et al. 2013). Establishing plot-based CCS for types of high hierarchical rank at sub-continental to continental scales is also a relatively new development (e.g., Zechmeister & Mucina 1994;Eli a s et al. 2013), and raises the question of how the types in these new CCS can be related to types of lower rank. We believe that the framework presented here will be useful for this integration task, as it will contribute to the understanding of the differences between the approaches employed to develop the different legacy classification systems. Moreover, it will force integrated systems to be explicit about the different CCS and the protocols used in each section.
In addition to the promotion of standard approaches and the integration of classification systems produced using similar approaches, it will be necessary to relate vegetation types defined in classification systems having the same scope but produced using very different approaches. Referencing across legacy classifications may facilitate their preservation and avoid the problems of forcing their integration into a single framework. In the particular case of classification approaches having similar protocols at fine typological resolution, as happens for associations and alliances of the Braun-Blanquet and EcoVeg approaches, another alternative may be the harmonization of vegetation type definitions (i.e. building cross-walks) at these levels of resolution, a strategy that would ensure both the compatibility of classification systems and the preservation of original classification criteria at coarser levels of resolution.