Quantifying sample completeness and comparing diversities among assemblages.
Peer reviewed, Journal article
MetadataShow full item record
Original versionChao, A., Kubota, Y., Zeleny, D., Chiu, C.-H., Li, C.-F., Kusumoto, B., Yasuhara, M., Thorn, S., Wei, C.-L., Costello, M. J. & Colwell, R. K. (2020). Quantifying sample completeness and comparing diversities among assemblages. Ecological Research, 35(2), 292-314. doi: 10.1111/1440-1703.12102
We develop a novel class of measures to quantify sample completeness of a biological survey. The class of measures is parameterized by an order q ≥ 0 to control for sensitivity to species relative abundances. When q = 0, species abundances are disregarded and our measure reduces to the conventional measure of completeness, that is, the ratio of the observed species richness to the true richness (observed plus undetected). When q = 1, our measure reduces to the sample coverage (the proportion of the total number of individuals in the entire assemblage that belongs to detected species), a concept developed by Alan Turing in his cryptographic analysis. The sample completeness of a general order q ≥ 0 extends Turing's sample coverage and quantifies the proportion of the assemblage's individuals belonging to detected species, with each individual being proportionally weighted by the (q − 1)th power of its abundance. We propose the use of a continuous profile depicting our proposed measures with respect to q ≥ 0 to characterize the sample completeness of a survey. An analytic estimator of the diversity profile and its sampling uncertainty based on a bootstrap method are derived and tested by simulations. To compare diversity across multiple assemblages, we propose an integrated approach based on the framework of Hill numbers to assess (a) the sample completeness profile, (b) asymptotic diversity estimates to infer true diversities of entire assemblages, (c) non‐asymptotic standardization via rarefaction and extrapolation, and (d) an evenness profile. Our framework can be extended to incidence data. Empirical data sets from several research fields are used for illustration.