dc.contributor.advisor | Wilson, Simon | en |
dc.contributor.author | Al-Ghamdi, Asmaa | en |
dc.date.accessioned | 2021-02-25T14:37:46Z | |
dc.date.available | 2021-02-25T14:37:46Z | |
dc.date.issued | 2021 | en |
dc.date.submitted | 2021 | en |
dc.identifier.citation | Al-Ghamdi, Asmaa, An Integrated Framework for Estimating the Number of Classes with Application for Species Estimation, Trinity College Dublin.School of Computer Science & Statistics, 2021 | en |
dc.identifier.other | Y | en |
dc.identifier.uri | http://hdl.handle.net/2262/95385 | |
dc.description | APPROVED | en |
dc.description.abstract | The two most common approaches for estimating the number of distinct classes
within a population are either to use sampling data directly with combinatorial
arguments or to extrapolate historical discovery data. However, in the former
case, such detailed sampling data is often unavailable, while the latter approach
makes assumptions on the form of parametric curves used to fit the discovery
data, that are often lacking in theoretical justification. Instead, we propose an
integrated transdisciplinary framework that dissolves the boundaries between the
above two approaches. This is achieved by directly describing the samplingdiscovery
process in parallel with describing a co-variate latent e↵ort process,
where we have historical discovery data for the former process and some proxy
data for the latent process. The linkage between these two processes allows one to
form data on sampling records by forcing some constraints on how many samples
were taken over time. Due to the nature of the constrained data, many inference
techniques become infeasible. However, simulation-based methods such as
Approximate Bayesian Computation remain available. Our proposed approach
is demonstrated and analysed through many simulation experiments, and finally
applied in the ecology field to estimate the number of species as an example of
the number of classes problem. | en |
dc.publisher | Trinity College Dublin. School of Computer Science & Statistics. Discipline of Statistics | en |
dc.rights | Y | en |
dc.subject | Reliability, Species Estimation, Latent effort Process, Approximate Bayesian Computation | en |
dc.title | An Integrated Framework for Estimating the Number of Classes with Application for Species Estimation | en |
dc.type | Thesis | en |
dc.contributor.sponsor | Scholarship from King Abdulaziz University, Saudi Arabia | en |
dc.relation.references | The fit of the Poisson-inverse Gaussian distribution to the several well-known data sets examined here suggests that it may be an appropriate model for species abundance data under specific conditions but will not always be the clear choice | en |
dc.relation.references | The main finding is that both the method of computing the marginal likelihood as the expectation of the likelihood under the prior, as the method of computing the harmonic mean of likelihood values when sampling from the posterior, are not trustworthy.... | en |
dc.relation.references | The code to implement the logistic function approach of [64] is written in C. This is a pure curve-fitting approach and only uses the discovery times as its input. It was applied to the CoL data. This data set was too large to be handled by the code, which was solved by dividing the yearly discovery counts by 40 and rounding to an integer, then fitting this smaller data set, obtaining an estimate of the number of species remaining to be discovered and multiplying that back by 40. The WoRMS data had a similar problem and that was solved similarly, but by dividing the number of discoveries per year by 10. | en |
dc.type.supercollection | thesis_dissertations | en |
dc.type.supercollection | refereed_publications | en |
dc.type.qualificationlevel | Doctoral | en |
dc.identifier.peoplefinderurl | https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:AALGHAMD | en |
dc.identifier.rssinternalid | 224453 | en |
dc.rights.ecaccessrights | openAccess | |