13 research outputs found

    Sparse Topic Modeling: Computational Efficiency, Near-Optimal Algorithms, and Statistical Inference

    No full text
    Sparse topic modeling under the probabilistic latent semantic indexing (pLSI) model is studied. Novel and computationally fast algorithms for estimation and inference of both the word-topic matrix and the topic-document matrix are proposed and their theoretical properties are investigated. Both minimax upper and lower bounds are established and the results show that the proposed algorithms are rate-optimal, up to a logarithmic factor. Moreover, a refitting algorithm is proposed to establish asymptotic normality and construct valid confidence intervals for the individual entries of the word-topic and topic-document matrices. Simulation studies are carried out to investigate the numerical performance of the proposed algorithms. The results show that the proposed algorithms perform well numerically and are more accurate in a range of simulation settings comparing to the existing literature. In addition, the methods are illustrated through an analysis of the COVID-19 Open Research Dataset (CORD-19).</p

    Relationship between extinction threshold (<i>t</i>) and Robustness (<i>R</i><sub>50</sub>) in 12 food webs in order of species richness.

    No full text
    The threshold t varied from 5% to 95% by 5%, and the robustness of each t is recorded for each removal approach. All algorithms were tested 10 times on each food web and averages were taken as results.</p

    Relationship between primary and cumulative extinction in 12 food webs in order of species richness.

    No full text
    ID, OD, SD, PD, EIG and TS represent the keystone species identification strategies based on in-degree, out-degree, sum of in-degree and out-degree, product of in-degree and out-degree, eigenvector, and tabu search, respectively. All algorithms were tested 10 times on each food web, and averages were taken as results.</p

    A quantitative food web example with 6 species and 19 weighted links.

    No full text
    Virtual node v1 represents the external environment. The link from v1 to v2 with a green weight represents the energy flow that the food web receives from the external environment, and the links from v2, …, v7 pointing to v1 with yellow weights indicate energy flows from the food web into the external environment.</p

    Major characteristics of food webs used in the present research.

    No full text
    Major characteristics of food webs used in the present research.</p

    Mean SEA values(mean ± SEM) of each removal algorithm.

    No full text
    Mean SEA values(mean ± SEM) of each removal algorithm.</p

    Illustration of the tabu search-based food web disintegration strategy.

    No full text
    The blue row represents the current solution, X1, X2, X3, X4 represents the four different candidate solutions generated, and the right rectangle represents F(Xind). The red value is the optimal value of the objective function in one cycle, and its corresponding swap is noted as green.</p

    Dunnett post hoc test results of secondary extinction area.

    No full text
    Dunnett post hoc test results of secondary extinction area.</p

    Dunnett post hoc test results of robustness.

    No full text
    As species extinction accelerates globally and biodiversity declines dramatically, identifying keystone species becomes an effective way to conserve biodiversity. In traditional approaches, it is considered that the extinction of species with high centrality poses the greatest threat to secondary extinction. However, the indirect effect, which is equally important as the local and direct effects, is not included. Here, we propose an optimized disintegration strategy model for quantitative food webs and introduced tabu search, a metaheuristic optimization algorithm, to identify keystone species. Topological simulations are used to record secondary extinctions during species removal and secondary extinction areas, as well as to evaluate food web robustness. The effectiveness of the proposed strategy is also validated by comparing it with traditional methods. Results of our experiments demonstrate that our strategy can optimize the effect of food web disintegration and identify the species whose extinction is most destructive to the food web through global search. The algorithm provides an innovative and efficient way for further development of keystone species identification in the ecosystem.</div

    S1 Data -

    No full text
    As species extinction accelerates globally and biodiversity declines dramatically, identifying keystone species becomes an effective way to conserve biodiversity. In traditional approaches, it is considered that the extinction of species with high centrality poses the greatest threat to secondary extinction. However, the indirect effect, which is equally important as the local and direct effects, is not included. Here, we propose an optimized disintegration strategy model for quantitative food webs and introduced tabu search, a metaheuristic optimization algorithm, to identify keystone species. Topological simulations are used to record secondary extinctions during species removal and secondary extinction areas, as well as to evaluate food web robustness. The effectiveness of the proposed strategy is also validated by comparing it with traditional methods. Results of our experiments demonstrate that our strategy can optimize the effect of food web disintegration and identify the species whose extinction is most destructive to the food web through global search. The algorithm provides an innovative and efficient way for further development of keystone species identification in the ecosystem.</div
    corecore