396 research outputs found

    Robust Cache System for Web Search Engine Yioop

    Get PDF
    Caches are the most effective mechanism utilized by web search engines to optimize the performance of search queries. Search engines employ caching at multiple levels to improve its performance, for example, caching posting list and caching result set. Caching query results reduces overhead of processing frequent queries and thus saves a lot of time and computing power. Yioop is an open-source web search engine which utilizes result cache to optimize searches. The current implementation utilizes a single dynamic cache based on Marker’s algorithm. The goal of the project is to improve the performance of cache in Yioop. To choose a new caching system, Static-Dynamic cache along with its different variations Machine Learning Static- Dynamic Cache, Static-Semistatic-Dynamic Cache, and Static-Topic-Dynamic Cache were evaluated. Based on these experiments, Static-Topic-Dynamic was implemented in Yioop. Static-Dynamic cache exploits temporal locality by dividing cache into a static part which stores most popular queries and a dynamic part which captures the bursty behavior of queries. Static-Topic-Dynamic adds topical cache section in Static-Dynamic Cache which captures queries that are neither too popular to be in static cache nor too frequent to be in dynamic cache by creating dedicated cache for each topic. To extract topic from the queries, ��-means algorithm was chosen as topic model. The results of Static-Dynamic Cache and Static-Topic-Dynamic cache showed the improvement of 2.3% and 1% over the initial performance of the cache

    Visualizing Digital Collections at Archive-It

    Get PDF
    Archive-It, a subscription service from the Internet Archive, allows users to create,maintain, and view digital collections of web resources. The current interface of Archive-It is largely text-based, supporting drill-down navigation using lists of URIs.While this interface provides good searching capabilities, it is not efficient for browsing. In the absence of keywords, a user has to spend large amount of time trying to locate a web page of interest. In order to provide a better visual experience to the user, we have studied the underlying characteristics of Archive-It collections and implemented six different visualizations (treemap, time cloud, bubble chart, image plot, timeline and wordle), each highlighting one or more of the underlying characteristics of the collection. Archive-It supports grouping of web pages into categories, however, it does not enforce its usage. As a result there are many collections with missing or improper grouping. For such collections, we present a method of grouping web pages based on a set of pre-defined rules

    KGCleaner : Identifying and Correcting Errors Produced by Information Extraction Systems

    Full text link
    KGCleaner is a framework to identify and correct errors in data produced and delivered by an information extraction system. These tasks have been understudied and KGCleaner is the first to address both. We introduce a multi-task model that jointly learns to predict if an extracted relation is credible and repair it if not. We evaluate our approach and other models as instance of our framework on two collections: a Wikidata corpus of nearly 700K facts and 5M fact-relevant sentences and a collection of 30K facts from the 2015 TAC Knowledge Base Population task. For credibility classification, parameter efficient simple shallow neural network can achieve an absolute performance gain of 30 F1F_1 points on Wikidata and comparable performance on TAC. For the repair task, significant performance (at more than twice) gain can be obtained depending on the nature of the dataset and the models

    Pengaruh Implementasi Peraturan Bupati Nomor 18 Tahun 2011 terhadap Peningkatan Disiplin Kerja Pegawai (Study Kasus Badan Perencanaan Pembangunan Daerah dan Sekretariat Daerah Kabupaten Rokan Hulu)

    Full text link
    Discipline is a starting point of all success in order to achieve the objectives of the organization. Therefore, the local goverment in the districks Rokan Hulu make regulation regent No. 18 of 2011 on compolsory prayer noon dan asr in congregation at mosques Pasir Pangarayan for Muslim employee in the working day in the administration districts Rokan Hulu in order to increase the devotion of empliyess to god almight one that affects discipline increase employeyment. Planning and regional Development Agencies Rokan hulu and Rokan Hulu Secretariat upstream part of the governance environment districts Rokan Hulu.This study aims to determine the implementation of the decree no. 18 of 2011. Dicipline of employess working in really in development Planning and Rokan Hulu Regency Secretariat inflence the implementation mainly Regent No. 18 of 2011 to dicipline the local Planning Agancy and the regional Secretariat Rokan Hulu.In this study the typies of research used in quantitative research required data collected to the study site with the questionnaire in this study are emplyess of the local palnning agancy Rokan Hulu 55 and Rokan Hulu district Secretariat asmany as 119 people.As this study using a Likert scale ewrw analized using simple Linier regression analysis and determination with the help of SPSS version 17.0 for Windows.The results of this study indicate that the implemantation of the decree No. 18 of 2011 against increased employee discipline Development planning Rokan Hulu great influence 23% and Rokan Hulu district Secretariat a big influence 29%.Key Words : Implementation, Diciplin

    A comparison of non-fi nancial strategy disclosure in the annual reports of South African and Indian listed companies

    Get PDF
    This study focuses on non-fi nancial strategy disclosure in the annual reports of listed companies in South Africa and India. South Africa and India are both developing nations that face similar socioeconomic conditions, including the threats presented by the HIV/AIDS pandemic and affi rmative action policies and regulations. The fact that integrated reporting is fast becoming a necessity for emerging markets to gain entrance to developed economies validates the contribution of this research. This study, which replicated the studies of Santema and Van de Rijt (2001) and Padia and Yasseen (2011), compared the top 40 listed companies in South Africa with the top 40 listed companies in India based on market capitalisation for the year 2012. The results were statistically analysed using principal component analysis and Hotelling’s t-square tests. The fi ndings concluded that South African companies divulge more information in terms of their non-financial strategy disclosure than their Indian counterparts. In addition, the Hotelling’s t-square test found that there were no signifi cant differences in terms of four of the variables when comparing South African companies with Indian companies. Overall, however, there are vast differences in the levels of non-fi nancial strategy disclosure in both countries, which is attributed to stock exchange regulation in the respective countries.Key words: affirmative action, annual reports, HIV/AIDS, India, integrated reporting, nonfinancial disclosure, South Africa, strateg

    Comparative Analysis Using The Altman, Springate, Grover, Zmijewski, And Taffler Models In Assessing Financial Distress Before And During The Covid-19 Pandemic (Empirical Study On Transportation And Logistics Companies Listed On The Indonesia Stock Exchange 2017-2022)

    Get PDF
    The purpose of this study was to find out whether there were major changes in the level of financial distress prior to and during COVID-19outbreak by using the Altman, Springate, Grover, Zmijewski, and Taffler assessment models for companies engaged in the transportation and logistics industry listed on the IDX in 2017-2022, as well as finding out whether there is a model with the highest accuracy among these models. Purposive sampling was used as a sampling technique in this study, resulting in a total sample of 18 companies. At the same time, the data analysis technique uses descriptive statistical analysis, normality tests, hypothesis testing consisting of paired sample t-tests, Wilcoxon signed-rank tests, and Calculation of the accuracy level of each model. The research findings showed no major changes in the level of financial distress prior to and during the COVID-19 outbreak using the Altman, Springate, Grover, Zmijewski, and Taffler models. In addition, there is a model with the highest level of accuracy, namely the Zmijewski model, with an accuracy rate of 74.07%, exceeding the Springate model of 69.44%, the Taffler model of 69.44%, the Grover model of 67.59%, and the Altman model. of 59.26%.Keywords: Financial Distress; Altman Model; Springate Model; Grover Model; Zmijewski Model; Taffler Model
    • …
    corecore