641 research outputs found

    Filtering News from Document Streams: Evaluation Aspects and Modeled Stream Utility

    Get PDF
    Events like hurricanes, earthquakes, or accidents can impact a large number of people. Not only are people in the immediate vicinity of the event affected, but concerns about their well-being are shared by the local government and well-wishers across the world. The latest information about news events could be of use to government and aid agencies in order to make informed decisions on providing necessary support, security and relief. The general public avails of news updates via dedicated news feeds or broadcasts, and lately, via social media services like Facebook or Twitter. Retrieving the latest information about newsworthy events from the world-wide web is thus of importance to a large section of society. As new content on a multitude of topics is continuously being published on the web, specific event related information needs to be filtered from the resulting stream of documents. We present in this thesis, a user-centric evaluation measure for evaluating systems that filter news related information from document streams. Our proposed evaluation measure, Modeled Stream Utility (MSU), models users accessing information from a stream of sentences produced by a news update filtering system. The user model allows for simulating a large number of users with different characteristic stream browsing behavior. Through simulation, MSU estimates the utility of a system for an average user browsing a stream of sentences. Our results show that system performance is sensitive to a user population's stream browsing behavior and that existing evaluation metrics correspond to very specific types of user behavior. To evaluate systems that filter sentences from a document stream, we need a set of judged sentences. This judged set is a subset of all the sentences returned by all systems, and is typically constructed by pooling together the highest quality sentences, as determined by respective system assigned scores for each sentence. Sentences in the pool are manually assessed and the resulting set of judged sentences is then used to compute system performance metrics. In this thesis, we investigate the effect of including duplicates of judged sentences, into the judged set, on system performance evaluation. We also develop an alternative pooling methodology, that given the MSU user model, selects sentences for pooling based on the probability of a sentences being read by modeled users. Our research lays the foundation for interesting future work for utilizing user-models in different aspects of evaluation of stream filtering systems. The MSU measure enables incorporation of different user models. Furthermore, the applicability of MSU could be extended through calibration based on user behavior

    Computational Intelligence for the Micro Learning

    Get PDF
    The developments of the Web technology and the mobile devices have blurred the time and space boundaries of people’s daily activities, which enable people to work, entertain, and learn through the mobile device at almost anytime and anywhere. Together with the life-long learning requirement, such technology developments give birth to a new learning style, micro learning. Micro learning aims to effectively utilise learners’ fragmented spare time and carry out personalised learning activities. However, the massive volume of users and the online learning resources force the micro learning system deployed in the context of enormous and ubiquitous data. Hence, manually managing the online resources or user information by traditional methods are no longer feasible. How to utilise computational intelligence based solutions to automatically managing and process different types of massive information is the biggest research challenge for realising the micro learning service. As a result, to facilitate the micro learning service in the big data era efficiently, we need an intelligent system to manage the online learning resources and carry out different analysis tasks. To this end, an intelligent micro learning system is designed in this thesis. The design of this system is based on the service logic of the micro learning service. The micro learning system consists of three intelligent modules: learning material pre-processing module, learning resource delivery module and the intelligent assistant module. The pre-processing module interprets the content of the raw online learning resources and extracts key information from each resource. The pre-processing step makes the online resources ready to be used by other intelligent components of the system. The learning resources delivery module aims to recommend personalised learning resources to the target user base on his/her implicit and explicit user profiles. The goal of the intelligent assistant module is to provide some evaluation or assessment services (such as student dropout rate prediction and final grade prediction) to the educational resource providers or instructors. The educational resource providers can further refine or modify the learning materials based on these assessment results

    Passive acoustic monitoring and audio subsampling: optimizing autonomous methods for avian biodiversity assessments

    Get PDF
    Tese de mestrado, Biologia da Conservação, 2022, Universidade de Lisboa, Faculdade de CiênciasThe global decline of bird populations has prompted the search for innovative tools to inventory and monitor their communities. While standard surveys require an observer in the field, autonomous sound recorders are an alternative that demands less expertise and is more scalable in time and space. The literature is not consensual on the efficiency of this method and the factors that influence it, particularly in multispecies studies, although recent attempts have yielded encouraging results towards its applicability in practical monitoring situations. In this study, we conducted a set of observer and recorder-based bird point counts in cork oak woodlands in Portugal, in winter season. We compared both methods in terms of richness values and species-by-species, and assessed the role of the observer and sampling time in recorder performance. Additionally, we compared species richness values obtained through three types of intermittent audio file subsampling, and by intermittent and continuous approaches. We found the observer detected significantly more species, but its presence did not influence the recorder’s results and the pool of species detected by both was similar. We found time of sampling to be relevant in autonomous recorders. The degree of intermittence generated different cost/benefit scenarios for audio processing. Lastly, intermittent subsampling surpassed the number of species detected through continuous subsampling by a factor of two. The results of this study showed that recorders tended to perform well in biodiversity surveys in winter, while being more flexible in scaling, especially when small portions of audio are analysed. However, they also suggest the observer should not be dismissed a priori, and highlight the complexity of factors that may influence the recorder’s performance. We encourage future studies to test this performance over a variety of different time, spatial, and species-related constraints, to maximize the universality of a future all-year autonomous method monitoring protocol.O declínio global das populações de aves promoveu a procura de ferramentas inovadoras para inventariar e monitorizar as suas comunidades. Enquanto as amostragens tradicionais necessitam de um observador no terreno, os gravadores automáticos são uma alternativa que requer menos conhecimento especializado e permite aumentar a escala espacial e temporal da amostragem. A literatura é ambígua sobre a eficiência deste método, particularmente em estudos multiespecíficos, apesar de trabalhos recentes terem mostrado resultados encorajadores acerca da sua aplicabilidade em situações de monitorização. Neste estudo, realizámos um conjunto de pontos de contagem de aves baseados em observadores e em gravadores em zonas de montado, em Portugal, durante o inverno. Comparámos ambas as abordagens em termos de riqueza específica e espécie a espécie, tendo analisado o papel do observador e da janela temporal de amostragem no desempenho dos gravadores. Adicionalmente, comparámos os valores de riqueza específica obtidos através de três tipos de subamostragem de ficheiros áudio. Por fim, comparámos abordagens contínuas e intermitentes. Os resultados mostraram que o observador detetou mais espécies e que a sua presença não influenciou os resultados dos gravadores, que detetaram uma amostra de espécies semelhante. A janela temporal de amostragem foi considerada relevante e teve impacto nas estimativas dos gravadores. O nível de intermitência de subamostragem gerou diferentes cenários de custo/benefício para o processamento de áudio. Por fim, a abordagem intermitente permitiu a deteção de duas vezes mais espécies do que a contínua. Estes resultados sugerem que os gravadores têm um bom desempenho em contextos multiespecíficos no inverno, sendo mais flexíveis a grande escala, particularmente quando são analisadas pequenas porções de áudio. Por oposição, sugerem também que o observador é relevante e evidenciam a complexidade de fatores inerentes ao desempenho dos gravadores. Reiteramos a necessidade de estudos futuros testarem este desempenho em variadas condições espaciais, temporais e de espécies para maximizar a universalidade de um futuro protocolo para monitorização automática aplicável durante todo o ano

    Top-k term publish/subscribe for geo-textual data streams

    Get PDF

    Semantic Selection of Internet Sources through SWRL Enabled OWL Ontologies

    Get PDF
    This research examines the problem of Information Overload (IO) and give an overview of various attempts to resolve it. Furthermore, argue that instead of fighting IO, it is advisable to start learning how to live with it. It is unlikely that in modern information age, where users are producer and consumer of information, the amount of data and information generated would decrease. Furthermore, when managing IO, users are confined to the algorithms and policies of commercial Search Engines and Recommender Systems (RSs), which create results that also add to IO. this research calls to initiate a change in thinking: this by giving greater power to users when addressing the relevance and accuracy of internet searches, which helps in IO. However powerful search engines are, they do not process enough semantics in the moment when search queries are formulated. This research proposes a semantic selection of internet sources, through SWRL enabled OWL ontologies. the research focuses on SWT and its Stack because they (a)secure the semantic interpretation of the environments where internet searches take place and (b) guarantee reasoning that results in the selection of suitable internet sources in a particular moment of internet searches. Therefore, it is important to model the behaviour of users through OWL concepts and reason upon them in order to address IO when searching the internet. Thus, user behaviour is itemized through user preferences, perceptions and expectations from internet searches. The proposed approach in this research is a Software Engineering (SE) solution which provides computations based on the semantics of the environment stored in the ontological model

    Algorithms in E-recruitment Systems

    Get PDF
    corecore