11 research outputs found

    Usercentric Operational Decision Making in Distributed Information Retrieval

    Get PDF
    Information specialists in enterprises regularly use distributed information retrieval (DIR) systems that query a large number of information retrieval (IR) systems, merge the retrieved results, and display them to users. There can be considerable heterogeneity in the quality of results returned by different IR servers. Further, because different servers handle collections of different sizes and have different processing and bandwidth capacities, there can be considerable heterogeneity in their response times. The broker in the DIR system has to decide which servers to query, how long to wait for responses, and which retrieved results to display based on the benefits and costs imposed on users. The benefit of querying more servers and waiting longer is the ability to retrieve more documents. The costs may be in the form of access fees charged by IR servers or user’s cost associated with waiting for the servers to respond. We formulate the broker’s decision problem as a stochastic mixed-integer program and present analytical solutions for the problem. Using data gathered from FedStats—a system that queries IR engines of several U.S. federal agencies—we demonstrate that the technique can significantly increase the utility from DIR systems. Finally, simulations suggest that the technique can be applied to solve the broker’s decision problem under more complex decision environments

    Towards Enhancing the Capability of IoT Applications by Utilizing Cloud Computing Concept

    Get PDF
    The emergence of smart and innovative applications in diverse domains has inspired our lives by presenting many state-of-The art applications ranging from offline to smart online systems, smart communication system to tracking systems, and many others. The availability of smart internet enabled systems has made the world as a global village where people can collaborate, communicate, and share information in secure and timely manner. Innovation in information technology focuses on investigating characteristics that make it easier for the people to accept and distribute innovative IT-based processes or products. To provide elastic services and resource the Internet service provider developed cloud computing to support maximal number of users. Cloud computing is a subscription paradigm in which users do not buy various resources permanently, but they purchase it with block chain-driven payment schemes (credit cards). A flexible, on-demand, and dynamically scalable computer infrastructure is offered by cloud providers to its clients on charging some amount of subscription. This research article provides an introduction of cloud computing and the integration of IoT concept, its impacts on crowd and organizations, provision of various services, and analyzing and selecting the appropriate features using probability distribution function for enhancing cloud-based IoT capabilities. In ambiguous and complex situations, decision makers use quantitative techniques combined with traditional approaches to select the appropriate one among a group of features. Probability distribution function is used to evaluate the appropriate features that will enhance the capabilities of cloud-based IoT application

    Exploring foundations for using simulations in IS research

    Get PDF
    Simulation has been adopted in many disciplines as a means for understanding the behavior of a system by imitating it through an artificial object that exhibits a nearly identical behavior. Although simulation approaches have been widely adopted for theory building in disciplines such as engineering, computer science, management, and social sciences, their potential in the IS field is often overlooked. The aim of this paper is to understand how different simulation approaches are used in IS research, thereby providing insights and methodological recommendations for future studies. A literature review of simulation studies published in top-tier IS journals leads to the definition of three classes of simulations, namely the self-organizing, the elementary, and the situated. A set of stylized facts is identified for characterizing the ways in which the premise, the inference, and the contribution are presented in IS simulation studies. As a result, this study provides guidance to future simulation researchers in designing and presenting findings

    Data Mining Algorithms for Internet Data: from Transport to Application Layer

    Get PDF
    Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data

    Segurança e privacidade em terminologia de rede

    Get PDF
    Security and Privacy are now at the forefront of modern concerns, and drive a significant part of the debate on digital society. One particular aspect that holds significant bearing in these two topics is the naming of resources in the network, because it directly impacts how networks work, but also affects how security mechanisms are implemented and what are the privacy implications of metadata disclosure. This issue is further exacerbated by interoperability mechanisms that imply this information is increasingly available regardless of the intended scope. This work focuses on the implications of naming with regards to security and privacy in namespaces used in network protocols. In particular on the imple- mentation of solutions that provide additional security through naming policies or increase privacy. To achieve this, different techniques are used to either embed security information in existing namespaces or to minimise privacy ex- posure. The former allows bootstraping secure transport protocols on top of insecure discovery protocols, while the later introduces privacy policies as part of name assignment and resolution. The main vehicle for implementation of these solutions are general purpose protocols and services, however there is a strong parallel with ongoing re- search topics that leverage name resolution systems for interoperability such as the Internet of Things (IoT) and Information Centric Networks (ICN), where these approaches are also applicable.Segurança e Privacidade são dois topicos que marcam a agenda na discus- são sobre a sociedade digital. Um aspecto particularmente subtil nesta dis- cussão é a forma como atribuímos nomes a recursos na rede, uma escolha com consequências práticas no funcionamento dos diferentes protocols de rede, na forma como se implementam diferentes mecanismos de segurança e na privacidade das várias partes envolvidas. Este problema torna-se ainda mais significativo quando se considera que, para promover a interoperabili- dade entre diferentes redes, mecanismos autónomos tornam esta informação acessível em contextos que vão para lá do que era pretendido. Esta tese foca-se nas consequências de diferentes políticas de atribuição de nomes no contexto de diferentes protocols de rede, para efeitos de segurança e privacidade. Com base no estudo deste problema, são propostas soluções que, através de diferentes políticas de atribuição de nomes, permitem introdu- zir mecanismos de segurança adicionais ou mitigar problemas de privacidade em diferentes protocolos. Isto resulta na implementação de mecanismos de segurança sobre protocolos de descoberta inseguros, assim como na intro- dução de mecanismos de atribuiçao e resolução de nomes que se focam na protecçao da privacidade. O principal veículo para a implementação destas soluções é através de ser- viços e protocolos de rede de uso geral. No entanto, a aplicabilidade destas soluções extende-se também a outros tópicos de investigação que recorrem a mecanismos de resolução de nomes para implementar soluções de intero- perabilidade, nomedamente a Internet das Coisas (IoT) e redes centradas na informação (ICN).Programa Doutoral em Informátic

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    LTE Optimization and Resource Management in Wireless Heterogeneous Networks

    Get PDF
    Mobile communication technology is evolving with a great pace. The development of the Long Term Evolution (LTE) mobile system by 3GPP is one of the milestones in this direction. This work highlights a few areas in the LTE radio access network where the proposed innovative mechanisms can substantially improve overall LTE system performance. In order to further extend the capacity of LTE networks, an integration with the non-3GPP networks (e.g., WLAN, WiMAX etc.) is also proposed in this work. Moreover, it is discussed how bandwidth resources should be managed in such heterogeneous networks. The work has purposed a comprehensive system architecture as an overlay of the 3GPP defined SAE architecture, effective resource management mechanisms as well as a Linear Programming based analytical solution for the optimal network resource allocation problem. In addition, alternative computationally efficient heuristic based algorithms have also been designed to achieve near-optimal performance

    Modelling the information needs of users in the electronic information environment.

    Get PDF
    Abstract available in PDF file

    The segmentation issue: general stopping criteria and specific design considerations for practical application of evolutionary algorithms

    Get PDF
    Segmentation is a tool presented for representation and approximation of data, according to a set of appropriate models. These procedures have applications to many different domains, such as time series analysis, polygonal approximation, Air Traffic Control,... Different heuristic and metaheuristic proposals have been introduced to deal with this issue. This thesis provides a novel multiobjective evolutionary method, analyzing the required general tools for the application evolutionary algorithms to real problems and the specific modifications required over the different steps of general proposals to adapt them to the segmentation domain. An introduction to the domain is presented by means of the design of a specific heuristic for segmentation of Air Traffic Control (ATC) data. This domain has a series of characteristics which make it difficult to be faced with traditional techniques: noisy data and a large number of measurements. The proposal works on two phases, using a pre-segmentation which introduces available domain information and applying a standard technique over this initial technique's results. Its results according to the presented domain, tested with a set of eight different representative trajectories, show competitive advantages compared to general approaches, which oversegmentate noisy data and, in some cases, exhibit poor scalability. This heuristic proposal shows the costly process of adapting available approaches and designing specific ones, along with the multi-objective nature of the problem, which requires the use of quality indicators for a proper comparison process. Applying evolutionary algorithms to segmentation provides several advantages, highlighting the fact that the problem dependance of heuristics make it costly to adapt these heuristics to new domains, as introduced by the designed heuristic to ATC. However, the practical application of these algorithms requires the study of a topic which has received little research effort from the community: stopping criteria. An evolutionary approach should contain a dynamic procedure which can determine when stagnation has taken place and stop the algorithm accordingly (as opposed to a-priori cost budgets, either in function evaluations or generations, which are usually applied for test datasets). Stopping criteria have been faced for single and multi-objective cases in this thesis. Single-objective stopping criteria have been approached proposing an active role of the stopping criteria, actively increasing the diversity in the variable space while tracking the updates in the fitness function. Thus, the algorithm reuses the information obtained for the stopping decision and feeds it to a stopping prevention mechanism in order to prevent problematic situations such as early convergence. The presented algorithm has been tested according to a set of 27 different functions, with different characteristics regarding their dimensionality, search space, local minima... The results show that the introduced mechanisms enhance the robustness of the results, due to the improved exploration and the early convergence prevention. Multi-objective stopping criteria are faced with the use of progress indicators (comparison measures of the quality of the evolution results at different generations) and an associated data gathering tool. The final proposal uses three different progress indicators, (hypervolume, epsilon and Mutual Dominance Rate) and considers them jointly according to a decision fusion architecture. The stagnation analysis is based on the least squares regression parameters of the indicators values, including a normality analysis as well. The online nature of these algorithms is highlighted, preventing the recomputation of the indicators values which were present in other available alternatives, and also focusing on the simplicity of the final proposal, in order to reduce the cost of introducing it into available algorithms. The proposal has been tested with instances of the DTLZ algorithm family, obtaining satisfactory stops with a standard set of configuration values for the technique. However, there is a lack of quantitative measures to determine the objective quality of a stop and to properly compare its value to other alternatives. The multi-objective nature of the segmentation problem is analyzed to propose a multiobjective evolutionary algorithm (MOEA) to deal with it. This nature is analyzed according to a selection of available approaches, highlighting the difficulties which had to be faced in the parameter configuration in order to guide the processes to the desired solution values. A multi-objective a-posteriori approach such as the one presented allows the decision maker to choose from the front of possible final solutions the one which suits him best, simplifying this process. The presented approach chooses SPEA2 as its underlying MOEA, analyzing different representation and initialization proposals. The results have been validated against a representative set of heuristic and metaheuristic techniques, using three widely extended curves from the polygonal approximation domain (chromosome, leaf and semicircle), obtaining statistically better results for almost all the different test cases. This initial MOEA approach had unresolved issues, such as the archiving technique complexity order, and also lacked the proper specific design considerations to adapt it to the application domain. These issues have been faced according to different improvements. First of all, an alternative representation is proposed, including partial fitness information and associated fitness-aware transformation operators (transformation operators which compute children fitness values according to their changes and the parents partial values). A novel archiving procedure is introduced according to the bi-objective nature of the domain, being one of them discrete. This leads to a relaxed Pareto dominance check, named epsilon glitches. Multi-objective local search versions of the traditional algorithms are proposed and tested for the initialization of the algorithm, along with the stopping criterion proposal which has also been adapted to the problem characteristics. The archive size in this case is big enough to contain all the different individuals in the optimal front, such that quality assessment is simplified and a simpler mechanism can be introduced to detect stagnation, according to the improvements in each of the possible individuals. The final evolutionary proposal is scalable, requires few configuration parameters and introduces an efficient dynamic stopping criterion. Its results have been tested against the original technique and the set of heuristic and metaheuristic techniques previously used, including the three original curves and also more complex versions of them (obtained with an introduced generation mechanism according to these original shapes). Even though the stopping results are very satisfactory, the obtained results are slightly worse than the original MOEA for the three simpler problem instances with the established configuration parameters (as was expected, due to the computational effort of the a-priori established number of generations and population size, based on the analysis of the algorithm's results). However, the comparison versus the alternative techniques stills shows the same statistically better results, and its reduced computational cost allows its application to a wider set of problems.La segmentación es una técnica creada para la representación y la aproximación de conjuntos de datos a través de un conjunto de modelos apropiados. Estos procedimientos tienen aplicaciones para múltiples dominios distintos, como el análisis de series temporales, la aproximación poligonal o el Control de Tráfico Aéreo. Se han hecho múltiples propuestas tanto de carácter heurístico como metaheurístico para lidiar con este problema. Esta tesis proporciona un nuevo método evolutivo multiobjetivo, analizando las herramientas generales necesarias para la aplicación de algoritmos evolutivos a problemas reales y las modificaciones específicas necesarias sobre los distintos pasos de las propuestas genéricas para adaptarlos al dominio de la segmentación. Se presenta una introducción al dominio mediante el diseño de una heurística específica para la segmentación de datos procedentes del Control de Tráfico Aéreo (CTA). Este dominio tiene una serie de características que dificultan la aplicación de técnicas tradicionales: datos con ruido y un gran número de muestras. La propuesta realizada funciona de acuerdo a dos fases, utilizando una presegmentación que introduce información del dominio disponible para posteriormente aplicar una técnica estándar sobre los resultados de esta técnica inicial. Sus resultados para el dominio presentado, probado con un conjunto de ocho trayectorias representativas distintas, presentan ventajas competitivas frente a los enfoques generales, que sobresegmentan los datos con ruido y, en algunos casos, presentan una mala escalabilidad. Esta propuesta heurística muestra el costoso proceso que implica adaptar los enfoques existentes o el diseño de otros nuevos, junto a la naturaleza multiobjectivo del problema, que precisa del uso de indicadores de calidad para realizar un proceso de comparación apropiado. La aplicación de algoritmos evolutivos a la segmentación tiene múltiples ventajas, destacando el hecho de la dependencia existente entre las heurísticas y el problema específico para el que han sido diseñadas, lo que hace que su adaptación a nuevos dominios sea costosa, como se ha introducido a través de la propuesta heurística para CTA. A pesar de ello, la aplicación práctica de estos algoritmos requiere el estudio de una faceta que ha recibido poca atención por parte de la comunidad desde el punto de vista de la investigación: los criterios de parada. Un enfoque evolutivo debería tener una técnica dinámica que pueda detectar cuando se ha producido el estancamiento del proceso, y parar el algoritmo de acuerdo a ello (de manera opuesta a los criterios a-priori que establecen un coste predeterminado, expresado como número de evaluaciones o de generaciones, y que son habitualmente aplicados para los conjuntos de datos de prueba). Los criterios de parada se han afrontado tanto desde el caso de un único objetivo como desde el caso multiobjectivo en esta tesis. Los criterios de parada para un único objetivo se han abordado proponiendo un rol activo para el criterio, aumentando la diversidad en el espacio de variables de una manera activa, mientras se monitorizan los cambios en la función objetivo. De esta manera, el algoritmo reutiliza la información obtenida para la decisión de parada y la inserta en un mecanismo de prevención de la parada con la finalidad de prevenir situaciones problemáticas como la convergencia temprana. El algoritmo presentado se ha probado sobre un conjunto de 27 funciones distintas, con diferentes características respecto a su dimensionalidad, espacio de búsqueda, mínimos locales... Los resultados muestran que los mecanismos introducidos mejoran la robustez de los resultados, haciendo uso de la exploración mejorada y la prevención de la convergencia temprana. Los criterios de parada multiobjetivo se han planteado con el uso de indicadores de avance (medidas comparativas de la calidad de los resultados de la evolución en diferentes generaciones) y una herramienta de recolección de datos asociada. La propuesta final utiliza tres indicadores de avance distintos (hypervolumen, epsilon y ratio de dominancia mutua) y los considera de una manera conjunta de acuerdo a una arquitectura de fusión de decisiones. El análisis del estancamiento se basa en los parámetros de una regresión de mínimos cuadrados sobre los valores de los indicadores, incluyendo asimismo un análisis de normalidad. Se recalca la naturaleza online de estos algoritmos, evitando el recálculo de los valores de los indicadores que estaba presente en otras alternativas disponibles, y también focalizándose en la simplicidad de la propuesta final, de manera que se facilite el proceso de introducir el criterio en los algoritmos existentes. La propuesta ha sido probada con instancias de la familia de algoritmos DTLZ, obteniendo resultados de parada satisfactorios con un conjunto de valores de configuración estándar para la técnica. Sin embargo, existe una falta de medidas cuantitativas para determinar la calidad objetiva de una parada, así como para comparar de manera apropiada su valor frente al de otras alternativas. La naturaleza multiobjetivo del problema de segmentación se ha analizado para proponer un algoritmo evolutivo multiobjetivo (AEMO) para resolverlo. Esta naturaleza ha sido analizada de acuerdo a una selección de los enfoques disponibles, destacando las dificultades que se tienen que afrontar en la configuración de los parámetros de cara a guiar el proceso hacia los valores de solución deseados. Un enfoque multiobjetivo a-posteriori como el que se ha presentado permite al responsable elegir del frente de posibles soluciones finales aquella que encaja mejor, simplificando este proceso. El enfoque presentado ha elegido SPEA2 como algoritmo de base, analizando diferentes propuestas de inicialización y representación. Los resultados se han validado frente a un conjunto significativo de técnicas heurísticas y metaheurísticas, utilizando tres curvas ampliamente extendidas en el dominio de la segmentación poligonal (cromosoma, hoja y semicírculo), obteniendo resultados estadísticamente mejores para la casi totatilidad de los casos de prueba. Esta propuesta inicial de AEMO presentaba una serie de problemas sin resolver, como el orden de complejidad de la técnica de almacenaje, y además carecía de las consideraciones específicas de diseño para su adaptación al dominio de aplicación. Estos problemas se han afrontado de acuerdo a diferentes mejoras. Por un lado, se ha propuesto una representación alternativa, incluyendo información parcial de la función objetivo y operadores de transformación informados (operadores de transformación que calculan los valores de la función objetivo de los hijos de acuerdo a los cambios realizados y los valores parciales de los padres). Una nueva técnica de almacenaje se ha introducido de acuerdo a la naturaleza biobjetivo del dominio, siendo uno de ellos además discreto. Esta naturaleza ha llevado a la aplicación de una forma relajada de dominancia de Pareto, que hemos denominado pulsos épsilon. Versiones multiobjetivo de los algoritmos tradicionales de búsqueda local han sido propuestas y probadas para la inicialización del algoritmo, junto con la propuesta de criterio de parada, que también ha sido adaptada a las características del problema. En este caso, el tamaño del almacén es suficientemente grande como para almacenar todos los individuos del frente óptimo, de manera que las técnicas de análisis de calidad de los frentes se simplifican, y un mecanismo más sencillo puede ser introducido para detectar el estancamiento, de acuerdo a las mejoras en cada uno de los individuos posibles. La propuesta evolutiva final es escalable, requiere pocos parámetros de configuración e introduce un criterio de parada dinámico y eficiente. Sus resultados se han probado frente a la técnica original y el conjunto de técnicas heurísticas y metaheurísticas previamente utilizadas, incluyendo las tres curvas originales y versiones más complejas de las mismas (obtenidas con un mecanismo de generación incluido de acuerdo a estas tres formas originales). A pesar de que los resultados de parada son muy satisfactorios, los resultados obtenidos son ligeramente peores que el AEMO original para las tres instancias del problema más simples, utilizando el conjunto de parámetros de configuración establecidos (como cabía esperar, dado el coste computacional del número de generaciones y tamaño de la población establecidos a priori, basados en el análisis de los resultados del algoritmo). En cualquier caso, la comparación frente a las técnicas alternativas todavía presenta los mismos resultados estadísticamente mejores, y las mejoras en el coste computacional permiten su aplicación a un mayor conjunto de problemas.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Pedro Isasi Viñuela.- Secretario: Rafael Martínez Tomás.- Vocal: Javier Segovia Pére
    corecore