14 research outputs found

    The segmentation issue: general stopping criteria and specific design considerations for practical application of evolutionary algorithms

    Get PDF
    Segmentation is a tool presented for representation and approximation of data, according to a set of appropriate models. These procedures have applications to many different domains, such as time series analysis, polygonal approximation, Air Traffic Control,... Different heuristic and metaheuristic proposals have been introduced to deal with this issue. This thesis provides a novel multiobjective evolutionary method, analyzing the required general tools for the application evolutionary algorithms to real problems and the specific modifications required over the different steps of general proposals to adapt them to the segmentation domain. An introduction to the domain is presented by means of the design of a specific heuristic for segmentation of Air Traffic Control (ATC) data. This domain has a series of characteristics which make it difficult to be faced with traditional techniques: noisy data and a large number of measurements. The proposal works on two phases, using a pre-segmentation which introduces available domain information and applying a standard technique over this initial technique's results. Its results according to the presented domain, tested with a set of eight different representative trajectories, show competitive advantages compared to general approaches, which oversegmentate noisy data and, in some cases, exhibit poor scalability. This heuristic proposal shows the costly process of adapting available approaches and designing specific ones, along with the multi-objective nature of the problem, which requires the use of quality indicators for a proper comparison process. Applying evolutionary algorithms to segmentation provides several advantages, highlighting the fact that the problem dependance of heuristics make it costly to adapt these heuristics to new domains, as introduced by the designed heuristic to ATC. However, the practical application of these algorithms requires the study of a topic which has received little research effort from the community: stopping criteria. An evolutionary approach should contain a dynamic procedure which can determine when stagnation has taken place and stop the algorithm accordingly (as opposed to a-priori cost budgets, either in function evaluations or generations, which are usually applied for test datasets). Stopping criteria have been faced for single and multi-objective cases in this thesis. Single-objective stopping criteria have been approached proposing an active role of the stopping criteria, actively increasing the diversity in the variable space while tracking the updates in the fitness function. Thus, the algorithm reuses the information obtained for the stopping decision and feeds it to a stopping prevention mechanism in order to prevent problematic situations such as early convergence. The presented algorithm has been tested according to a set of 27 different functions, with different characteristics regarding their dimensionality, search space, local minima... The results show that the introduced mechanisms enhance the robustness of the results, due to the improved exploration and the early convergence prevention. Multi-objective stopping criteria are faced with the use of progress indicators (comparison measures of the quality of the evolution results at different generations) and an associated data gathering tool. The final proposal uses three different progress indicators, (hypervolume, epsilon and Mutual Dominance Rate) and considers them jointly according to a decision fusion architecture. The stagnation analysis is based on the least squares regression parameters of the indicators values, including a normality analysis as well. The online nature of these algorithms is highlighted, preventing the recomputation of the indicators values which were present in other available alternatives, and also focusing on the simplicity of the final proposal, in order to reduce the cost of introducing it into available algorithms. The proposal has been tested with instances of the DTLZ algorithm family, obtaining satisfactory stops with a standard set of configuration values for the technique. However, there is a lack of quantitative measures to determine the objective quality of a stop and to properly compare its value to other alternatives. The multi-objective nature of the segmentation problem is analyzed to propose a multiobjective evolutionary algorithm (MOEA) to deal with it. This nature is analyzed according to a selection of available approaches, highlighting the difficulties which had to be faced in the parameter configuration in order to guide the processes to the desired solution values. A multi-objective a-posteriori approach such as the one presented allows the decision maker to choose from the front of possible final solutions the one which suits him best, simplifying this process. The presented approach chooses SPEA2 as its underlying MOEA, analyzing different representation and initialization proposals. The results have been validated against a representative set of heuristic and metaheuristic techniques, using three widely extended curves from the polygonal approximation domain (chromosome, leaf and semicircle), obtaining statistically better results for almost all the different test cases. This initial MOEA approach had unresolved issues, such as the archiving technique complexity order, and also lacked the proper specific design considerations to adapt it to the application domain. These issues have been faced according to different improvements. First of all, an alternative representation is proposed, including partial fitness information and associated fitness-aware transformation operators (transformation operators which compute children fitness values according to their changes and the parents partial values). A novel archiving procedure is introduced according to the bi-objective nature of the domain, being one of them discrete. This leads to a relaxed Pareto dominance check, named epsilon glitches. Multi-objective local search versions of the traditional algorithms are proposed and tested for the initialization of the algorithm, along with the stopping criterion proposal which has also been adapted to the problem characteristics. The archive size in this case is big enough to contain all the different individuals in the optimal front, such that quality assessment is simplified and a simpler mechanism can be introduced to detect stagnation, according to the improvements in each of the possible individuals. The final evolutionary proposal is scalable, requires few configuration parameters and introduces an efficient dynamic stopping criterion. Its results have been tested against the original technique and the set of heuristic and metaheuristic techniques previously used, including the three original curves and also more complex versions of them (obtained with an introduced generation mechanism according to these original shapes). Even though the stopping results are very satisfactory, the obtained results are slightly worse than the original MOEA for the three simpler problem instances with the established configuration parameters (as was expected, due to the computational effort of the a-priori established number of generations and population size, based on the analysis of the algorithm's results). However, the comparison versus the alternative techniques stills shows the same statistically better results, and its reduced computational cost allows its application to a wider set of problems.La segmentación es una técnica creada para la representación y la aproximación de conjuntos de datos a través de un conjunto de modelos apropiados. Estos procedimientos tienen aplicaciones para múltiples dominios distintos, como el análisis de series temporales, la aproximación poligonal o el Control de Tráfico Aéreo. Se han hecho múltiples propuestas tanto de carácter heurístico como metaheurístico para lidiar con este problema. Esta tesis proporciona un nuevo método evolutivo multiobjetivo, analizando las herramientas generales necesarias para la aplicación de algoritmos evolutivos a problemas reales y las modificaciones específicas necesarias sobre los distintos pasos de las propuestas genéricas para adaptarlos al dominio de la segmentación. Se presenta una introducción al dominio mediante el diseño de una heurística específica para la segmentación de datos procedentes del Control de Tráfico Aéreo (CTA). Este dominio tiene una serie de características que dificultan la aplicación de técnicas tradicionales: datos con ruido y un gran número de muestras. La propuesta realizada funciona de acuerdo a dos fases, utilizando una presegmentación que introduce información del dominio disponible para posteriormente aplicar una técnica estándar sobre los resultados de esta técnica inicial. Sus resultados para el dominio presentado, probado con un conjunto de ocho trayectorias representativas distintas, presentan ventajas competitivas frente a los enfoques generales, que sobresegmentan los datos con ruido y, en algunos casos, presentan una mala escalabilidad. Esta propuesta heurística muestra el costoso proceso que implica adaptar los enfoques existentes o el diseño de otros nuevos, junto a la naturaleza multiobjectivo del problema, que precisa del uso de indicadores de calidad para realizar un proceso de comparación apropiado. La aplicación de algoritmos evolutivos a la segmentación tiene múltiples ventajas, destacando el hecho de la dependencia existente entre las heurísticas y el problema específico para el que han sido diseñadas, lo que hace que su adaptación a nuevos dominios sea costosa, como se ha introducido a través de la propuesta heurística para CTA. A pesar de ello, la aplicación práctica de estos algoritmos requiere el estudio de una faceta que ha recibido poca atención por parte de la comunidad desde el punto de vista de la investigación: los criterios de parada. Un enfoque evolutivo debería tener una técnica dinámica que pueda detectar cuando se ha producido el estancamiento del proceso, y parar el algoritmo de acuerdo a ello (de manera opuesta a los criterios a-priori que establecen un coste predeterminado, expresado como número de evaluaciones o de generaciones, y que son habitualmente aplicados para los conjuntos de datos de prueba). Los criterios de parada se han afrontado tanto desde el caso de un único objetivo como desde el caso multiobjectivo en esta tesis. Los criterios de parada para un único objetivo se han abordado proponiendo un rol activo para el criterio, aumentando la diversidad en el espacio de variables de una manera activa, mientras se monitorizan los cambios en la función objetivo. De esta manera, el algoritmo reutiliza la información obtenida para la decisión de parada y la inserta en un mecanismo de prevención de la parada con la finalidad de prevenir situaciones problemáticas como la convergencia temprana. El algoritmo presentado se ha probado sobre un conjunto de 27 funciones distintas, con diferentes características respecto a su dimensionalidad, espacio de búsqueda, mínimos locales... Los resultados muestran que los mecanismos introducidos mejoran la robustez de los resultados, haciendo uso de la exploración mejorada y la prevención de la convergencia temprana. Los criterios de parada multiobjetivo se han planteado con el uso de indicadores de avance (medidas comparativas de la calidad de los resultados de la evolución en diferentes generaciones) y una herramienta de recolección de datos asociada. La propuesta final utiliza tres indicadores de avance distintos (hypervolumen, epsilon y ratio de dominancia mutua) y los considera de una manera conjunta de acuerdo a una arquitectura de fusión de decisiones. El análisis del estancamiento se basa en los parámetros de una regresión de mínimos cuadrados sobre los valores de los indicadores, incluyendo asimismo un análisis de normalidad. Se recalca la naturaleza online de estos algoritmos, evitando el recálculo de los valores de los indicadores que estaba presente en otras alternativas disponibles, y también focalizándose en la simplicidad de la propuesta final, de manera que se facilite el proceso de introducir el criterio en los algoritmos existentes. La propuesta ha sido probada con instancias de la familia de algoritmos DTLZ, obteniendo resultados de parada satisfactorios con un conjunto de valores de configuración estándar para la técnica. Sin embargo, existe una falta de medidas cuantitativas para determinar la calidad objetiva de una parada, así como para comparar de manera apropiada su valor frente al de otras alternativas. La naturaleza multiobjetivo del problema de segmentación se ha analizado para proponer un algoritmo evolutivo multiobjetivo (AEMO) para resolverlo. Esta naturaleza ha sido analizada de acuerdo a una selección de los enfoques disponibles, destacando las dificultades que se tienen que afrontar en la configuración de los parámetros de cara a guiar el proceso hacia los valores de solución deseados. Un enfoque multiobjetivo a-posteriori como el que se ha presentado permite al responsable elegir del frente de posibles soluciones finales aquella que encaja mejor, simplificando este proceso. El enfoque presentado ha elegido SPEA2 como algoritmo de base, analizando diferentes propuestas de inicialización y representación. Los resultados se han validado frente a un conjunto significativo de técnicas heurísticas y metaheurísticas, utilizando tres curvas ampliamente extendidas en el dominio de la segmentación poligonal (cromosoma, hoja y semicírculo), obteniendo resultados estadísticamente mejores para la casi totatilidad de los casos de prueba. Esta propuesta inicial de AEMO presentaba una serie de problemas sin resolver, como el orden de complejidad de la técnica de almacenaje, y además carecía de las consideraciones específicas de diseño para su adaptación al dominio de aplicación. Estos problemas se han afrontado de acuerdo a diferentes mejoras. Por un lado, se ha propuesto una representación alternativa, incluyendo información parcial de la función objetivo y operadores de transformación informados (operadores de transformación que calculan los valores de la función objetivo de los hijos de acuerdo a los cambios realizados y los valores parciales de los padres). Una nueva técnica de almacenaje se ha introducido de acuerdo a la naturaleza biobjetivo del dominio, siendo uno de ellos además discreto. Esta naturaleza ha llevado a la aplicación de una forma relajada de dominancia de Pareto, que hemos denominado pulsos épsilon. Versiones multiobjetivo de los algoritmos tradicionales de búsqueda local han sido propuestas y probadas para la inicialización del algoritmo, junto con la propuesta de criterio de parada, que también ha sido adaptada a las características del problema. En este caso, el tamaño del almacén es suficientemente grande como para almacenar todos los individuos del frente óptimo, de manera que las técnicas de análisis de calidad de los frentes se simplifican, y un mecanismo más sencillo puede ser introducido para detectar el estancamiento, de acuerdo a las mejoras en cada uno de los individuos posibles. La propuesta evolutiva final es escalable, requiere pocos parámetros de configuración e introduce un criterio de parada dinámico y eficiente. Sus resultados se han probado frente a la técnica original y el conjunto de técnicas heurísticas y metaheurísticas previamente utilizadas, incluyendo las tres curvas originales y versiones más complejas de las mismas (obtenidas con un mecanismo de generación incluido de acuerdo a estas tres formas originales). A pesar de que los resultados de parada son muy satisfactorios, los resultados obtenidos son ligeramente peores que el AEMO original para las tres instancias del problema más simples, utilizando el conjunto de parámetros de configuración establecidos (como cabía esperar, dado el coste computacional del número de generaciones y tamaño de la población establecidos a priori, basados en el análisis de los resultados del algoritmo). En cualquier caso, la comparación frente a las técnicas alternativas todavía presenta los mismos resultados estadísticamente mejores, y las mejoras en el coste computacional permiten su aplicación a un mayor conjunto de problemas.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Pedro Isasi Viñuela.- Secretario: Rafael Martínez Tomás.- Vocal: Javier Segovia Pére

    Eight Biennial Report : April 2005 – March 2007

    No full text

    Psr1p interacts with SUN/sad1p and EB1/mal3p to establish the bipolar spindle

    Get PDF
    Regular Abstracts - Sunday Poster Presentations: no. 382During mitosis, interpolar microtubules from two spindle pole bodies (SPBs) interdigitate to create an antiparallel microtubule array for accommodating numerous regulatory proteins. Among these proteins, the kinesin-5 cut7p/Eg5 is the key player responsible for sliding apart antiparallel microtubules and thus helps in establishing the bipolar spindle. At the onset of mitosis, two SPBs are adjacent to one another with most microtubules running nearly parallel toward the nuclear envelope, creating an unfavorable microtubule configuration for the kinesin-5 kinesins. Therefore, how the cell organizes the antiparallel microtubule array in the first place at mitotic onset remains enigmatic. Here, we show that a novel protein psrp1p localizes to the SPB and plays a key role in organizing the antiparallel microtubule array. The absence of psr1+ leads to a transient monopolar spindle and massive chromosome loss. Further functional characterization demonstrates that psr1p is recruited to the SPB through interaction with the conserved SUN protein sad1p and that psr1p physically interacts with the conserved microtubule plus tip protein mal3p/EB1. These results suggest a model that psr1p serves as a linking protein between sad1p/SUN and mal3p/EB1 to allow microtubule plus ends to be coupled to the SPBs for organization of an antiparallel microtubule array. Thus, we conclude that psr1p is involved in organizing the antiparallel microtubule array in the first place at mitosis onset by interaction with SUN/sad1p and EB1/mal3p, thereby establishing the bipolar spindle.postprin

    Removal of antagonistic spindle forces can rescue metaphase spindle length and reduce chromosome segregation defects

    Get PDF
    Regular Abstracts - Tuesday Poster Presentations: no. 1925Metaphase describes a phase of mitosis where chromosomes are attached and oriented on the bipolar spindle for subsequent segregation at anaphase. In diverse cell types, the metaphase spindle is maintained at a relatively constant length. Metaphase spindle length is proposed to be regulated by a balance of pushing and pulling forces generated by distinct sets of spindle microtubules and their interactions with motors and microtubule-associated proteins (MAPs). Spindle length appears important for chromosome segregation fidelity, as cells with shorter or longer than normal metaphase spindles, generated through deletion or inhibition of individual mitotic motors or MAPs, showed chromosome segregation defects. To test the force balance model of spindle length control and its effect on chromosome segregation, we applied fast microfluidic temperature-control with live-cell imaging to monitor the effect of switching off different combinations of antagonistic forces in the fission yeast metaphase spindle. We show that spindle midzone proteins kinesin-5 cut7p and microtubule bundler ase1p contribute to outward pushing forces, and spindle kinetochore proteins kinesin-8 klp5/6p and dam1p contribute to inward pulling forces. Removing these proteins individually led to aberrant metaphase spindle length and chromosome segregation defects. Removing these proteins in antagonistic combination rescued the defective spindle length and, in some combinations, also partially rescued chromosome segregation defects. Our results stress the importance of proper chromosome-to-microtubule attachment over spindle length regulation for proper chromosome segregation.postprin

    Shortest Route at Dynamic Location with Node Combination-Dijkstra Algorithm

    Get PDF
    Abstract— Online transportation has become a basic requirement of the general public in support of all activities to go to work, school or vacation to the sights. Public transportation services compete to provide the best service so that consumers feel comfortable using the services offered, so that all activities are noticed, one of them is the search for the shortest route in picking the buyer or delivering to the destination. Node Combination method can minimize memory usage and this methode is more optimal when compared to A* and Ant Colony in the shortest route search like Dijkstra algorithm, but can’t store the history node that has been passed. Therefore, using node combination algorithm is very good in searching the shortest distance is not the shortest route. This paper is structured to modify the node combination algorithm to solve the problem of finding the shortest route at the dynamic location obtained from the transport fleet by displaying the nodes that have the shortest distance and will be implemented in the geographic information system in the form of map to facilitate the use of the system. Keywords— Shortest Path, Algorithm Dijkstra, Node Combination, Dynamic Location (key words

    Smoking and Second Hand Smoking in Adolescents with Chronic Kidney Disease: A Report from the Chronic Kidney Disease in Children (CKiD) Cohort Study

    Get PDF
    The goal of this study was to determine the prevalence of smoking and second hand smoking [SHS] in adolescents with CKD and their relationship to baseline parameters at enrollment in the CKiD, observational cohort study of 600 children (aged 1-16 yrs) with Schwartz estimated GFR of 30-90 ml/min/1.73m2. 239 adolescents had self-report survey data on smoking and SHS exposure: 21 [9%] subjects had “ever” smoked a cigarette. Among them, 4 were current and 17 were former smokers. Hypertension was more prevalent in those that had “ever” smoked a cigarette (42%) compared to non-smokers (9%), p\u3c0.01. Among 218 non-smokers, 130 (59%) were male, 142 (65%) were Caucasian; 60 (28%) reported SHS exposure compared to 158 (72%) with no exposure. Non-smoker adolescents with SHS exposure were compared to those without SHS exposure. There was no racial, age, or gender differences between both groups. Baseline creatinine, diastolic hypertension, C reactive protein, lipid profile, GFR and hemoglobin were not statistically different. Significantly higher protein to creatinine ratio (0.90 vs. 0.53, p\u3c0.01) was observed in those exposed to SHS compared to those not exposed. Exposed adolescents were heavier than non-exposed adolescents (85th percentile vs. 55th percentile for BMI, p\u3c 0.01). Uncontrolled casual systolic hypertension was twice as prevalent among those exposed to SHS (16%) compared to those not exposed to SHS (7%), though the difference was not statistically significant (p= 0.07). Adjusted multivariate regression analysis [OR (95% CI)] showed that increased protein to creatinine ratio [1.34 (1.03, 1.75)] and higher BMI [1.14 (1.02, 1.29)] were independently associated with exposure to SHS among non-smoker adolescents. These results reveal that among adolescents with CKD, cigarette use is low and SHS is highly prevalent. The association of smoking with hypertension and SHS with increased proteinuria suggests a possible role of these factors in CKD progression and cardiovascular outcomes