    Reading the news through its structure: new hybrid connectivity based approaches

    In this thesis a solution for the problem of identifying the structure of news published by online newspapers is presented. This problem requires new approaches and algorithms that are capable of dealing with the massive number of online publications in existence (and that will grow in the future). The fact that news documents present a high degree of interconnection makes this an interesting and hard problem to solve. The identification of the structure of the news is accomplished both by descriptive methods that expose the dimensionality of the relations between different news, and by clustering the news into topic groups. To achieve this analysis this integrated whole was studied using different perspectives and approaches. In the identification of news clusters and structure, and after a preparatory data collection phase, where several online newspapers from different parts of the globe were collected, two newspapers were chosen in particular: the Portuguese daily newspaper Público and the British newspaper The Guardian. In the first case, it was shown how information theory (namely variation of information) combined with adaptive networks was able to identify topic clusters in the news published by the Portuguese online newspaper Público. In the second case, the structure of news published by the British newspaper The Guardian is revealed through the construction of time series of news clustered by a kmeans process. After this approach an unsupervised algorithm, that filters out irrelevant news published online by taking into consideration the connectivity of the news labels entered by the journalists, was developed. This novel hybrid technique is based on Qanalysis for the construction of the filtered network followed by a clustering technique to identify the topical clusters. Presently this work uses a modularity optimisation clustering technique but this step is general enough that other hybrid approaches can be used without losing generality. A novel second order swarm intelligence algorithm based on Ant Colony Systems was developed for the travelling salesman problem that is consistently better than the traditional benchmarks. This algorithm is used to construct Hamiltonian paths over the news published using the eccentricity of the different documents as a measure of distance. This approach allows for an easy navigation between published stories that is dependent on the connectivity of the underlying structure. The results presented in this work show the importance of taking topic detection in large corpora as a multitude of relations and connectivities that are not in a static state. They also influence the way of looking at multi-dimensional ensembles, by showing that the inclusion of the high dimension connectivities gives better results to solving a particular problem as was the case in the clustering problem of the news published online.Neste trabalho resolvemos o problema da identificação da estrutura das notícias publicadas em linha por jornais e agências noticiosas. Este problema requer novas abordagens e algoritmos que sejam capazes de lidar com o número crescente de publicações em linha (e que se espera continuam a crescer no futuro). Este facto, juntamente com o elevado grau de interconexão que as notícias apresentam tornam este problema num problema interessante e de difícil resolução. A identificação da estrutura do sistema de notícias foi conseguido quer através da utilização de métodos descritivos que expõem a dimensão das relações existentes entre as diferentes notícias, quer através de algoritmos de agrupamento das mesmas em tópicos. Para atingir este objetivo foi necessário proceder a ao estudo deste sistema complexo sob diferentes perspectivas e abordagens. Após uma fase preparatória do corpo de dados, onde foram recolhidos diversos jornais publicados online optou-se por dois jornais em particular: O Público e o The Guardian. A escolha de jornais em línguas diferentes deve-se à vontade de encontrar estratégias de análise que sejam independentes do conhecimento prévio que se tem sobre estes sistemas. Numa primeira análise é empregada uma abordagem baseada em redes adaptativas e teoria de informação (nomeadamente variação de informação) para identificar tópicos noticiosos que são publicados no jornal português Público. Numa segunda abordagem analisamos a estrutura das notícias publicadas pelo jornal Britânico The Guardian através da construção de séries temporais de notícias. Estas foram seguidamente agrupadas através de um processo de k-means. Para além disso desenvolveuse um algoritmo que permite filtrar de forma não supervisionada notícias irrelevantes que apresentam baixa conectividade às restantes notícias através da utilização de Q-analysis seguida de um processo de clustering. Presentemente este método utiliza otimização de modularidade, mas a técnica é suficientemente geral para que outras abordagens híbridas possam ser utilizadas sem perda de generalidade do método. Desenvolveu-se ainda um novo algoritmo baseado em sistemas de colónias de formigas para solução do problema do caixeiro viajante que consistentemente apresenta resultados melhores que os tradicionais bancos de testes. Este algoritmo foi aplicado na construção de caminhos Hamiltonianos das notícias publicadas utilizando a excentricidade obtida a partir da conectividade do sistema estudado como medida da distância entre notícias. Esta abordagem permitiu construir um sistema de navegação entre as notícias publicadas que é dependente da conectividade observada na estrutura de notícias encontrada. Os resultados apresentados neste trabalho mostram a importância de analisar sistemas complexos na sua multitude de relações e conectividades que não são estáticas e que influenciam a forma como tradicionalmente se olha para sistema multi-dimensionais. Mostra-se que a inclusão desta dimensões extra produzem melhores resultados na resolução do problema de identificar a estrutura subjacente a este problema da publicação de notícias em linha

    Development of emotional inteligence on education of english for the 8th grade students at the San José La Salle educative unit during the period of 2012-2013

    After doing a complete bibliographic investigation, it was possible to find out thatthere is not any other similar work; so, the present investigation is original.such information can be checked at any time anywher

    How Companies Are Seizing the Dialogic Opportunities Provided by Social Media to Communicate with their External Audiences

    Aquesta tesi avalua, de manera integrada, el nivell de comunicació dialògica desenvolupat per les empreses de l'IBEX 35 i una selecció de vint empreses del Fortune 500 amb els seus públics externs en blogs, Facebook i Twitter. Amb aquest objectiu, s'ha creat una eina dialògica conceptual basada en el marc teòric de Kent i Taylor (1998), i s'ha aplicat a tota la mostra. Es tracta d'un qüestionari que analitza seixanta-una variables i trenta-nou subvariables, organitzades en tres dimensions: Presència, Contingut i Interactivitat. Per a dur a terme la recerca, s'ha aplicat una triangulació entre mètodes: etnografia virtual, anàlisi crítica del discurs (CDA, per les sigles en anglès) i entrevistes amb experts. Els resultats d'aquesta recerca mostren que el nivell dialògic de l'ús dels mitjans socials és més alt a les empreses de l'IBEX 35 que a les empreses del Fortune 500. No obstant això, el percentatge d'empreses amb un nivell baix de comunicació dialògica supera el percentatge d'empreses amb un nivell alt, tant en l'IBEX 35 com en el Fortune 500.Esta tesis evalúa, de forma integrada, el nivel de comunicación dialógica desarrollado por las empresas del IBEX 35 y una selección de veinte empresas del Fortune 500 con sus públicos externos en blogs, Facebook y Twitter. Con este objetivo, se ha creado una herramienta dialógica conceptual basada en el marco teórico de Kent y Taylor (1998), y se ha aplicado a toda la muestra. Se trata de un cuestionario que analiza sesenta y una variables y treinta y nueve subvariables, organizadas en tres dimensiones: Presencia, Contenido e Interactividad. Para llevar a cabo la investigación, se ha aplicado una triangulación entre métodos: etnografía virtual, análisis crítico del discurso (CDA, por las siglas en inglés) y entrevistas con expertos. Los resultados de esta investigación muestran que el nivel dialógico del uso de los medios sociales es más alto en las empresas del IBEX 35 que en las empresas del Fortune 500. Sin embargo, el porcentaje de empresas con un nivel bajo de comunicación dialógica supera al porcentaje de empresas con un nivel alto, tanto en el IBEX 35 como en el Fortune 500.This thesis assesses the level of integrative dialogic communication prompted by IBEX 35 companies and a selection of 20 Fortune 500 firms with their external audiences on blogs, Facebook and Twitter. With this aim, a dialogic conceptual tool based on Kent y Taylor's (1998) (1998) framework has been created and applied to the entire sample. The tool consists of a questionnaire which analyzes 61 variables and 39 sub-variables on three dimensions: presence, content and interactivity. Inter-method triangulation, i.e., virtual ethnography, critical discourse analysis (CDA) and interviews with experts, has been applied to carry out the research. Results of this research show that use of social media at the dialogic level is higher in the Ibex 35 companies than in the Fortune 500 firms. However, both in the Ibex 35 and the Fortune 500 companies, the percentage of companies with low levels of dialogic communication exceeds the percentage of companies with high levels. Consistent with previous research, this study concludes that Ibex 35 and Fortune 500 companies are still not fully utilizing social media's dialogic potential

    Uticaj klasifikacije teksta na primene u obradi prirodnih jezika

    The main goal of this dissertation is to put different text classification tasks in the same frame, by mapping the input data into the common vector space of linguistic attributes. Subsequently, several classification problems of great importance for natural language processing are solved by applying the appropriate classification algorithms. The dissertation deals with the problem of validation of bilingual translation pairs, so that the final goal is to construct a classifier which provides a substitute for human evaluation and which decides whether the pair is a proper translation between the appropriate languages by means of applying a variety of linguistic information and methods. In dictionaries it is useful to have a sentence that demonstrates use for a particular dictionary entry. This task is called the classification of good dictionary examples. In this thesis, a method is developed which automatically estimates whether an example is good or bad for a specific dictionary entry. Two cases of short message classification are also discussed in this dissertation. In the first case, classes are the authors of the messages, and the task is to assign each message to its author from that fixed set. This task is called authorship identification. The other observed classification of short messages is called opinion mining, or sentiment analysis. Starting from the assumption that a short message carries a positive or negative attitude about a thing, or is purely informative, classes can be: positive, negative and neutral. These tasks are of great importance in the field of natural language processing and the proposed solutions are language-independent, based on machine learning methods: support vector machines, decision trees and gradient boosting. For all of these tasks, a demonstration of the effectiveness of the proposed methods is shown on for the Serbian language.Osnovni cilj disertacije je stavljanje različitih zadataka klasifikacije teksta u isti okvir, preslikavanjem ulaznih podataka u isti vektorski prostor lingvističkih atributa..

    On Clustering and Evaluation of Narrow Domain Short-Test Corpora

    En este trabajo de tesis doctoral se investiga el problema del agrupamiento de conjuntos especiales de documentos llamados textos cortos de dominios restringidos. Para llevar a cabo esta tarea, se han analizados diversos corpora y métodos de agrupamiento. Mas aún, se han introducido algunas medidas de evaluación de corpus, técnicas de selección de términos y medidas para la validez de agrupamiento con la finalidad de estudiar los siguientes problemas: -Determinar la relativa dificultad de un corpus para ser agrupado y estudiar algunas de sus características como longitud de los textos, amplitud del dominio, estilometría, desequilibrio de clases y estructura. -Contribuir en el estado del arte sobre el agrupamiento de corpora compuesto de textos cortos de dominios restringidos El trabajo de investigación que se ha llevado a cabo se encuentra parcialmente enfocado en el "agrupamiento de textos cortos". Este tema se considera relevante dado el modo actual y futuro en que las personas tienden a usar un "lenguaje reducido" constituidos por textos cortos (por ejemplo, blogs, snippets, noticias y generación de mensajes de textos como el correo electrónico y el chat). Adicionalmente, se estudia la amplitud del dominio de corpora. En este sentido, un corpus puede ser considerado como restringido o amplio si el grado de traslape de vocabulario es alto o bajo, respectivamente. En la tarea de categorización, es bastante complejo lidiar con corpora de dominio restringido tales como artículos científicos, reportes técnicos, patentes, etc. El objetivo principal de este trabajo consiste en estudiar las posibles estrategias para tratar con los siguientes dos problemas: a) las bajas frecuencias de los términos del vocabulario en textos cortos, y b) el alto traslape de vocabulario asociado a dominios restringidos. Si bien, cada uno de los problemas anteriores es un reto suficientemente alto, cuando se trata con textos cortos de dominios restringidos, la complejidad del problema se incrPinto Avendaño, DE. (2008). On Clustering and Evaluation of Narrow Domain Short-Test Corpora [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/2641Palanci

    Congress UPV Proceedings of the 21ST International Conference on Science and Technology Indicators

    This is the book of proceedings of the 21st Science and Technology Indicators Conference that took place in València (Spain) from 14th to 16th of September 2016. The conference theme for this year, ‘Peripheries, frontiers and beyond’ aimed to study the development and use of Science, Technology and Innovation indicators in spaces that have not been the focus of current indicator development, for example, in the Global South, or the Social Sciences and Humanities. The exploration to the margins and beyond proposed by the theme has brought to the STI Conference an interesting array of new contributors from a variety of fields and geographies. This year’s conference had a record 382 registered participants from 40 different countries, including 23 European, 9 American, 4 Asia-Pacific, 4 Africa and Near East. About 26% of participants came from outside of Europe. There were also many participants (17%) from organisations outside academia including governments (8%), businesses (5%), foundations (2%) and international organisations (2%). This is particularly important in a field that is practice-oriented. The chapters of the proceedings attest to the breadth of issues discussed. Infrastructure, benchmarking and use of innovation indicators, societal impact and mission oriented-research, mobility and careers, social sciences and the humanities, participation and culture, gender, and altmetrics, among others. We hope that the diversity of this Conference has fostered productive dialogues and synergistic ideas and made a contribution, small as it may be, to the development and use of indicators that, being more inclusive, will foster a more inclusive and fair world

    Spanish transitions: representation of transgender characters in Spanish film

    This dissertation examines the different representations of transgender characters in Spanish cinema since their appearance in the 1970s up until today. The history of Spain in the last forty years, with its radical political changes, yields extremely fertile transgender case studies, especially in Spanish films, which become sites of struggle and negotiation of meaning, definition and understanding of gender, sex and sexuality. The cases in which transgender characters have been protagonists of Spanish movies –and thus explored and portrayed in more depth- give us first-hand information on the different ways of thinking about gender, and the different ways of thinking and picturing a topic that was previously hidden from the public arena. Furthermore, the systems of codes and analogies that a culture uses and reproduces in its media are a perfect site to further investigate the sets of beliefs that a society holds true or privileges over others as defining traits. In order to do such investigation, this dissertation develops three archetypes of representation that classify and make sense of all the movies and their representations, highlighting the recurring tropes, narrative tools or privileged ideological discourses embedded in them. By organizing the titles in archetypes, but also paying attention to their temporality and social changes around transgender issues, this dissertation investigates the codification and representation of transgenderism in Spanish film as a site for discursive formation of the transgender identity, but also as a space of social struggle for the meaning of sex, gender and sexuality. The two first archetypes (the Criminal and the Patient) correspond to representations with a heavy reliance on the legal or medical situation of the character respectively, whereas the third one (the Empowered) lets us see how representation can transcend medical and legal definitions and give autonomy and a voice to the character that the previous two somehow negate. Furthermore, the three of them overlap in some of the movies, negating the possibility of fixed and monolithic categories, and highlighting the limits of these archetypes, which are used as a tool to understand the different discourses rather than classify and label each of the characters. This dissertation, then, explores the different representations or archetypes that are used to portray transgender people through the case studies found in contemporary Spanish cinema, with the goal of unpacking the continuities and ruptures of sex and gender politics in the last 40 years of political change in Spain

    Historische Sozialforschung: Auswahlbibliographie 1975-2000

    Remote Sensing Applications to Support Locust Management and Research: Evaluating the Potential of Earth Observation for Locust Outbreaks in Different Regions

    This dissertation focuses on satellite remote sensing applications for locust management and additional contributions to locust research. Specifically, the remote sensing-based characterization and interpretation of land surface cover and its dynamics are addressed with a special emphasis on the requirements of different locust species. At first, the aim of this dissertation is to provide a holistic overview of the existing applications using satellite data focusing on different locust species and thus, to present current and new opportunities. Furthermore, remote sensing and geospatial datasets are used in a model to categorize areas with ideal and less than ideal conditions for locust outbreaks. The benefit of up-to-date remote sensing data for preventive locust management is demonstrated using time-series-based Sentinel-2 land cover classification. Due to the diversity of the numerous locust species and their spatial distribution in different geographical locations, this research focuses mainly on two locust species, the Italian locust (Calliptamus italicus) and the Moroccan locust (Dociostaurus maroccanus), as well as on selected study areas within their extensive habitats, respectively. Both selected locust species caused numerous damages in Europe, the Caucasus, Central Asia and North Africa in the past. For both species, there is only a limited number of publications exploiting the capabilities of remote sensing methods. Therefore, this dissertation aims to explore the potential approaches of Earth observation datasets to support preventive locust management and research for both species.Die vorliegende Dissertation beschäftigt sich mit dem Einsatz der Satellitenfernerkundung im Bereich Heuschreckenmanagement und -forschung. Die fernerkundungsbasierte Charakterisierung und Interpretation der Landoberflächen-bedeckung und deren Dynamik stehen dabei - mit Fokus auf die Anforderungen der verschiedenen Heuschreckenarten - im Vordergrund. Ziel dieser Dissertation ist es zunächst, einen ganzheitlichen Überblick über vorhandene Anwendungen von Satellitendaten im Kontext Heuschreckenmanagement zu erarbeiten. Des Weiteren werden fernerkundungs- und geobasierten Datensätzen in einem Model verwendet, um Flächen mit idealen bzw. weniger idealen Bedingungen für Heuschreckenausbrüche zu kategorisieren. Der Vorteil von aktuellen Fernerkundungsdaten für präventives Heuschreckenmanagement wird anhand zeitreihenbasierten Sentinel-2 Landbedeckungsklassifikation demonstriert. Aufgrund der Vielfältigkeit der zahlreichen Heuschreckenarten und deren räumlicher Verteilung in verschiedenen geographischen Lagen, konzentriert sich diese Arbeit im Wesentlichen auf zwei Heuschreckenarten, die Italienische Schönschrecke (Calliptamus italicus) und die Marokkanische Wanderheuschrecke (Dociostaurus maroccanus), sowie auf ausgewählte Studiengebiete innerhalb deren weiträumigen Habitaten. Beide Heuschreckenarten verursachten zahlreiche Ausbrüche in der Vergangenheit mit Schäden in Europa, dem Kaukasus, Zentralasien und Nordafrika. Für beide Heuschreckenarten existieren nur wenige Forschungsarbeiten, die sich mit der Anwendung von Fern-erkundungsdaten auseinandersetzen. Vor diesem Hintergrund zielt diese Dissertation auf die Entwicklung von relevanten Methoden unter Einsatz von Fernerkundungsdaten für beide Heuschreckenarten ab, um präventives Heuschreckenmanagement und -forschung zu unterstützen.Данная диссертация раскрывает тему применения спутникового дистанционного зондирования для контроля саранчовых и проведения дополнительных исследований саранчи. В частности, особое внимание уделяется изучению потребностей различных видов саранчовых при описании характеристик земного покрова и его динамики на основе данных дистанционного зондирования. Первостепенная цель данной диссертации состоит в том, чтобы предоставить целостный обзор существующих приложений, использующих спутниковые данные, в разрезе различных видов саранчовых для того, чтобы раскрыть текущие и потенциальные возможности. Кроме того, дистанционное зондирование и наборы геопространственных данных используются для классификации территорий с идеальными и не идеальными условиями для нашествий саранчи. исследование сосредоточено в основном на двух видах саранчи, итальянского пруса (Calliptamus italicus) и марокканской саранче (Dociostaurus maroccanus), а также на определенных территориях, в пределах их обширногo местообитаний

    Competitive Risaralda, generating research alliance for development

    El presente libro lleva como título “Risaralda competitiva, generando alianzas en investigación para el desarrollo”, resultado del V encuentro de investigadores del departamento de Risaralda realizado en el mes de noviembre del año 2020. Evento en el cual se presentaron las últimas investigaciones realizadas en las diferentes instituciones educativas del departamento; quienes hacen parte de la Mesa de Investigaciones de Risaralda; ejercicio de gran interés que arroja resultados de investigaciones en diferentes áreas como son las Ciencias Agrícolas, Ciencias sociales, Ciencias de la salud, Ciencias de la tecnología y la información