    Application of Channel Modeling for Indoor Localization Using TOA and RSS

    Recently considerable attention has been paid to indoor geolocation using wireless local area networks (WLAN) and wireless personal area networks (WPAN) devices. As more applications using these technologies are emerging in the market, the need for accurate and reliable localization increases. In response to this need, a number of technologies and associated algorithms have been introduced in the literature. These algorithms resolve the location either by using estimated distances between a mobile station (MS) and at least three reference points (via triangulation) or pattern recognition through radio frequency (RF) fingerprinting. Since RF fingerprinting, which requires on site measurements is a time consuming process, it is ideal to replace this procedure with the results obtained from radio channel modeling techniques. Localization algorithms either use the received signal strength (RSS) or time of arrival (TOA) of the received signal as their localization metric. TOA based systems are sensitive to the available bandwidth, and also to the occurrence of undetected direct path (UDP) channel conditions, while RSS based systems are less sensitive to the bandwidth and more resilient to UDP conditions. Therefore, the comparative performance evaluation of different positioning systems is a multifaceted and challenging problem. This dissertation demonstrates the viability of radio channel modeling techniques to eliminate the costly fingerprinting process in pattern recognition algorithms by introducing novel ray tracing (RT) assisted RSS and TOA based algorithms. Two sets of empirical data obtained by radio channel measurements are used to create a baseline for comparative performance evaluation of localization algorithms. The first database is obtained by WiFi RSS measurements in the first floor of the Atwater Kent laboratory; an academic building on the campus of WPI; and the other by ultra wideband (UWB) channel measurements in the third floor of the same building. Using the results of measurement campaign, we specifically analyze the comparative behavior of TOA- and RSS-based indoor localization algorithms employing triangulation or pattern recognition with different bandwidths adopted in WLAN and WPAN systems. Finally, we introduce a new RT assisted hybrid RSS-TOA based algorithm which employs neural networks. The resulting algorithm demonstrates a superior performance compared to the conventional RSS and TOA based algorithms in wideband systems

    RF Localization in Indoor Environment

    In this paper indoor localization system based on the RF power measurements of the Received Signal Strength (RSS) in WLAN environment is presented. Today, the most viable solution for localization is the RSS fingerprinting based approach, where in order to establish a relationship between RSS values and location, different machine learning approaches are used. The advantage of this approach based on WLAN technology is that it does not need new infrastructure (it reuses already and widely deployed equipment), and the RSS measurement is part of the normal operating mode of wireless equipment. We derive the Cramer-Rao Lower Bound (CRLB) of localization accuracy for RSS measurements. In analysis of the bound we give insight in localization performance and deployment issues of a localization system, which could help designing an efficient localization system. To compare different machine learning approaches we developed a localization system based on an artificial neural network, k-nearest neighbors, probabilistic method based on the Gaussian kernel and the histogram method. We tested the developed system in real world WLAN indoor environment, where realistic RSS measurements were collected. Experimental comparison of the results has been investigated and average location estimation error of around 2 meters was obtained

    Transparent Location Fingerprinting for Wireless Services

    Detecting the user location is crucial in a wireless environment, not only for the choice of first-hop communication partners, but also for many auxiliary purposes: Quality of Service (availability of information in the right place for reduced congestion/delay, establishment of the optimal path), energy consumption, automated insertion of location-dependent info into a web query issued by a user (for example a tourist asking informations about a monument or a restaurant, a fireman approaching a disaster area). The technique we propose in our investigation tries to meet two main goals: transparency to the network and independence from the environment. A user entering an environment (for instance a wireless-networked building) shall be able to use his own portable equipment to build a personal map of the environment without the system even noticing it. Preliminary tests allow us to detect position on a map with an average uncertainty of two meters when using information gathered from three IEEE802.11 access points in an indoor environment composed of many rooms on a 625sqm area. Performance is expected to improve when more access points will be exploited in the test area. Implementation of the same techniques on Bluetooth are also being studied

    Report on the Information Retrieval Festival (IRFest2017)

    The Information Retrieval Festival took place in April 2017 in Glasgow. The focus of the workshop was to bring together IR researchers from the various Scottish universities and beyond in order to facilitate more awareness, increased interaction and reflection on the status of the field and its future. The program included an industry session, research talks, demos and posters as well as two keynotes. The first keynote was delivered by Prof. Jaana Kekalenien, who provided a historical, critical reflection of realism in Interactive Information Retrieval Experimentation, while the second keynote was delivered by Prof. Maarten de Rijke, who argued for more Artificial Intelligence usage in IR solutions and deployments. The workshop was followed by a "Tour de Scotland" where delegates were taken from Glasgow to Aberdeen for the European Conference in Information Retrieval (ECIR 2017

    Detecting and locating trending places using multimodal social network data

    This paper presents a machine learning-based classifier for detecting points of interest through the combined use of images and text from social networks. This model exploits the transfer learning capabilities of the neural network architecture CLIP (Contrastive Language-Image Pre-Training) in multimodal environments using image and text. Different methodologies based on multimodal information are explored for the geolocation of the places detected. To this end, pre-trained neural network models are used for the classification of images and their associated texts. The result is a system that allows creating new synergies between images and texts in order to detect and geolocate trending places that has not been previously tagged by any other means, providing potentially relevant information for tasks such as cataloging specific types of places in a city for the tourism industry. The experiments carried out reveal that, in general, textual information is more accurate and relevant than visual cues in this multimodal setting.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research has been partially funded by project “Desarrollo de un ecosistema de datos abiertos para transformar el sector turístico” (GVA-COVID19/2021/103) funded by Conselleria de Innovación, Universidades, Ciencia y Sociedad Digital de la Generalitat Valenciana, “A way of making Europe” European Regional Development Fund (ERDF) and MCIN/AEI/10.13039/501100011033 for supporting this work under the “CHAN-TWIN” project (grant TED2021-130890B-C21) and the HORIZON-MSCA-2021-SE-0 action number: 101086387, REMARKABLE, Rural Environmental Monitoring via ultra wide-ARea networKs And distriButed federated Learning. We also would like to thank Nvidia for their generous hardware donations that made these experiments possible

    Location-aware computing: a neural network model for determining location in wireless LANs

    The strengths of the RF signals arriving from more access points in a wireless LANs are related to the position of the mobile terminal and can be used to derive the location of the user. In a heterogeneous environment, e.g. inside a building or in a variegated urban geometry, the received power is a very complex function of the distance, the geometry, the materials. The complexity of the inverse problem (to derive the position from the signals) and the lack of complete information, motivate to consider flexible models based on a network of functions (neural networks). Specifying the value of the free parameters of the model requires a supervised learning strategy that starts from a set of labeled examples to construct a model that will then generalize in an appropriate manner when confronted with new data, not present in the training set. The advantage of the method is that it does not require ad-hoc infrastructure in addition to the wireless LAN, while the flexible modeling and learning capabilities of neural networks achieve lower errors in determining the position, are amenable to incremental improvements, and do not require the detailed knowledge of the access point locations and of the building characteristics. A user needs only a map of the working space and a small number of identified locations to train a system, as evidenced by the experimental results presented

    Unsupervised quantification of entity consistency between photos and text in real-world news

    Das World Wide Web und die sozialen Medien übernehmen im heutigen Informationszeitalter eine wichtige Rolle für die Vermittlung von Nachrichten und Informationen. In der Regel werden verschiedene Modalitäten im Sinne der Informationskodierung wie beispielsweise Fotos und Text verwendet, um Nachrichten effektiver zu vermitteln oder Aufmerksamkeit zu erregen. Kommunikations- und Sprachwissenschaftler erforschen das komplexe Zusammenspiel zwischen Modalitäten seit Jahrzehnten und haben unter Anderem untersucht, wie durch die Kombination der Modalitäten zusätzliche Informationen oder eine neue Bedeutungsebene entstehen können. Die Anzahl gemeinsamer Konzepte oder Entitäten (beispielsweise Personen, Orte und Ereignisse) zwischen Fotos und Text stellen einen wichtigen Aspekt für die Bewertung der Gesamtaussage und Bedeutung eines multimodalen Artikels dar. Automatisierte Ansätze zur Quantifizierung von Bild-Text-Beziehungen können für zahlreiche Anwendungen eingesetzt werden. Sie ermöglichen beispielsweise eine effiziente Exploration von Nachrichten, erleichtern die semantische Suche von Multimedia-Inhalten in (Web)-Archiven oder unterstützen menschliche Analysten bei der Evaluierung der Glaubwürdigkeit von Nachrichten. Allerdings gibt es bislang nur wenige Ansätze, die sich mit der Quantifizierung von Beziehungen zwischen Fotos und Text beschäftigen. Diese Ansätze berücksichtigen jedoch nicht explizit die intermodalen Beziehungen von Entitäten, welche eine wichtige Rolle in Nachrichten darstellen, oder basieren auf überwachten multimodalen Deep-Learning-Techniken. Diese überwachten Lernverfahren können ausschließlich die intermodalen Beziehungen von Entitäten detektieren, die in annotierten Trainingsdaten enthalten sind. Um diese Forschungslücke zu schließen, wird in dieser Arbeit ein unüberwachter Ansatz zur Quantifizierung der intermodalen Konsistenz von Entitäten zwischen Fotos und Text in realen multimodalen Nachrichtenartikeln vorgestellt. Im ersten Teil dieser Arbeit werden neuartige Verfahren auf Basis von Deep Learning zur Extrahierung von Informationen aus Fotos vorgestellt, um Ereignisse (Events), Orte, Zeitangaben und Personen automatisch zu erkennen. Diese Verfahren bilden eine wichtige Voraussetzung, um die Beziehungen von Entitäten zwischen Bild und Text zu bewerten. Zunächst wird ein Ansatz zur Ereignisklassifizierung präsentiert, der neuartige Optimierungsfunktionen und Gewichtungsschemata nutzt um Ontologie-Informationen aus einer Wissensdatenbank in ein Deep-Learning-Verfahren zu integrieren. Das Training erfolgt anhand eines neu vorgestellten Datensatzes, der 570.540 Fotos und eine Ontologie mit 148 Ereignistypen enthält. Der Ansatz übertrifft die Ergebnisse von Referenzsystemen die keine strukturierten Ontologie-Informationen verwenden. Weiterhin wird ein DeepLearning-Ansatz zur Schätzung des Aufnahmeortes von Fotos vorgeschlagen, der Kontextinformationen über die Umgebung (Innen-, Stadt-, oder Naturaufnahme) und von Erdpartitionen unterschiedlicher Granularität verwendet. Die vorgeschlagene Lösung übertrifft die bisher besten Ergebnisse von aktuellen Forschungsarbeiten, obwohl diese deutlich mehr Fotos zum Training verwenden. Darüber hinaus stellen wir den ersten Datensatz zur Schätzung des Aufnahmejahres von Fotos vor, der mehr als eine Million Bilder aus den Jahren 1930 bis 1999 umfasst. Dieser Datensatz wird für das Training von zwei Deep-Learning-Ansätzen zur Schätzung des Aufnahmejahres verwendet, welche die Aufgabe als Klassifizierungs- und Regressionsproblem behandeln. Beide Ansätze erzielen sehr gute Ergebnisse und übertreffen Annotationen von menschlichen Probanden. Schließlich wird ein neuartiger Ansatz zur Identifizierung von Personen des öffentlichen Lebens und ihres gemeinsamen Auftretens in Nachrichtenfotos aus der digitalen Bibliothek Internet Archiv präsentiert. Der Ansatz ermöglicht es unstrukturierte Webdaten aus dem Internet Archiv mit Metadaten, beispielsweise zur semantischen Suche, zu erweitern. Experimentelle Ergebnisse haben die Effektivität des zugrundeliegenden Deep-Learning-Ansatzes zur Personenerkennung bestätigt. Im zweiten Teil dieser Arbeit wird ein unüberwachtes System zur Quantifizierung von BildText-Beziehungen in realen Nachrichten vorgestellt. Im Gegensatz zu bisherigen Verfahren liefert es automatisch neuartige Maße der intermodalen Konsistenz für verschiedene Entitätstypen (Personen, Orte und Ereignisse) sowie den Gesamtkontext. Das System ist nicht auf vordefinierte Datensätze angewiesen, und kann daher mit der Vielzahl und Diversität von Entitäten und Themen in Nachrichten umgehen. Zur Extrahierung von Entitäten aus dem Text werden geeignete Methoden der natürlichen Sprachverarbeitung eingesetzt. Examplarbilder für diese Entitäten werden automatisch aus dem Internet beschafft. Die vorgeschlagenen Methoden zur Informationsextraktion aus Fotos werden auf die Nachrichten- und heruntergeladenen Exemplarbilder angewendet, um die intermodale Konsistenz von Entitäten zu quantifizieren. Es werden zwei Aufgaben untersucht um die Qualität des vorgeschlagenen Ansatzes in realen Anwendungen zu bewerten. Experimentelle Ergebnisse für die Dokumentverifikation und die Beschaffung von Nachrichten mit geringer (potenzielle Fehlinformation) oder hoher multimodalen Konsistenz zeigen den Nutzen und das Potenzial des Ansatzes zur Unterstützung menschlicher Analysten bei der Untersuchung von Nachrichten.In today’s information age, the World Wide Web and social media are important sources for news and information. Different modalities (in the sense of information encoding) such as photos and text are typically used to communicate news more effectively or to attract attention. Communication scientists, linguists, and semioticians have studied the complex interplay between modalities for decades and investigated, e.g., how their combination can carry additional information or add a new level of meaning. The number of shared concepts or entities (e.g., persons, locations, and events) between photos and text is an important aspect to evaluate the overall message and meaning of an article. Computational models for the quantification of image-text relations can enable many applications. For example, they allow for more efficient exploration of news, facilitate semantic search and multimedia retrieval in large (web) archives, or assist human assessors in evaluating news for credibility. To date, only a few approaches have been suggested that quantify relations between photos and text. However, they either do not explicitly consider the cross-modal relations of entities – which are important in the news – or rely on supervised deep learning approaches that can only detect the cross-modal presence of entities covered in the labeled training data. To address this research gap, this thesis proposes an unsupervised approach that can quantify entity consistency between photos and text in multimodal real-world news articles. The first part of this thesis presents novel approaches based on deep learning for information extraction from photos to recognize events, locations, dates, and persons. These approaches are an important prerequisite to measure the cross-modal presence of entities in text and photos. First, an ontology-driven event classification approach that leverages new loss functions and weighting schemes is presented. It is trained on a novel dataset of 570,540 photos and an ontology with 148 event types. The proposed system outperforms approaches that do not use structured ontology information. Second, a novel deep learning approach for geolocation estimation is proposed that uses additional contextual information on the environmental setting (indoor, urban, natural) and from earth partitions of different granularity. The proposed solution outperforms state-of-the-art approaches, which are trained with significantly more photos. Third, we introduce the first large-scale dataset for date estimation with more than one million photos taken between 1930 and 1999, along with two deep learning approaches that treat date estimation as a classification and regression problem. Both approaches achieve very good results that are superior to human annotations. Finally, a novel approach is presented that identifies public persons and their co-occurrences in news photos extracted from the Internet Archive, which collects time-versioned snapshots of web pages that are rarely enriched with metadata relevant to multimedia retrieval. Experimental results confirm the effectiveness of the deep learning approach for person identification. The second part of this thesis introduces an unsupervised approach capable of quantifying image-text relations in real-world news. Unlike related work, the proposed solution automatically provides novel measures of cross-modal consistency for different entity types (persons, locations, and events) as well as the overall context. The approach does not rely on any predefined datasets to cope with the large amount and diversity of entities and topics covered in the news. State-of-the-art tools for natural language processing are applied to extract named entities from the text. Example photos for these entities are automatically crawled from the Web. The proposed methods for information extraction from photos are applied to both news images and example photos to quantify the cross-modal consistency of entities. Two tasks are introduced to assess the quality of the proposed approach in real-world applications. Experimental results for document verification and retrieval of news with either low (potential misinformation) or high cross-modal similarities demonstrate the feasibility of the approach and its potential to support human assessors to study news

    Modeling the Behavior of Multipath Components Pertinent to Indoor Geolocation

    Recently, a number of empirical models have been introduced in the literature for the behavior of direct path used in the design of algorithms for RF based indoor geolocation. Frequent absence of direct path has been a major burden on the performance of these algorithms directing researchers to discover algorithms using multipath diversity. However, there is no reliable model for the behavior of multipath components pertinent to precise indoor geolocation. In this dissertation, we first examine the absence of direct path by statistical analysis of empirical data. Then we show how the concept of path persistency can be exploited to obtain accurate ranging using multipath diversity. We analyze the effects of building architecture on the multipath structure by demonstrating the effects of wall length and wall density on the path persistency. Finally, we introduce a comprehensive model for the spatial behavior of multipath components. We use statistical analysis of empirical data obtained by a measurement calibrated ray-tracing tool to model the time-of- arrival, angle-of-arrival and path gains. The relationship between the transmitter-receiver separation and the number of paths are also incorporated in our model. In addition, principles of ray optics are applied to explain the spatial evolution of path gains, time-of-arrival and angle-of-arrival of individual multipath components as a mobile terminal moves inside a typical indoor environment. We also use statistical modeling for the persistency and birth/death rate of the paths

    A ubiquitous service-oriented automatic optical inspection platform for textile industry

    Within a highly competitive market context, quality standards are vital for the textile industry, in which related procedures to assess respective manufacture still mainly rely on human-based visual inspection. Thereby, factors such as ergonomics, analytical subjectivity, tiredness and error susceptibility affect the employee's performance and comfort in particular and impact the economic healthiness of each company operating in this industry, generally. In this paper, a defect detection-oriented platform for quality control in the textile industry is proposed to tackle these issues and respective impacts, combining computer vision, deep learning, geolocation and communication technologies. The system under development can integrate and improve the production ecosystem of a textile company through a properly adapted information technology setup and associated functionalities such as automatic defect detection and classification, real-time monitoring of operators, among others.This work was financed by the project “Smart Production Process” (No. POCI-01-0247-FEDER-045366), supported under the Incentive System for Research and Technological Development - Business R&DT (Individual Projects)