4,003 research outputs found
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system â GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
Federated Learning for Predictive Healthcare Analytics: From theory to real world applications
In the contemporary landscape, machine learning has a pervasive impact across virtually all industries. However, the success of these systems hinges on the accessibility of training data. In today's world, every device generates data, which can serve as the building blocks for future technologies. Conventional machine learning methods rely on centralized data for training, but the availability of sufficient and valid data is often hindered by privacy concerns. Data privacy is the main concern while developing a healthcare system. One of the technique which allow decentralized learning is Federated Learning. Researchers have been actively applying this approach in various domains and have received a positive response. This paper underscores the significance of employing Federated Learning in the healthcare sector, emphasizing the wealth of data present in hospitals and electronic health records that could be used to train medical systems
Improving Cross-Lingual Transfer Learning for Event Detection
The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning
Neuroimaging investigations of cortical specialisation for different types of semantic knowledge
Embodied theories proposed that semantic knowledge is grounded in motor and perceptual experiences. This leads to two questions: (1) whether the neural underpinnings of perception are also necessary for semantic cognition; (2) how do biases towards different sensorimotor experiences cause brain regions to specialise for particular types of semantic information. This thesis tackles these questions in a series of neuroimaging and behavioural investigations.
Regarding question 1, strong embodiment theory holds that semantic representation is reenactment of corresponding experiences, and brain regions for perception are necessary for comprehending modality-specific concepts. However, the weak embodiment view argues that reenactment may not be necessary, and areas near to perceiving regions may be sufficient to support semantic representation.
In the particular case of motion concepts, lateral occipital temporal cortex (LOTC) has been long identified as an important area, but the roles of its different subregions are still uncertain. Chapter 3 examined how different parts of LOTC reacted to written descriptions of motion and static events, using multiple analysis methods. A series of anterior to posterior sub-regions were analyzed through univariate, multivariate pattern analysis (MVPA), and psychophysical interaction (PPI) analyses. MVPA revealed strongest decoding effects for motion vs. static events in the posterior parts of LOTC, including both visual motion area (V5) and posterior middle temporal gyrus (pMTG). In contrast, only the middle portion of LOTC showed increased activation for motion sentences in univariate analyses. PPI analyses showed increased functional connectivity between posterior LOTC and the multiple demand network for motion events. These findings suggest that posterior LOTC, which overlapped with the motion perception V5 region, is selectively involved in comprehending motion events, while the anterior part of LOTC contributes to general semantic processing.
Regarding question 2, the hub-and-spoke theory suggests that anterior temporal lobe (ATL) acts as a hub, using inputs from modality-specific regions to construct multimodal concepts. However, some researchers propose temporal parietal cortex (TPC) as an additional hub, specialised in processing and integrating interaction and contextual information (e.g., for actions and locations). These hypotheses are summarized as the "dual-hub theory" and different aspects of this theory were investigated in in Chapters 4 and 5.
Chapter 4 focuses on taxonomic and thematic relations. Taxonomic relations (or categorical relations) occur when two concepts belong to the same category (e.g., âdogâ and âwolfâ are both canines). In contrast, thematic relations (or associative relations) refer to situations that two concepts co-occur in events or scenes (e.g., âdogâ and âboneâ), focusing on the interaction or association between concepts. Some studies have indicated ATL specialization for taxonomic relations and TPC specialization for thematic relations, but others have reported inconsistent or even converse results. Thus Chapter 4 first conducted an activation likelihood estimation (ALE) meta-analysis of neuroimaging studies contrasting taxonomic and thematic relations. This found that thematic relations reliably engage action and location processing regions (left pMTG and SMG), while taxonomic relations only showed consistent effects in the right occipital lobe. A primed semantic judgement task was then used to test the dual-hub theoryâs prediction that taxonomic relations are heavily reliant on colour and shape knowledge, while thematic relations rely on action and location knowledge. This behavioural experiment revealed that action or location priming facilitated thematic relation processing, but colour and shape did not lead to priming effects for taxonomic relations. This indicates that thematic relations rely more on action and location knowledge, which may explain why the preferentially engage TPC, whereas taxonomic relations are not specifically linked to shape and colour features. This may explain why they did not preferentially engage left ATL.
Chapter 5 concentrates on event and object concepts. Previous studies suggest ATL specialization for coding similarity of objectsâ semantics, and angular gyrus (AG) specialization for sentence and event structure representation. In addition, in neuroimaging studies, event semantics are usually investigated using complex temporally extended stimuli, unlike than the single-concept stimuli used to investigate object semantics. Thus chapter 5 used representational similarity analysis (RSA), univariate analysis, and PPI analysis to explore neural activation patterns for event and object concepts presented as static images. Bilateral AGs encoded semantic similarity for event concepts, with the left AG also coding object similarity. Bilateral ATLs encoded semantic similarity for object concepts but also for events. Left ATL exhibited stronger coding for events than objects. PPI analysis revealed stronger connections between left ATL and right pMTG, and between right AG and bilateral inferior temporal gyrus (ITG) and middle occipital gyrus, for event concepts compared to object concepts. Consistent with the meta-analysis in chapter 4, the results in chapter 5 support the idea of partial specialization in AG for event semantics but do not support ATL specialization for object semantics. In fact, both the meta-analysis and chapter 5 findings suggest greater ATL involvement in coding objects' associations compared to their similarity.
To conclude, the thesis provides support for the idea that perceptual brain regions are engaged in conceptual processing, in the case of motion concepts. It also provides evidence for a specialised role for TPC regions in processing thematic relations (pMTG) and event concepts (AG). There was mixed evidence for specialisation within the ATLs and this remains an important target for future research
Loop closure detection of visual SLAM based on variational autoencoder
Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance
OpenAGI: When LLM Meets Domain Experts
Human intelligence has the remarkable ability to assemble basic skills into
complex ones so as to solve complex tasks. This ability is equally important
for Artificial Intelligence (AI), and thus, we assert that in addition to the
development of large, comprehensive intelligent models, it is equally crucial
to equip such models with the capability to harness various domain-specific
expert models for complex task-solving in the pursuit of Artificial General
Intelligence (AGI). Recent developments in Large Language Models (LLMs) have
demonstrated remarkable learning and reasoning abilities, making them promising
as a controller to select, synthesize, and execute external models to solve
complex tasks. In this project, we develop OpenAGI, an open-source AGI research
platform, specifically designed to offer complex, multi-step tasks and
accompanied by task-specific datasets, evaluation metrics, and a diverse range
of extensible models. OpenAGI formulates complex tasks as natural language
queries, serving as input to the LLM. The LLM subsequently selects,
synthesizes, and executes models provided by OpenAGI to address the task.
Furthermore, we propose a Reinforcement Learning from Task Feedback (RLTF)
mechanism, which uses the task-solving result as feedback to improve the LLM's
task-solving ability. Thus, the LLM is responsible for synthesizing various
external models for solving complex tasks, while RLTF provides feedback to
improve its task-solving ability, enabling a feedback loop for self-improving
AI. We believe that the paradigm of LLMs operating various expert models for
complex task-solving is a promising approach towards AGI. To facilitate the
community's long-term improvement and evaluation of AGI's ability, we
open-source the code, benchmark, and evaluation methods of the OpenAGI project
at https://github.com/agiresearch/OpenAGI.Comment: 18 pages, 6 figures, 7 table
Medical Image Analysis using Deep Relational Learning
In the past ten years, with the help of deep learning, especially the rapid
development of deep neural networks, medical image analysis has made remarkable
progress. However, how to effectively use the relational information between
various tissues or organs in medical images is still a very challenging
problem, and it has not been fully studied. In this thesis, we propose two
novel solutions to this problem based on deep relational learning. First, we
propose a context-aware fully convolutional network that effectively models
implicit relation information between features to perform medical image
segmentation. The network achieves the state-of-the-art segmentation results on
the Multi Modal Brain Tumor Segmentation 2017 (BraTS2017) and Multi Modal Brain
Tumor Segmentation 2018 (BraTS2018) data sets. Subsequently, we propose a new
hierarchical homography estimation network to achieve accurate medical image
mosaicing by learning the explicit spatial relationship between adjacent
frames. We use the UCL Fetoscopy Placenta dataset to conduct experiments and
our hierarchical homography estimation network outperforms the other
state-of-the-art mosaicing methods while generating robust and meaningful
mosaicing result on unseen frames.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0778
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
Text-to-image generative models have attracted rising attention for flexible
image editing via user-specified descriptions. However, text descriptions alone
are not enough to elaborate the details of subjects, often compromising the
subjects' identity or requiring additional per-subject fine-tuning. We
introduce a new framework called \textit{Paste, Inpaint and Harmonize via
Denoising} (PhD), which leverages an exemplar image in addition to text
descriptions to specify user intentions. In the pasting step, an off-the-shelf
segmentation model is employed to identify a user-specified subject within an
exemplar image which is subsequently inserted into a background image to serve
as an initialization capturing both scene context and subject identity in one.
To guarantee the visual coherence of the generated or edited image, we
introduce an inpainting and harmonizing module to guide the pre-trained
diffusion model to seamlessly blend the inserted subject into the scene
naturally. As we keep the pre-trained diffusion model frozen, we preserve its
strong image synthesis ability and text-driven ability, thus achieving
high-quality results and flexible editing with diverse texts. In our
experiments, we apply PhD to both subject-driven image editing tasks and
explore text-driven scene generation given a reference subject. Both
quantitative and qualitative comparisons with baseline methods demonstrate that
our approach achieves state-of-the-art performance in both tasks. More
qualitative results can be found at
\url{https://sites.google.com/view/phd-demo-page}.Comment: 10 pages, 12 figure
La traduzione specializzata allâopera per una piccola impresa in espansione: la mia esperienza di internazionalizzazione in cinese di Bioretics© S.r.l.
Global markets are currently immersed in two all-encompassing and unstoppable processes: internationalization and globalization. While the former pushes companies to look beyond the borders of their country of origin to forge relationships with foreign trading partners, the latter fosters the standardization in all countries, by reducing spatiotemporal distances and breaking down geographical, political, economic and socio-cultural barriers. In recent decades, another domain has appeared to propel these unifying drives: Artificial Intelligence, together with its high technologies aiming to implement human cognitive abilities in machinery. The âLanguage Toolkit â Le lingue straniere al servizio dellâinternazionalizzazione dellâimpresaâ project, promoted by the Department of Interpreting and Translation (ForlĂŹ Campus) in collaboration with the Romagna Chamber of Commerce (ForlĂŹ-Cesena and Rimini), seeks to help Italian SMEs make their way into the global market. It is precisely within this project that this dissertation has been conceived. Indeed, its purpose is to present the translation and localization project from English into Chinese of a series of texts produced by Bioretics© S.r.l.: an investor deck, the company website and part of the installation and use manual of the Aliquis© framework software, its flagship product. This dissertation is structured as follows: Chapter 1 presents the project and the company in detail; Chapter 2 outlines the internationalization and globalization processes and the Artificial Intelligence market both in Italy and in China; Chapter 3 provides the theoretical foundations for every aspect related to Specialized Translation, including website localization; Chapter 4 describes the resources and tools used to perform the translations; Chapter 5 proposes an analysis of the source texts; Chapter 6 is a commentary on translation strategies and choices
- âŠ