123 research outputs found
Tailored deep learning techniques for information retrieval
La recherche d'information vise Ă trouver des documents pertinents par rapport Ă une requĂȘte. Auparavant, de nombreux modĂšles traditionnels de la Recherche d'Informations ont Ă©tĂ© proposĂ©s. Ils essaient soit d'encoder la requĂȘte et les documents en vecteurs dans l'espace des termes et d'estimer la pertinence en calculant la similaritĂ© des deux vecteurs, soit d'estimer la pertinence par des modĂšles probabilistes. Cependant, pour les modĂšles d'espace vectoriel, l'encodage des requĂȘtes et des documents dans l'espace des termes a ses limites: par exemple, il est difficile d'identifier les termes du document qui ont des sens similaires au termes exactes de la requĂȘte. Il est Ă©galement difficile de reprĂ©senter le contenu du texte Ă diffĂ©rents niveaux d'abstraction pouvant correspondre aux besoins diffĂ©rents d'information exprimĂ©s dans des requĂȘtes.
Avec le dĂ©veloppement rapide des techniques d'apprentissage profond, il est possible d'apprendre des reprĂ©sentations utiles Ă travers une sĂ©rie de couches neurones, ce qui ouvre la voie Ă de meilleures reprĂ©sentations dans un espace dense latent plutĂŽt que dans l'espace des termes, ce qui peut aider Ă identifier les termes non exactes mais qui portent les sens similaires. Il nous permet Ă©galement de crĂ©er de diffĂ©rentes couches de reprĂ©sentation pour la requĂȘte et le document, permettant ainsi des correspondances entre la requĂȘte et les documents Ă diffĂ©rents niveaux d'abstractions, ce qui peut mieux rĂ©pondre aux besoins d'informations pour diffĂ©rents types de requĂȘtes. Enfin, les techniques d'apprentissage profond permettent Ă©galement d'apprendre une meilleure fonction d'appariement.
Dans cette thÚse, nous explorons différentes techniques d'apprentissage profond pour traiter ces problÚmes.
Nous Ă©tudions d'abord la construction de plusieurs couches de reprĂ©sentation avec diffĂ©rents niveaux d'abstraction entre la requĂȘte et le document, pour des modĂšles basĂ©s sur la reprĂ©sentation et l'interaction.
Nous proposons ensuite un modĂšle permettant de faire les matchings croisĂ©s des representations entre la requĂȘte et le document sur diffĂ©rentes couches pour mieux rĂ©pondre au besoin de correspondance terme-phrase. Enfin, nous explorons l'apprentissage intĂ©grĂ© d'une fonction de rang et les reprĂ©sentations de la requĂȘte et du document.
Des expériences sur des jeux de données publics ont montré que nos méthods proposées dans cette thÚse sont plus performantes que les méthodes existantes.Information Retrieval aims to find relevant documents to a query. Previously many traditional information retrieval models have been proposed. They either try to encode query and documents into vectors in term space and estimate the relevance by computing the similarity of the two vectors or estimate the relevance by probabilistic models. However for vector space models, encoding query and documents into term space has its limitations: for example, it's difficult to catch terms of similar meanings to the exact query term in the document. It is also difficult to represent the text in a hierarchy of abstractions to better match the information need expressed in the query.
With the fast development of deep learning techniques, it is possible to learn useful representations through a series of neural layers, which paves the way to learn better representations in latent dense space rather the term space, which may help to match the non exact matched but similar terms. It also allows us to create different layers of representation for query and document thereby enabling matchings between query and documents at different levels of abstractions, which may better serve the information needs for different queries. Finally, deep learning techniques also allows to learn better ranking function.
In this thesis, we explore several deep learning techniques to deal with the above problems.
First, we study the effectiveness of building multiple abstraction layers between query and document, for representation- and interaction-based models. Then we propose a model allowing for cross-matching of query and document representations at different layers to better serve the need of term-phrase matching. Finally we propose an integrated learning framework of ranking function and neural features from query and document.
Experiments on public datasets demonstrate that the methods we propose in this thesis are more effective than the existing ones
Quantum Transport and Band Structure Evolution under High Magnetic Field in Few-Layer Tellurene
Quantum Hall effect (QHE) is a macroscopic manifestation of quantized states
which only occurs in confined two-dimensional electron gas (2DEG) systems.
Experimentally, QHE is hosted in high mobility 2DEG with large external
magnetic field at low temperature. Two-dimensional van der Waals materials,
such as graphene and black phosphorus, are considered interesting material
systems to study quantum transport, because it could unveil unique host
material properties due to its easy accessibility of monolayer or few-layer
thin films at 2D quantum limit. Here for the first time, we report direct
observation of QHE in a novel low-dimensional material system:
tellurene.High-quality 2D tellurene thin films were acquired from recently
reported hydrothermal method with high hole mobility of nearly 3,000 cm2/Vs at
low temperatures, which allows the observation of well-developed
Shubnikov-de-Haas (SdH) oscillations and QHE. A four-fold degeneracy of Landau
levels in SdH oscillations and QHE was revealed. Quantum oscillations were
investigated under different gate biases, tilted magnetic fields and various
temperatures, and the results manifest the inherent information of the
electronic structure of Te. Anomalies in both temperature-dependent oscillation
amplitudes and transport characteristics were observed which are ascribed to
the interplay between Zeeman effect and spin-orbit coupling as depicted by the
density functional theory (DFT) calculations
Regex-augmented Domain Transfer Topic Classification based on a Pre-trained Language Model: An application in Financial Domain
A common way to use large pre-trained language models for downstream tasks is
to fine tune them using additional layers. This may not work well if downstream
domain is a specialized domain whereas the large language model has been
pre-trained on a generic corpus. In this paper, we discuss the use of regular
expression patterns employed as features for domain knowledge during the
process of fine tuning, in addition to domain specific text. Our experiments on
real scenario production data show that this method of fine tuning improves the
downstream text classification tasks as compared to fine tuning only on domain
specific text. We also show that the use of attention network for fine tuning
improves results compared to simple linear layers
Higher superconducting transition temperature by breaking the universal pressure relation
By investigating the bulk superconducting state via dc magnetization
measurements, we have discovered a common resurgence of the superconductive
transition temperatures (Tcs) of the monolayer Bi2Sr2CuO6+{\delta} (Bi2201) and
bilayer Bi2Sr2CaCu2O8+{\delta} (Bi2212) to beyond the maximum Tcs (Tc-maxs)
predicted by the universal relation between Tc and doping (p) or pressure (P)
at higher pressures. The Tc of under-doped Bi2201 initially increases from 9.6
K at ambient to a peak at ~ 23 K at ~ 26 GPa and then drops as expected from
the universal Tc-P relation. However, at pressures above ~ 40 GPa, Tc rises
rapidly without any sign of saturation up to ~ 30 K at ~ 51 GPa. Similarly, the
Tc for the slightly overdoped Bi2212 increases after passing a broad valley
between 20-36 GPa and reaches ~ 90 K without any sign of saturation at ~ 56
GPa. We have therefore attributed this Tc-resurgence to a possible
pressure-induced electronic transition in the cuprate compounds due to a charge
transfer between the Cu 3d_(x^2-y^2 ) and the O 2p bands projected from a
hybrid bonding state, leading to an increase of the density of states at the
Fermi level, in agreement with our density functional theory calculations.
Similar Tc-P behavior has also been reported in the trilayer
Br2Sr2Ca2Cu3O10+{\delta} (Bi2223). These observations suggest that higher Tcs
than those previously reported for the layered cuprate high temperature
superconductors can be achieved by breaking away from the universal Tc-P
relation through the application of higher pressures.Comment: 13 pages, including 5 figure
The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task
The study explores the effectiveness of the Chain-of-Thought approach, known
for its proficiency in language tasks by breaking them down into sub-tasks and
intermediate steps, in improving vision-language tasks that demand
sophisticated perception and reasoning. We present the "Description then
Decision" strategy, which is inspired by how humans process signals. This
strategy significantly improves probing task performance by 50%, establishing
the groundwork for future research on reasoning paradigms in complex
vision-language tasks
Recommended from our members
Single-cell analysis of the developing human ovary defines distinct insights into ovarian somatic and germline progenitors.
Formation of either an ovary or a testis during human embryonic life is one of the most important sex-specific events leading to the emergence of secondary sexual characteristics and sex assignment of babies at birth. Our study focused on the sex-specific and sex-indifferent characteristics of the prenatal ovarian stromal cells, cortical cords, and germline, with the discovery that the ovarian mesenchymal cells of the stroma are transcriptionally indistinguishable from the mesenchymal cells of the testicular interstitium. We found that first-wave pre-granulosa cells emerge at week 7 from early supporting gonadal cells with stromal identity and are spatially defined by KRT19 levels. We also identified rare transient state f0 spermatogonia cells within the ovarian cords between weeks 10 and 16. Taken together, our work illustrates a unique plasticity of the embryonic ovary during human development
A Survey of Large Language Models
Language is essentially a complex, intricate system of human expressions
governed by grammatical rules. It poses a significant challenge to develop
capable AI algorithms for comprehending and grasping a language. As a major
approach, language modeling has been widely studied for language understanding
and generation in the past two decades, evolving from statistical language
models to neural language models. Recently, pre-trained language models (PLMs)
have been proposed by pre-training Transformer models over large-scale corpora,
showing strong capabilities in solving various NLP tasks. Since researchers
have found that model scaling can lead to performance improvement, they further
study the scaling effect by increasing the model size to an even larger size.
Interestingly, when the parameter scale exceeds a certain level, these enlarged
language models not only achieve a significant performance improvement but also
show some special abilities that are not present in small-scale language
models. To discriminate the difference in parameter scale, the research
community has coined the term large language models (LLM) for the PLMs of
significant size. Recently, the research on LLMs has been largely advanced by
both academia and industry, and a remarkable progress is the launch of ChatGPT,
which has attracted widespread attention from society. The technical evolution
of LLMs has been making an important impact on the entire AI community, which
would revolutionize the way how we develop and use AI algorithms. In this
survey, we review the recent advances of LLMs by introducing the background,
key findings, and mainstream techniques. In particular, we focus on four major
aspects of LLMs, namely pre-training, adaptation tuning, utilization, and
capacity evaluation. Besides, we also summarize the available resources for
developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page
- âŠ