5,032 research outputs found
Alʔilbīrī’s Book of the rational conclusions. Introduction, Critical Edition of the Arabic Text and Materials for the History of the Ḫawāṣṣic Genre in Early Andalus
[eng] The Book of the rational conclusions, written perhaps somewhen in the 10th c. by a physician from Ilbīrah (Andalus), is a multi-section medical pandect. The author brings together, from a diversity of sources, materials dealing with matters related to drug-handling, natural philosophy, therapeutics, medical applications of the specific properties of things, a regimen, and a dispensatory. This dissertation includes three different parts. First the transmission of the text, its contents, and its possible context are discussed. Then a critical edition of the Arabic text is offered. Last, but certainly not least, the subject of the specific properties is approached from several points of view. The analysis of Section III of the original book leads to an exploration of the early Andalusī assimilation of this epistemic tradition and to the establishment of a well-defined textual family in which our text must be inscribed. On the other hand, the concept itself of ‘specific property’ is often misconstrued and it is usually made synonymous to magic and superstition. Upon closer inspection, however, the alleged irrationality of the knowledge of these properties appears to be largely the result of anachronistic interpretation. As a complement of this particular research and as an illustration of the genre, a sample from an ongoing integral commentary on this section of the book is presented.[cat] El Llibre de les conclusions racionals d’un desconegut metge d’Ilbīrah (l’Àndalus) va ser compilat probablement durant la segona meitat del s. X. Es tracta d’un rudimentari però notablement complet kunnaix (un gènere epistèmic que és definit sovint com a ‘enciclopèdia mèdica’) en què l’autor aplega materials manllevats (sovint de manera literal i no-explícita) de diversos gèneres. El llibre obre amb una secció sobre apoteconomia (una mena de manual d’apotecaris) però se centra després en les diferents branques de la medicina. A continuació d’uns prolegòmens filosòfics l’autor copia, amb mínima adaptació lingüística, un tractat sencer de terapèutica, després un altre sobre les aplicacions mèdiques de les propietats específiques de les coses, una sèrie de fragments relacionats amb la dietètica (un règim en termes tradicionals) i, finalment, una col·lecció de receptes mèdiques. Cadascuna d’aquestes seccions mostren evidents lligams d’intertextualitat que apunten cap a una intensa activitat sintetitzadora de diverses tradicions aliades a la medicina a l’Àndalus califal. El text és, de fet, un magnífic objecte sobre el qual aplicar la metodologia de la crítica textual i de fonts. L’edició crítica del text incorpora la dimensió cronològica dins l’aparat, que esdevé així un element contextualitzador. Quant l’estudi de les fonts, si tot al llarg de la primera part d’aquesta tesi és només secundari, aquesta disciplina pren un protagonisme gairebé absolut en la tercera part, especialment en el capítol dedicat a l’anàlisi individual de cada passatge recollit en la secció sobre les propietats específiques de les coses
Statistical analysis of grouped text documents
L'argomento di questa tesi sono i modelli statistici per l'analisi dei dati testuali, con particolare attenzione ai contesti in cui i campioni di testo sono raggruppati.
Quando si ha a che fare con dati testuali, il primo problema è quello di elaborarli, per renderli compatibili dal punto di vista computazionale e metodologico con i metodi matematici e statistici prodotti e continuamente sviluppati dalla comunità scientifica. Per questo motivo, la tesi passa in rassegna i metodi esistenti per la rappresentazione analitica e l'elaborazione di campioni di dati testuali, compresi i "Vector Space Models", le "rappresentazioni distribuite" di parole e documenti e i "contextualized embeddings". Questa rassegna comporta la standardizzazione di una notazione che, anche all'interno dello stesso approccio di rappresentazione, appare molto eterogenea in letteratura.
Vengono poi esplorati due domini di applicazione: i social media e il turismo culturale. Per quanto riguarda il primo, viene proposto uno studio sull'autodescrizione di gruppi diversi di individui sulla piattaforma StockTwits, dove i mercati finanziari sono gli argomenti dominanti. La metodologia proposta ha integrato diversi tipi di dati, sia testuali che variabili categoriche. Questo studio ha agevolato la comprensione sul modo in cui le persone si presentano online e ha trovato stutture di comportamento ricorrenti all'interno di gruppi di utenti.
Per quanto riguarda il turismo culturale, la tesi approfondisce uno studio condotto nell'ambito del progetto "Data Science for Brescia - Arts and Cultural Places", in cui è stato addestrato un modello linguistico per classificare le recensioni online scritte in italiano in quattro aree semantiche distinte relative alle attrazioni culturali della città di Brescia. Il modello proposto permette di identificare le attrazioni nei documenti di testo, anche quando non sono esplicitamente menzionate nei metadati del documento, aprendo così la possibilità di espandere il database relativo a queste attrazioni culturali con nuove fonti, come piattaforme di social media, forum e altri spazi online.
Infine, la tesi presenta uno studio metodologico che esamina la specificità di gruppo delle parole, analizzando diversi stimatori di specificità di gruppo proposti in letteratura. Lo studio ha preso in considerazione documenti testuali raggruppati con variabile di "outcome" e variabile di gruppo. Il suo contributo consiste nella proposta di modellare il corpus di documenti come una distribuzione multivariata, consentendo la simulazione di corpora di documenti di testo con caratteristiche predefinite. La simulazione ha fornito preziose indicazioni sulla relazione tra gruppi di documenti e parole. Inoltre, tutti i risultati possono essere liberamente esplorati attraverso un'applicazione web, i cui componenti sono altresì descritti in questo manoscritto.
In conclusione, questa tesi è stata concepita come una raccolta di studi, ognuno dei quali suggerisce percorsi di ricerca futuri per affrontare le sfide dell'analisi dei dati testuali raggruppati.The topic of this thesis is statistical models for the analysis of textual data, emphasizing contexts in which text samples are grouped.
When dealing with text data, the first issue is to process it, making it computationally and methodologically compatible with the existing mathematical and statistical methods produced and continually developed by the scientific community. Therefore, the thesis firstly reviews existing methods for analytically representing and processing textual datasets, including Vector Space Models, distributed representations of words and documents, and contextualized embeddings. It realizes this review by standardizing a notation that, even within the same representation approach, appears highly heterogeneous in the literature.
Then, two domains of application are explored: social media and cultural tourism. About the former, a study is proposed about self-presentation among diverse groups of individuals on the StockTwits platform, where finance and stock markets are the dominant topics. The methodology proposed integrated various types of data, including textual and categorical data. This study revealed insights into how people present themselves online and found recurring patterns within groups of users.
About the latter, the thesis delves into a study conducted as part of the "Data Science for Brescia - Arts and Cultural Places" Project, where a language model was trained to classify Italian-written online reviews into four distinct semantic areas related to cultural attractions in the Italian city of Brescia. The model proposed allows for the identification of attractions in text documents, even when not explicitly mentioned in document metadata, thus opening possibilities for expanding the database related to these cultural attractions with new sources, such as social media platforms, forums, and other online spaces.
Lastly, the thesis presents a methodological study examining the group-specificity of words, analyzing various group-specificity estimators proposed in the literature. The study considered grouped text documents with both outcome and group variables. Its contribution consists of the proposal of modeling the corpus of documents as a multivariate distribution, enabling the simulation of corpora of text documents with predefined characteristics. The simulation provided valuable insights into the relationship between groups of documents and words. Furthermore, all its results can be freely explored through a web application, whose components are also described in this manuscript.
In conclusion, this thesis has been conceived as a collection of papers. It aimed to contribute to the field with both applications and methodological proposals, and each study presented here suggests paths for future research to address the challenges in the analysis of grouped textual data
Displacement and the Humanities: Manifestos from the Ancient to the Present
This is the final version. Available on open access from MDPI via the DOI in this recordThis is a reprint of articles from the Special Issue published online in the open access journal Humanities (ISSN 2076-0787) (available at: https://www.mdpi.com/journal/humanities/special_issues/Manifestos Ancient Present)This volume brings together the work of practitioners, communities, artists and other researchers from multiple disciplines. Seeking to provoke a discourse around displacement within and beyond the field of Humanities, it positions historical cases and debates, some reaching into the ancient past, within diverse geo-chronological contexts and current world urgencies. In adopting an innovative dialogic structure, between practitioners on the ground - from architects and urban planners to artists - and academics working across subject areas, the volume is a proposition to: remap priorities for current research agendas; open up disciplines, critically analysing their approaches; address the socio-political responsibilities that we have as scholars and practitioners; and provide an alternative site of discourse for contemporary concerns about displacement. Ultimately, this volume aims to provoke future work and collaborations - hence, manifestos - not only in the historical and literary fields, but wider research concerned with human mobility and the challenges confronting people who are out of place of rights, protection and belonging
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Essays on Corporate Disclosure of Value Creation
Information on a firm’s business model helps investors understand an entity’s resource requirements, priorities for action, and prospects (FASB, 2001, pp. 14-15; IASB, 2010, p. 12). Disclosures of strategy and business model (SBM) are therefore considered a central element of effective annual report commentary (Guillaume, 2018; IIRC, 2011). By applying natural language processing techniques, I explore what SBM disclosures look like when management are pressed to say something, analyse determinants of cross-sectional variation in SBM reporting properties, and assess whether and how managers respond to regulatory interventions seeking to promote SBM annual report commentary. This dissertation contains three main chapters. Chapter 2 presents a systematic review of the academic literature on non-financial reporting and the emerging literature on SBM reporting. Here, I also introduce my institutional setting. Chapter 3 and Chapter 4 form the empirical sections of this thesis. In Chapter 3, I construct the first large sample corpus of SBM annual report commentary and provide the first systematic analysis of the properties of such disclosures. My topic modelling analysis rejects the hypothesis that such disclosure is merely padding; instead finding themes align with popular strategy frameworks and management tailor the mix of SBM topics to reflect their unique approach to value creation. However, SBM commentary is less specific, less precise about time horizon (short- and long-term), and less balanced (more positive) in tone relative to general management commentary. My findings suggest symbolic compliance and legitimisation characterize the typical annual report discussion of SBM. Further analysis identifies proprietary cost considerations and obfuscation incentives as key determinants of symbolic reporting. In Chapter 4, I seek evidence on how managers respond to regulatory mandates by adapting the properties of disclosure and investigate whether the form of the mandate matters. Using a differences-in-differences research design, my results suggest a modest incremental response by treatment firms to the introduction of a comply or explain provision to provide disclosure on strategy and business model. In contrast, I find a substantial response to enacting the same requirements in law. My analysis provides clear and consistent evidence that treatment firms incrementally increase the volume of SBM disclosure, improve coverage across a broad range of topics as well as providing commentary with greater focus on the long term. My results point to substantial changes in SBM reporting properties following regulatory mandates, but the form of the mandate does matter. Overall, this dissertation contributes to the accounting literature by examining how firms discuss a central topic to economic decision making in annual reports and how firms respond to different forms of disclosure mandate. Furthermore, the results of my analysis are likely to be of value for regulators and policymakers currently reviewing or considering mandating disclosure requirements. By examining how companies adapt their reporting to different types of regulations, this study provides an empirical basis for recalibrating SBM disclosure mandates, thereby enhancing the information set of capital market participants and promoting stakeholder engagement in a landscape increasingly shaped by non-financial information
Low- and high-resource opinion summarization
Customer reviews play a vital role in the online purchasing decisions we make. The reviews
express user opinions that are useful for setting realistic expectations and uncovering important
details about products. However, some products receive hundreds or even thousands of
reviews, making them time-consuming to read. Moreover, many reviews contain uninformative
content, such as irrelevant personal experiences. Automatic summarization offers an
alternative – short text summaries capturing the essential information expressed in reviews.
Automatically produced summaries can reflect overall or particular opinions and be tailored to
user preferences. Besides being presented on major e-commerce platforms, home assistants
can also vocalize them. This approach can improve user satisfaction by assisting in making
faster and better decisions.
Modern summarization approaches are based on neural networks, often requiring thousands of
annotated samples for training. However, human-written summaries for products are expensive
to produce because annotators need to read many reviews. This has led to annotated data
scarcity where only a few datasets are available. Data scarcity is the central theme of our
works, and we propose a number of approaches to alleviate the problem. The thesis consists
of two parts where we discuss low- and high-resource data settings.
In the first part, we propose self-supervised learning methods applied to customer reviews
and few-shot methods for learning from small annotated datasets. Customer reviews without
summaries are available in large quantities, contain a breadth of in-domain specifics, and
provide a powerful training signal. We show that reviews can be used for learning summarizers
via a self-supervised objective. Further, we address two main challenges associated with
learning from small annotated datasets. First, large models rapidly overfit on small datasets
leading to poor generalization. Second, it is not possible to learn a wide range of in-domain
specifics (e.g., product aspects and usage) from a handful of gold samples. This leads to
subtle semantic mistakes in generated summaries, such as ‘great dead on arrival battery.’ We
address the first challenge by explicitly modeling summary properties (e.g., content coverage
and sentiment alignment). Furthermore, we leverage small modules – adapters – that are
more robust to overfitting. As we show, despite their size, these modules can be used to
store in-domain knowledge to reduce semantic mistakes. Lastly, we propose a simple method
for learning personalized summarizers based on aspects, such as ‘price,’ ‘battery life,’ and
‘resolution.’ This task is harder to learn, and we present a few-shot method for training a
query-based summarizer on small annotated datasets.
In the second part, we focus on the high-resource setting and present a large dataset with
summaries collected from various online resources. The dataset has more than 33,000 humanwritten
summaries, where each is linked up to thousands of reviews. This, however, makes it
challenging to apply an ‘expensive’ deep encoder due to memory and computational costs. To
address this problem, we propose selecting small subsets of informative reviews. Only these
subsets are encoded by the deep encoder and subsequently summarized. We show that the
selector and summarizer can be trained end-to-end via amortized inference and policy gradient
methods
Scalable Exploration of Complex Objects and Environments Beyond Plain Visual Replication
Digital multimedia content and presentation means are rapidly increasing their sophistication and are now capable of describing detailed representations of the physical world. 3D exploration experiences allow people to appreciate, understand and interact with intrinsically virtual objects.
Communicating information on objects requires the ability to explore them under different angles, as well as to mix highly photorealistic or illustrative presentations of the object themselves with additional data that provides additional insights on these objects, typically represented in the form of annotations. Effectively providing these capabilities requires the solution of important problems in visualization and user interaction.
In this thesis, I studied these problems in the cultural heritage-computing-domain, focusing on the very common and important special case of mostly planar, but visually, geometrically, and semantically rich objects. These could be generally roughly flat objects with a standard frontal viewing direction (e.g., paintings, inscriptions, bas-reliefs), as well as visualizations of fully 3D objects from a particular point of views (e.g., canonical views of buildings or statues). Selecting a precise application domain and a specific presentation mode allowed me to concentrate on the well defined use-case of the exploration of annotated relightable stratigraphic models (in particular, for local and remote museum presentation).
My main results and contributions to the state of the art have been a novel technique for interactively controlling visualization lenses while automatically maintaining good focus-and-context parameters, a novel approach for avoiding clutter in an annotated model and for guiding users towards interesting areas, and a method for structuring audio-visual object annotations into a graph and for using that graph to improve guidance and support storytelling and automated tours.
We demonstrated the effectiveness and potential of our techniques by performing interactive exploration sessions on various screen sizes and types ranging from desktop devices to large-screen displays for a walk-up-and-use museum installation.
KEYWORDS - Computer Graphics, Human-Computer Interaction, Interactive Lenses, Focus-and-Context, Annotated Models, Cultural Heritage Computing
Archaeological palaeoenvironmental archives: challenges and potential
This Arts and Humanities Research Council (AHRC) sponsored collaborative doctoral project represents one of
the most significant efforts to collate quantitative and qualitative data that can elucidate practices related to
archaeological palaeoenvironmental archiving in England. The research has revealed that archived
palaeoenvironmental remains are valuable resources for archaeological research and can clarify subjects that
include the adoption and importation of exotic species, plant and insect invasion, human health and diet, and
plant and animal husbandry practices. In addition to scientific research, archived palaeoenvironmental remains
can provide evidence-based narratives of human resilience and climate change and offer evidence of the
scientific process, making them ideal resources for public science engagement. These areas of potential have
been realised at an imperative time; given that waterlogged palaeoenvironmental remains at significant sites
such as Star Carr, Must Farm, and Flag Fen, archaeological deposits in towns and cities are at risk of decay due
to climate change-related factors, and unsustainable agricultural practices. Innovative approaches to collecting
and archiving palaeoenvironmental remains and maintaining existing archives will permit the creation of an
accessible and thorough national resource that can service archaeologists and researchers in the related fields
of biology and natural history. Furthermore, a concerted effort to recognise absences in archaeological
archives, matched by an effort to supply these deficiencies, can produce a resource that can contribute to an
enduring geographical and temporal record of England's biodiversity, which can be used in perpetuity in the
face of diminishing archaeological and contemporary natural resources.
To realise these opportunities, particular challenges must be overcome. The most prominent of these include
inconsistent collection policies resulting from pressures associated with shortages in storage capacity and
declining specialist knowledge in museums and repositories combined with variable curation practices. Many of
these challenges can be resolved by developing a dedicated storage facility that can focus on the ongoing
conservation and curation of palaeoenvironmental remains. Combined with an OASIS + module designed to
handle and disseminate data pertaining to palaeoenvironmental archives, remains would be findable,
accessible, and interoperable with biological archives and collections worldwide. Providing a national centre for
curating palaeoenvironmental remains and a dedicated digital repository will require significant funding.
Funding sources could be identified through collaboration with other disciplines. If sufficient funding cannot be
identified, options that would require less financial investment, such as high-level archive audits and the
production of guidance documents, will be able to assist all stakeholders with the improved curation,
management, and promotion of the archived resource
- …