104 research outputs found
Techniques for Improving Web Search by Understanding Queries
This thesis investigates the refinement of web search results with a special
focus on the use of clustering and the role of queries. It presents a
collection of new methods for evaluating clustering methods, performing
clustering effectively, and for performing query refinement.
The thesis identifies different types of query, the situations where refinement
is necessary, and the factors affecting search difficulty. It then
analyses hard searches and argues that many of them fail because users
and search engines have different query models.
The thesis identifies best practice for evaluating web search results and
search refinement methods. It finds that none of the commonly used evaluation
measures for clustering meet all of the properties of good evaluation
measures. It then presents new quality and coverage measures that
satisfy all the desired properties and that rank clusterings correctly in all
web page clustering situations.
The thesis argues that current web page clustering methods work well
when different interpretations of the query have distinct vocabulary, but
still have several limitations and often produce incomprehensible clusters.
It then presents a new clustering method that uses the query to guide
the construction of semantically meaningful clusters. The new clustering
method significantly improves performance.
Finally, the thesis explores how searches and queries are composed of
different aspects and shows how to use aspects to reduce the distance between
the query models of search engines and users. It then presents fully
automatic methods that identify query aspects, identify underrepresented
aspects, and predict query difficulty. Used in combination, these methods
have many applications â the thesis describes methods for two of
them. The first method improves the search results for hard queries with
underrepresented aspects by automatically expanding the query using semantically
orthogonal keywords related to the underrepresented aspects.
The second method helps users refine hard ambiguous queries by identifying
the different query interpretations using a clustering of a diverse set
of refinements. Both methods significantly outperform existing methods
Multi-Class Simultaneous Adaptive Segmentation and Quality Control of Point Cloud Data
3D modeling of a given site is an important activity for a wide range of applications including urban planning, as-built mapping of industrial sites, heritage documentation, military simulation, and outdoor/indoor analysis of airflow. Point clouds, which could be either derived from passive or active imaging systems, are an important source for 3D modeling. Such point clouds need to undergo a sequence of data processing steps to derive the necessary information for the 3D modeling process. Segmentation is usually the first step in the data processing chain. This paper presents a region-growing multi-class simultaneous segmentation procedure, where planar, pole-like, and rough regions are identified while considering the internal characteristics (i.e., local point density/spacing and noise level) of the point cloud in question. The segmentation starts with point cloud organization into a kd-tree data structure and characterization process to estimate the local point density/spacing. Then, proceeding from randomly-distributed seed points, a set of seed regions is derived through distance-based region growing, which is followed by modeling of such seed regions into planar and pole-like features. Starting from optimally-selected seed regions, planar and pole-like features are then segmented. The paper also introduces a list of hypothesized artifacts/problems that might take place during the region-growing process. Finally, a quality control process is devised to detect, quantify, and mitigate instances of partially/fully misclassified planar and pole-like features. Experimental results from airborne and terrestrial laser scanning as well as image-based point clouds are presented to illustrate the performance of the proposed segmentation and quality control framework
QSAR models for prediction of PPARÎŽ agonistic activity of indanylacetic acid derivatives
Peroxisome Proliferator Activated Receptor b/d (PPAR b/d), one of three PPAR isoforms
is a member of nuclear receptor superfamily and ubiquitously expressed in several
metabolically active tissues such as liver, muscle, and fat. Tissue specific expression and
knock-out studies suggest a role of PPARd in obesity and metabolic syndrome. Specific
and selective PPARd ligands may play an important role in the treatment of metabolic
disorders. Indanylacetic acid derivatives reported as potent and specific ligands against
PPARd have been studied for the Quantitative Structure â Activity Relationships
(QSAR). Molecules were represented by chemical descriptors that encode constitutional,
topological, geometrical, and electronic structure features. Four different approaches, i.e.,
random selection, hierarchical clustering, k-means clustering, and sphere exclusion
method were used to classify the dataset into training and test subsets. Forward stepwise
Multiple Linear Regression (MLR) approach was used to linearly select the subset of
descriptors and establish the linear relationship with PPARd agonistic activity of the
molecules. The models were validated internally by Leave One Out (LOO) and externally
for the prediction of test sets. The best subset of descriptors was then fed to the Artificial
Neural Networks (ANN) to develop non-linear models. Statistically significant MLR
models; with R2 varying from 0.80 to 0.87 were generated based on the different training
and test set selection methods. Training of ANNs with different architectures for the
different training and test selection methods resulted in models with R2 values varying
from 0.83 to 0.94, which indicates the high predictive ability of the models.info:eu-repo/semantics/publishedVersio
Metabolic fingerprint after acute and under sustained consumption of a functional beverage based on grape skin extract in healthy human subjects
Grape-derived polyphenols are considered to be one of the most promising ingredients for functional foods due to their health-promoting activities. We applied a HPLC-MS-based untargeted metabolomic approach in order to evaluate the impact of a functional food based on grape skin polyphenols on the urinary metabolome of healthy subjects. Thirty-one volunteers participated in two dietary crossover randomized intervention studies: with a single-dose intake (187 mL) and with a 15-day sustained consumption (twice per day, 187 mL per day in total) of a functional beverage (FB). Postprandial (4-hour) and 24-hour urine samples collected after acute consumption and on the last day of sustained FB consumption, respectively, were analysed using an untargeted HPLC-qTOF-MS approach. Multivariate modelling with subsequent application of an S-plot revealed differential mass features related to acute and prolonged consumption of FB. More than half of the mass features were shared between the two types of samples, among which several phase II metabolites of grape-derived polyphenols were identified at confidence level II. Prolonged consumption of FB was specifically reflected in urine metabolome by the presence of first-stage microbial metabolites of flavanols: hydroxyvaleric acid and hydroxyvalerolactone derivatives. Overall, several epicatechin and phenolic acid metabolites both of tissular and microbiota origin were the most representative markers of FB consumption. To our knowledge, this is one of the first studies where an untargeted LC-MS metabolomic approach has been applied in nutrition research on a grape-derived FB
Kodex ou comment organiser les résultats d'une recherche d'information par détection de communautés sur un graphe biparti ?
International audienceInformation Retrieval Systems (IRS) generally display results as a list of documents.One may think that a deeper structure exists within results. This hypothesis is reinforced bythe fact that most of the graphs produced from real data (e.g., graphs of documents) sharesome structural properties, and in particular a community structure. We propose to use theseproperties to better organize the set of returned documents for a query from a given IRS. For thispurpose, the retrieved document set is modeled as a bipartite graph (Documents â Terms) onwhich the Kodex community detection algorithm is applied. This paper presents Kodex and itsevaluation : regarding F1 measure, Kodex overcomes baseline Okapi BM25 by 22%.Les SystĂšmes de Recherche dâInformation structurent en gĂ©nĂ©ral leurs rĂ©sultats sousla forme dâune liste de documents. Nous pensons quâil existe une structure plus riche dansces rĂ©sultats. En effet, la plupart des graphes obtenus Ă partir de donnĂ©es rĂ©elles (entre autre,les graphes de documents) partagent certaines propriĂ©tĂ©s structurelles, en particulier uneorganisation en communautĂ©s que nous proposons dâexploiter afin de mieux organiser lâensembledes documents restituĂ©s pour une requĂȘte. Pour ce faire, lâensemble des documents restituĂ©s estmodĂ©lisĂ© par un graphe biparti (Documents â Termes) sur lequel est appliquĂ© notre algorithmeKodex de dĂ©tection de communautĂ©s. Cet article prĂ©sente Kodex et son Ă©valuation : sur la mesureF1 , Kodex amĂ©liore significativement la baseline Okapi BM25 de 22 %
Some further studies on improving QFD methodology and analysis
Quality Function Deployment (QFD) starts and ends with the customer. In other words, how it ends may depend largely on how it starts. Any QFD practitioners will start with collecting the voice of the customer that reflects customerâs needs as to make sure that the products will eventually sell or the service may satisfy the customer. On the basis of those needs, a product or service creation process is initiated. It always takes a certain period of time for the product or service to be ready for the customer. The question here is whether those customer-needs may remain exactly the same during the product or service creation process. The answer would be very likely to be a ânoâ, especially in todayâs rapidly changing environment due to increased competition and globalization. The focus of this thesis is placed on dealing with the change of relative importance of the customerâs needs during product or service creation process. In other words, the assumption is that there is no new need discovered along the time or an old one becomes outdated; only the relative importance change of the existing needs is dealt with. Considering the latest development of QFD research, especially the increasingly extensive use of Analytic Hierarchy Process (AHP) in QFD, this thesis aims to enhance the current QFD methodology and analysis, with respect to the change during product or service creation process, as to continually meet or exceed the needs of the customer. The entire research works are divided into three main parts, namely, the further use of AHP in QFD, the incorporation of AHP-based prioritiesâ dynamics in QFD, and decision making analysis with respect to the dynamics. The first part focuses on the question "In what ways does AHP, considering its strength and weakness, contribute to an improved QFD analysis?" The usefulness of AHP in QFD is demonstrated through a case study in improving higher education quality of an education institution. Furthermore, a generalized model of using AHP in QFD is also proposed. The generalized model not only provides an alternative way to construct the house of quality (HoQ), but also creates the possibility to include other relevant factors into QFD analysis, such as new product development risks. The second part addresses the question "How to use the AHP in QFD in dealing with the dynamics of priorities?" A novel quantitative method to model the dynamics of AHP-based priorities in the HoQ is proposed. The method is simple and time-efficient. It is especially useful when the historical data is limited, which is the case in a highly dynamic environment. As to further improve QFD analysis, the modeling method is applied into two areas. The first area is to enhance the use of Kanoâs model in QFD by considering its dynamics. It not only extends the use of Kanoâs model in QFD, but also advances the academic literature on modeling the life cycle of quality attributes quantitatively. The second area is to enhance the benchmarking part of QFD by including the dynamics of competitorsâ performance in addition to the dynamics of customerâs needs. The third part deals with the question "How to make decision in a QFD analysis with respect to the dynamics in the house of quality?" Two decision making approaches are proposed to prioritize and/or optimize the technical attributes with respect to the modeling results. Considering the fact that almost all QFD translation process employs the relationship matrix, a guideline for QFD practitioners to decide whether the relationship matrix should be normalized is developed. Furthermore, a practical implication of the research work towards the possible use of QFD in helping a company develop more innovative products is also discussed. In brief, the main contribution of this thesis is in providing some novel methods and/or approaches to enhance the QFDâs use with respect to the change during product or service creation process. For scientific community, this means that the existing QFD research has been considerably improved, especially with the use of AHP in QFD. For engineering practice, a better way of doing QFD analysis, as a customer-driven engineering design tool, has been proposed. It is hoped that the research work may provide a first step into a better customer-driven product or service design process, and eventually increase the possibility to create more innovative and competitive products or services over time
Optimization with artificial intelligence in additive manufacturing: a systematic review
In situations requiring high levels of customization and limited production volumes, additive manufacturing (AM) is a frequently utilized technique with several benefits. To properly configure all the parameters required to produce final goods of the utmost quality, AM calls for qualified designers and experienced operators. This research demonstrates how, in this scenario, artificial intelligence (AI) could significantly enable designers and operators to enhance additive manufacturing. Thus, 48 papers have been selected from the comprehensive collection of research using a systematic literature review to assess the possibilities that AI may bring to AM. This review aims to better understand the current state of AI methodologies that can be applied to optimize AM technologies and the potential future developments and applications of AI algorithms in AM. Through a detailed discussion, it emerges that AI might increase the efficiency of the procedures associated with AM, from simulation optimization to in-process monitoring
Climate Change, Gene Flow, and the Legendary Synchrony of Snowshoe Hares
In recent decades, climate change has been invoked in the apparent collapse of some of the best-known examples of cyclic and synchronous population dynamics among boreal species. Simultaneously, some studies have predicted that as species\u27 ranges shift poleward and southern habitats fragment in response to climate change, we will lose the southern glacial refugial populations that have historically harbored species\u27 highest genetic diversity and uniqueness. I investigated how climate change and habitat fragmentation may impact genetic and population dynamic processes for the snowshoe hare (Lepus americanus), a species historically recognized as a key driver of North American boreal community dynamics.
I collected \u3e1000 genetic samples and \u3e300 time series from 175 cooperators in 30 U.S. states and Canadian provinces and territories. Based on analyses of nuclear and mitochondrial DNA, I identified three highly divergent groups of snowshoe hares in the Boreal, Pacific Northwest, and Southern Rockies regions of North America. I found high genetic diversity in mid-range (Boreal) hare populations, and high genetic uniqueness but lower diversity in the species\u27 southern range (Pacific Northwest and Rockies). If southern populations decline due to climate change, snowshoe hares may still retain high genetic diversity, but will lose many alleles currently unique to southern populations.
In a simulation study comparing five synchrony metrics, I found the Kendall metric performed best with short, noisy time series similar to those available for snowshoe hares. I used this metric in partial Mantel tests, modified correlograms, and shifting window analyses of hare synchrony patterns. Confirming long-held but previously untested assumptions, I found northern hare populations are significantly synchronized at distances up to several thousand kilometers, while southern populations are not significantly synchronized at any of the distance classes evaluated. I found that historical patterns of synchrony still persist for snowshoe hares, in contrast to reports for some other synchronous species. Hare synchrony patterns clustered into groups defined according to genetic criteria--but not ecoregions or climatic regions--highlighting the importance of dispersal and population connectivity in snowshoe hare synchrony
Layered Modeling and Simulation of Complex Biotechnological Processes - Optimizing Rhamnolipid Production by Pseudomonas aeruginosa during Cultivation in a Bioreactor
In this thesis, a model for the regulation of rhamnolipid production and data obtained from metabolic balancing were combined with a process model on a bioreactor scale. The model was used to derive an optimized process control stategy for enhanced product formation. This thesis provides a missing piece in a puzzle for knowledge-based strategies for enhanced rhamnolipid formation
- âŠ