41 research outputs found
SimFusion: A Unified Similarity Measurement Algorithm for Multi-Type Interrelated Web Objects
In this paper, we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous web objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlink, user click-through relationships). We claim that iterative computations over the URM can help overcome the data sparseness problem (a common situation in the Web) and detect latent relationships among heterogeneous web objects, thus, can improve the quality of various information applications that require the combination of information from heterogeneous sources. To support our claim, we further propose a unified similarity-calculating algorithm, the SimFusion algorithm. By iteratively computing over the URM, the SimFusion algorithm can effectively integrate relationships from heterogeneous sources when measuring the similarity of two web objects. Experiments based on a real search engine query log and a large real web page collection demonstrate that the SimFusion algorithm can significantly improve similarity measurement of web objects over both traditional content based similarity-calculating algorithms and the cutting edge SimRank algorithm
Differential requirement for Dab2 in the development of embryonic and extra-embryonic tissues
BACKGROUND: Disabled-2 (Dab2) is an endocytic adaptor protein involved in clathrin-mediated endocytosis and cargo trafficking. Since its expression is lost in several cancer types, Dab2 has been suggested to be a tumor suppressor. In vitro studies indicate that Dab2 establishes epithelial cell polarity and organization by directing endocytic trafficking of membrane glycoproteins. Dab2 also modulates cellular signaling pathways by mediating the endocytosis and recycling of surface receptors and associated signaling components. Previously, two independent gene knockout studies have been reported, with some discrepancies in the observed embryonic phenotypes. To further clarify the in vivo roles of Dab2 in development and physiology, we designed a new floxed allele to delete dab2 gene. RESULTS: The constitutive dab2 deleted embryos showed a spectrum in the degree of endoderm disorganization in E5.5 and no mutant embryos persisted at E9.5. However, the mice were grossly normal when dab2 deletion was restricted to the embryo proper and the gene was retained in extraembryonic tissues using Meox2-Cre and Sox2-Cre. Adult Dab2-deficient mice had a small but statistically significant increase in serum cholesterol levels. CONCLUSION: The study of the new dab2 mutant allele in embryos and embryoid bodies confirms a role for Dab2 in extraembryonic endoderm development and epithelial organization. Experimental results with embryoid bodies suggest that additional endocytic adaptors such as Arh and Numb could partially compensate for Dab2 loss. Conditional deletion indicates that Dab2 is dispensable for organ development, when the vast majority of the embryonic cells are dab2 null. However, Dab2 has a physiological role in the endocytosis of lipoproteins and cholesterol metabolism
Combining multiple sources of evidence for information retrieval
This study explores the use of logistic regression for combining multiple sources of evidence and different weighting schemes for the purpose of improving the retrieval performance. The effectiveness of using logistic regression is compared with the effectiveness of manually combined methods.Master of Engineering (SCE
ABSTRACT Machine Learning Approach for Homepage Finding Task
This paper describes new machine learning approaches to predict the correct homepage in response to a user’s homepage finding query. This involves two phases. In the first phase, a decision tree is generated to predict whether a URL is a homepage URL or not. The decision tree then is used to filter out non-homepages from the webpages returned by a standard vector space IR system. In the second phase, a logistic regression analysis is used to combine multiple sources of evidence on the remaining webpages to predict which homepage is most relevant to a user’s query. 100 queries are used to train the logistic regression model and another 145 testing queries are used to evaluate the model derived. Our results show that about 84 % of the testing queries had the correct homepage returned within the top 10 pages. This shows that our machine learning approaches are effective since without any machine learning approaches, only 59 % of the testing queries had their correct answers returned within the top 10 hits. 1
Simfusion: measuring similarity using unified relationship matrix
In this paper we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous data objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlinks, user clickthrough sequences). We claim that iterative computations over the URM can help overcome the data sparseness problem and detect latent relationships among heterogeneous data objects, thus, can improve the quality of information applications that require com-bination of information from heterogeneous sources. To support our claim, we present a unified similarity-calculating algorithm, SimFusion. By iteratively computing over the URM, SimFusion can effectively integrate relationships from heterogeneous sources when measuring the similarity of two data objects. Experiments based on a web search engine query log and a web page collection demonstrate that SimFusion can improve similarity measurement of web objects over both traditional content based algorithms and the cutting edge SimRank algorithm
Endocytosis and Physiology: Insights from Disabled-2 Deficient Mice
Disabled-2 (Dab2) is a clathrin and cargo binding endocytic adaptor protein, and cell biology studies revealed that Dab2 plays a role in cellular trafficking of a number of transmembrane receptors and signaling proteins. A PTB/PID domain located in the N-terminus of Dab2 binds the NPXY motif(s) present at the cytoplasmic tails of certain transmembrane proteins/receptors. The membrane receptors reported to bind directly to Dab2 include LDL receptor and its family members LRP1 and LRP2 (megalin), growth factor receptors EGFR and FGFR, and the cell adhesion receptor beta1 integrin. Dab2 also serves as an adaptor in signaling pathways. Particularly, Dab2 facilitates the endocytosis of the Ras activating Grb2/Sos1 signaling complex, controls its disassembly, and thereby regulates the Ras/MAPK signaling pathway. Cellular analyses have suggested several diverse functions for the widely expressed proteins, and Dab2 is also considered a tumor suppressor, as loss or reduced expression is found in several cancer types. Dab2 null mutant mice were generated and investigated to determine if the findings from cellular studies might be important and relevant in intact animals. Dab2 conditional knockout mice mediated through a Sox2-Cre transgene have no obvious developmental defects and have a normal life span despite that the Dab2 protein is essentially absent in the mutant mice. The conditional knockout mice were grossly normal, though more recent investigation of the Dab2-deficient mice revealed several phenotypes, which can be accounted for by several previously suggested mechanisms. The studies of mutant mice established that Dab2 plays multiple physiological roles through its endocytic functions and modulation of signal pathways
Hormonal induction and roles of Disabled-2 in lactation and involution.
Disabled-2 (Dab2) is a widely expressed endocytic adaptor that was first isolated as a 96 KDa phospho-protein, p96, involved in MAPK signal transduction. Dab2 expression is lost in several cancer types including breast cancer, and Dab2 is thought to have a tumor suppressor function. In mammary epithelia, Dab2 was induced upon pregnancy and further elevated during lactation. We constructed mutant mice with a mosaic Dab2 gene deletion to bypass early embryonic lethality and to investigate the roles of Dab2 in mammary physiology. Loss of Dab2 had subtle effects on lactation, but Dab2-deficient mammary glands showed a strikingly delayed cell clearance during involution. In primary cultures of mouse mammary epithelial cells, Dab2 proteins were also induced by estrogen, progesterone, and/or prolactin. Dab2 null mammary epithelial cells were refractory to growth suppression induced by TGF-beta. However, Dab2 deletion did not affect Smad2 phosphorylation; rather TGF-beta-stimulated MAPK activation was enhanced in Dab2-deficient cells. We conclude that Dab2 expression is induced by hormones and Dab2 plays a role in modulating TGF-beta signaling to enhance apoptotic clearance of mammary epithelial cells during involution
Can We Get A Better Retrieval Function From Machine?
The quality of an information retrieval system heavily depends on its retrieval function, which returns a similarity measurement between the query and each document in the collection. Documents are sorted according to their similarity values with the query and those with high rank are assumed to be relevant. Okapi BM25 and their variations are very popular retrieval functions and they seem to be the default retrieval function for the IR research community; and there are many other widely used and well studied functions, for example, Pivoted TFIDF and INQUERY. Most of these retrieval functions being used today are made based on probabilistic theories and they are adjusted in real world according to different contexts and information needs. In this paper, we propose the idea that a good retrieval function can be discovered by a pure machine learning approach, without using probabilistic theories and knowledge-based techniques. Two machine learning algorithms, Support Vector Machine (SVM) and Genetic Programming (GP) are used for retrieval function discovery, and GP is found to be a more effective approach. The retrieval functions discovered by GP might be hard for human interpretation, but their performance is superior to Okapi BM25, one of the most popular functions. The new retrieval function is combined with query expansion techniques and the retrieval performance is improved significantly. Based on our observations in the empirical study, the GP function is more reliable and effective than Okapi BM25 when query expansion techniques are used