472 research outputs found

    Learning mixtures of separated nonspherical Gaussians

    Full text link
    Mixtures of Gaussian (or normal) distributions arise in a variety of application areas. Many heuristics have been proposed for the task of finding the component Gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin [J. Roy. Statist. Soc. Ser. B 39 (1977) 1-38]. These do not provably run in polynomial time. We present the first algorithm that provably learns the component Gaussians in time that is polynomial in the dimension. The Gaussians may have arbitrary shape, but they must satisfy a ``separation condition'' which places a lower bound on the distance between the centers of any two component Gaussians. The mathematical results at the heart of our proof are ``distance concentration'' results--proved using isoperimetric inequalities--which establish bounds on the probability distribution of the distance between a pair of points generated according to the mixture. We also formalize the more general problem of max-likelihood fit of a Gaussian mixture to unstructured data.Comment: Published at http://dx.doi.org/10.1214/105051604000000512 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Computing a Nonnegative Matrix Factorization -- Provably

    Full text link
    In the Nonnegative Matrix Factorization (NMF) problem we are given an n×mn \times m nonnegative matrix MM and an integer r>0r > 0. Our goal is to express MM as AWA W where AA and WW are nonnegative matrices of size n×rn \times r and r×mr \times m respectively. In some applications, it makes sense to ask instead for the product AWAW to approximate MM -- i.e. (approximately) minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where AA and WW are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant rr. Indeed NMF is most interesting in applications precisely when rr is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time (nm)o(r)(nm)^{o(r)}, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in nn, mm and rr under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.Comment: 29 pages, 3 figure

    A Detailed Dominant Data Mining Approach for Predictive Modeling of Social Networking Data using WEKA

    Get PDF
    Social network has gained popularity manifold in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. In this Paper, we present the first comprehensive review of social and computer science literature on trust in social networks. We first review the existing definitions of trust and define social trust in the context of social networks. Web-based social networks have become popular as a medium for disseminating information and connecting like-minded people. The public accessibility of such networks with the ability to share opinions, thoughts, information, and experience offers great promise to enterprises and governments. As the popularity increases and they became widely used as one of the important sources of news, people become more cautious about determining the trustworthiness of the information which is disseminating through social media for various reasons. For this reason, knowing the factors that influence the trust in social media content became very important. In this research paper, we use a survey as a mechanism to study trust in social networks. First, we prepared a questionnaire which focuses on measuring the ways in which social network users determine whether content is true or not and then we analyzed the response of individuals who participated in the survey and discuss the results in a focus group session. Then, the responses, we get from the survey and the focus group was used as a dataset for modeling trust, which incorporates factors that alter trust determination. The dataset preprocessing a total of 56 records were used for building the models. This Paper applies the Decision Tree, Bayesian Classifiers and Neural Network predictive data mining techniques in significant social media factors for predicting trust. To accomplish this goal: The WEKA data mining tool is used to evaluate the J48, Naïve Bayes and Multilayer Perception algorithms with different experiments were made by performing adjustments of the attributes and using various numbers of attributes in order to come up with a purposeful output

    A Detailed Dominant Data Mining Approach for Predictive Modeling of Social Networking Data using WEKA

    Get PDF
    Social network has gained popularity manifold in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. In this Paper, we present the first comprehensive review of social and computer science literature on trust in social networks. We first review the existing definitions of trust and define social trust in the context of social networks. Web-based social networks have become popular as a medium for disseminating information and connecting like-minded people. The public accessibility of such networks with the ability to share opinions, thoughts, information, and experience offers great promise to enterprises and governments. As the popularity increases and they became widely used as one of the important sources of news, people become more cautious about determining the trustworthiness of the information which is disseminating through social media for various reasons. For this reason, knowing the factors that influence the trust in social media content became very important. In this research paper, we use a survey as a mechanism to study trust in social networks. First, we prepared a questionnaire which focuses on measuring the ways in which social network users determine whether content is true or not and then we analyzed the response of individuals who participated in the survey and discuss the results in a focus group session. Then, the responses, we get from the survey and the focus group was used as a dataset for modeling trust, which incorporates factors that alter trust determination. The dataset preprocessing a total of 56 records were used for building the models. This Paper applies the Decision Tree, Bayesian Classifiers and Neural Network predictive data mining techniques in significant social media factors for predicting trust. To accomplish this goal: The WEKA data mining tool is used to evaluate the J48, Na�ve Bayes and Multilayer Perception algorithms with different experiments were made by performing adjustments of the attributes and using various numbers of attributes in order to come up with a purposeful output

    Distributive Distillation Enabled by Microchannel Process Technology

    Get PDF
    The application of microchannel technology for distributive distillation was studied to achieve the Grand Challenge goals of 25% energy savings and 10% return on investment. In Task 1, a detailed study was conducted and two distillation systems were identified that would meet the Grand Challenge goals if the microchannel distillation technology was used. Material and heat balance calculations were performed to develop process flow sheet designs for the two distillation systems in Task 2. The process designs were focused on two methods of integrating the microchannel technology â 1) Integrating microchannel distillation to an existing conventional column, 2) Microchannel distillation for new plants. A design concept for a modular microchannel distillation unit was developed in Task 3. In Task 4, Ultrasonic Additive Machining (UAM) was evaluated as a manufacturing method for microchannel distillation units. However, it was found that a significant development work would be required to develop process parameters to use UAM for commercial distillation manufacturing. Two alternate manufacturing methods were explored. Both manufacturing approaches were experimentally tested to confirm their validity. The conceptual design of the microchannel distillation unit (Task 3) was combined with the manufacturing methods developed in Task 4 and flowsheet designs in Task 2 to estimate the cost of the microchannel distillation unit and this was compared to a conventional distillation column. The best results were for a methanol-water separation unit for the use in a biodiesel facility. For this application microchannel distillation was found to be more cost effective than conventional system and capable of meeting the DOE Grand Challenge performance requirements

    Antimicrobial activity of seed, pomace and leaf extracts of sea buckthorn (Hippophae rhamnoides L.) against foodborne and food spoilage pathogens

    Get PDF
    The present study was conducted to evaluate the total phenolic content (TPC) and antibacterial properties of crude extracts of sea buckthorn (Hippophae rhamnoides L.) pomace, seeds and leaves against 17 foodborne pathogens. The methanolic extract of leaves exhibited high total phenolic content (278.80 mg GAE/g extract) and had low minimum inhibitory concentration (MIC) value of 125 μg/ml against Listeria monocytogenes. Salmonella typhimurium strain was found to be resistant against all tested extracts. The antilisterial activity of the methanolic extract of leaves was tested on carrots. Bacterial enumeration was significantly reduced by 0.15 to 0.31, 0.26 to 1.72 and 0.59 to 4.10 log cfu/g after 0 to 60 min exposure when treated with 125, 2500 and 5000 μg/ml extract, respectively. Thus, in addition to its use as a functional food ingredient, leaves extract from sea buckthorn (SBT) can possibly be used as a biosanitizer in food industries.Key words: Antimicrobial activity, Hippophae, Listeria monocytogenes, natural sanitizer, seabuckthorn

    The Next Generation Non-competitive Active Polyester Nanosystems for Transferrin Receptor-mediated Peroral Transport Utilizing Gambogic Acid as a Ligand

    Get PDF
    The current methods for targeted drug delivery utilize ligands that must out-compete endogenous ligands in order to bind to the active site facilitating the transport. To address this limitation, we present a non-competitive active transport strategy to overcome intestinal barriers in the form of tunable nanosystems (NS) for transferrin receptor (TfR) utilizing gambogic acid (GA), a xanthanoid, as its ligand. The NS made using GA conjugated poly(lactide-co-glycolide) (PLGA) have shown non-competitive affinity to TfR evaluated in cell/cell-free systems. The fluorescent PLGA-GA NS exhibited significant intestinal transport and altered distribution profile compared to PLGA NS in vivo. The PLGA-GA NS loaded with cyclosporine A (CsA), a model peptide, upon peroral dosing to rodents led to maximum plasma concentration of CsA at 6 h as opposed to 24 h with PLGA-NS with at least 2-fold higher levels in brain at 72 h. The proposed approach offers new prospects for peroral drug delivery and beyond
    • …
    corecore