25 research outputs found

    A statistical test vs. a validation experiment in Gene Expression Study

    Get PDF
    Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014Comparative CT Method compares the Ct value of one target gene to another using the formula called 2-ΔΔCT. To make this method valid, the efficiency of the target amplification (the gene of interest) and the efficiency of the reference amplification (the endogenous control) must be equal. In this article we propose to test statistical hypotheses instead to perform validation biological experiments when we want to show that the efficiencies of the target and endogenous control amplifications are approximately equal.Association for the Development of the Information Society, Institute of Mathematics and Informatics Bulgarian Academy of Sciences, Plovdiv University "Paisii Hilendarski

    Canonical correlation analysis and DEA for azorean agriculture efficiency

    Get PDF
    In this paper we will document the application of canonical correlation analysis to variable aggregation using the correlations of the original variables with the canonical variates. A case study, about farms in Terceira Island, with a small data set is presented. In this data set of 30 farms we intend to use 17 input variables and 2 output variables to measure DEA efficiency. Without any data reduction procedure several problems known as “curse of dimensionality” are expected. With the data reduction procedures suggested it was possible to conclude quite acceptable and domain consistent conclusions.N/

    Azorean agriculture efficiency by PAR

    Get PDF
    The producers always aspire at increasing the efficiency of their production process. However, they do not always succeed in optimizing their production. In the last years, the interest on Data Envelopment Analysis (DEA) as a powerful tool for measuring efficiency has increased. This is due to the large amount of data sets collected to better understand the phenomena under study, and, at the same time, to the need of timely and inexpensive information. The “Productivity Analysis with R” (PAR) framework establishes a user-friendly data envelopment analysis environment with special emphasis on variable selection and aggregation, and summarization and interpretation of the results. The starting point is the following R packages: DEA (Diaz-Martinez and Fernandez-Menendez, 2008) and FEAR (Wilson, 2007). The DEA package performs some models of Data Envelopment Analysis presented in (Cooper et al., 2007). FEAR is a software package for computing nonparametric efficiency estimates and testing hypotheses in frontier models. FEAR implements the bootstrap methods described in (Simar and Wilson, 2000). PAR is a software framework using a portfolio of models for efficiency estimation and providing also results explanation functionality. PAR framework has been developed to distinguish between efficient and inefficient observations and to explicitly advise the producers about possibilities for production optimization. PER framework offers several R functions for a reasonable interpretation of the data analysis results and text presentation of the obtained information. The output of an efficiency study with PAR software is self- explanatory. We are applying PAR framework to estimate the efficiency of the agricultural system in Azores (Mendes et al., 2009). All Azorean farms will be clustered into homogeneous groups according to their efficiency measurements to define clusters of “good” practices and cluster of “less good” practices. This makes PAR appropriate to support public policies in agriculture sector in Azores.N/

    Knowledge Presentation and Reasoning with Loglinear Models

    Get PDF
    Our approach for knowledge presentation is based on the idea of expert system shell. At first we will build a graph shell of both possible dependencies and possible actions. Then, reasoning by means of Loglinear models, we will activate some nodes and some directed links. In this way a Bayesian network and networks presenting loglinear models are generated

    Extraction of Fraud Schemes from Trade Series

    Get PDF
    2000 Mathematics Subject Classification: 62H30, 62M10, 62M20, 62P20, 94A13.It is very often the case that the patterns of a fraudulent activity in trade are hidden within existing trade data time series. Furthermore, with the advent of powerful and affordable computing hardware, relatively big amounts of available trade data can be quickly analyzed with a view to assisting antifraud investigations in this field. In this paper, based on the availability of such import/export data series, we present a statistical method for the identification of potential fraud schemes, by extracting and highlighting those cases which lend themselves to further investigation by anti-fraud domain experts. The proposed method consists in applying time series analysis for prediction purposes, calculating the resulting significant deviations, and finally clustering time series with similar patterns together, thus identifying suspect or abnormal cases

    A software framework for measuring efficiency.

    Get PDF
    Trabalho apresentado em UseR Conference, Rennes, França, Julho 8 a 10 de 2009."[...]. PAR is a software framework using a variety of models estimating efficiency and providing results explanation functionality. [...]. We are applying PAR framework to estimate the efficiency of the agricultural system in Azores [Mendes et al., 2009]. All Azorean farms will be clustered into homogeneous groups according to their efficiency measurements to define clusters of "good" practices and cluster of "less good" practices. This makes PAR appropriate to support public policies in agriculture sector in Azores. [...]"This work has been partially supported by Regional Directorate for Science and Technology of Azores Government through the project M.2.1.2/l/009/2008, "Productivity Analysis of Azorean Cattle-Breeding Farms with R Statistical Software"

    An Approach to Variable Aggregation in Efficiency Analysis

    Get PDF
    In the nonparametric framework of Data Envelopment Analysis the statistical properties of its estimators have been investigated and only asymptotic results are available. For DEA estimators results of practical use have been proved only for the case of one input and one output. However, in the real world problems the production process is usually well described by many variables. In this paper a machine learning approach to variable aggregation based on Canonical Correlation Analysis is presented. This approach is applied for efficiency estimation of all the farms in Terceira Island of the Azorean archipelago

    Canonical Correlation Analysis in Variable Aggregation in DEA.

    Get PDF
    14º Congresso da APDIO, 7 a 9 de Setembro de 2009, Faculdade de Ciências e Tecnologia - Caparica.Neste trabalho documenta-se a aplicação de análise de correlações canónicas à agregação de variáveis em DEA, usando as correlações entre as variáveis originais e os componentes canónicos extraídos. É apresentado um caso de estudo que utiliza um pequeno conjunto de dados sobre explorações agrícolas na ilha terceira. Neste conjunto de 30 explorações agrícolas pretende-se usar 17 variáveis de input e 2 de output para avaliar a eficiência usando DEA. Sem qualquer redução de dados, vários problemas conhecidos como "praga da dimensionalidade" seriam esperados. Com os procedimentos sugeridos foi possível obter resultados razoáveis e de acordo com o conhecimento de domínio actual.ABSTRACT: In this paper we will document the application of canonical correlation analysis to variable aggregation using the correlations of the original variables with the canonical variates. A case study, about farms in Terceira Island, with a small data set is presented. In this data set of 30 farms we intend to use 17 input variables and 2 output variables to measure DEA efficiency. Without any data reduction procedure several problems known as "curse of dimensionality" are expected. With the data reduction procedures suggested it was possible to conclude quite acceptable and domain consistent conclusions

    An Approach to Variable Aggregation in Efficiency Analysis

    Get PDF
    Conference: The paper is selected from International Conference "Classification, Forecasting, Data Mining" CFDM 2009, Varna, Bulgaria, June-July 2009.In the nonparametric framework of Data Envelopment Analysis the statistical properties of its estimators have been investigated and only asymptotic results are available. For DEA estimators results of practical use have been proved only for the case of one input and one output. However, in the real world problems the production process is usually well described by many variables. In this paper a machine learning approach to variable aggregation based on Canonical Correlation Analysis is presented. This approach is applied for efficiency estimation of all the farms in Terceira Island of the Azorean archipelago

    Decision support for enhanced productivity with R software: an Azorean farms case study.

    Get PDF
    38th Annual Meeting of the Western Decision Sciences Institute WDSI 2009, Kauai, Hawaii, United States, 11th April 2009.Azores is a Portuguese insular territory where the main economic activity is dairy and meat farming. Dairy policy depends on Common Agricultural Policy of the European Union and is limited by quotas. On top of that the transformation sector had implemented a program for penalising the worst quality agricultural raw materials. The current historical context is particularly complex as some major changes are likely to occur. This is the case for the increase prices of some food products in international markets and, locally, the end of milk quota system. The multiplying effect of agriculture in both a small economy and the Azorean society, makes of major interest this kind of work not only to protect the income of farmers, but also to keep the society in equilibrium on employment matters and reduce immigration cycles. In this context, decision makers need information and knowledge for deciding the best policies in promoting quality and best practices. So, in this project we apply benchmarketing methodologies to estimate the efficiency of the agricultural system in Azores. We also propose to identify the inefficiency units and delineate action plans for correcting production or organizational identified problems. The data analysis will be possible using non parametric methods like data envelopment analysis – DEA. We develop a new data-driven methodology, called PAR (Productivity Analysis with R), which combines DEA with a statistical technique need for analysing a reduced number of farms. All Terceira (the second biggest island) farms are analyze according to their efficiency measurements to define groups of “good” practices and groups of “less good” practices. This makes the system appropriate to support public policies in agriculture sector in Azores. The decision makers we intend to support are of two different levels: farmers or services responsible for agriculture improvement and political decision makers. These two types of decision makers need information that is very specific and concrete in the first case and much more aggregated and general in the second case. The data analysis methods we are using can support the needs of both decision makers’ types, but the software interface must be specific designed. PAR project is designed to provide a bridge from mathematical models to productivity study using R statistical software. Several DEA models are described in literature. Some of them are implemented as functions in statistical software R which are being used for PAR system. Some works in restricted data sets were already done for the dairy sector in Azores using different approaches, by the authors. We use this data and results to validate and correct the software system we are developing. R statistical software is not very user friendly. Much programing is needed to make the output of the PAR computer program self explanatory and easily understandable.This work has been partially supported by Direccao Regional da Ciencia e Tecnologia of Azores Government through the project M.2.1.2/l/009/2008
    corecore