40 research outputs found
Kanak - kanak didedahkan pertanian mesra alam
Effective energy consumption prediction is important for determining the demand and supply of energy. The challenge is how to predict energy consumption? This study presents an energy consumption analytical regression model and process based on the project conducted in an Australian company. This study involved the analysis of household and energy consumption datasets in the residential sector. The analytical model generation process is organised into four major stages: prepared the household and energy consumption data or data cleansing, household energy consumption clustering (segmentation or groups) using k-means clustering algorithm for similarity measure in their characteristics, stepwise multiple regression for variables selection to determine the final model's predictors, and filter the final regression model to identify the influential observations using Cook's distance and Q-Q (quantile-quantile) normal plot for improvement in the model. The final filtered regression model represents 64 percent variation to the dependent variable is explained by independent variables with correlation 0.8 between energy consumption observed and predicted values. The abovementioned process and resultant regression model seem useful for developing household energy consumptions models for managing the demand and supply of energy
A new best proximity point results in partial metric spaces endowed with a graph
For a given mapping f in the framework of different spaces, the fixed-point equations of the form f x = x can model several problems in different areas, such as differential equations, optimization, and computer science. In this work, the aim is to find the best proximity point and to prove its uniqueness on partial metric spaces where the symmetry condition is preserved for several types of contractive non-self mapping endowed with a graph. Our theorems generalize different results in the literature. In addition, we will illustrate the usability of our outcomes with some examples. The proposed model can be considered as a theoretical foundation for applications to real cases
Document terms with same statistical properties
Statistical analysis prior to processing queries in text mining is important and can help text searching methods to perform better. In this paper descriptive analysis and non-parametric ANOVA are presented for Term Document frequency matrix showing the significance of different terms and their paired comparisons. This will filter and reduce the number of the terms and can be significantly contributed to faster and efficient searching results
Refined information retrieval and frequency distribution
This paper presents the process of refining the document and their terms in Information Retrieval. It also shows the significance of this process prior to applying any of the information retrieval applications including probability models on the actual terms distribution. This is an important issue in language models approach, it also helps to show the effectiveness and efficiency in terms of minimizing the amount of time and space required to process the data. This is also very important for probabilistic approaches such as Single Poisson, double Poisson, Binomial and Multinomial distributions which are used to define the weights in the document matching process. This approach is applied on specific data sources rather than Web pages
Using probability models to classify software patterns
We propose an approach for creating software design patterns classification scheme based on probability models and statistical methods used in information retrieval domain. The approach looks for a set of words, phrases, and topics, i.e. concepts embedded or represented by words and phrases that describe the pattern. We also present a process that generates a list of terms, associate each list with a pattern category, and search the resulting list with user queries to select a particular pattern
Information retrieval models : performance, evaluation and comparisons for healthcare big data analytics
We propose analysis, performance and evaluation of different Information retrieval models with a foundational implementation system to the Healthcare Data Analytics. In this type of systems, patients post questions to patient/ caregiver support forums. To reduce repetitiveness due to previously asked questions by other patients with similar conditions, albeit worded differently, the proposed system will other patients questions that are semantically similar to theirs. The problem is re-formulated as an Information Retrieval (IR) problem and several of the modern implementations of IR models particularly the probabilistic models are available to tackle this problem. Specifically, we utilized Lucene which offers a full-text search library by adding search functionality to our foundational model and system implementation
Cost efficiency modeling in health care foodservice operations
Cost efficiency and cost data in the health care foodservice sector have many interesting features. Traditional productivity approaches and empirical studies are yet to address many of these features, as they are limited to partial ratios and restricted parametric techniques. In this paper, we introduce and demonstrate the Stochastic Frontier Approach to this sector, and analyze the level of cost efficiency and its determinants. The approach is tested in a cross-sectional data set from a sample of 101 health care foodservice operations in Australia and the USA. Results indicate that the average level of cost efficiency is around 76.5% which suggests that health care foodservice operations could reduce their input costs by as much as 23.5% without decreasing their total output. Further, the coefficients of the inefficiency component, estimated simultaneously with the stochastic frontier model, indicate that both the level of manager's experience and the level of manager's education are significant determinants of cost efficiency
Improving the accuracy of DEA efficiency analysis : a bootstrap application to the health care foodservice industry
This article analyses the efficiency of health care foodservice operations and its determinants using a Data Envelopment Analysis (DEA) boot-strapping approach. The purpose of using the bootstrapping approach is two-fold: first, to obtain the bias corrected estimates and the confidence intervals of DEA-efficiency scores and second, to overcome the correlation problem of DEA-efficiency scores and to provide consistent inferences in explaining the determinants of health care foodservice efficiency. The approach was implemented on a sample of 89 health care foodservice operations. The results showed the presence of inefficiency in the sample, with an average efficiency level of 72.6%. Further, the results from analysing the determinants of health care foodservice operations provided policy implication regarding the factors that might improve the efficiency of health care foodservice operation
Evaluating statistical information retrieval models with different indexing enhancement strategies
In this paper we will introduce the indexing enhancement for the Statistical Information Retrieval models (IR). To accomplish this task, we propose including an extra level of synonym and semantic classification and the distribution entropy of these synonym classifications via professional and technical resources that is determined by the query domain subject matter experts. We will evaluate this enhancement by using Rank-biased Precision (RBP) measurement after using BM25 IR model. This approach needs to re-formulate the implementations of IR models particularly the probabilistic models in an integrated system. Evaluated CRAN data will be analysed utilizing Lucene which offers a full-text search library by adding search functionality to our foundational model and system implementation
Mixture distribution model for resources availability in volunteer computing systems
Characterizing, analysis and modelling resources availability in volunteer computing systems is becoming extremely important and essential for efficient application scheduling and systems utilization. In this paper we describe, analyse and model the availability characteristics using mixture probability density functions. We apply Mixture-gamma model as a predictive method to estimate the availability tail probability using the real availability traces from the SETI@home project with more than 230,000 hosts