26,423 research outputs found

    A scalable application server on Beowulf clusters : a thesis presented in partial fulfilment of the requirement for the degree of Master of Information Science at Albany, Auckland, Massey University, New Zealand

    Get PDF
    Application performance and scalability of a large distributed multi-tiered application is a core requirement for most of today's critical business applications. I have investigated the scalability of a J2EE application server using the standard ECperf benchmark application in the Massey Beowulf Clusters namely the Sisters and the Helix. My testing environment consists of Open Source software: The integrated JBoss-Tomcat as the application server and the web server, along with PostgreSQL as the database. My testing programs were run on the clustered application server, which provide replication of the Enterprise Java Bean (EJB) objects. I have completed various centralized and distributed tests using the JBoss Cluster. I concluded that clustering of the application server and web server will effectively increase the performance of the application running on them given sufficient system resources. The application performance will scale to a point where a bottleneck has occurred in the testing system, the bottleneck could be any resources included in the testing environment: the hardware, software, network and the application that is running. Performance tuning for a large-scale J2EE application is a complicated issue, which is related to the resources available. However, by carefully identifying the performance bottleneck in the system with hardware, software, network, operating system and application configuration. I can improve the performance of the J2EE applications running in a Beowulf Cluster. The software bottleneck can be solved by changing the default settings, on the other hand, hardware bottlenecks are harder unless more investment are made to purchase higher speed and capacity hardware

    Caveats for information bottleneck in deterministic scenarios

    Full text link
    Information bottleneck (IB) is a method for extracting information from one random variable XX that is relevant for predicting another random variable YY. To do so, IB identifies an intermediate "bottleneck" variable TT that has low mutual information I(X;T)I(X;T) and high mutual information I(Y;T)I(Y;T). The "IB curve" characterizes the set of bottleneck variables that achieve maximal I(Y;T)I(Y;T) for a given I(X;T)I(X;T), and is typically explored by maximizing the "IB Lagrangian", I(Y;T)−βI(X;T)I(Y;T) - \beta I(X;T). In some cases, YY is a deterministic function of XX, including many classification problems in supervised learning where the output class YY is a deterministic function of the input XX. We demonstrate three caveats when using IB in any situation where YY is a deterministic function of XX: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of β\beta; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when YY is a small perturbation away from being a deterministic function of XX, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset

    A Rapidly Deployable Classification System using Visual Data for the Application of Precision Weed Management

    Full text link
    In this work we demonstrate a rapidly deployable weed classification system that uses visual data to enable autonomous precision weeding without making prior assumptions about which weed species are present in a given field. Previous work in this area relies on having prior knowledge of the weed species present in the field. This assumption cannot always hold true for every field, and thus limits the use of weed classification systems based on this assumption. In this work, we obviate this assumption and introduce a rapidly deployable approach able to operate on any field without any weed species assumptions prior to deployment. We present a three stage pipeline for the implementation of our weed classification system consisting of initial field surveillance, offline processing and selective labelling, and automated precision weeding. The key characteristic of our approach is the combination of plant clustering and selective labelling which is what enables our system to operate without prior weed species knowledge. Testing using field data we are able to label 12.3 times fewer images than traditional full labelling whilst reducing classification accuracy by only 14%.Comment: 36 pages, 14 figures, published Computers and Electronics in Agriculture Vol. 14

    An information theoretic approach to the functional classification of neurons

    Get PDF
    A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured in the identity of the clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without any need to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classical results, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cells is functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells.Comment: 13 pages, 4 figures. To appear in Advances in Neural Information Processing Systems (NIPS) 1
    • …
    corecore