26,423 research outputs found
A scalable application server on Beowulf clusters : a thesis presented in partial fulfilment of the requirement for the degree of Master of Information Science at Albany, Auckland, Massey University, New Zealand
Application performance and scalability of a large distributed multi-tiered application is a core requirement for most of today's critical business applications. I have investigated the scalability of a J2EE application server using the standard ECperf benchmark application in the Massey Beowulf Clusters namely the Sisters and the Helix. My testing environment consists of Open Source software: The integrated JBoss-Tomcat as the application server and the web server, along with PostgreSQL as the database. My testing programs were run on the clustered application server, which provide replication of the Enterprise Java Bean (EJB) objects. I have completed various centralized and distributed tests using the JBoss Cluster. I concluded that clustering of the application server and web server will effectively increase the performance of the application running on them given sufficient system resources. The application performance will scale to a point where a bottleneck has occurred in the testing system, the bottleneck could be any resources included in the testing environment: the hardware, software, network and the application that is running. Performance tuning for a large-scale J2EE application is a complicated issue, which is related to the resources available. However, by carefully identifying the performance bottleneck in the system with hardware, software, network, operating system and application configuration. I can improve the performance of the J2EE applications running in a Beowulf Cluster. The software bottleneck can be solved by changing the default settings, on the other hand, hardware bottlenecks are harder unless more investment are made to purchase higher speed and capacity hardware
Caveats for information bottleneck in deterministic scenarios
Information bottleneck (IB) is a method for extracting information from one
random variable that is relevant for predicting another random variable
. To do so, IB identifies an intermediate "bottleneck" variable that has
low mutual information and high mutual information . The "IB
curve" characterizes the set of bottleneck variables that achieve maximal
for a given , and is typically explored by maximizing the "IB
Lagrangian", . In some cases, is a deterministic
function of , including many classification problems in supervised learning
where the output class is a deterministic function of the input . We
demonstrate three caveats when using IB in any situation where is a
deterministic function of : (1) the IB curve cannot be recovered by
maximizing the IB Lagrangian for different values of ; (2) there are
"uninteresting" trivial solutions at all points of the IB curve; and (3) for
multi-layer classifiers that achieve low prediction error, different layers
cannot exhibit a strict trade-off between compression and prediction, contrary
to a recent proposal. We also show that when is a small perturbation away
from being a deterministic function of , these three caveats arise in an
approximate way. To address problem (1), we propose a functional that, unlike
the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the
three caveats on the MNIST dataset
A Rapidly Deployable Classification System using Visual Data for the Application of Precision Weed Management
In this work we demonstrate a rapidly deployable weed classification system
that uses visual data to enable autonomous precision weeding without making
prior assumptions about which weed species are present in a given field.
Previous work in this area relies on having prior knowledge of the weed species
present in the field. This assumption cannot always hold true for every field,
and thus limits the use of weed classification systems based on this
assumption. In this work, we obviate this assumption and introduce a rapidly
deployable approach able to operate on any field without any weed species
assumptions prior to deployment. We present a three stage pipeline for the
implementation of our weed classification system consisting of initial field
surveillance, offline processing and selective labelling, and automated
precision weeding. The key characteristic of our approach is the combination of
plant clustering and selective labelling which is what enables our system to
operate without prior weed species knowledge. Testing using field data we are
able to label 12.3 times fewer images than traditional full labelling whilst
reducing classification accuracy by only 14%.Comment: 36 pages, 14 figures, published Computers and Electronics in
Agriculture Vol. 14
An information theoretic approach to the functional classification of neurons
A population of neurons typically exhibits a broad diversity of responses to
sensory inputs. The intuitive notion of functional classification is that cells
can be clustered so that most of the diversity is captured in the identity of
the clusters rather than by individuals within clusters. We show how this
intuition can be made precise using information theory, without any need to
introduce a metric on the space of stimuli or responses. Applied to the retinal
ganglion cells of the salamander, this approach recovers classical results, but
also provides clear evidence for subclasses beyond those identified previously.
Further, we find that each of the ganglion cells is functionally unique, and
that even within the same subclass only a few spikes are needed to reliably
distinguish between cells.Comment: 13 pages, 4 figures. To appear in Advances in Neural Information
Processing Systems (NIPS) 1
- …