88 research outputs found
Shedding light on social learning
Culture involves the origination and transmission of ideas, but the
conditions in which culture can emerge and evolve are unclear. We constructed
and studied a highly simplified neural-network model of these processes. In
this model ideas originate by individual learning from the environment and are
transmitted by communication between individuals. Individuals (or "agents")
comprise a single neuron which receives structured data from the environment
via plastic synaptic connections. The data are generated in the simplest
possible way: linear mixing of independently fluctuating sources and the goal
of learning is to unmix the data. To make this problem tractable we assume that
at least one of the sources fluctuates in a nonGaussian manner. Linear mixing
creates structure in the data, and agents attempt to learn (from the data and
possibly from other individuals) synaptic weights that will unmix, i.e., to
"understand" the agent's world. For a variety of reasons even this goal can be
difficult for a single agent to achieve; we studied one particular type of
difficulty (created by imperfection in synaptic plasticity), though our
conclusions should carry over to many other types of difficulty. We previously
studied whether a small population of communicating agents, learning from each
other, could more easily learn unmixing coefficients than isolated individuals,
learning only from their environment. We found, unsurprisingly, that if agents
learn indiscriminately from any other agent (whether or not they have learned
good solutions), communication does not enhance understanding. Here we extend
the model slightly, by allowing successful learners to be more effective
teachers, and find that now a population of agents can learn more effectively
than isolated individuals. We suggest that a key factor in the onset of culture
might be the development of selective learning.Comment: 11 pages 8 figure
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Learning from Multiple Sources for Video Summarisation
Many visual surveillance tasks, e.g.video summarisation, is conventionally
accomplished through analysing imagerybased features. Relying solely on visual
cues for public surveillance video understanding is unreliable, since visual
observations obtained from public space CCTV video data are often not
sufficiently trustworthy and events of interest can be subtle. On the other
hand, non-visual data sources such as weather reports and traffic sensory
signals are readily accessible but are not explored jointly to complement
visual data for video content analysis and summarisation. In this paper, we
present a novel unsupervised framework to learn jointly from both visual and
independently-drawn non-visual data sources for discovering meaningful latent
structure of surveillance video data. In particular, we investigate ways to
cope with discrepant dimension and representation whist associating these
heterogeneous data sources, and derive effective mechanism to tolerate with
missing and incomplete data from different sources. We show that the proposed
multi-source learning framework not only achieves better video content
clustering than state-of-the-art methods, but also is capable of accurately
inferring missing non-visual semantics from previously unseen videos. In
addition, a comprehensive user study is conducted to validate the quality of
video summarisation generated using the proposed multi-source model
Symbiotic deep learning for medical image analysis with applications in real-time diagnosis for fetal ultrasound screening
The last hundred years have seen a monumental rise in the power and capability of machines to
perform intelligent tasks in the stead of previously human operators. This rise is not expected
to slow down any time soon and what this means for society and humanity as a whole remains
to be seen. The overwhelming notion is that with the right goals in mind, the growing influence
of machines on our every day tasks will enable humanity to give more attention to the truly
groundbreaking challenges that we all face together. This will usher in a new age of human
machine collaboration in which humans and machines may work side by side to achieve greater
heights for all of humanity. Intelligent systems are useful in isolation, but the true benefits of
intelligent systems come to the fore in complex systems where the interaction between humans
and machines can be made seamless, and it is this goal of symbiosis between human and machine
that may democratise complex knowledge, which motivates this thesis. In the recent past, datadriven
methods have come to the fore and now represent the state-of-the-art in many different
fields. Alongside the shift from rule-based towards data-driven methods we have also seen a
shift in how humans interact with these technologies. Human computer interaction is changing
in response to data-driven methods and new techniques must be developed to enable the same
symbiosis between man and machine for data-driven methods as for previous formula-driven
technology.
We address five key challenges which need to be overcome for data-driven human-in-the-loop
computing to reach maturity. These are (1) the ’Categorisation Challenge’ where we examine
existing work and form a taxonomy of the different methods being utilised for data-driven
human-in-the-loop computing; (2) the ’Confidence Challenge’, where data-driven methods must
communicate interpretable beliefs in how confident their predictions are; (3) the ’Complexity
Challenge’ where the aim of reasoned communication becomes increasingly important as the
complexity of tasks and methods to solve also increases; (4) the ’Classification Challenge’ in
which we look at how complex methods can be separated in order to provide greater reasoning
in complex classification tasks; and finally (5) the ’Curation Challenge’ where we challenge the
assumptions around bottleneck creation for the development of supervised learning methods.Open Acces
Recommended from our members
Weakly-Supervised Temporal Activity Localization and Classification With Web Videos
In this thesis, weakly-supervised temporal activity localization and classification is considered with the use of web videos. Most activity localization methods depend on the availability of frame-wise annotation, which is a burdensome task to collect. To reduce the effort of manual labeling, learning from weak labels may be used as a potential solution. Recently there has been a substantial influx of tagged videos on the Internet. These can potentially be used as a rich source of data for weakly-supervised training. The following problem is considered. Given only the keyword of an action, can videos be retrieved online and be used to train the Weakly-supervised Temporal Activity Localization and Classification (W-TALC) network? Then, can a re-ranking method be implemented to filter out noisy video data? Action categories of the Thumos14 dataset are used to search for videos online with Youtube Data API. These videos are used as a training set for the W-TALC network. Given only the video labels, the W-TALC network learns to both localize and classify actions in videos. Using a re-ranking strategy, noisy video data is removed and shows an increase in detection performance versus using the original web video dataset. Analysis of the web video dataset and results of the detection performance shows promise for the reliable use of web videos for training
Pelacakan Objek Bergerak Berdasarkan Pendekatan Adaptive Threshold untuk Alpha Matting Menggunakan Metode K-Means
Pelacakan objek merupakan kegiatan penting dalam bidang computer vision yang memiliki banyak aplikasi bidang interaksi manusia dan komputer, pengawasan, ruang yang cerdas dan pencitraan medis. Dalam bentuk yang paling sederhana, pelacakan dapat didefinisikan sebagai masalah memperkirakan lintasan objek dalam bidang gambar ketika bergerak di sekitar scene. Pelacakan obyek udah banyak dilakukan oleh para peneliti sebelumnya, baik menggunakan representasi obyek, feature selection. Maka peneliti mengusulkan penelitian baru yaitu pencarian nilai threshold menggunakan metode kmeans. Kemudian di lanjutkan dengan proses matting. Dari percobaan menggunakan 15 data indoor dan 15 data outdoor, didapatkan nilai threshold menggunakan metode kmeansuntuk matting terbukti lebih baik dibandingkan dengan metode Otsu, FCM, maupun metode manual. Dimana nilai akurasi metode Otsu didapatkan nilai MSE sebesar 3,13E+02 pixel, nilai MSE untuk FCM didapat sebesar 5,22E+01 pixel, metode kmeans sebesar 4,00E+01 pixeldari beberapa frame yang dijadikan latihanmenggunakan metode kmeans menggunakan fungsi matting. Dan untuk dataset outdoor nilai rata-rata yang di dapat dengan metode Otsu didapatkan nilai MSE sebesar 1,38E+02 pixel, nilai MSE untuk FCM didapat sebesar 1,89E+02 pixel, metode kmeans sebesar 1,27E+02 pixe
- …