593,511 research outputs found
The Immune System: the ultimate fractionated cyber-physical system
In this little vision paper we analyze the human immune system from a
computer science point of view with the aim of understanding the architecture
and features that allow robust, effective behavior to emerge from local sensing
and actions. We then recall the notion of fractionated cyber-physical systems,
and compare and contrast this to the immune system. We conclude with some
challenges.Comment: In Proceedings Festschrift for Dave Schmidt, arXiv:1309.455
Classification of Computers and Computing Architectures:
Different terms or buzzwords have existed for the several classes of computers. Should our view of the classes of computers be so complicated and potentially confusing? Based on literature survey, empirical research, and authors\u27 combined accumulated experience in teaching and consulting, this paper recommends that for most situations, a simple dichotomyof computers as CLIENT and SERVER is adequate. The CLIENT computer is primarily for the use of and under the control of an individual, while the SERVER computer is meant for the use of more than one individual -a group, department, corporation, or government agency. This paper contends that this simple dichotomy facilitates initial learning for all computer users. Based on empirical research, the results were statistically significant to substantiate that --(a) computer classification confusion exists, (b) the dichotomy works, and (c) the dichotomy is preferred. This paper also proposes a hierarchical classification of computers based on different levels of perspective. Just as the general view of the classes of computers was technical in the beginning,the view of the computing architecture has been also technical. The technical classifications were based on criteria like network topologies, type of protocol, etc. This paper contends that again the user-oriented view for the classification of computing architecture should prevail. We suggest a simple dichotomy of computing architectures: Server/Client and Client/Server. The proposed dichotomy is based on end users\u27 view: who is at the center of information processing: Server or Client. In the Server/Client architecture, the server is at the center and the clients revolved around it in the sense that these are dependent on the capacity and capabilities of the server. With the fusion of computer and telecommunication technologies, a new paradigm of Client/Sever architecture has evolved. In this architecture, the client is at the center and there are several local or remote servers catering to the needs of this KING called the client
Disparity map generation based on trapezoidal camera architecture for multiview video
Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities,
the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video
remains a huge challenge. This paper presents the mathematical description of trapezoidal camera
architecture and relationships which facilitate the determination of camera position for visual content
acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera
Architecture is that it allows for adaptive camera topology by which points within the scene, especially the
occluded ones can be optically and geometrically viewed from several different viewpoints either on the
edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and
the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate
description could very well be used to address the issue of occlusion which continues to be a major
problem in computer vision with regards to the generation of depth map
Whatever happened to normative drawing?
One hallmark of humanities inquiry is its motivation by normative concerns: a critical view of how things are and a corresponding vision of how things could be. Architects used to adeptly deploy the rhetorical approaches of humanities inquiry in the promotion of architectural visions. Today architecture seems to be dominated by technical and instrumental concerns and where a vision is articulated it is usually of transformed individual living rather than of a different society. Imaginative text and image based signifying strategies have largely given way to the ubiquity of computer generated plans, elevations, walk-throughs and unconvincing images of contrived and unlikely sociality
Multi-view Convolutional Neural Networks for 3D Shape Recognition
A longstanding question in computer vision concerns the representation of 3D
shapes for recognition: should 3D shapes be represented with descriptors
operating on their native 3D formats, such as voxel grid or polygon mesh, or
can they be effectively represented with view-based descriptors? We address
this question in the context of learning to recognize 3D shapes from a
collection of their rendered views on 2D images. We first present a standard
CNN architecture trained to recognize the shapes' rendered views independently
of each other, and show that a 3D shape can be recognized even from a single
view at an accuracy far higher than using state-of-the-art 3D shape
descriptors. Recognition rates further increase when multiple views of the
shapes are provided. In addition, we present a novel CNN architecture that
combines information from multiple views of a 3D shape into a single and
compact shape descriptor offering even better recognition performance. The same
architecture can be applied to accurately recognize human hand-drawn sketches
of shapes. We conclude that a collection of 2D views can be highly informative
for 3D shape recognition and is amenable to emerging CNN architectures and
their derivatives.Comment: v1: Initial version. v2: An updated ModelNet40 training/test split is
used; results with low-rank Mahalanobis metric learning are added. v3 (ICCV
2015): A second camera setup without the upright orientation assumption is
added; some accuracy and mAP numbers are changed slightly because a small
issue in mesh rendering related to specularities is fixe
Cross-View Image Synthesis using Conditional GANs
Learning to generate natural scenes has always been a challenging task in
computer vision. It is even more painstaking when the generation is conditioned
on images with drastically different views. This is mainly because
understanding, corresponding, and transforming appearance and semantic
information across the views is not trivial. In this paper, we attempt to solve
the novel problem of cross-view image synthesis, aerial to street-view and vice
versa, using conditional generative adversarial networks (cGAN). Two new
architectures called Crossview Fork (X-Fork) and Crossview Sequential (X-Seq)
are proposed to generate scenes with resolutions of 64x64 and 256x256 pixels.
X-Fork architecture has a single discriminator and a single generator. The
generator hallucinates both the image and its semantic segmentation in the
target view. X-Seq architecture utilizes two cGANs. The first one generates the
target image which is subsequently fed to the second cGAN for generating its
corresponding semantic segmentation map. The feedback from the second cGAN
helps the first cGAN generate sharper images. Both of our proposed
architectures learn to generate natural images as well as their semantic
segmentation maps. The proposed methods show that they are able to capture and
maintain the true semantics of objects in source and target views better than
the traditional image-to-image translation method which considers only the
visual appearance of the scene. Extensive qualitative and quantitative
evaluations support the effectiveness of our frameworks, compared to two state
of the art methods, for natural scene generation across drastically different
views.Comment: Accepted at CVPR 201
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
Summarizing First-Person Videos from Third Persons' Points of Views
Video highlight or summarization is among interesting topics in computer
vision, which benefits a variety of applications like viewing, searching, or
storage. However, most existing studies rely on training data of third-person
videos, which cannot easily generalize to highlight the first-person ones. With
the goal of deriving an effective model to summarize first-person videos, we
propose a novel deep neural network architecture for describing and
discriminating vital spatiotemporal information across videos with different
points of view. Our proposed model is realized in a semi-supervised setting, in
which fully annotated third-person videos, unlabeled first-person videos, and a
small number of annotated first-person ones are presented during training. In
our experiments, qualitative and quantitative evaluations on both benchmarks
and our collected first-person video datasets are presented.Comment: 16+10 pages, ECCV 201
- …