Information extraction from multi-sensor remote sensing imagery is an important and challenging task for many
applications such as urban area mapping and change detection. Especially for optical and radar data fusion a special
acquisition (orthogonal) geometry is of great importance in order to minimize displacements due to an inaccuracy of
the Digital Elevation Model (DEM) used for data ortho-rectification and due to the presence of unknown 3D
structures in a scene. Final data spatial alignment is performed manually using ground control points (GCPs) or by a
recently proposed automatic co-registration method based on a Mutual Information measure. These data preprocessing
steps are of a crucial importance for a success of the following data fusion. For a combination of features
originating from different sources, which are quite often non-commensurable, we propose an information fusion
framework called INFOFUSE consisting of three main processing steps: feature fission (feature extraction for
complete description of a scene), unsupervised clustering (complexity reduction and feature conversion to a
common domain) and supervised classification realized by Bayesian/Neural/Graphical networks. Finally, a general
data processing chain for multi-sensor data fusion is presented. Examples of buildings in an urban area are presented
for very high resolution space borne optical WorldView-2 and radar TerraSAR-X imagery over Munich city,
Germany in different acquisition geometries including the orthogonal one. Additionally, theoretical analysis of radar
signatures of buildings in urban area and its impact on the joint classification or data fusion is discussed