231 research outputs found
The German turnover tax statistics panel
Based on the yearly turnover tax statistics, the German turnover tax statistics panel allows for the first time detailed longitudinal analyses of nearly all economic sectors. In addition to turnover tax related variables, the dataset provides information about exports, imports and, due to the combination with the German business register (Unternehmensregister), information about employees liable to pay social insurance. The panel contains more than 4.3 million enterprises and 1.9 million of these are covered over the whole time period from 2001 to 2005. There is no other German statistics that covers nearly all economic sectors with such completeness. In the following we give an overview of the turnover tax statistics and the matching process (sections 2 and 3). Section 4 describes the variables included in the dataset and in section 5 examples of the research potential are presented. The paper closes with information about the way of data access (section 6).
Rapport fait au nom de la commission des affaires sociales et de la sante publique sur les propositions de la Commission des Communautes europeennes au Conseil (doc. 96/70) concernant: I. une directive relative au rapprochement des legislations des Etats membres concernant la biere; II. un reglement modifiant le reglement no 120/67/CEE et le reglement no 359/67/CEE pour ce qui concerne la restitution a la production dont beneficient certains produits utilises en brasserie. Documents de seance 1971-1972, Document 44/71, 7 juin 1971. = "Report on behalf of the Committee on Social Affairs and Public Health on the proposals of the European Communities Commission to the Council (Doc 96/70.) Concerning: I. a Directive on the approximation of the laws of the Member States on beer; II. a regulation amending Regulation No. 120/67/EEC and Regulation No 359/67/EEC regarding the return to production which shall enjoy certain products used in brewing. Working Documents 1971-1972, Document 44/71, 7 June 1971"
Main Memory Adaptive Indexing for Multi-core Systems
Adaptive indexing is a concept that considers index creation in databases as
a by-product of query processing; as opposed to traditional full index creation
where the indexing effort is performed up front before answering any queries.
Adaptive indexing has received a considerable amount of attention, and several
algorithms have been proposed over the past few years; including a recent
experimental study comparing a large number of existing methods. Until now,
however, most adaptive indexing algorithms have been designed single-threaded,
yet with multi-core systems already well established, the idea of designing
parallel algorithms for adaptive indexing is very natural. In this regard only
one parallel algorithm for adaptive indexing has recently appeared in the
literature: The parallel version of standard cracking. In this paper we
describe three alternative parallel algorithms for adaptive indexing, including
a second variant of a parallel standard cracking algorithm. Additionally, we
describe a hybrid parallel sorting algorithm, and a NUMA-aware method based on
sorting. We then thoroughly compare all these algorithms experimentally; along
a variant of a recently published parallel version of radix sort. Parallel
sorting algorithms serve as a realistic baseline for multi-threaded adaptive
indexing techniques. In total we experimentally compare seven parallel
algorithms. Additionally, we extensively profile all considered algorithms. The
initial set of experiments considered in this paper indicates that our parallel
algorithms significantly improve over previously known ones. Our results
suggest that, although adaptive indexing algorithms are a good design choice in
single-threaded environments, the rules change considerably in the parallel
case. That is, in future highly-parallel environments, sorting algorithms could
be serious alternatives to adaptive indexing.Comment: 26 pages, 7 figure
Only Aggressive Elephants are Fast Elephants
Yellow elephants are slow. A major reason is that they consume their inputs
entirely before responding to an elephant rider's orders. Some clever riders
have trained their yellow elephants to only consume parts of the inputs before
responding. However, the teaching time to make an elephant do that is high. So
high that the teaching lessons often do not pay off. We take a different
approach. We make elephants aggressive; only this will make them very fast. We
propose HAIL (Hadoop Aggressive Indexing Library), an enhancement of HDFS and
Hadoop MapReduce that dramatically improves runtimes of several classes of
MapReduce jobs. HAIL changes the upload pipeline of HDFS in order to create
different clustered indexes on each data block replica. An interesting feature
of HAIL is that we typically create a win-win situation: we improve both data
upload to HDFS and the runtime of the actual Hadoop MapReduce job. In terms of
data upload, HAIL improves over HDFS by up to 60% with the default replication
factor of three. In terms of query execution, we demonstrate that HAIL runs up
to 68x faster than Hadoop. In our experiments, we use six clusters including
physical and EC2 clusters of up to 100 nodes. A series of scalability
experiments also demonstrates the superiority of HAIL.Comment: VLDB201
Automatische, Deskriptor-basierte Unterstützung der Dokumentanalyse zur Fokussierung und Klassifizierung von Geschäftsbriefen
Die vorliegende Arbeit wurde im Rahmen des ALV-Projekts (Automatisches Lesen und Verstehen) am Deutschen Forschungszentrum für Künstliche Intelligenz (DFKI) erstellt. Ziel des ALV-Projektes ist die Entwicklung einer intelligenten Schnittstelle zwischen Papier und Rechner (paper-computer interface). Hierbei soll durch Nachahmung des menschlichen Leseverhaltens ein Schritt in Richtung papierloses Büro ausgeführt werden. Exemplarisch werden in ALV Geschäftsbriefe als Domäne untersucht. Teilgebiete innerhalb des ALV-Projekts sind Layoutextraktion, Logical Labeling, Texterkennung und Textanalyse. Diese Arbeit fällt in den Bereich der Textanalyse. Die Aufgabenstellung bestand darin, mittels der vorkommenden Wörter (im Brieftext) die Art des Briefes sowie erste Hinweise über die Intention des Briefautors zu ermitteln. Derartige Informationen können von anderen Experten zur weiteren Verarbeitung, Verteilung und Archivierung der Briefe genutzt werden. Das innerhalb einer Diplomarbeit entwickelte und implementierte INFOCLAS-System versucht deshalb auf der Basis statistischer Verfahren und Methodiken aus dem Information Retrieval folgende Funktionalität bereitzustellen:
i) Extrahierung und Gewichtung von bedeutungstragenden Wörtern;
ii) Ermittelung der Kernaussage (Fokus) eines Geschäftsbriefs;
iii) Klassifizierung eines Geschäftsbriefs in vordefinierte Nachrichtentypen.
Die dafür entwickelten Module Indexierer, Fokussierer und Klassifizierer benutzen -- neben Konzepten aus dem Information Retrieval -- eine Datenbasis, die eine Sammlung von Geschäftsbriefen enthält, sowie spezifische Wortlisten, die die modellierten Briefklassen repräsentieren. Als weiteres Hilfsmittel dient ein morphologisches Werkzeug zur grammatikalischen Analyse der Wörter. Mit diesen Wissensquellen werden Hypothesen über die Briefklasse und die Kernaussage des Briefinhalts aufgestellt.In this documentation existing techniques of information retrieval (IR) are compared and evaluated for their application in document analysis and understanding. Moreover, we have developed a system called INFOCLAS which uses appropriate statistical methods of IR, primarily for the classification of German business letters into corresponding message types such as order, offer, confirmation, inquiry, and advertisement. INFOCLAS is a first step towards understanding of business letters. Actually, it comprises three modules: the central indexer (extraction and weighting of indexing terms), the classifier (classification of business letters into given types) and the focusser (highlighting relevant parts of the letter). INFOCLAS integrates several knowledge sources including a database of about 120 letters, word frequency statistics for German, message type specific words, morphological knowledge as well as the underlying document model (layout and logical structure). As output, the system computes a set of weighted hypotheses about the type of letter at hand. A classification of documents allows the automatic distribution or archiving of letters and is also an excellent starting point for higher-level document analysis
Energy Efficiency in Machining of Aircraft Components
High production costs and material removal rates characterize the manufacturing of aircraft components made of titanium. Due to competitive pressure, the manufacturing processes are highly optimized from an economical perspective, whereas environmental aspects are usually not considered. One example is the recycling of titanium chips. Because of process-induced contaminations they do not meet the quality required for recycling in high-grade titanium alloys. Thus the components need to be manufactured from primary material, which leads to a poor energy balance. This paper describes a methodology to increase the recycling rate and energy efficiency of the manufacturing process by investigating the influencing parameters on chip quality of the machining process with the aim to increase the chip quality to a recyclable degree under monetary aspects. The analysis shows that the recycling rate can be significantly increased through dry cutting, which also brings economic benefits.German Federal Ministry for Economic Affairs and Energy (BMWi)/03ET1174
Automatic Feature-Based Point Cloud Registration for a Moving Sensor Platform
The automatic and accurate alignment of multiple point clouds is a basic requirement for an adequate digitization, reconstruction and interpretation of large 3D environments. Due to the recent technological advancements, modern devices are available which allow for simultaneously capturing intensity and range images with high update rates. Hence, such devices can even be used for dynamic scene analysis and for rapid mapping which is particularly required for environmental applications and disaster management, but unfortu-nately, they also reveal severe restrictions. Facing challenges with respect to noisy range measurements, a limited non-ambiguous range, a limited field of view and the occurrence of scene dynamics, the adequate alignment of captured point clouds has to satisfy additional constraints compared to the classical registration of terrestrial laser scanning (TLS) point clouds for describing static scenes. In this paper, we propose a new methodology for point cloud registration which considers such constraints while maintaining the fundamental properties of high accuracy and low computational effort without relying on a good initial alignment or human interaction. Exploiting 2D image features and 2D/2D correspondences, sparse point clouds of physically almost identical 3D points are derived. Subsequently, these point clouds are aligned with a fast procedure directly taking into account the reliability of the detected correspondences with respect to geometric and radiometric information. The proposed methodology is evaluated and its performance is demonstrated for data captured with a moving sensor platform which has been designed for monitoring from low altitudes. Due to the provided reliability and a fast processing scheme, the proposed methodology offers a high potential for dynamic scene capture and analysis.
Against Bureaucracy. Why Flexibility and Decentralisation Cannot Solve Organisational Problems
Kühl S, Dittrich EJ. Against Bureaucracy. Why Flexibility and Decentralisation Cannot Solve Organisational Problems. In: Makó C, Warhurst C, eds. The Management and Organisation of Firms in the Global Context. Budapest: University of Gödöllo; 1999: 119-125
Investigations on a standardized process chain and support structure related rework procedures of SLM manufactured components
For the successful production of high quality parts by selective laser melting, various process steps are required. Besides the SLM process itself, different pre- and rework steps are needed to produce a final component. Therefore, the first part of this paper presents a concept of a standardized process chain for carrying out the necessary planning and production procedures. For this purpose, the CAD-model is enriched with information regarding support structures, the desired surface quality and the position of tooling points. Since major steps in the reworking procedure are the removal of residual powder, the removal of support structures and the finishing operations for functional component surfaces, selected experimental results concerning these steps are presented in the second part of the paper. Based on the result, recommendations for the design of support structures are given
Zur Dynamik der Export- und Importbeteiligung deutscher Industrieunternehmen – Empirische Befunde aus dem Umsatzsteuerpanel 2001 – 2006
Im Jahr 2008 wurden erstmals die Querschnittsdatensätze der Umsatzsteuerstatistik zu einem Paneldatensatz verknüpft - zunächst für den Zeitraum 2001 bis 2005, seit Mitte 2009 steht nun die aktuelle Version des Umsatzsteuerpanels für den Zeitraum 2001 bis 2006 für Auswertungen zur Verfügung. Dieser Datensatz bietet die einzigartige Möglichkeit, alle in diesem Zeitraum umsatzsteuerpflichtigen Unternehmen über den Zeitverlauf hinweg zu betrachten. Da in den Daten auch Informationen über die Export- und Importaktivitäten der Unternehmen enthalten sind, kann das Umsatzsteuerpanel unter anderem dazu genutzt werden, Auskunft über die Verbreitung von Exportund Importaktivitäten sowie über die Dynamik der Export- und Importbeteiligung auf Unternehmensebene zu geben. In 2006 weisen gut 20 Prozent der westdeutschen Industrieunternehmen und knapp 14 Prozent der ostdeutschen Industrieunternehmen sowohl Export- als auch Importaktivitäten auf. Der Anteil der Industrieunternehmen die in 2006 weder exportiert noch importiert haben liegt bei 59 Prozent in Westdeutschland sowie bei 67 Prozent in Ostdeutschland. Eine Betrachtung der Muster der Export- und Importbeteiligung über die Jahre 2001 bis 2006 sowie Übergangsmatrizen für das Jahr 2001 auf 2006 zeigen, dass der überwiegende Teil der Unternehmen ihren Status (weder Exporteur noch Importeur, nur Exporteur, nur Importeur, sowohl Exporteur als auch Importeur) über die Zeit nicht ändert. Immerhin ein Drittel der in allen betrachteten Jahren im Datensatz enthaltenen Unternehmen haben jedoch mindestens einmal zwischen 2001 und 2006 ihren Status gewechselt.
- …