5,320,855 research outputs found
Government spending, migration, and human capital: impact on economic welfare and growth- theory and evidence
The purpose of this dissertation is to analyze the effects of public policies on rural-urban migration and human capital expansion, and to examine the role of human capital (among other domestic and external factors) in the long-term economic growth of developing countries. Human capital expansion and labor migration from villages to cities are two aspects of the structure of labor markets in poor countries that are continuously influenced by public policies— policies that are often either ineffective or have unintended adverse consequences. For example, while much of human resource policy in developing countries is directed toward increasing the supply of educated labor, inter sectoral in-country migration and unemployment have become more pronounced, requiring new thinking on policy responses. This dissertation analyzes the outcomes of such policies and offers insights into how they might be improved.
Chapter 2 extends a two-sector, general-equilibrium model of rural-urban migration to include government spending. Provision of public goods acts as a productivity-enhancing input in private production that results in external economies of scale. This approach is generalized by introducing an unbalanced allocation of public expenditure in rural and urban sectors due to political economy considerations, differential sector output elasticities with respect to government input, and distortionary taxation. The chapter studies the effects of an increase in public spending and taxation on sectoral outputs, factor prices, urban unemployment, and welfare. Of particular concern here is to study the effect of an unbalanced allocation of government spending between rural and urban areas.
Chapter 3 studies the effects of selected education policies on the size of the educated labor pool and on economic welfare using the “job ladder” model of education, which is relevant to liberal arts education in developing countries. The policies considered are (1) increasing the teacher student ratio, (2) raising the relative wage of teachers, and (3) increasing the direct subsidy per student. In addition, the chapter analyzes the impact of wage rigidities in the skilled or modern sector on the size of the educated labor force. The analysis consists of five major sections. First, it reformulates the Bhagwati-Srinivasan job ladder model to make it amenable to analyzing the comparative static results of the effects of selected policies. Second, since higher education is mostly publicly financed, the analysis extends the job ladder model to incorporate public financing of the education sector. It then examines that model along with the effects of changes in policy parameters. Third, the analysis develops another extension of the job ladder model to include private tuition practices by teachers that are prevalent in many developing countries. Fourth, to analyze the impact of wage rigidities in a less restrictive framework where individuals can choose education based on ability and cost, the chapter develops an overlapping generations model of education with job ladder assumptions of wage rigidities in the skilled or modern sector. The chapter examines the flexible market and fixed market (with wage rigidities) equilibrium scenarios, and compares the impact on the threshold level of abilities and the size of the educated labor force. Finally, using specific functional forms of human capital production, cost, and ability density functions, the chapter analyzes the equilibrium outcomes. The analysis shows that in an economy with wage rigidities in the skilled sectors (modern and education sectors), the result of quality-enhancing policies under the simple job ladder model is an increase in the total size of the educated labor force. However, under an extended version of the job ladder model, the result depends on the relative size of the effects of an increase in the cost of education and the effects of an increase in the expected wage. The overlapping generations/job ladder model formulation used in the chapter finds that an increase in the present value of the expected wage and/or an increase in the marginal product of education will increase the demand for education. The minimum threshold level of ability falls, and more people are encouraged to acquire educational skills.
Chapter 4 estimates the effects of openness, trade orientation, human capital, and other factors on total factor productivity (TFP) and output for a pooled cross section, time-series sample of countries from Africa and Asia, as well as for the two regions separately. The models are estimated for the level and growth of both TFP and output by using panel fixed effects. The generalized method of moments is also applied to address endogeneity issues. Several variables related to political, financial, and economic risks are used as instruments, together with the lagged values of the dependent and endogenous explanatory variables. The data for this study span 40 years (1972–2011) and are grouped into five-year averages. Several sources were used to obtain the most updated data, including the newly released Penn World Table (Version 8.0). The chapter finds that inducing a greater outward orientation generally boosts TFP, per capita output, and growth. Greater accumulation of human capital has a consistently positive effect on output and TFP growth in both Africa and Asia. Its positive influence comes rather independently of trade variables than interactive terms with openness. Furthermore, inflation does not negatively affect growth, although inflation variability is found to adversely affect TFP and output in Africa.
Chapter 5 concludes the dissertation by providing conclusions, a summary of major results, and possible directions for future research
The Robust Reading Competition Annotation and Evaluation Platform
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and
re-established in 2011, has become a de-facto evaluation standard for robust
reading systems and algorithms. Concurrent with its second incarnation in 2011,
a continuous effort started to develop an on-line framework to facilitate the
hosting and management of competitions. This paper outlines the Robust Reading
Competition Annotation and Evaluation Platform, the backbone of the
competitions. The RRC Annotation and Evaluation Platform is a modular
framework, fully accessible through on-line interfaces. It comprises a
collection of tools and services for managing all processes involved with
defining and evaluating a research task, from dataset definition to annotation
management, evaluation specification and results analysis. Although the
framework has been designed with robust reading research in mind, many of the
provided tools are generic by design. All aspects of the RRC Annotation and
Evaluation Framework are available for research use.Comment: 6 pages, accepted to DAS 201
READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents
Text line detection is crucial for any application associated with Automatic
Text Recognition or Keyword Spotting. Modern algorithms perform good on
well-established datasets since they either comprise clean data or
simple/homogeneous page layouts. We have collected and annotated 2036 archival
document images from different locations and time periods. The dataset contains
varying page layouts and degradations that challenge text line segmentation
methods. Well established text line segmentation evaluation schemes such as the
Detection Rate or Recognition Accuracy demand for binarized data that is
annotated on a pixel level. Producing ground truth by these means is laborious
and not needed to determine a method's quality. In this paper we propose a new
evaluation scheme that is based on baselines. The proposed scheme has no need
for binarization and it can handle skewed as well as rotated text lines. The
ICDAR 2017 Competition on Baseline Detection and the ICDAR 2017 Competition on
Layout Analysis for Challenging Medieval Manuscripts used this evaluation
scheme. Finally, we present results achieved by a recently published text line
detection algorithm.Comment: Submitted to DAS201
Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Computation of document image quality metrics often depends upon the
availability of a ground truth image corresponding to the document. This limits
the applicability of quality metrics in applications such as hyperparameter
optimization of image processing algorithms that operate on-the-fly on unseen
documents. This work proposes the use of surrogate models to learn the behavior
of a given document quality metric on existing datasets where ground truth
images are available. The trained surrogate model can later be used to predict
the metric value on previously unseen document images without requiring access
to ground truth images. The surrogate model is empirically evaluated on the
Document Image Binarization Competition (DIBCO) and the Handwritten Document
Image Binarization Competition (H-DIBCO) datasets
Curriculum Learning for Handwritten Text Line Recognition
Recurrent Neural Networks (RNN) have recently achieved the best performance
in off-line Handwriting Text Recognition. At the same time, learning RNN by
gradient descent leads to slow convergence, and training times are particularly
long when the training database consists of full lines of text. In this paper,
we propose an easy way to accelerate stochastic gradient descent in this
set-up, and in the general context of learning to recognize sequences. The
principle is called Curriculum Learning, or shaping. The idea is to first learn
to recognize short sequences before training on all available training
sequences. Experiments on three different handwritten text databases (Rimes,
IAM, OpenHaRT) show that a simple implementation of this strategy can
significantly speed up the training of RNN for Text Recognition, and even
significantly improve performance in some cases
CNN training with graph-based sample preselection: application to handwritten character recognition
In this paper, we present a study on sample preselection in large training
data set for CNN-based classification. To do so, we structure the input data
set in a network representation, namely the Relative Neighbourhood Graph, and
then extract some vectors of interest. The proposed preselection method is
evaluated in the context of handwritten character recognition, by using two
data sets, up to several hundred thousands of images. It is shown that the
graph-based preselection can reduce the training data set without degrading the
recognition accuracy of a non pretrained CNN shallow model.Comment: Paper of 10 pages. Minor spelling corrections brought regarding the
v2. Accepted as an oral paper in the 13th IAPR Internationale Workshop on
Document Analysis Systems (DAS 2018
Symbol detection in online handwritten graphics using Faster R-CNN
Symbol detection techniques in online handwritten graphics (e.g. diagrams and
mathematical expressions) consist of methods specifically designed for a single
graphic type. In this work, we evaluate the Faster R-CNN object detection
algorithm as a general method for detection of symbols in handwritten graphics.
We evaluate different configurations of the Faster R-CNN method, and point out
issues relative to the handwritten nature of the data. Considering the online
recognition context, we evaluate efficiency and accuracy trade-offs of using
Deep Neural Networks of different complexities as feature extractors. We
evaluate the method on publicly available flowchart and mathematical expression
(CROHME-2016) datasets. Results show that Faster R-CNN can be effectively used
on both datasets, enabling the possibility of developing general methods for
symbol detection, and furthermore, general graphic understanding methods that
could be built on top of the algorithm.Comment: Submitted to DAS-201
Handwriting Recognition of Historical Documents with few labeled data
Historical documents present many challenges for offline handwriting
recognition systems, among them, the segmentation and labeling steps. Carefully
annotated textlines are needed to train an HTR system. In some scenarios,
transcripts are only available at the paragraph level with no text-line
information. In this work, we demonstrate how to train an HTR system with few
labeled data. Specifically, we train a deep convolutional recurrent neural
network (CRNN) system on only 10% of manually labeled text-line data from a
dataset and propose an incremental training procedure that covers the rest of
the data. Performance is further increased by augmenting the training set with
specially crafted multiscale data. We also propose a model-based normalization
scheme which considers the variability in the writing scale at the recognition
phase. We apply this approach to the publicly available READ dataset. Our
system achieved the second best result during the ICDAR2017 competition
Baseline Detection in Historical Documents using Convolutional U-Nets
Baseline detection is still a challenging task for heterogeneous collections
of historical documents. We present a novel approach to baseline extraction in
such settings, turning out the winning entry to the ICDAR 2017 Competition on
Baseline detection (cBAD). It utilizes deep convolutional nets (CNNs) for both,
the actual extraction of baselines, as well as for a simple form of layout
analysis in a pre-processing step. To the best of our knowledge it is the first
CNN-based system for baseline extraction applying a U-net architecture and
sliding window detection, profiting from a high local accuracy of the candidate
lines extracted. Final baseline post-processing complements our approach,
compensating for inaccuracies mainly due to missing context information during
sliding window detection. We experimentally evaluate the components of our
system individually on the cBAD dataset. Moreover, we investigate how it
generalizes to different data by means of the dataset used for the baseline
extraction task of the ICDAR 2017 Competition on Layout Analysis for
Challenging Medieval Manuscripts (HisDoc). A comparison with the results
reported for HisDoc shows that it also outperforms the contestants of the
latter.Comment: 6 pages, accepted to DAS 201
A fine-grained approach to scene text script identification
This paper focuses on the problem of script identification in unconstrained
scenarios. Script identification is an important prerequisite to recognition,
and an indispensable condition for automatic text understanding systems
designed for multi-language environments. Although widely studied for document
images and handwritten documents, it remains an almost unexplored territory for
scene text images.
We detail a novel method for script identification in natural images that
combines convolutional features and the Naive-Bayes Nearest Neighbor
classifier. The proposed framework efficiently exploits the discriminative
power of small stroke-parts, in a fine-grained classification framework.
In addition, we propose a new public benchmark dataset for the evaluation of
joint text detection and script identification in natural scenes. Experiments
done in this new dataset demonstrate that the proposed method yields state of
the art results, while it generalizes well to different datasets and variable
number of scripts. The evidence provided shows that multi-lingual scene text
recognition in the wild is a viable proposition. Source code of the proposed
method is made available online
- …