7,021 research outputs found
Using Generic Summarization to Improve Music Information Retrieval Tasks
In order to satisfy processing time constraints, many MIR tasks process only
a segment of the whole music signal. This practice may lead to decreasing
performance, since the most important information for the tasks may not be in
those processed segments. In this paper, we leverage generic summarization
algorithms, previously applied to text and speech summarization, to summarize
items in music datasets. These algorithms build summaries, that are both
concise and diverse, by selecting appropriate segments from the input signal
which makes them good candidates to summarize music as well. We evaluate the
summarization process on binary and multiclass music genre classification
tasks, by comparing the performance obtained using summarized datasets against
the performances obtained using continuous segments (which is the traditional
method used for addressing the previously mentioned time constraints) and full
songs of the same original dataset. We show that GRASSHOPPER, LexRank, LSA,
MMR, and a Support Sets-based Centrality model improve classification
performance when compared to selected 30-second baselines. We also show that
summarized datasets lead to a classification performance whose difference is
not statistically significant from using full songs. Furthermore, we make an
argument stating the advantages of sharing summarized datasets for future MIR
research.Comment: 24 pages, 10 tables; Submitted to IEEE/ACM Transactions on Audio,
Speech and Language Processin
On the Application of Generic Summarization Algorithms to Music
Several generic summarization algorithms were developed in the past and
successfully applied in fields such as text and speech summarization. In this
paper, we review and apply these algorithms to music. To evaluate this
summarization's performance, we adopt an extrinsic approach: we compare a Fado
Genre Classifier's performance using truncated contiguous clips against the
summaries extracted with those algorithms on 2 different datasets. We show that
Maximal Marginal Relevance (MMR), LexRank and Latent Semantic Analysis (LSA)
all improve classification performance in both datasets used for testing.Comment: 12 pages, 1 table; Submitted to IEEE Signal Processing Letter
The slippery slope : explaining the increase in extreme poverty in urban Brazil, 1976-96
Despite tremendous macroeconomic instability in Brazil, the country's distributions of urban income in 1976 and 1996 appear, at first glance, deceptively similar. Mean household income per capita was stagnant, with minute accumulated growth (4.3 percent) over the two decades. The Gini coefficient hovered just above 0.59 in both years, and the incidence of poverty (relative to a poverty line of R$60 a month in 1996 prices) remained effectively unchanged over the period, at 22 percent. Behind this apparent stability, however, a powerful combination of labor market, demographic, and educational dynamics was at work, one effect of which was to generate a substantial increase in extreme urban poverty. Using a decomposition methodology based on micro-simulation, which endogenizes labor incomes, individual occupational choices, and decisions about education, the authors show that the distribution of income was being affected by: 1) Three factors that tended to increase poverty-a decline in average returns to education and experience, a negative"growth"effect, and unfortunate changes in the structure of occupations and participation in the labor force. 2) Two factors that tended to reduce poverty-improved educational endowments across the board, and a progressive reduction in dependency ratios. The net effect was small and negative for measured inequality overall, and negligible for the incidence of poverty (relative to"high"poverty lines). But the net effect was to substantially increase extreme poverty-suggesting the creation of a group of urban households excluded from any labor market and trapped in indigence. Above the 15th percentile, urban Brazilians have"stayed put"only by climbing hard up a slippery slope. Counteracting failing returns in both self-employment and the labor market required substantially reduced fertility rates and an average of two extra years of schooling (which still left them undereducated for that income level).Economic Theory&Research,Health Economics&Finance,Environmental Economics&Policies,Public Health Promotion,Health Monitoring&Evaluation,Inequality,Health Economics&Finance,Environmental Economics&Policies,Governance Indicators,Poverty Assessment
Brazilian Land Tenure and Conflicts: The Landless Peasants' Movement
This paper analyzes conflicts and violence in Brazil involving landless peasants occupying privately-owned land, for the period 2000-2008. It is the first study to be undertaken at a national level, with a contemporary data span, using a count data model that allows for heterogeneity, endogeneity and dynamics. Results from the estimated model show that the violent land occupation grows with left-wing political support for land occupation, rural population density, and agricultural credit, and decreases with poverty, agricultural productivity. The study discusses the interconnection of land reform, poverty and conflict.Land occupation, land reform, Brazil, poverty, conflict.
Summarization of Films and Documentaries Based on Subtitles and Scripts
We assess the performance of generic text summarization algorithms applied to
films and documentaries, using the well-known behavior of summarization of news
articles as reference. We use three datasets: (i) news articles, (ii) film
scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics
are used for comparing generated summaries against news abstracts, plot
summaries, and synopses. We show that the best performing algorithms are LSA,
for news articles and documentaries, and LexRank and Support Sets, for films.
Despite the different nature of films and documentaries, their relative
behavior is in accordance with that obtained for news articles.Comment: 7 pages, 9 tables, 4 figures, submitted to Pattern Recognition
Letters (Elsevier
Workload-aware table splitting for NoSQL
Massive scale data stores, which exhibit highly desirable scalability and availability properties are becoming pivotal systems in nowadays infrastructures. Scalability achieved by these data stores is anchored on data independence; there is no clear relationship between data, and atomic inter-node operations are not a concern. Such assumption over data allows aggressive data partitioning. In particular, data tables are horizontally partitioned and spread across nodes for load balancing. However, in current versions of these data stores, partitioning is either a manual process or automated but simply based on table size. We argue that size based partitioning does not lead to acceptable load balancing as it ignores data access patterns, namely data hotspots. Moreover, manual data partitioning is cumbersome and typically infeasible in large scale scenarios. In this paper we propose an automated table splitting mechanism that takes into account the system workload. We evaluate such mechanism showing that it simple, non-intrusive and effective
- …