206 research outputs found
Topical Bias in Generalist Mathematics Journals
Generalist mathematics journals exhibit bias toward the branches of
mathematics by publishing articles about some subjects in quantities far
disproportionate to the production of papers in those areas within all of
mathematics.Comment: 8 pages, 3 figure
Spectral Condition Numbers of Orthogonal Projections and Full Rank Linear Least Squares Residuals
A simple formula is proved to be a tight estimate for the condition number of
the full rank linear least squares residual with respect to the matrix of least
squares coefficients and scaled 2-norms. The tight estimate reveals that the
condition number depends on three quantities, two of which can cause
ill-conditioning. The numerical linear algebra literature presents several
estimates of various instances of these condition numbers. All the prior values
exceed the formula introduced here, sometimes by large factors.Comment: 15 pages, 1 figure, 2 table
Multilingual Twitter Sentiment Classification: The Role of Human Annotators
What are the limits of automated Twitter sentiment classification? We analyze
a large set of manually labeled tweets in different languages, use them as
training data, and construct automated classification models. It turns out that
the quality of classification models depends much more on the quality and size
of training data than on the type of the model trained. Experimental results
indicate that there is no statistically significant difference between the
performance of the top classification models. We quantify the quality of
training data by applying various annotator agreement measures, and identify
the weakest points of different datasets. We show that the model performance
approaches the inter-annotator agreement when the size of the training set is
sufficiently large. However, it is crucial to regularly monitor the self- and
inter-annotator agreements since this improves the training datasets and
consequently the model performance. Finally, we show that there is strong
evidence that humans perceive the sentiment classes (negative, neutral, and
positive) as ordered
- …