412 research outputs found

    Utilizing Machine Learning Techniques for Classifying Translated and Non-Translated Corporate Annual Reports

    No full text
    Globalization has led to the widespread adoption of translated corporate annual reports in international markets. Nonetheless, it remains largely unexplored whether these translated documents fulfill the same function and communicate as effectively to international investors as their non-translated counterparts. Considering their significance to stakeholders, differentiating between these two types of reports is essential, yet research in this area is insufficient. This study seeks to bridge this gap by leveraging machine learning algorithms to classify corporate annual reports based on their translation status. By constructing corpora of comparable texts and employing thirteen syntactic complexity indices as features, we analyzed the reports using eight different algorithms: Naïve Bayes, Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Neural Network, Random Forest, Gradient Boosting and Deep Learning. Additionally, ensemble models were created by combining the three most effective algorithms. The best-performing model in our study achieved an Area Under the Curve (AUC) of 99.3%. This innovative approach demonstrates the effectiveness of syntactic complexity indices in machine learning for classifying translational language in corporate reporting, contributing valuable insights to text classification and translational language research. Our findings offer critical implications for stakeholders in multilingual contexts, highlighting the need for further research in this field.</p

    The examples of some retrieved concept relevance records (in part. As some phrases records have been omitted here).

    No full text
    <p>The examples of some retrieved concept relevance records (in part. As some phrases records have been omitted here).</p

    The impact of feature selection on clustering results.

    No full text
    <p>(A). Evaluation method: accuracy; Clustering method: k-means; Dataset: dataset 1; (B). Evaluation method: accuracy; Clustering method: SOM; Dataset: dataset 1; (C). Evaluation method: BF; Clustering method: k-means; Dataset: dataset 1; (D): Evaluation method: BF; Clustering method: SOM; Dataset: dataset 1; (E). Evaluation method: accuracy; Clustering method: k-means; Dataset: dataset 2; (F). Evaluation method: accuracy; Clustering method: SOM; Dataset: dataset 2; (G). Evaluation method: BF; Clustering method: k-means; Dataset: dataset 2; (H). Evaluation method: BF; Clustering method: SOM; Dataset: dataset 2.</p

    Clustering performance for different granularity and different blend factor.

    No full text
    <p>(A). Evaluation method: accuracy; Clustering method: k-means; Dataset: dataset 1; (B). Evaluation method: accuracy; Clustering method: SOM; Dataset: dataset 1; (C). Evaluation method: BF; Clustering method: k-means; Dataset: dataset 1; (D): Evaluation method: BF; Clustering method: SOM; Dataset: dataset 1; (E). Evaluation method: accuracy; Clustering method: k-means; Dataset: dataset 2; (F). Evaluation method: accuracy; Clustering method: SOM; Dataset: dataset 2; (G). Evaluation method: BF; Clustering method: k-means; Dataset: dataset 2; (H). Evaluation method: BF; Clustering method: SOM; Dataset: dataset 2.</p

    The extension at the word level.

    No full text
    <p>The extension at the word level.</p

    The two datasets.

    No full text
    <p>The two datasets.</p

    Comparison of time consumed with or without semantic extension on dataset 1 (: S; *-A: without extension; *-B: with extension).

    No full text
    <p>Comparison of time consumed with or without semantic extension on dataset 1 (: S; *-A: without extension; *-B: with extension).</p

    The LDA model.

    No full text
    <p>The LDA model.</p

    An example for demonstrating the effect of semantic extension on document level.

    No full text
    <p>An example for demonstrating the effect of semantic extension on document level.</p

    The examples of first 6 topics discovered by gibbsLDA.

    No full text
    <p>The examples of first 6 topics discovered by gibbsLDA.</p
    • …
    corecore