8 research outputs found

    To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis

    Full text link
    Recent research has highlighted the importance of dataset size in scaling language models. However, large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. To further enhance LLMs, a straightforward approach is to repeat the pre-training data for additional epochs. In this study, we empirically investigate three key aspects under this approach. First, we explore the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting, leading to multi-epoch degradation. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives, while less influential factors consist of dataset quality and model FLOPs. Finally, we explore whether widely used regularization can alleviate multi-epoch degradation. Most regularization techniques do not yield significant improvements, except for dropout, which demonstrates remarkable effectiveness but requires careful tuning when scaling up the model size. Additionally, we discover that leveraging mixture-of-experts (MoE) enables cost-effective and efficient hyper-parameter tuning for computationally intensive dense LLMs with comparable trainable parameters, potentially impacting efficient LLM development on a broader scale.Comment: Accepted at NeurIPS 202

    The Deterrence Effect Of Whistleblowing On Peer Firms\u27 Financial Reporting Decisions

    No full text
    I examine the impact of whistleblowing allegations on the financial reporting behavior of peer firms. This study provides several important findings. First, industry peers exhibit decreases in accrual-based earnings management, real earnings management, and the likelihood of financial misreporting in the two years following a whistleblowing allegation. Second, following a whistleblowing allegation, peer firms exhibit greater conditional conservatism. Third, peer firms increase earnings transparency to respond to whistleblowing allegations with the intention of improving financial reporting quality. I find results are strongest when the whistleblowing event relates specifically to accounting fraud. Whistleblowing allegations at large firms have greater impact on peer firms\u27 post-whistleblowing financial reporting. Prior literature shows that whistleblowing allegations are associated with changes in the financial reporting of whistleblowing target firms; this paper provides evidence that whistleblowing allegations also impact peer firms\u27 financial reporting

    Hedge fund activism and internal control weaknesses

    No full text
    The aim of the paper is to investigate the associations between hedge fund activism and corporate internal control weaknesses. In this paper, the authors identify hedge fund activism events using 13D filings and news search. After matching with internal control related information from Audit Analytics, the authors utilize ordinary least square (OLS) and propensity score matching (PSM) to analyze the data. The authors find that after hedge fund activism, target firms report additional internal control weaknesses, and these identified internal control weaknesses are remediated in subsequent years, leading to better financial-reporting quality. The findings indicate that both managers and activists have incentives to develop a stronger internal control environment after targeting

    Fuzzy Chaos Control of Fractional Order D-PMSG for Wind Turbine with Uncertain Parameters by State Feedback Design

    No full text
    To research the chaotic motion problem of the direct-drive permanent magnet synchronous generator (D-PMSG) for a wind turbine with uncertain parameters and fractional order characteristics, a control strategy established upon fuzzy state feedback is proposed. Firstly, according to the working mechanism of D-PMSG, the Lorenz nonlinear mathematical model is established by affine transformation and time transformation. Secondly, fractional order nonlinear systems (FONSs) are transformed into linear sub-model by Takagi–Sugeno (T-S) fuzzy model. Then, the fuzzy state feedback controller is designed through Parallel Distributed Compensation (PDC) control principle to suppress the chaotic motion. By applying the fractional Lyapunov stability theory (FLST), the sufficient conditions for Mittag–Leffler stability are formulated in the format of linear matrix inequalities (LMIs). Finally, the control performance and effectiveness of the proposed controller are demonstrated through numerical simulations, and the chaotic motions in D-PMSG can be eliminated quickly

    An Application of Judgement Modeling to Examine Inter-Cultural Differences Regarding Perceptions of Business Skill Importance

    Get PDF
    With increased global interaction, cultural awareness among stakeholders is crucial, especially for companies seeking growth in the international environment. This study focuses on comparing the perceptions of business skill importance between student subjects from China/Hong Kong (CHK) and the United States (US). The results show that the six cues representing the business skills/attributes strongly influenced student perceptions of job offer likelihood and the relative importance of these cues were not equal, with Interpersonal Effectiveness (INPER), Internship Experience (INT), and Ethical Awareness (ETH) having a higher impact than Communication (COMM), Cultural Intelligence (CULT), and Critical Thinking (CRIT). The results indicated that INPER, INT, and ETH were associated with similar, substantial effects, while COMM, CULT, and CRIT exhibited smallerand comparable effects. The analysis revealed significant interactions between the country and two cues Interpersonal Effectiveness (INPER) and Ethical Awareness (ETH). Chinese students perceived INPER to be somewhat more important than U.S. students, possibly influenced by cultural dimensions such as the emphasis on interpersonal relationships in Chinese culture. Conversely, U.S. students regarded ETH as more crucial than their Chinese counterparts, aligning with findings that suggest cultural variations,particularly in power distance and collectivism, may influence ethical values. The findings enhance our understanding of the relative importance of business skills in different cultural contexts and provide insights for educational institutions and employers in preparing students for the global business environment. The study contributes to the existing literature by providing direct comparisons of student perceptions across cultures and employing a rigorous judgment modeling methodology

    An Application of Judgement Modeling to Examine Inter-Cultural Differences Regarding Perceptions of Business Skill Importance

    No full text
    With increased global interaction, cultural awareness among stakeholders is crucial, especially for companies seeking growth in the international environment. This study focuses on comparing the perceptions of business skill importance between student subjects from China/Hong Kong (CHK) and the United States (US). The results show that the six cues representing the business skills/attributes strongly influenced student perceptions of job offer likelihood and the relative importance of these cues were not equal, with Interpersonal Effectiveness (INPER), Internship Experience (INT), and Ethical Awareness (ETH) having a higher impact than Communication (COMM), Cultural Intelligence (CULT), and Critical Thinking (CRIT). The results indicated that INPER, INT, and ETH were associated with similar, substantial effects, while COMM, CULT, and CRIT exhibited smaller and comparable effects. The analysis revealed significant interactions between the country and two cues Interpersonal Effectiveness (INPER) and Ethical Awareness (ETH). Chinese students perceived INPER to be somewhat more important than U.S. students, possibly influenced by cultural dimensions such as the emphasis on interpersonal relationships in Chinese culture. Conversely, U.S. students regarded ETH as more crucial than their Chinese counterparts, aligning with findings that suggest cultural variations, particularly in power distance and collectivism, may influence ethical values. The findings enhance our understanding of the relative importance of business skills in different cultural contexts and provide insights for educational institutions and employers in preparing students for the global business environment. The study contributes to the existing literature by providing direct comparisons of student perceptions across cultures and employing a rigorous judgment modeling methodology
    corecore