1,092 research outputs found

    A memory-based method to select the number of relevant components in Principal Component Analysis

    Get PDF
    We propose a new data-driven method to select the optimal number of relevant components in Principal Component Analysis (PCA). This new method applies to correlation matrices whose time autocorrelation function decays more slowly than an exponential, giving rise to long memory effects. In comparison with other available methods present in the literature, our procedure does not rely on subjective evaluations and is computationally inexpensive. The underlying basic idea is to use a suitable factor model to analyse the residual memory after sequentially removing more and more components, and stopping the process when the maximum amount of memory has been accounted for by the retained components. We validate our methodology on both synthetic and real financial data, and find in all cases a clear and computationally superior answer entirely compatible with available heuristic criteria, such as cumulative variance and cross-validation.Comment: 29 pages, publishe

    A cluster driven log-volatility factor model: a deepening on the source of the volatility clustering

    Get PDF
    We introduce a new factor model for log volatilities that performs dimensionality reduction and considers contributions globally through the market, and locally through cluster structure and their interactions. We do not assume a-priori the number of clusters in the data, instead using the Directed Bubble Hierarchical Tree (DBHT) algorithm to fix the number of factors. We use the factor model and a new integrated non parametric proxy to study how volatilities contribute to volatility clustering. Globally, only the market contributes to the volatility clustering. Locally for some clusters, the cluster itself contributes statistically to volatility clustering. This is significantly advantageous over other factor models, since the factors can be chosen statistically, whilst also keeping economically relevant factors. Finally, we show that the log volatility factor model explains a similar amount of memory to a Principal Components Analysis (PCA) factor model and an exploratory factor model

    Environmental Disclosures and Size of Selected Indian Firms

    Get PDF
    Business responsibility is an easily said but hard to assume construct of sustainability literature. Out of the nine principles of Business Responsibility Reporting (BRR), the sixth principle envisages the environmental concerns of the businesses. The objective of this study is to explain the response of corporate entities towards Environmental Concerns (EC). The environmental concern of an organization has been gauged through environmental disclosures by these firms under the sixth principle of BRR. The general lack of emphasis on environmental disclosures still remains to be a key challenge to encourage Indian corporate houses to develop and adopt clean technologies, energy efficiency and renewable energy initiatives. The role of clean technologies/environmental technologies is pivotal in ensuring adequate environmental disclosures. But the moot point is, do the firms of certain size would disclose more on EC. There is plenty of literature which suffices the relationship of size and environmental disclosure but by appearing green (disclosures) an organization cannot be green. An organization will be green through its clean technology and energy initiatives. There is a major shift in the sustainability literature by focusing on prevention rather than damaging and curing later. Clean energy initiatives are the first steps to towards preventing/minimizing the environmental damage. Therefore, the next important question arises what explains the variation in clean energy initiatives in an organization. Is it the size of the firm or regulation which leads to disclosing environmental concern (EC.?) The relationship between size of the firm and environmental disclosures related to EC has been found to be significant by applying‘t’ test in the selected sample of 40 companies, while the variation in clean technology initiatives in the same sample has been measured using binary logistic regression. Out of the two independent variables i.e. size and environmental concern it is established that instead of size it is the regulation which significantly pushes companies towards clean technologies and energy initiatives

    A new set of cluster driven composite development indicators

    Get PDF
    Composite development indicators used in policy making often subjectively aggregate a restricted set of indicators. We show, using dimensionality reduction techniques, including Principal Component Analysis (PCA) and for the first time information filtering and hierarchical clustering, that these composite indicators miss key information on the relationship between different indicators. In particular, the grouping of indicators via topics is not reflected in the data at a global and local level. We overcome these issues by using the clustering of indicators to build a new set of cluster driven composite development indicators that are objective, data driven, comparable between countries, and retain interpretabilty. We discuss their consequences on informing policy makers about country development, comparing them with the top PageRank indicators as a benchmark. Finally, we demonstrate that our new set of composite development indicators outperforms the benchmark on a dataset reconstruction task.Comment: Accepted in EPJ Data Scienc

    Estimating Time to Clear Pendency of Cases in High Courts in India using Linear Regression

    Full text link
    Indian Judiciary is suffering from burden of millions of cases that are lying pending in its courts at all the levels. The High Court National Judicial Data Grid (HC-NJDG) indexes all the cases pending in the high courts and publishes the data publicly. In this paper, we analyze the data that we have collected from the HC-NJDG portal on 229 randomly chosen days between August 31, 2017 to March 22, 2020, including these dates. Thus, the data analyzed in the paper spans a period of more than two and a half years. We show that: 1) the pending cases in most of the high courts is increasing linearly with time. 2) the case load on judges in various high courts is very unevenly distributed, making judges of some high courts hundred times more loaded than others. 3) for some high courts it may take even a hundred years to clear the pendency cases if proper measures are not taken. We also suggest some policy changes that may help clear the pendency within a fixed time of either five or fifteen years. Finally, we find that the rate of institution of cases in high courts can be easily handled by the current sanctioned strength. However, extra judges are needed only to clear earlier backlogs.Comment: 12 pages, 9 figures, JURISIN 2022. arXiv admin note: text overlap with arXiv:2307.1061

    Measurement of the top quark forward-backward production asymmetry and the anomalous chromoelectric and chromomagnetic moments in pp collisions at √s = 13 TeV

    Get PDF
    Abstract The parton-level top quark (t) forward-backward asymmetry and the anomalous chromoelectric (d̂ t) and chromomagnetic (μ̂ t) moments have been measured using LHC pp collisions at a center-of-mass energy of 13 TeV, collected in the CMS detector in a data sample corresponding to an integrated luminosity of 35.9 fb−1. The linearized variable AFB(1) is used to approximate the asymmetry. Candidate t t ¯ events decaying to a muon or electron and jets in final states with low and high Lorentz boosts are selected and reconstructed using a fit of the kinematic distributions of the decay products to those expected for t t ¯ final states. The values found for the parameters are AFB(1)=0.048−0.087+0.095(stat)−0.029+0.020(syst),μ̂t=−0.024−0.009+0.013(stat)−0.011+0.016(syst), and a limit is placed on the magnitude of | d̂ t| < 0.03 at 95% confidence level. [Figure not available: see fulltext.

    Measurement of t(t)over-bar normalised multi-differential cross sections in pp collisions at root s=13 TeV, and simultaneous determination of the strong coupling strength, top quark pole mass, and parton distribution functions

    Get PDF
    Peer reviewe

    An embedding technique to determine ττ backgrounds in proton-proton collision data

    Get PDF
    An embedding technique is presented to estimate standard model tau tau backgrounds from data with minimal simulation input. In the data, the muons are removed from reconstructed mu mu events and replaced with simulated tau leptons with the same kinematic properties. In this way, a set of hybrid events is obtained that does not rely on simulation except for the decay of the tau leptons. The challenges in describing the underlying event or the production of associated jets in the simulation are avoided. The technique described in this paper was developed for CMS. Its validation and the inherent uncertainties are also discussed. The demonstration of the performance of the technique is based on a sample of proton-proton collisions collected by CMS in 2017 at root s = 13 TeV corresponding to an integrated luminosity of 41.5 fb(-1).Peer reviewe

    MUSiC : a model-unspecific search for new physics in proton-proton collisions at root s=13TeV

    Get PDF
    Results of the Model Unspecific Search in CMS (MUSiC), using proton-proton collision data recorded at the LHC at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb(-1), are presented. The MUSiC analysis searches for anomalies that could be signatures of physics beyond the standard model. The analysis is based on the comparison of observed data with the standard model prediction, as determined from simulation, in several hundred final states and multiple kinematic distributions. Events containing at least one electron or muon are classified based on their final state topology, and an automated search algorithm surveys the observed data for deviations from the prediction. The sensitivity of the search is validated using multiple methods. No significant deviations from the predictions have been observed. For a wide range of final state topologies, agreement is found between the data and the standard model simulation. This analysis complements dedicated search analyses by significantly expanding the range of final states covered using a model independent approach with the largest data set to date to probe phase space regions beyond the reach of previous general searches.Peer reviewe
    corecore