253 research outputs found

    Simple system to measure the Earth's magnetic field

    Full text link
    Our aim in this proposal is by using the Faraday's law of induction as a simple lecture demonstration to measure the Earth's magnetic field (B). This will also enable the students to learn about how electric power is generated from the rotational motion. Obviously the idea is not original, yet it may be attractive in the sense that no sophisticated devices are used

    Efficiently Clustering Very Large Attributed Graphs

    Full text link
    Attributed graphs model real networks by enriching their nodes with attributes accounting for properties. Several techniques have been proposed for partitioning these graphs into clusters that are homogeneous with respect to both semantic attributes and to the structure of the graph. However, time and space complexities of state of the art algorithms limit their scalability to medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a fast and scalable algorithm for partitioning large attributed graphs. The approach is robust, being compatible both with categorical and with quantitative attributes, and it is tailorable, allowing the user to weight the semantic and topological components. Further, the approach does not require the user to guess in advance the number of clusters. SToC relies on well known approximation techniques such as bottom-k sketches, traditional graph-theoretic concepts, and a new perspective on the composition of heterogeneous distance measures. Experimental results demonstrate its ability to efficiently compute high-quality partitions of large scale attributed graphs.Comment: This work has been published in ASONAM 2017. This version includes an appendix with validation of our attribute model and distance function, omitted in the converence version for lack of space. Please refer to the published versio

    On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection

    Full text link
    Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff.Comment: 17 pages, 19 figures, in Proceedings of ACM FAT* 2019, dataset & demo available at https://deception.machineintheloop.co

    Outlier Edge Detection Using Random Graph Generation Models and Applications

    Get PDF
    Outliers are samples that are generated by different mechanisms from other normal data samples. Graphs, in particular social network graphs, may contain nodes and edges that are made by scammers, malicious programs or mistakenly by normal users. Detecting outlier nodes and edges is important for data mining and graph analytics. However, previous research in the field has merely focused on detecting outlier nodes. In this article, we study the properties of edges and propose outlier edge detection algorithms using two random graph generation models. We found that the edge-ego-network, which can be defined as the induced graph that contains two end nodes of an edge, their neighboring nodes and the edges that link these nodes, contains critical information to detect outlier edges. We evaluated the proposed algorithms by injecting outlier edges into some real-world graph data. Experiment results show that the proposed algorithms can effectively detect outlier edges. In particular, the algorithm based on the Preferential Attachment Random Graph Generation model consistently gives good performance regardless of the test graph data. Further more, the proposed algorithms are not limited in the area of outlier edge detection. We demonstrate three different applications that benefit from the proposed algorithms: 1) a preprocessing tool that improves the performance of graph clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel noisy data clustering algorithm. These applications show the great potential of the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape

    Reducing Controversy by Connecting Opposing Views

    Get PDF
    Peer reviewe

    Seismic risk in the city of Al Hoceima (north of Morocco) using the vulnerability index method, applied in Risk-UE project

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s11069-016-2566-8Al Hoceima is one of the most seismic active regions in north of Morocco. It is demonstrated by the large seismic episodes reported in seismic catalogs and research studies. However, seismic risk is relatively high due to vulnerable buildings that are either old or don’t respect seismic standards. Our aim is to present a study about seismic risk and seismic scenarios for the city of Al Hoceima. The seismic vulnerability of the existing residential buildings was evaluated using the vulnerability index method (Risk-UE). It was chosen to be adapted and applied to the Moroccan constructions for its practicality and simple methodology. A visual inspection of 1102 buildings was carried out to assess the vulnerability factors. As for seismic hazard, it was evaluated in terms of macroseismic intensity for two scenarios (a deterministic and probabilistic scenario). The maps of seismic risk are represented by direct damage on buildings, damage to population and economic cost. According to the results, the main vulnerability index of the city is equal to 0.49 and the seismic risk is estimated as Slight (main damage grade equal to 0.9 for the deterministic scenario and 0.7 for the probabilistic scenario). However, Moderate to heavy damage is expected in areas located in the newer extensions, in both the east and west of the city. Important economic losses and damage to the population are expected in these areas as well. The maps elaborated can be a potential guide to the decision making in the field of seismic risk prevention and mitigation strategies in Al Hoceima.Peer ReviewedPostprint (author's final draft

    On defining rules for cancer data fabrication

    Get PDF
    Funding: This research is partially funded by the Data Lab, and the EU H2020 project Serums: Securing Medical Data in Smart Patient-Centric Healthcare Systems (grant 826278).Data is essential for machine learning projects, and data accuracy is crucial for being able to trust the results obtained from the associated machine learning models. Previously, we have developed machine learning models for predicting the treatment outcome for breast cancer patients that have undergone chemotherapy, and developed a monitoring system for their treatment timeline showing interactively the options and associated predictions. Available cancer datasets, such as the one used earlier, are often too small to obtain significant results, and make it difficult to explore ways to improve the predictive capability of the models further. In this paper, we explore an alternative to enhance our datasets through synthetic data generation. From our original dataset, we extract rules to generate fabricated data that capture the different characteristics inherent in the dataset. Additional rules can be used to capture general medical knowledge. We show how to formulate rules for our cancer treatment data, and use the IBM solver to obtain a corresponding synthetic dataset. We discuss challenges for future work.Postprin

    An exposure-effect approach for evaluating ecosystem-wide risks from human activities

    Get PDF
    Ecosystem-based management (EBM) is promoted as the solution for sustainable use. An ecosystem-wide assessment methodology is therefore required. In this paper, we present an approach to assess the risk to ecosystem components from human activities common to marine and coastal ecosystems. We build on: (i) a linkage framework that describes how human activities can impact the ecosystem through pressures, and (ii) a qualitative expert judgement assessment of impact chains describing the exposure and sensitivity of ecological components to those activities. Using case study examples applied at European regional sea scale, we evaluate the risk of an adverse ecological impact from current human activities to a suite of ecological components and, once impacted, the time required for recovery to pre-impact conditions should those activities subside. Grouping impact chains by sectors, pressure type, or ecological components enabled impact risks and recovery times to be identified, supporting resource managers in their efforts to prioritize threats for management, identify most at-risk components, and generate time frames for ecosystem recovery

    A combined ULBP2 and SEMA5A expression signature as a prognostic and predictive biomarker for colon cancer

    Get PDF
    Background: Prognostic biomarkers for cancer have the power to change the course of disease if they add value beyond known prognostic factors, if they can help shape treatment protocols, and if they are reliable. The aim of this study was to identify such biomarkers for colon cancer and to understand the molecular mechanisms leading to prognostic stratifications based on these biomarkers. Methods and Findings: We used an in house R based script (SSAT) for the in silico discovery of stage-independent prognostic biomarkers using two cohorts, GSE17536 and GSE17537, that include 177 and 55 colon cancer patients, respectively. This identified 2 genes, ULBP2 and SEMA5A, which when used jointly, could distinguish patients with distinct prognosis. We validated our findings using a third cohort of 48 patients ex vivo. We find that in all cohorts, a combined ULBP2/SEMA5A classification (SU-GIB) can stratify distinct prognostic sub-groups with hazard ratios that range from 2.4 to 4.5 (p=0.01) when overall- or cancer-specific survival is used as an end-measure, independent of confounding prognostic parameters. In addition, our preliminary analyses suggest SU-GIB is comparable to Oncotype DX colon(®) in predicting recurrence in two different cohorts (HR: 1.5-2; p=0.02). SU-GIB has potential as a companion diagnostic for several drugs including the PI3K/mTOR inhibitor BEZ235, which are suitable for the treatment of patients within the bad prognosis group. We show that tumors from patients with worse prognosis have low EGFR autophosphorylation rates, but high caspase 7 activity, and show upregulation of pro-inflammatory cytokines that relate to a relatively mesenchymal phenotype. Conclusions: We describe two novel genes that can be used to prognosticate colon cancer and suggest approaches by which such tumors can be treated. We also describe molecular characteristics of tumors stratified by the SU-GIB signature. © Ivyspring International Publisher
    corecore