73 research outputs found

    i-Razor: A Differentiable Neural Input Razor for Feature Selection and Dimension Search in DNN-Based Recommender Systems

    Full text link
    Input features play a crucial role in DNN-based recommender systems with thousands of categorical and continuous fields from users, items, contexts, and interactions. Noisy features and inappropriate embedding dimension assignments can deteriorate the performance of recommender systems and introduce unnecessary complexity in model training and online serving. Optimizing the input configuration of DNN models, including feature selection and embedding dimension assignment, has become one of the essential topics in feature engineering. However, in existing industrial practices, feature selection and dimension search are optimized sequentially, i.e., feature selection is performed first, followed by dimension search to determine the optimal dimension size for each selected feature. Such a sequential optimization mechanism increases training costs and risks generating suboptimal input configurations. To address this problem, we propose a differentiable neural input razor (i-Razor) that enables joint optimization of feature selection and dimension search. Concretely, we introduce an end-to-end differentiable model to learn the relative importance of different embedding regions of each feature. Furthermore, a flexible pruning algorithm is proposed to achieve feature filtering and dimension derivation simultaneously. Extensive experiments on two large-scale public datasets in the Click-Through-Rate (CTR) prediction task demonstrate the efficacy and superiority of i-Razor in balancing model complexity and performance.Comment: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE

    FLOD: Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority

    Get PDF
    \textit{Privacy} and \textit{Byzantine-robustness} are two major concerns of federated learning (FL), but mitigating both threats simultaneously is highly challenging: privacy-preserving strategies prohibit access to individual model updates to avoid leakage, while Byzantine-robust methods require access for comprehensive mathematical analysis. Besides, most Byzantine-robust methods only work in the \textit{honest-majority} setting. We present FLOD\mathsf{FLOD}, a novel oblivious defender for private Byzantine-robust FL in dishonest-majority setting. Basically, we propose a novel Hamming distance-based aggregation method to resist >1/2>1/2 Byzantine attacks using a small \textit{root-dataset} and \textit{server-model} for bootstrapping trust. Furthermore, we employ two non-colluding servers and use additive homomorphic encryption (AHE\mathsf{AHE}) and secure two-party computation (2PC) primitives to construct efficient privacy-preserving building blocks for secure aggregation, in which we propose two novel in-depth variants of Beaver Multiplication triples (MT) to reduce the overhead of Bit to Arithmetic (Bit2A\mathsf{Bit2A}) conversion and vector weighted sum aggregation (VSWA\mathsf{VSWA}) significantly. Experiments on real-world and synthetic datasets demonstrate our effectiveness and efficiency: (\romannumeral1) FLOD\mathsf{FLOD} defeats known Byzantine attacks with a negligible effect on accuracy and convergence, (\romannumeral2) achieves a reduction of ≈2×\approx 2\times for offline (resp. online) overhead of Bit2A\mathsf{Bit2A} and VSWA\mathsf{VSWA} compared to ABY\mathsf{ABY}-AHE\mathsf{AHE} (resp. ABY\mathsf{ABY}-MT\mathsf{MT}) based methods (NDSS\u2715), (\romannumeral3) and reduces total online communication and run-time by 167167-1416×1416\times and 3.13.1-7.4×7.4\times compared to FLGUARD\mathsf{FLGUARD} (Crypto Eprint 2021/025)

    Prediction and elucidation of the population dynamics of Microcystis spp. in Lake Dianchi (China) by means of artificial neural networks

    Get PDF
    Lake Dianchi is a shallow and turbid lake, located in Southwest China. Since 1985, Lake Dianchi has experienced severe cyanabacterial blooms (dominated by Microcystis spp.). In extreme cases, the algal cell densities have exceeded three billion cells per liter. To predict and elucidate the population dynamics ofMicrocystis spp. in Lake Dianchi, a neural network based model was developed. The correlation coefficient (R 2) between the predicted algal concentrations by the model and the observed values was 0.911. Sensitivity analysis was performed to clarify the algal dynamics to the changes of environmental factors. The results of a sensitivity analysis of the neural network model suggested that small increases in pH could cause significantly reduced algal abundance. Further investigations on raw data showed that the response of Microcystis spp. concentration to pH increase was dependent on algal biomass and pH level. When Microcystis spp. population and pH were moderate or low, the response of Microcystis spp. population would be more likely to be positive in Lake Dianchi; contrarily, Microcystis spp. population in Lake Dianchi would be more likely to show negative response to pH increase when Microcystis spp. population and pH were high. The paper concluded that the extremely high concentration of algal population and high pH could explain the distinctive response of Microcystis spp. population to +1 SD (standard deviation) pH increase in Lake Dianchi. And the paper also elucidated the algal dynamics to changes of other environmental factors. One SD increase of water temperature (WT) had strongest positive relationship with Microcystis spp. biomass. Chemical oxygen demand (COD) and total phosphorus (TP) had strong positive effect on Microcystis spp. abundance while total nitrogen (TN), biological oxygen demand in five days (BOD5), and dissolved oxygen had only weak relationship with Microcystis spp. concentration. And transparency (Tr) had moderate positive relationship with Microcystis spp. concentration.Lake Dianchi is a shallow and turbid lake, located in Southwest China. Since 1985, Lake Dianchi has experienced severe cyanabacterial blooms (dominated by Microcystis spp.). In extreme cases, the algal cell densities have exceeded three billion cells per liter. To predict and elucidate the population dynamics ofMicrocystis spp. in Lake Dianchi, a neural network based model was developed. The correlation coefficient (R 2) between the predicted algal concentrations by the model and the observed values was 0.911. Sensitivity analysis was performed to clarify the algal dynamics to the changes of environmental factors. The results of a sensitivity analysis of the neural network model suggested that small increases in pH could cause significantly reduced algal abundance. Further investigations on raw data showed that the response of Microcystis spp. concentration to pH increase was dependent on algal biomass and pH level. When Microcystis spp. population and pH were moderate or low, the response of Microcystis spp. population would be more likely to be positive in Lake Dianchi; contrarily, Microcystis spp. population in Lake Dianchi would be more likely to show negative response to pH increase when Microcystis spp. population and pH were high. The paper concluded that the extremely high concentration of algal population and high pH could explain the distinctive response of Microcystis spp. population to +1 SD (standard deviation) pH increase in Lake Dianchi. And the paper also elucidated the algal dynamics to changes of other environmental factors. One SD increase of water temperature (WT) had strongest positive relationship with Microcystis spp. biomass. Chemical oxygen demand (COD) and total phosphorus (TP) had strong positive effect on Microcystis spp. abundance while total nitrogen (TN), biological oxygen demand in five days (BOD5), and dissolved oxygen had only weak relationship with Microcystis spp. concentration. And transparency (Tr) had moderate positive relationship with Microcystis spp. concentration

    Evaluating the Guiding Role of Elevated Pretreatment Serum Carcinoembryonic Antigen Levels for Adjuvant Chemotherapy in Stage IIA Colon Cancer: A Large Population-Based and Propensity Score-Matched Study

    Get PDF
    Objective: This study was to investigate guiding role of elevated pretreatment serum carcinoembryonic antigen (CEA) levels for ACT receipt in stage IIA colon cancer.Methods: Eligible patients diagnosed with stage IIA colon cancer (N = 21848) were identified from the Surveillance, Epidemiology, and End Results (SEER) database between January 2004 and December 2010. Pearson's chi-squared tests, Cox proportional hazards regression models, and Kaplan-Meier methods were performed. Propensity score matching (PSM) was used to decrease the risk of biased estimates of treatment effect.Results: Multivariate Cox analysis indicated that, in CEA-elevated group, receiving or not receiving ACT did not presented statistically CSS difference [hazard ratio (HR) = 0.940, 95% confidence interval (CI) = 0.804–1.097, P = 0.431]; in CEA-normal group, receiving or not receiving ACT also did not presented statistically CSS difference (HR = 0.911, 95% CI = 0.779–1.064, P = 0.239). After PSM, Kaplan-Meier analyses showed that there was no statistical CSS difference between receiving or not receiving ACT (P = 0.64).Conclusion: ACT did not show substantial survival benefit in stage IIA colon cancer with elevated pretreatment serum CEA levels. Stage IIA disease with elevated pretreatment serum CEA should not be treated with ACT

    MiR-592 Promotes Gastric Cancer Proliferation, Migration, and Invasion Through the PI3K/AKT and MAPK/ERK Signaling Pathways by Targeting Spry2

    Get PDF
    Background/Aims: Gastric cancer (GC) is one of the most prevalent digestive malignancies. MicroRNAs (miRNAs) are involved in multiple cellular processes, including oncogenesis, and miR-592 itself participates in many malignancies; however, its role in GC remains unknown. In this study, we investigated the expression and molecular mechanisms of miR-592 in GC. Methods: Quantitative real-time PCR and immunohistochemistry were performed to determine the expression of miR-592 and its putative targets in human tissues and cell lines. Proliferation, migration, and invasion were evaluated by Cell Counting Kit-8, population doubling time, colony formation, Transwell, and wound-healing assays in transfected GC cells in vitro. A dual-luciferase reporter assay was used to determine whether miR-592 could directly bind its target. A tumorigenesis assay was used to study whether miR-592 affected GC growth in vivo. Proteins involved in signaling pathways and the epithelial–mesenchymal transition (EMT) were detected with western blot. Results: The ectopic expression of miR-592 promoted GC proliferation, migration, and invasion in vitro and facilitated tumorigenesis in vivo. Spry2 was a direct target of miR-592 and Spry2 overexpression partially counteracted the effects of miR-592. miR-592 induced the EMT and promoted its progression in GC via the PI3K/AKT and MAPK/ERK signaling pathways by inhibiting Spry2. Conclusions: Overexpression of miR-592 promotes GC proliferation, migration, and invasion and induces the EMT via the PI3K/AKT and MAPK/ERK signaling pathways by inhibiting Spry2, suggesting a potential therapeutic target for GC

    Design of High-Reliability Micro Safety and Arming Devices for a Small Caliber Projectile

    No full text
    With the development of micro technology, the fuse for small-caliber projectiles tends to be miniaturized and intelligent, the traditional fuse no longer meets the requirements. In this paper, we demonstrate a micro safety and arming (S & A) device with small volume and high reliability in small caliber projectile platforms. The working principle of S & A devices is that a centrifugal insurance mechanism could deform under a centrifugal load and thus cause fuse safety arming. The centrifugal insurance mechanism is designed theoretically, verified by simulation and experimental methods. The experimental results show that, when the rotary speed is over 36,000 rpm, the fuse was armed safely. In addition, the experimental, simulation, and theoretical results are basically consistent, and indicate that the centrifugal insurance mechanism meets the expected criteria

    Parallel Machine Scheduling with Batch Delivery to Two Customers

    No full text
    In some make-to-order supply chains, the manufacturer needs to process and deliver products for customers at different locations. To coordinate production and distribution operations at the detailed scheduling level, we study a parallel machine scheduling model with batch delivery to two customers by vehicle routing method. In this model, the supply chain consists of a processing facility with m parallel machines and two customers. A set of jobs containing n1 jobs from customer 1 and n2 jobs from customer 2 are first processed in the processing facility and then delivered to the customers directly without intermediate inventory. The problem is to find a joint schedule of production and distribution such that the tradeoff between maximum arrival time of the jobs and total distribution cost is minimized. The distribution cost of a delivery shipment consists of a fixed charge and a variable cost proportional to the total distance of the route taken by the shipment. We provide polynomial time heuristics with worst-case performance analysis for the problem. If m=2 and (n1-b)(n2-b)<0, we propose a heuristic with worst-case ratio bound of 3/2, where b is the capacity of the delivery shipment. Otherwise, the worst-case ratio bound of the heuristic we propose is 2-2/(m+1)

    Integrated Scheduling of Production and Distribution with Release Dates and Capacitated Deliveries

    No full text
    This paper investigates an integrated scheduling of production and distribution model in a supply chain consisting of a single machine, a customer, and a sufficient number of homogeneous capacitated vehicles. In this model, the customer places a set of orders, each of which has a given release date. All orders are first processed nonpreemptively on the machine and then batch delivered to the customer. Two variations of the model with different objective functions are studied: one is to minimize the arrival time of the last order plus total distribution cost and the other is to minimize total arrival time of the orders plus total distribution cost. For the former one, we provide a polynomial-time exact algorithm. For the latter one, due to its NP-hard property, we provide a heuristic with a worst-case ratio bound of 2
    • …
    corecore