43 research outputs found

    SELFIES and the future of molecular string representations

    Get PDF
    Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.Comment: 34 pages, 15 figures, comments and suggestions for additional references are welcome

    SELFIES and the future of molecular string representations

    Get PDF
    Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings—most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness: SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science

    SELFIES and the future of molecular string representations

    Get PDF
    Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science

    Safety and efficacy of the ChAdOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: an interim analysis of four randomised controlled trials in Brazil, South Africa, and the UK

    Get PDF
    Background A safe and efficacious vaccine against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), if deployed with high coverage, could contribute to the control of the COVID-19 pandemic. We evaluated the safety and efficacy of the ChAdOx1 nCoV-19 vaccine in a pooled interim analysis of four trials. Methods This analysis includes data from four ongoing blinded, randomised, controlled trials done across the UK, Brazil, and South Africa. Participants aged 18 years and older were randomly assigned (1:1) to ChAdOx1 nCoV-19 vaccine or control (meningococcal group A, C, W, and Y conjugate vaccine or saline). Participants in the ChAdOx1 nCoV-19 group received two doses containing 5 × 1010 viral particles (standard dose; SD/SD cohort); a subset in the UK trial received a half dose as their first dose (low dose) and a standard dose as their second dose (LD/SD cohort). The primary efficacy analysis included symptomatic COVID-19 in seronegative participants with a nucleic acid amplification test-positive swab more than 14 days after a second dose of vaccine. Participants were analysed according to treatment received, with data cutoff on Nov 4, 2020. Vaccine efficacy was calculated as 1 - relative risk derived from a robust Poisson regression model adjusted for age. Studies are registered at ISRCTN89951424 and ClinicalTrials.gov, NCT04324606, NCT04400838, and NCT04444674. Findings Between April 23 and Nov 4, 2020, 23 848 participants were enrolled and 11 636 participants (7548 in the UK, 4088 in Brazil) were included in the interim primary efficacy analysis. In participants who received two standard doses, vaccine efficacy was 62·1% (95% CI 41·0–75·7; 27 [0·6%] of 4440 in the ChAdOx1 nCoV-19 group vs71 [1·6%] of 4455 in the control group) and in participants who received a low dose followed by a standard dose, efficacy was 90·0% (67·4–97·0; three [0·2%] of 1367 vs 30 [2·2%] of 1374; pinteraction=0·010). Overall vaccine efficacy across both groups was 70·4% (95·8% CI 54·8–80·6; 30 [0·5%] of 5807 vs 101 [1·7%] of 5829). From 21 days after the first dose, there were ten cases hospitalised for COVID-19, all in the control arm; two were classified as severe COVID-19, including one death. There were 74 341 person-months of safety follow-up (median 3·4 months, IQR 1·3–4·8): 175 severe adverse events occurred in 168 participants, 84 events in the ChAdOx1 nCoV-19 group and 91 in the control group. Three events were classified as possibly related to a vaccine: one in the ChAdOx1 nCoV-19 group, one in the control group, and one in a participant who remains masked to group allocation. Interpretation ChAdOx1 nCoV-19 has an acceptable safety profile and has been found to be efficacious against symptomatic COVID-19 in this interim analysis of ongoing clinical trials

    Safety and efficacy of the ChAdOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: an interim analysis of four randomised controlled trials in Brazil, South Africa, and the UK.

    Get PDF
    BACKGROUND: A safe and efficacious vaccine against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), if deployed with high coverage, could contribute to the control of the COVID-19 pandemic. We evaluated the safety and efficacy of the ChAdOx1 nCoV-19 vaccine in a pooled interim analysis of four trials. METHODS: This analysis includes data from four ongoing blinded, randomised, controlled trials done across the UK, Brazil, and South Africa. Participants aged 18 years and older were randomly assigned (1:1) to ChAdOx1 nCoV-19 vaccine or control (meningococcal group A, C, W, and Y conjugate vaccine or saline). Participants in the ChAdOx1 nCoV-19 group received two doses containing 5 × 1010 viral particles (standard dose; SD/SD cohort); a subset in the UK trial received a half dose as their first dose (low dose) and a standard dose as their second dose (LD/SD cohort). The primary efficacy analysis included symptomatic COVID-19 in seronegative participants with a nucleic acid amplification test-positive swab more than 14 days after a second dose of vaccine. Participants were analysed according to treatment received, with data cutoff on Nov 4, 2020. Vaccine efficacy was calculated as 1 - relative risk derived from a robust Poisson regression model adjusted for age. Studies are registered at ISRCTN89951424 and ClinicalTrials.gov, NCT04324606, NCT04400838, and NCT04444674. FINDINGS: Between April 23 and Nov 4, 2020, 23 848 participants were enrolled and 11 636 participants (7548 in the UK, 4088 in Brazil) were included in the interim primary efficacy analysis. In participants who received two standard doses, vaccine efficacy was 62·1% (95% CI 41·0-75·7; 27 [0·6%] of 4440 in the ChAdOx1 nCoV-19 group vs71 [1·6%] of 4455 in the control group) and in participants who received a low dose followed by a standard dose, efficacy was 90·0% (67·4-97·0; three [0·2%] of 1367 vs 30 [2·2%] of 1374; pinteraction=0·010). Overall vaccine efficacy across both groups was 70·4% (95·8% CI 54·8-80·6; 30 [0·5%] of 5807 vs 101 [1·7%] of 5829). From 21 days after the first dose, there were ten cases hospitalised for COVID-19, all in the control arm; two were classified as severe COVID-19, including one death. There were 74 341 person-months of safety follow-up (median 3·4 months, IQR 1·3-4·8): 175 severe adverse events occurred in 168 participants, 84 events in the ChAdOx1 nCoV-19 group and 91 in the control group. Three events were classified as possibly related to a vaccine: one in the ChAdOx1 nCoV-19 group, one in the control group, and one in a participant who remains masked to group allocation. INTERPRETATION: ChAdOx1 nCoV-19 has an acceptable safety profile and has been found to be efficacious against symptomatic COVID-19 in this interim analysis of ongoing clinical trials. FUNDING: UK Research and Innovation, National Institutes for Health Research (NIHR), Coalition for Epidemic Preparedness Innovations, Bill & Melinda Gates Foundation, Lemann Foundation, Rede D'Or, Brava and Telles Foundation, NIHR Oxford Biomedical Research Centre, Thames Valley and South Midland's NIHR Clinical Research Network, and AstraZeneca

    Service and Document Based Interoperability for European eCustoms Solutions

    No full text
    Overall view against facade of the Ace Museum; Previously exhibited at the Vancouver Sculpture Biennale (January 2010). With a vivid chrome finish, the metal sculpture is constructed from horizontal sections that were purposefully staggered, representing his broken legacy. Atop is a very playful inclusion of baby (with Mao's head) holding a balance stick on top of Lenin’s protruding head, as if walking a tight rope of his risky Marxist ideals. The style has been called "cynical realism." Gao Shen (Zhen) and Gao Qiang, both born in Jinan, Shandong Province are Beijing-based artists who work together, known as the Gao Brothers (Gao shi xiong di). They are also authors, photographers, and movie producer-directors. [Biography from LCNAF

    Service and Document Based Interoperability for European eCustoms Solutions

    No full text
    corecore