122 research outputs found
Approximate TF-IDF based on topic extraction from massive message stream using the GPU
The Web is a constantly expanding global information space that includes disparate types of data and resources. Recent trends demonstrate the urgent need to manage the large amounts of data stream, especially in specific domains of application such as critical infrastructure systems, sensor networks, log file analysis, search engines and more recently, social networks. All of these applications involve large-scale data-intensive tasks, often subject to time constraints and space complexity. Algorithms, data management and data retrieval techniques must be able to process data stream, i.e., process data as it becomes available and provide an accurate response, based solely on the data stream that has already been provided. Data retrieval techniques often require traditional data storage and processing approach, i.e., all data must be available in the storage space in order to be processed. For instance, a widely used relevance measure is Term Frequency–Inverse Document Frequency (TF–IDF), which can evaluate how important a word is in a collection of documents and requires to a priori know the whole dataset.
To address this problem, we propose an approximate version of the TF–IDF measure suitable to work on continuous data stream (such as the exchange of messages, tweets and sensor-based log files). The algorithm for the calculation of this measure makes two assumptions: a fast response is required, and memory is both limited and infinitely smaller than the size of the data stream. In addition, to face the great computational power required to process massive data stream, we present also a parallel implementation of the approximate TF–IDF calculation using Graphical Processing Units (GPUs).
This implementation of the algorithm was tested on generated and real data stream and was able to capture the most frequent terms. Our results demonstrate that the approximate version of the TF–IDF measure performs at a level that is comparable to the solution of the precise TF–IDF measure
Approximate TF–IDF based on topic extraction from massive message stream using the GPU
The Web is a constantly expanding global information space that includes disparate types of data and resources. Recent trends demonstrate the urgent need to manage the large amounts of data stream, especially in specific domains of application such as critical infrastructure systems, sensor networks, log file analysis, search engines and more recently, social networks. All of these applications involve large-scale data-intensive tasks, often subject to time constraints and space complexity. Algorithms, data management and data retrieval techniques must be able to process data stream, i.e., process data as it becomes available and provide an accurate response, based solely on the data stream that has already been provided. Data retrieval techniques often require traditional data storage and processing approach, i.e., all data must be available in the storage space in order to be processed. For instance, a widely used relevance measure is Term Frequency–Inverse Document Frequency (TF–IDF), which can evaluate how important a word is in a collection of documents and requires to a priori know the whole dataset.
To address this problem, we propose an approximate version of the TF–IDF measure suitable to work on continuous data stream (such as the exchange of messages, tweets and sensor-based log files). The algorithm for the calculation of this measure makes two assumptions: a fast response is required, and memory is both limited and infinitely smaller than the size of the data stream. In addition, to face the great computational power required to process massive data stream, we present also a parallel implementation of the approximate TF–IDF calculation using Graphical Processing Units (GPUs).
This implementation of the algorithm was tested on generated and real data stream and was able to capture the most frequent terms. Our results demonstrate that the approximate version of the TF–IDF measure performs at a level that is comparable to the solution of the precise TF–IDF measure
Towards an Ontology Design Pattern for UAV Video Content Analysis
Video scene understanding is leading to an increased research investment in developing artificial intelligence technologies, pattern recognition, and computer vision, especially with the advance in sensor technologies. Developing autonomous unmanned vehicles, able to recognize not just targets appearing in a scene but a complete scene the targets are involved in (describing events, actions, situations, etc.) is becoming crucial in the recent advanced intelligent surveillance systems. At the same time, besides these consolidated technologies, the Semantic Web Technologies are also emerging, yielding seamless support to the high-level understanding of the scenes. To this purpose, the paper proposes a systematic ontology modeling to support and improve video content analysis, by generating a comprehensive high-level scene description, achieved by semantic reasoning and querying. The ontology schema comes from as an integration of new and existing ontologies and provides some design pattern guideline to get a high-level description of a whole scenario. It starts from the description of basic targets in the video scenario, thanks to the support of video tracking algorithms and target classification; then provides a higher level interpretation, compounding event-driven target interactions (for local activity comprehension), to reach gradually an abstraction high level that enables a concise and complete scenario description
Laboratory-based surveillance of invasive listeriosis in Northern Italy over a fourteen-year period: epidemiological and clinical results
Introduction
Invasive listeriosis is a rare foodborne disease with a large public health impact, because of the severity of its clinical manifestations and high fatality rate. In this study, we provide a snapshot of epidemiology of listeriosis in Lombardy Region, Northern Italy, reviewing enhanced surveillance data collected over fourteen years, after the implementation of a voluntary laboratory-based surveillance system for the referral of clinical isolates of Listeria monocytogenes to a regional reference laboratory, since 2005.
Methods
Invasive listeriosis cases data from 2005 to 2018 were extracted from the regional laboratory-based surveillance system database and compared with the regional mandatory notification disease system data.
Results
Over the fourteen period under study, 533 Listeria monocytogenes isolates were detected by the laboratory surveillance system, 55 of which from pregnancy-related cases. The median age of non-pregnancy-associated patients was 71 years, with 64.6% of cases observed in the elderly. Cases with underlying medical risk conditions accounted for 92.1%, and the fatality rate was 26.2%. By integrating data from the two sources, a total of 935 cases were recorded. The collection of data through the laboratory surveillance system allowed to increase the surveillance sensitivity by 18%.
Conclusions
Our results documented the growing epidemiological relevance of listeriosis through the analysis of two information sources. The data we obtained were consistent with the literature, except for pregnancy-related cases, which are often underdiagnosed. This study highlighted the importance of laboratory-based surveillance system, which led to a significant increase in the sensitivity of the mandatory notification system
Obstetric near-miss cases among women admitted to intensive care units in Italy
Objective. Maternal near-miss defines a narrow category of morbidity encompassing potentially life-threatening episodes. The purpose of this study was to detect near-miss instances among women admitted to intensive care units or coronary units, analyze associated causes, and compute absolute and specific maternal morbidity rates in six Italian regions. Design. Observational retrospective study. Setting. Six Italian regions representing 49% of all resident Italian women aged 15-49 years. Population. The study population included all pregnant women aged 15-49 years admitted to intensive care units or coronary care units in the participating regions. Cases were defined as women aged 15-49 years resident in the participating regions, with one or more hospitalizations in intensive care for pregnancy or any pregnancy outcome between 2004 and 2005. Methods. Cases were identified through the Hospital Discharge Database. Enrolled cases were diagnosed according to the 9th International Classification of Diseases. Main outcome measure. Maternal near-miss rate (number of women experiencing an admission to intensive care units/all women with live or stillborn babies). Results. A total of 1259 near-miss cases were identified and the total maternal near-miss rate was 2.0/1000 deliveries. Seventy percent of the women were admitted to intensive care units or coronary units after a cesarean section. The leading associated risk factors were obstetric hemorrhage/disseminated intravascular coagulation (40%) and hypertensive disorders of pregnancy (29%). Conclusions. Monitoring of near-miss morbidity in conjunction with mortality surveillance could help to identify effective preventive measures for potentially life-threatening episodes
Genome analysis of Legionella pneumophila ST23 from various countries reveals highly similar strains
© 2022 Ricci et al. This article is available under a CreativeCommons License (Attribution 4.0 International, as
described at https://creativecommons.org/licenses/by/4.0/).Legionella pneumophila serogroup 1 (Lp1) sequence type (ST) 23 is one of the most commonly detected STs in Italy where it currently causes all investigated outbreaks. ST23 has caused both epidemic and sporadic cases between 1995 and 2018 and was analysed at genomic level and compared with ST23 isolated in other countries to determine possible similarities and differences. A core genome multi-locus sequence typing (cgMLST), based on a previously described set of 1,521 core genes, and single-nucleotide polymorphisms (SNPs) approaches were applied to an ST23 collection including genomes from Italy, France, Denmark and Scotland. DNAs were automatically extracted, libraries prepared using NextEra library kit and MiSeq sequencing performed. Overall, 63 among clinical and environmental Italian Lp1 isolates and a further seven and 11 ST23 from Denmark and Scotland, respectively, were sequenced, and pangenome analysed. Both cgMLST and SNPs analyses showed very few loci and SNP variations in ST23 genomes. All the ST23 causing outbreaks and sporadic cases in Italy and elsewhere, were phylogenetically related independent of year, town or country of isolation. Distances among the ST23s were further shortened when SNPs due to horizontal gene transfers were removed. The Lp1 ST23 isolated in Italy have kept their monophyletic origin, but they are phylogenetically close also to ST23 from other countries. The ST23 are quite widespread in Italy, and a thorough epidemiological investigation is compelled to determine sources of infection when this ST is identified in both LD sporadic cases and outbreaks.info:eu-repo/semantics/publishedVersio
[Epidemiology and surveillance of hepatitis E in Italy: data from the SEIEVA surveillance system 2007-2019]
hepatitis E is a disease spread all over the world, with endemic levels varying according to ecological and socioeconomic factors. In developing countries, large epidemics spread mainly through contaminated water; in developed countries, hepatitis E has always been considered a sporadic disease, closely associated to the travels to endemic areas, especially in Southeastern Asia. In the last years, this perception is significantly changing, because of an increasing number of autochthonous cases reported in many European countries
ITALIAN CANCER FIGURES - REPORT 2015: The burden of rare cancers in Italy = I TUMORI IN ITALIA - RAPPORTO 2015: I tumori rari in Italia
OBJECTIVES:
This collaborative study, based on data collected by the network of Italian Cancer Registries (AIRTUM), describes the burden of rare cancers in Italy. Estimated number of new rare cancer cases yearly diagnosed (incidence), proportion of patients alive after diagnosis (survival), and estimated number of people still alive after a new cancer diagnosis (prevalence) are provided for about 200 different cancer entities.
MATERIALS AND METHODS:
Data herein presented were provided by AIRTUM population- based cancer registries (CRs), covering nowadays 52% of the Italian population. This monograph uses the AIRTUM database (January 2015), which includes all malignant cancer cases diagnosed between 1976 and 2010. All cases are coded according to the International Classification of Diseases for Oncology (ICD-O-3). Data underwent standard quality checks (described in the AIRTUM data management protocol) and were checked against rare-cancer specific quality indicators proposed and published by RARECARE and HAEMACARE (www.rarecarenet.eu; www.haemacare.eu). The definition and list of rare cancers proposed by the RARECAREnet "Information Network on Rare Cancers" project were adopted: rare cancers are entities (defined as a combination of topographical and morphological codes of the ICD-O-3) having an incidence rate of less than 6 per 100,000 per year in the European population. This monograph presents 198 rare cancers grouped in 14 major groups. Crude incidence rates were estimated as the number of all new cancers occurring in 2000-2010 divided by the overall population at risk, for males and females (also for gender-specific tumours).The proportion of rare cancers out of the total cancers (rare and common) by site was also calculated. Incidence rates by sex and age are reported. The expected number of new cases in 2015 in Italy was estimated assuming the incidence in Italy to be the same as in the AIRTUM area. One- and 5-year relative survival estimates of cases aged 0-99 years diagnosed between 2000 and 2008 in the AIRTUM database, and followed up to 31 December 2009, were calculated using complete cohort survival analysis. To estimate the observed prevalence in Italy, incidence and follow-up data from 11 CRs for the period 1992-2006 were used, with a prevalence index date of 1 January 2007. Observed prevalence in the general population was disentangled by time prior to the reference date (≤2 years, 2-5 years, ≤15 years). To calculate the complete prevalence proportion at 1 January 2007 in Italy, the 15-year observed prevalence was corrected by the completeness index, in order to account for those cancer survivors diagnosed before the cancer registry activity started. The completeness index by cancer and age was obtained by means of statistical regression models, using incidence and survival data available in the European RARECAREnet data.
RESULTS:
In total, 339,403 tumours were included in the incidence analysis. The annual incidence rate (IR) of all 198 rare cancers in the period 2000-2010 was 147 per 100,000 per year, corresponding to about 89,000 new diagnoses in Italy each year, accounting for 25% of all cancer. Five cancers, rare at European level, were not rare in Italy because their IR was higher than 6 per 100,000; these tumours were: diffuse large B-cell lymphoma and squamous cell carcinoma of larynx (whose IRs in Italy were 7 per 100,000), multiple myeloma (IR: 8 per 100,000), hepatocellular carcinoma (IR: 9 per 100,000) and carcinoma of thyroid gland (IR: 14 per 100,000). Among the remaining 193 rare cancers, more than two thirds (No. 139) had an annual IR <0.5 per 100,000, accounting for about 7,100 new cancers cases; for 25 cancer types, the IR ranged between 0.5 and 1 per 100,000, accounting for about 10,000 new diagnoses; while for 29 cancer types the IR was between 1 and 6 per 100,000, accounting for about 41,000 new cancer cases. Among all rare cancers diagnosed in Italy, 7% were rare haematological diseases (IR: 41 per 100,000), 18% were solid rare cancers. Among the latter, the rare epithelial tumours of the digestive system were the most common (23%, IR: 26 per 100,000), followed by epithelial tumours of head and neck (17%, IR: 19) and rare cancers of the female genital system (17%, IR: 17), endocrine tumours (13% including thyroid carcinomas and less than 1% with an IR of 0.4 excluding thyroid carcinomas), sarcomas (8%, IR: 9 per 100,000), central nervous system tumours and rare epithelial tumours of the thoracic cavity (5%with an IR equal to 6 and 5 per 100,000, respectively). The remaining (rare male genital tumours, IR: 4 per 100,000; tumours of eye, IR: 0.7 per 100,000; neuroendocrine tumours, IR: 4 per 100,000; embryonal tumours, IR: 0.4 per 100,000; rare skin tumours and malignant melanoma of mucosae, IR: 0.8 per 100,000) each constituted <4% of all solid rare cancers. Patients with rare cancers were on average younger than those with common cancers. Essentially, all childhood cancers were rare, while after age 40 years, the common cancers (breast, prostate, colon, rectum, and lung) became increasingly more frequent. For 254,821 rare cancers diagnosed in 2000-2008, 5-year RS was on average 55%, lower than the corresponding figures for patients with common cancers (68%). RS was lower for rare cancers than for common cancers at 1 year and continued to diverge up to 3 years, while the gap remained constant from 3 to 5 years after diagnosis. For rare and common cancers, survival decreased with increasing age. Five-year RS was similar and high for both rare and common cancers up to 54 years; it decreased with age, especially after 54 years, with the elderly (75+ years) having a 37% and 20% lower survival than those aged 55-64 years for rare and common cancers, respectively. We estimated that about 900,000 people were alive in Italy with a previous diagnosis of a rare cancer in 2010 (prevalence). The highest prevalence was observed for rare haematological diseases (278 per 100,000) and rare tumours of the female genital system (265 per 100,000). Very low prevalence (<10 prt 100,000) was observed for rare epithelial skin cancers, for rare epithelial tumours of the digestive system and rare epithelial tumours of the thoracic cavity.
COMMENTS:
One in four cancers cases diagnosed in Italy is a rare cancer, in agreement with estimates of 24% calculated in Europe overall. In Italy, the group of all rare cancers combined, include 5 cancer types with an IR>6 per 100,000 in Italy, in particular thyroid cancer (IR: 14 per 100,000).The exclusion of thyroid carcinoma from rare cancers reduces the proportion of them in Italy in 2010 to 22%. Differences in incidence across population can be due to the different distribution of risk factors (whether environmental, lifestyle, occupational, or genetic), heterogeneous diagnostic intensity activity, as well as different diagnostic capacity; moreover heterogeneity in accuracy of registration may determine some minor differences in the account of rare cancers. Rare cancers had worse prognosis than common cancers at 1, 3, and 5 years from diagnosis. Differences between rare and common cancers were small 1 year after diagnosis, but survival for rare cancers declined more markedly thereafter, consistent with the idea that treatments for rare cancers are less effective than those for common cancers. However, differences in stage at diagnosis could not be excluded, as 1- and 3-year RS for rare cancers was lower than the corresponding figures for common cancers. Moreover, rare cancers include many cancer entities with a bad prognosis (5-year RS <50%): cancer of head and neck, oesophagus, small intestine, ovary, brain, biliary tract, liver, pleura, multiple myeloma, acute myeloid and lymphatic leukaemia; in contrast, most common cancer cases are breast, prostate, and colorectal cancers, which have a good prognosis. The high prevalence observed for rare haematological diseases and rare tumours of the female genital system is due to their high incidence (the majority of haematological diseases are rare and gynaecological cancers added up to fairly high incidence rates) and relatively good prognosis. The low prevalence of rare epithelial tumours of the digestive system was due to the low survival rates of the majority of tumours included in this group (oesophagus, stomach, small intestine, pancreas, and liver), regardless of the high incidence rate of rare epithelial cancers of these sites. This AIRTUM study confirms that rare cancers are a major public health problem in Italy and provides quantitative estimations, for the first time in Italy, to a problem long known to exist. This monograph provides detailed epidemiologic indicators for almost 200 rare cancers, the majority of which (72%) are very rare (IR<0.5 per 100,000). These data are of major interest for different stakeholders. Health care planners can find useful information herein to properly plan and think of how to reorganise health care services. Researchers now have numbers to design clinical trials considering alternative study designs and statistical approaches. Population-based cancer registries with good quality data are the best source of information to describe the rare cancer burden in a population
Acute Delta Hepatitis in Italy spanning three decades (1991–2019): Evidence for the effectiveness of the hepatitis B vaccination campaign
Updated incidence data of acute Delta virus hepatitis (HDV) are lacking worldwide. Our aim was to evaluate incidence of and risk factors for acute HDV in Italy after the introduction of the compulsory vaccination against hepatitis B virus (HBV) in 1991. Data were obtained from the National Surveillance System of acute viral hepatitis (SEIEVA). Independent predictors of HDV were assessed by logistic-regression analysis. The incidence of acute HDV per 1-million population declined from 3.2 cases in 1987 to 0.04 in 2019, parallel to that of acute HBV per 100,000 from 10.0 to 0.39 cases during the same period. The median age of cases increased from 27 years in the decade 1991-1999 to 44 years in the decade 2010-2019 (p < .001). Over the same period, the male/female ratio decreased from 3.8 to 2.1, the proportion of coinfections increased from 55% to 75% (p = .003) and that of HBsAg positive acute hepatitis tested for by IgM anti-HDV linearly decreased from 50.1% to 34.1% (p < .001). People born abroad accounted for 24.6% of cases in 2004-2010 and 32.1% in 2011-2019. In the period 2010-2019, risky sexual behaviour (O.R. 4.2; 95%CI: 1.4-12.8) was the sole independent predictor of acute HDV; conversely intravenous drug use was no longer associated (O.R. 1.25; 95%CI: 0.15-10.22) with this. In conclusion, HBV vaccination was an effective measure to control acute HDV. Intravenous drug use is no longer an efficient mode of HDV spread. Testing for IgM-anti HDV is a grey area requiring alert. Acute HDV in foreigners should be monitored in the years to come
- …