3 research outputs found

    Early Detection of Ovarian Cancer Using Gabor Wavelet Phase Quantization and Binary Coding

    Get PDF
    Ovarian cancer is the 5th most common cancer in women, but it is the most difficult to detect in its early stages. Early detection and treatment of ovarian cancer has been shown to increase the five year survival rate of a woman from 12% if caught in stage four of the disease up to 92% if caught in stage one of the disease. Using signal processing, pattern classification and a learning algorithm, it is possible to identify patterns in high dimensionality mass spectrometry data that distinguishes between cancer and non-cancer ovarian samples. For our research, proteomic spectra were generated using SELDI-TOF mass spectrum data, which was composed of 162 ovarian cancer and 91 non-ovarian cancer samples. We introduce a Gabor filter on the mass spectrometry data and design a binary coding scheme for phase quantization encoding that is used for the pattern classification. This pattern will expose crucial features in the data that can be used to correctly classify unmasked samples for the presence or absence of ovarian cancer. Our proposed algorithm was able to successfully discriminate ovarian cancer and non-ovarian samples that yielded results with sensitivities, specificities and accuracies in the 90% to 100% range

    An Improved Utility Driven Approach Towards K-Anonymity Using Data Constraint Rules

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)As medical data continues to transition to electronic formats, opportunities arise for researchers to use this microdata to discover patterns and increase knowledge that can improve patient care. Now more than ever, it is critical to protect the identities of the patients contained in these databases. Even after removing obvious “identifier” attributes, such as social security numbers or first and last names, that clearly identify a specific person, it is possible to join “quasi-identifier” attributes from two or more publicly available databases to identify individuals. K-anonymity is an approach that has been used to ensure that no one individual can be distinguished within a group of at least k individuals. However, the majority of the proposed approaches implementing k-anonymity have focused on improving the efficiency of algorithms implementing k-anonymity; less emphasis has been put towards ensuring the “utility” of anonymized data from a researchers’ perspective. We propose a new data utility measurement, called the research value (RV), which extends existing utility measurements by employing data constraints rules that are designed to improve the effectiveness of queries against the anonymized data. To anonymize a given raw dataset, two algorithms are proposed that use predefined generalizations provided by the data content expert and their corresponding research values to assess an attribute’s data utility as it is generalizing the data to ensure k-anonymity. In addition, an automated algorithm is presented that uses clustering and the RV to anonymize the dataset. All of the proposed algorithms scale efficiently when the number of attributes in a dataset is large
    corecore