3 research outputs found
Early Detection of Ovarian Cancer Using Gabor Wavelet Phase Quantization and Binary Coding
Ovarian cancer is the 5th most common cancer in women, but it is the most difficult to detect in its early stages. Early detection and treatment of ovarian cancer has been shown to increase the five year survival rate of a woman from 12% if caught in stage four of the disease up to 92% if caught in stage one of the disease. Using signal processing, pattern classification and a learning algorithm, it is possible to identify patterns in high dimensionality mass spectrometry data that distinguishes between cancer and non-cancer ovarian samples. For our research, proteomic spectra were generated using SELDI-TOF mass spectrum data, which was composed of 162 ovarian cancer and 91 non-ovarian cancer samples. We introduce a Gabor filter on the mass spectrometry data and design a binary coding scheme for phase quantization encoding that is used for the pattern classification. This pattern will expose crucial features in the data that can be used to correctly classify unmasked samples for the presence or absence of ovarian cancer. Our proposed algorithm was able to successfully discriminate ovarian cancer and non-ovarian samples that yielded results with sensitivities, specificities and accuracies in the 90% to 100% range
An Improved Utility Driven Approach Towards K-Anonymity Using Data Constraint Rules
Indiana University-Purdue University Indianapolis (IUPUI)As medical data continues to transition to electronic formats, opportunities arise for researchers to use this microdata to discover patterns and increase knowledge that can improve patient care. Now more than ever, it is critical to protect the identities of the
patients contained in these databases. Even after removing obvious “identifier”
attributes, such as social security numbers or first and last names, that clearly identify a specific person, it is possible to join “quasi-identifier” attributes from two or more publicly
available databases to identify individuals.
K-anonymity is an approach that has been used to ensure that no one individual
can be distinguished within a group of at least k individuals. However, the majority of the proposed approaches implementing k-anonymity have focused on improving the efficiency of algorithms implementing k-anonymity; less emphasis has been put towards ensuring the “utility” of anonymized data from a researchers’ perspective. We propose a
new data utility measurement, called the research value (RV), which extends existing
utility measurements by employing data constraints rules that are designed to improve
the effectiveness of queries against the anonymized data.
To anonymize a given raw dataset, two algorithms are proposed that use predefined
generalizations provided by the data content expert and their corresponding
research values to assess an attribute’s data utility as it is generalizing the data to
ensure k-anonymity. In addition, an automated algorithm is presented that uses
clustering and the RV to anonymize the dataset. All of the proposed algorithms scale
efficiently when the number of attributes in a dataset is large