17 research outputs found
Interpretable methods in cancer diagnostics
Cancer is a hard problem. It is hard for the patients, for the doctors and nurses,
and for the researchers working on understanding the disease and finding better
treatments for it. The challenges faced by a pathologist diagnosing the disease
for a patient is not necessarily the same as the ones faced by cell biologists
working on experimental treatments and understanding the fundamentals of
cancer. In this thesis we work on different challenges faced by both of the above
teams.
This thesis first presents methods to improve the analysis of the flow cy-
tometry data used frequently in the diagnosis process, specifically for the two
subtypes of non-Hodgkin Lymphoma which are our focus: Follicular Lymphoma
and Diffuse Large B Cell Lymphoma. With a combination of concepts from graph
theory, dynamic programming, and machine learning, we present methods to
improve the diagnosis process and the analysis of the abovementioned data.
The interpretability of the method helps a pathologist to better understand a
patient’s disease, which itself improves their choices for a treatment.
In the second part, we focus on the analysis of DNA-methylation and gene
expression data, both of which presenting the challenge of being very high dimen-
sional yet with a few number of samples comparatively. We present an ensemble
model which adapts to different patterns seen in each given data, in order to
adapt to noise and batch effects. At the same time, the interpretability of our
model helps a pathologist to better find and tune the treatment for the patient:
a step further towards personalized medicine.Krebs ist ein schweres Problem. Es ist schwer für die Patienten, für die Ärzte und Krankenschwestern und für die Forscher, die daran arbeiten, die Krankheit zu verstehen und eine bessere Behandlung dafür zu finden. Die Herausforderungen, mit denen ein Pathologe konfrontiert ist, um die Krankheit eines Patienten zu diagnostizieren, müssen nicht die gleichen sein, mit denen Zellbiologen konfrontiert sind, die an experimentellen Behandlungen arbeiten und die Grundlagen von Krebs verstehen. In dieser Arbeit beschäftigen wir uns mit verschiedenen Herausforderungen, denen sich beide oben genannten Teams stellen. In dieser Arbeit werden zunächst Methoden vorgestellt, um die Analyse der im Diagnoseverfahren häufig verwendeten Durchflusszytometriedaten zu verbessern, insbesondere für die beiden Subtypen des Non-Hodgkin-Lymphoms, auf die wir uns konzentrieren: das follikuläre Lymphom und das diffuse großzellige B-Zell-Lymphom. Mit einer Kombination von Konzepten aus Graphentheorie, dynamischer Programmierung und künstliche Intelligenz präsentieren wir Methoden zur Verbesserung des Diagnoseprozesses und der Analyse der oben genannten Daten. Die Interpretierbarkeit der Methode hilft einem Pathologen, die Apatientenkrankheit besser zu verstehen, was wiederum seine Wahlmöglichkeiten für eine Behandlung verbessert. Im zweiten Teil konzentrieren wir uns auf die Analyse von DNA-Methylierungsund Genexpressionsdaten, die beide die Herausforderung darstellen, sehr hochdimensional zu sein, jedoch mit nur wenigen Proben im Vergleich.Wir präsentieren ein Zusammenstellungsmodell, das sich an unterschiedliche Muster anpasst, die in den jeweiligen Daten zu sehen sind, um sich an Rauschen und Batch-Effekte anzupassen. Gleichzeitig hilft die Interpretierbarkeit unseres Modells einem Pathologen, die Behandlung für den Patienten besser zu finden und abzustimmen: ein Schritt weiter in Richtung personalisierter Medizin
Fairlearn: Assessing and Improving Fairness of AI Systems
Fairlearn is an open source project to help practitioners assess and improve
fairness of artificial intelligence (AI) systems. The associated Python
library, also named fairlearn, supports evaluation of a model's output across
affected populations and includes several algorithms for mitigating fairness
issues. Grounded in the understanding that fairness is a sociotechnical
challenge, the project integrates learning resources that aid practitioners in
considering a system's broader societal context
RchyOptimyx: Gating Hierarchy Optimization for Flow Cytometry
3.1 Processing using flowType......................
Additional file 2 of Interpretable per case weighted ensemble method for cancer associations
Detailed performance measures. (PDF 578 kb
Cancerouspdomains: comprehensive analysis of cancer type-specific recurrent somatic mutations in proteins and domains
Abstract Background Discriminating driver mutations from the ones that play no role in cancer is a severe bottleneck in elucidating molecular mechanisms underlying cancer development. Since protein domains are representatives of functional regions within proteins, mutations on them may disturb the protein functionality. Therefore, studying mutations at domain level may point researchers to more accurate assessment of the functional impact of the mutations. Results This article presents a comprehensive study to map mutations from 29 cancer types to both sequence- and structure-based domains. Statistical analysis was performed to identify candidate domains in which mutations occur with high statistical significance. For each cancer type, the corresponding type-specific domains were distinguished among all candidate domains. Subsequently, cancer type-specific domains facilitated the identification of specific proteins for each cancer type. Besides, performing interactome analysis on specific proteins of each cancer type showed high levels of interconnectivity among them, which implies their functional relationship. To evaluate the role of mitochondrial genes, stem cell-specific genes and DNA repair genes in cancer development, their mutation frequency was determined via further analysis. Conclusions This study has provided researchers with a publicly available data repository for studying both CATH and Pfam domain regions on protein-coding genes. Moreover, the associations between different groups of genes/domains and various cancer types have been clarified. The work is available at http://www.cancerouspdomains.ir
Additional file 1: of Cancerouspdomains: comprehensive analysis of cancer type-specific recurrent somatic mutations in proteins and domains
It contains 27 supplementary tables, Table A1–27 with the data described in the text. (XLSX 1032 kb
Sage-Bionetworks/synapsePythonClient: v4.0.0-rc
<h2>Highlights</h2>
<ul>
<li><strong>Only authentication through Personal Access Token</strong>
<strong>(aka: Authentication bearer token) is supported</strong>. Review the <a href="https://python-docs.synapse.org/tutorials/authentication/">Authentication document</a> for information on setting up your usage of a Personal Access Token to authenticate with Synapse.</li>
<li><strong>Date type Annotations on Synapse entities are now timezone aware</strong>. Review our <a href="https://python-docs.synapse.org/reference/annotations/">reference documentation for Annotations</a>. The <a href="https://pypi.org/project/pytz/"><code>pytz</code> package</a> is reccomended if you regularly work with data across time zones.<ul>
<li>If you do not set the <code>tzinfo</code> field on a date or datetime instance we will use the timezone of the machine where the code is executing.</li>
<li>If you are using the <a href="https://python-docs.synapse.org/explanations/manifest_tsv/#annotations">Manifest TSV</a> for bulk actions on your projects you'll now see that [synapseutils.sync.syncFromSynapse][] will store dates as <code>YYYY-MM-DDTHH:MM:SSZ</code>. Review our documentation for an <a href="https://python-docs.synapse.org/explanations/manifest_tsv/#example-manifest-file">example manifest file</a>. Additionally, if you'd like to upload an annotation in a specific timezone please make sure that it is in <a href="https://en.wikipedia.org/wiki/ISO_8601">ISO 8601 format</a>. If you do not specify a timezone it is assumed to use the timezone of the machine where the code is executing.</li>
</ul>
</li>
<li><strong>Support for annotations with multiple values through the</strong> <a href="https://python-docs.synapse.org/explanations/manifest_tsv/#multiple-values-of-annotations-per-key"><strong>Manifest TSV</strong></a> with the usage of a comma delimited bracket wrapped list. Any manifest files wishing to take advantage of multi-value annotations need to match this format. Examples:<ul>
<li><code>["Annotation, with a comma", another annotation]</code></li>
<li><code>[1,2,3]</code></li>
<li><code>[2023-12-04T07:00:00Z,2000-01-01T07:00:00Z]</code></li>
</ul>
</li>
<li>Migration and expansion of the docs site! You'll see that the look, feel, and flow of all of the information on this site has been touched. As we move forward we hope that you'll <a href="https://sagebionetworks.jira.com/servicedesk/customer/portal/5/group/7">provide the Data Processing and Engineering team feedback on areas we can improve</a>.</li>
<li>Expansion of the available Python Tutorials can be found <a href="https://python-docs.synapse.org/tutorials/python_client/">starting here</a>.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Adding a label to the dockerfile to automatically label it for this repo by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1018</li>
<li>Updates Dockerfile to Correctly Install Dependencies by @BWMac in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1019</li>
<li>[SYNPY-1358] Correction of timestamp in annotations from manifest file by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1020</li>
<li>[SYNPY-1336] Benchmarking upload with annotations by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1021</li>
<li>[SYNPY-1321] Download benchmark results by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1024</li>
<li>[SYNPY-1360] Migrating to mkdocstrings by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1025</li>
<li>[SYNPY-1366] Add code coverage by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1029</li>
<li>[SYNPY-1362] High level best practices for project structure by @thomasyu888 in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1028</li>
<li>[SYNPY-1371] Migrate to Google Style by @BWMac in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1033</li>
<li>[SYNPY-1302] Replace getPermission with get_acl and add new get_permissions by @danlu1 in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1037</li>
<li>[SYNPY-1334] Revamp getting started docs by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1036</li>
<li>[SYNPY-1332] Pypi deployment strategy by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1038</li>
<li>[SYNPY-1370] Documentation Upgrade by @jaymedina in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1032</li>
<li>[SYNPY-1370] Minor formatting fixes by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1039</li>
<li>[SYNPY-1371] Doc fixes by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1040</li>
<li>[SYNPY-1225] Support authToken only by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1041</li>
<li>[SYNPY-1392] Remove some deprecated pieces by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1043</li>
<li>[Synpy 1369] Migrate to Google style by @danlu1 in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1042</li>
<li>[SYNPY-1387] Update Structure Project doc by @danlu1 in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1044</li>
<li>[SYNPY-1357] Allow multiple values in manifest TSV by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1030</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li>@jaymedina made their first contribution in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/1032</li>
</ul>
<p><strong>Full Changelog</strong>: https://github.com/Sage-Bionetworks/synapsePythonClient/compare/v3.2.0...v4.0.0-rc</p>
Sage-Bionetworks/synapsePythonClient: v3.1.0-rc
<h2>What's Changed</h2>
<ul>
<li>[SYNPY-49] Aggregate acl based on groups by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/979</li>
<li>[SYNPY-967] deprecated memoize and added @lru_cache by @BryanFauble, @linglp in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/983</li>
<li>SYNPY-1285: Create pipfile by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/984</li>
<li>[SYNPY-1282] Adds Type Hinting to <code>client.py</code> by @BWMac in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/987</li>
<li>[SYNPY-1293] Update urllib3 version dependency by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/988</li>
<li>[SYNPY-1283] Replace Broken Link URL by @BWMac in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/989</li>
<li>[SYNPY-1296] Config client error with api key or PAT by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/990* * [SYNPY-1283] Adds Missing Trailing Space (Broken Link Fix) by @BWMac in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/991</li>
<li>[SYNPY-1295] Adding to the credentials.rst doc by @BryanFauble in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/992</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li>@BryanFauble made their first contribution in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/979</li>
<li>@BWMac made their first contribution in https://github.com/Sage-Bionetworks/synapsePythonClient/pull/987</li>
</ul>
<p><strong>Full Changelog</strong>: https://github.com/Sage-Bionetworks/synapsePythonClient/compare/v3.0.0...v3.1.0-rc</p>