    Analysis of user-generated content from online social communities to characterize and predict depression degree

    The identification of a mental disorder at its early stages is a challenging task because it requires clinical interventions that may not be feasible in many cases. Social media such as online communities and blog posts have shown some promising features to help detect and characterise mental disorder at an early stage. In this work, we make use of user-generated content to identify depression and further characterise its degree of severity. We used the user-generated post contents and its associated mood tag to understand and differentiate the linguistic style and sentiments of the user content. We applied machine learning and statistical analysis methods to discriminate the depressive posts and communities from non-depressive ones. The depression degree of a depressed post is identified using variations of valence values based on the mood tag. The proposed methodology achieved 90%, 95% and 92% accuracy for the classification of depressive posts, depressive communities and depression degree, respectively. </jats:p

    Improving classification of epileptic and non-epileptic EEG events by feature selection

    This is the Accepted Manuscript version of the following article: E. Pippa, et al, “Improving classification of epileptic and non-epileptic EEG events by feature selection”, Neurocomputing, Vol. 171: 576-585, July 2015. The final published version is available at: http://www.sciencedirect.com/science/article/pii/S0925231215009509?via%3Dihub Copyright © 2015 Elsevier B.V.Correctly diagnosing generalized epileptic from non-epileptic episodes, such as psychogenic non epileptic seizures (PNES) and vasovagal or vasodepressor syncope (VVS), despite its importance for the administration of appropriate treatment, life improvement of the patient, and cost reduction for patient and healthcare system, is rarely tackled in the literature. Usually clinicians differentiate between generalized epileptic seizures and PNES based on clinical features and video-EEG. In this work, we investigate the use of machine learning techniques for automatic classification of generalized epileptic and non-epileptic events based only on multi-channel EEG data. For this purpose, we extract the signal patterns in the time domain and in the frequency domain and then combine all features across channels to characterize the spatio-temporal manifestation of seizures. Several classification algorithms are explored and evaluated on EEG epochs from 11 subjects in an inter-subject cross-validation setting. Due to large number of features feature ranking and selection is performed prior to classification using the ReliefF ranking algorithm within two different voting strategies. The classification models using feature subsets, achieved higher accuracy compared to the models using all features reaching 95% (Bayesian Network), 89% (Random Committee) and 87% (Random Forest) for binary classification (epileptic versus non-epileptic). The results demonstrate the competitiveness of this approach as opposed to previous methods.Peer reviewe

    Evaluation of techniques for relevance analysis of radiological images using filters

    Una etapa importante y fundamental en el reconocimiento de patrones sobre imágenes es la determinación del conjunto de características que mejor pueda describir la misma. En este artículo se presenta una etapa adicional entre la caracterización de la imagen y su posterior clasificación o recuperación de imágenes similares a una imagen dada, conocido como análisis de relevancia. Este permite reducir la dimensionalidad del conjunto inicial de características a un nuevo conjunto de menor dimensión que conserva la tasa de acierto de la recuperación. Las imágenes analizadas correspondieron a nódulos pulmonares de placas radiológicas de tórax disponibles en una base de datos de acceso libre disponible a través de la sociedad japonesa de tecnología radiológica.An important and fundamental stage in the image pattern recognition is the determination of the characteristics set that best describes the image. This paper describes a further step between the image characterization and its posterior classification or image retrieval similar to a given image, known as relevance analysis. It allows reducing the dimensionality of an initial set of features to a new set with fewer dimensions that preserves the hit rate of the retrieval. The analyzed images corresponded to lung nodules of radiological plaques of thorax, available through the open access library available through the Japanese society of radiological technolog

    Tarjouskirjan stokastinen mallintaminen

    Tässä kandidaatintyössä käsitellään tarjouskirjan stokastista mallintamista, mikä yhdistää useita eri matematiikan osa-alueita. Työssä käsitellään yhtä keskeistä mallia syvällisesti, mikä johtaa ymmärrykseen tarjouskirjan mallintamisesta ja mallin kritisointiin empiiristen tutkimusten ja vaihtoehtoisten mallien pohjalta. Työ on toteutettu kirjallisuuskatsauksena eli malleja on vertailtu kvalitatiivisesti ja selkeitä kehityskohteita on esitetty muiden tutkimusten pohjalta. Tarjouskirjan mallintamisessa tärkeitä mallinnettavia asioita ovat osto- ja myyntihinta, tarjouksien määrät eri hinnoilla ja todennäköisyydet, joilla muutoksia tapahtuu tarjouskirjassa. Työssä tarjouskirja on ensin esitetty matemaattisin merkinnöin, minkä jälkeen näille tärkeille ominaisuuksille on esitetty matemaattiset kaavat tulevaisuuteen. Samalla malliin on esitetty kehityskohteita, kuten esimerkiksi parametrien estimoinnissa vaihtoehtoisia tapoja löytää parhaat parametrit. Työssä esitetään matemaattiset laskukaavat todennäköisyyksille, joilla keskihinta nousee tai laskee, rajahintatarjous toteutuu ennen hinnan muuttumista ja markkinantakaaja tuottaa hintaeron. Näiden todennäköisyyksien esittäminen mahdollistaa eri kaupankäyntistrategioiden kehittämisen ja työssä esitellään näistä kaksi yksinkertaista vaihtoehtoa. Ensin työssä esitetään miten osake kannattaa ostaa ja myydä myöhemmin, jos hinnan nousemiselle on korkea todennäköisyys, ja tämän jälkeen hintaeron tuottamiselle esitetään strategia. Mallin vertaaminen vaihtoehtoisiin malleihin on toteutettu etsimällä mahdollisimman monipuolisia tapoja mallintaa tarjouskirjaa. Tässä ideana on ollut löytää eri kehityskohteita esitettyyn malliin, jotta mahdollinen suurin syy epätarkalle tulokselle voitaisiin löytää. Vaihtoehtoisista malleista löydettiin eroavaisuuksia tarjouksien yksittäisestä tarkastelusta, tapahtumien välisistä korrelaatioista, differentiaaliyhtälöiden tärkeydestä, oletetusta jakaumasta, ratkaisun analyyttisyydestä, volatiliteetin vaikutuksesta, taloustieteellisestä selityksestä ja omien toimintojen vaikutuksesta tarjouskirjaan. Näiden havaintojen perusteella malliin mietittiin mahdollisia kehityskohteita, jotka olisivat jatkotutkimuksen kohteina.This bachelor’s thesis studies stochastic order book modelling, which includes many different subjects from mathematics. The thesis studies a single stochastic order book model more deeply, which helps with understanding order book modelling and in criticizing the model. The criticizing is done by comparing the model to other models and empirical studies. The bachelor’s thesis is a literature review, and thus it is qualitative in nature and the ideas for further improvements are based on other studies. The most important values in order book modelling are ask and bid prices, number of orders on each price, and probabilities of different changes in the order book. In the thesis, the order book is first introduced with mathematical notation, which is followed by the equations of different probabilities of changes. Furthermore, while the order book is introduced, different improvement ideas are introduced. For example, when estimating parameters for the model, one could use different methods to get better results. The three equations of probabilities are introduced for increase in mid-price, executing order before mid-price moves and making the spread. The introduction of the equations makes it possible to use them in simple trading strategies of which two are introduced. In the first one, a market participant buys a stock and sells it later if there is a high probability of increase in the mid-price. In the second one, a market participant should enter two limit book orders if there is a high probability of making the spread. Different models are briefly introduced to compare them to the main model. The different models are as different as possible to get maximum utility from them for the main model. The different models differ in the use of unit size in order sizes, the correlations between different changes, the importance of differential equations, the assumption of the probability distribution, the analytical solution, the effect of volatility, the economic explanation and how a market participant’s actions affect the order book. These observations were used to make propositions for further improvements in the model