322 research outputs found
ENTERPRISE CREDIT RISK ASSESSMENT ANALYZING THE DATA OF SHORT TERM ACTIVITY PERIOD
This research investigates the possibility to classify the companies into default and non-default groups analyzing the financial data of 1 year. The developed statistical model enables banks to predict the default of new companies that have no sufficient financial information for the credit risk assessment using other models. The classification and regression tree predicts the default of companies with the 96% probability. The complementary analysis the financial data of 2 years by probit model allows to increase the classification accuracy to 99%.
DOI: https://doi.org/10.15544/ssaf.2012.2
Statistical deformation reconstruction using multi-organ shape features for pancreatic cancer localization
Respiratory motion and the associated deformations of abdominal organs and tumors are essential information in clinical applications. However, inter- and intra-patient multi-organ deformations are complex and have not been statistically formulated, whereas single organ deformations have been widely studied. In this paper, we introduce a multi-organ deformation library and its application to deformation reconstruction based on the shape features of multiple abdominal organs. Statistical multi-organ motion/deformation models of the stomach, liver, left and right kidneys, and duodenum were generated by shape matching their region labels defined on four-dimensional computed tomography images. A total of 250 volumes were measured from 25 pancreatic cancer patients. This paper also proposes a per-region-based deformation learning using the non-linear kernel model to predict the displacement of pancreatic cancer for adaptive radiotherapy. The experimental results show that the proposed concept estimates deformations better than general per-patient-based learning models and achieves a clinically acceptable estimation error with a mean distance of 1.2 ± 0.7 mm and a Hausdorff distance of 4.2 ± 2.3 mm throughout the respiratory motion
ROMO: Retrieval-enhanced Offline Model-based Optimization
Data-driven black-box model-based optimization (MBO) problems arise in a
great number of practical application scenarios, where the goal is to find a
design over the whole space maximizing a black-box target function based on a
static offline dataset. In this work, we consider a more general but
challenging MBO setting, named constrained MBO (CoMBO), where only part of the
design space can be optimized while the rest is constrained by the environment.
A new challenge arising from CoMBO is that most observed designs that satisfy
the constraints are mediocre in evaluation. Therefore, we focus on optimizing
these mediocre designs in the offline dataset while maintaining the given
constraints rather than further boosting the best observed design in the
traditional MBO setting. We propose retrieval-enhanced offline model-based
optimization (ROMO), a new derivable forward approach that retrieves the
offline dataset and aggregates relevant samples to provide a trusted
prediction, and use it for gradient-based optimization. ROMO is simple to
implement and outperforms state-of-the-art approaches in the CoMBO setting.
Empirically, we conduct experiments on a synthetic Hartmann (3D) function
dataset, an industrial CIO dataset, and a suite of modified tasks in the
Design-Bench benchmark. Results show that ROMO performs well in a wide range of
constrained optimization tasks.Comment: 15 pages, 9 figure
Fortschritte im unĂŒberwachten Lernen und Anwendungsbereiche: Subspace Clustering mit Hintergrundwissen, semantisches Passworterraten und erlernte Indexstrukturen
Over the past few years, advances in data science, machine learning and, in particular, unsupervised learning have enabled significant progress in many scientific fields and even in everyday life. Unsupervised learning methods are usually successful whenever they can be tailored to specific applications using appropriate requirements based on domain expertise. This dissertation shows how purely theoretical research can lead to circumstances that favor overly optimistic results, and the advantages of application-oriented research based on specific background knowledge. These observations apply to traditional unsupervised learning problems such as clustering, anomaly detection and dimensionality reduction. Therefore, this thesis presents extensions of these classical problems, such as subspace clustering and principal component analysis, as well as several specific applications with relevant interfaces to machine learning. Examples include password guessing using semantic word embeddings and learning spatial index structures using statistical models. In essence, this thesis shows that application-oriented research has many advantages for current and future research.In den letzten Jahren haben Fortschritte in der Data Science, im maschinellen Lernen und insbesondere im unĂŒberwachten Lernen zu erheblichen Fortentwicklungen in vielen Bereichen der Wissenschaft und des tĂ€glichen Lebens gefĂŒhrt. Methoden des unĂŒberwachten Lernens sind in der Regel dann erfolgreich, wenn sie durch geeignete, auf Expertenwissen basierende Anforderungen an spezifische Anwendungen angepasst werden können. Diese Dissertation zeigt, wie rein theoretische Forschung zu UmstĂ€nden fĂŒhren kann, die allzu optimistische Ergebnisse begĂŒnstigen, und welche Vorteile anwendungsorientierte Forschung hat, die auf spezifischem Hintergrundwissen basiert. Diese Beobachtungen gelten fĂŒr traditionelle unĂŒberwachte Lernprobleme wie Clustering, Anomalieerkennung und DimensionalitĂ€tsreduktion. Daher werden in diesem Beitrag Erweiterungen dieser klassischen Probleme, wie Subspace Clustering und Hauptkomponentenanalyse, sowie einige spezifische Anwendungen mit relevanten Schnittstellen zum maschinellen Lernen vorgestellt. Beispiele sind das Erraten von Passwörtern mit Hilfe semantischer Worteinbettungen und das Lernen von rĂ€umlichen Indexstrukturen mit Hilfe statistischer Modelle. Im Wesentlichen zeigt diese Arbeit, dass anwendungsorientierte Forschung viele Vorteile fĂŒr die aktuelle und zukĂŒnftige Forschung hat
An academic review: applications of data mining techniques in finance industry
With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Mooreâs Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance
Spectral Graph Embedding for Dimension Reduction in Financial Risk Assessment
The economic downturn in recent years has had a significant negative impact on corporates performance. In the last two years, as in the last years of 2010s, many companies have been influenced by the economic conditions and some have gone bankrupt. This has led to an increase in companies' financial risk. One of the significant branches of financial risk is the emph{company's credit risk}. Lenders and investors attach great importance to determining a company's credit risk when granting a credit facility. Credit risk means the possibility of default on repayment of facilities received by a company. There are various models for assessing credit risk using statistical models or machine learning. In this paper, we will investigate the machine learning task of the binary classification of firms into bankrupt and healthy based on the emph{spectral graph theory}. We first construct an emph{adjacency graph} from a list of firms with their corresponding emph{feature vectors}. Next, we first embed this graph into a one-dimensional Euclidean space and then into a two dimensional Euclidean space to obtain two lower-dimensional representations of the original data points. Finally, we apply the emph{support vector machine} and the emph{multi-layer perceptron} neural network techniques to proceed binary emph{node classification}. The results of the proposed method on the given dataset (selected firms of Tehran stock exchange market) show a comparative advantage over PCA method of emph{dimension reduction}. Finally, we conclude the paper with some discussions on further research directions
Natural image processing and synthesis using deep learning
Nous Ă©tudions dans cette thĂšse comment les rĂ©seaux de neurones profonds peuvent ĂȘtre utilisĂ©s dans diffĂ©rents domaines de la vision artificielle. La vision artificielle est un domaine interdisciplinaire qui traite de la comprĂ©hension dâimages et de vidĂ©os numĂ©riques. Les problĂšmes de ce domaine ont traditionnellement Ă©tĂ© adressĂ©s avec des mĂ©thodes ad-hoc nĂ©cessitant beaucoup de rĂ©glages manuels. En effet, ces systĂšmes de vision artificiels comprenaient jusquâĂ rĂ©cemment une sĂ©rie de modules optimisĂ©s indĂ©pendamment. Cette approche est trĂšs raisonnable dans la mesure oĂč, avec peu de donnĂ©es, elle bĂ©nĂ©ficient autant que possible des connaissances du chercheur. Mais cette avantage peut se rĂ©vĂ©ler ĂȘtre une limitation si certaines donnĂ©es dâentrĂ© nâont pas Ă©tĂ© considĂ©rĂ©es dans la conception de lâalgorithme.
Avec des volumes et une diversitĂ© de donnĂ©es toujours plus grands, ainsi que des capacitĂ©s de calcul plus rapides et Ă©conomiques, les rĂ©seaux de neurones profonds optimisĂ©s dâun bout Ă lâautre sont devenus une alternative attrayante. Nous dĂ©montrons leur avantage avec une sĂ©rie dâarticles de recherche, chacun dâentre eux trouvant une solution Ă base de rĂ©seaux de neurones profonds Ă un problĂšme dâanalyse ou de synthĂšse visuelle particulier.
Dans le premier article, nous considĂ©rons un problĂšme de vision classique: la dĂ©tection de bords et de contours. Nous partons de lâapproche classique et la rendons plus âneuraleâ en combinant deux Ă©tapes, la dĂ©tection et la description de motifs visuels, en un seul rĂ©seau convolutionnel. Cette mĂ©thode, qui peut ainsi sâadapter Ă de nouveaux ensembles de donnĂ©es, sâavĂšre ĂȘtre au moins aussi prĂ©cis que les mĂ©thodes conventionnelles quand il sâagit de domaines qui leur sont favorables, tout en Ă©tant beaucoup plus robuste dans des domaines plus gĂ©nĂ©rales.
Dans le deuxiĂšme article, nous construisons une nouvelle architecture pour la manipulation dâimages qui utilise lâidĂ©e que la majoritĂ© des pixels produits peuvent dâĂȘtre copiĂ©s de lâimage dâentrĂ©e. Cette technique bĂ©nĂ©ficie de plusieurs avantages majeurs par rapport Ă lâapproche conventionnelle en apprentissage profond. En effet, elle conserve les dĂ©tails de lâimage dâorigine, nâintroduit pas dâaberrations grĂące Ă la capacitĂ© limitĂ©e du rĂ©seau sous-jacent et simplifie lâapprentissage. Nous dĂ©montrons lâefficacitĂ© de cette architecture dans le cadre dâune tĂąche de correction du regard, oĂč notre systĂšme produit dâexcellents rĂ©sultats.
Dans le troisiĂšme article, nous nous Ă©clipsons de la vision artificielle pour Ă©tudier le problĂšme plus gĂ©nĂ©rale de lâadaptation Ă de nouveaux domaines. Nous dĂ©veloppons un nouvel algorithme dâapprentissage, qui assure lâadaptation avec un objectif auxiliaire Ă la tĂąche principale. Nous cherchons ainsi Ă extraire des motifs qui permettent dâaccomplir la tĂąche mais qui ne permettent pas Ă un rĂ©seau dĂ©diĂ© de reconnaĂźtre le domaine. Ce rĂ©seau est optimisĂ© de maniĂšre simultanĂ© avec les motifs en question, et a pour tĂąche de reconnaĂźtre le domaine de provenance des motifs. Cette technique est simple Ă implĂ©menter, et conduit pourtant Ă lâĂ©tat de lâart sur toutes les tĂąches de rĂ©fĂ©rence.
Enfin, le quatriĂšme article prĂ©sente un nouveau type de modĂšle gĂ©nĂ©ratif dâimages. Ă lâopposĂ© des approches conventionnels Ă base de rĂ©seaux de neurones convolutionnels, notre systĂšme baptisĂ© SPIRAL dĂ©crit les images en termes de programmes bas-niveau qui sont exĂ©cutĂ©s par un logiciel de graphisme ordinaire. Entre autres, ceci permet Ă lâalgorithme de ne pas sâattarder sur les dĂ©tails de lâimage, et de se concentrer plutĂŽt sur sa structure globale. Lâespace latent de notre modĂšle est, par construction, interprĂ©table et permet de manipuler des images de façon prĂ©visible. Nous montrons la capacitĂ© et lâagilitĂ© de cette approche sur plusieurs bases de donnĂ©es de rĂ©fĂ©rence.In the present thesis, we study how deep neural networks can be applied to various tasks in computer vision. Computer vision is an interdisciplinary field that deals with understanding of digital images and video. Traditionally, the problems arising in this domain were tackled using heavily hand-engineered adhoc methods. A typical computer vision system up until recently consisted of a sequence of independent modules which barely talked to each other. Such an approach is quite reasonable in the case of limited data as it takes major advantage of the researcher's domain expertise. This strength turns into a weakness if some of the input scenarios are overlooked in the algorithm design process.
With the rapidly increasing volumes and varieties of data and the advent of cheaper and faster computational resources end-to-end deep neural networks have become an appealing alternative to the traditional computer vision pipelines. We demonstrate this in a series of research articles, each of which considers a particular task of either image analysis or synthesis and presenting a solution based on a ``deep'' backbone.
In the first article, we deal with a classic low-level vision problem of edge detection. Inspired by a top-performing non-neural approach, we take a step towards building an end-to-end system by combining feature extraction and description in a single convolutional network. The resulting fully data-driven method matches or surpasses the detection quality of the existing conventional approaches in the settings for which they were designed while being significantly more usable in the out-of-domain situations.
In our second article, we introduce a custom architecture for image manipulation based on the idea that most of the pixels in the output image can be directly copied from the input. This technique bears several significant advantages over the naive black-box neural approach. It retains the level of detail of the original images, does not introduce artifacts due to insufficient capacity of the underlying neural network and simplifies training process, to name a few. We demonstrate the efficiency of the proposed architecture on the challenging gaze correction task where our system achieves excellent results.
In the third article, we slightly diverge from pure computer vision and study a more general problem of domain adaption. There, we introduce a novel training-time algorithm (\ie, adaptation is attained by using an auxilliary objective in addition to the main one). We seek to extract features that maximally confuse a dedicated network called domain classifier while being useful for the task at hand. The domain classifier is learned simultaneosly with the features and attempts to tell whether those features are coming from the source or the target domain. The proposed technique is easy to implement, yet results in superior performance in all the standard benchmarks.
Finally, the fourth article presents a new kind of generative model for image data. Unlike conventional neural network based approaches our system dubbed SPIRAL describes images in terms of concise low-level programs executed by off-the-shelf rendering software used by humans to create visual content. Among other things, this allows SPIRAL not to waste its capacity on minutae of datasets and focus more on the global structure. The latent space of our model is easily interpretable by design and provides means for predictable image manipulation. We test our approach on several popular datasets and demonstrate its power and flexibility
- âŠ