52 research outputs found
TEXT CATEGORIZATION USING ONLY FRAGMENTS OF DOCUMENTS
In this paper we presented a lot of experiments that examine how the particular parts of the documents do contribute to the performance of a classifier. We evaluated text classifiers on two very different text corpora. We conclude that some parts of the text are more important from the point of text classification performance. Giving higher weights to more important parts can increase the performance of the classifier. The question, that which parts are more or less important depends on the nature of the documents in the corpora. Some tasks that remains to be done: − More text corpora should be investigated. − In section 6.4 we optimized the number of features to be kept independent from the section. However, it could be optimized for each section. − Splitting the documents into parts of 50 words, to examine what if the parts are of equal size not only inside a document, but among the documents too. − When splitting documents into k equal parts, we may combine the classifiers resulted from different k values.machine learning, text categorization, classifier ensembles, Research and Development/Tech Change/Emerging Technologies,
CONSTRAINTS: A PROGRAMMING PARADIGM AND A MODELLING METHODOLOGY
Constraints are often used as a formal approach to problems, because the very essence of the problem can be grasped by them. A lot of problems can be viewed as a set of variables and a set of relations on them. From this point of view the problem can be mapped naturally to a constraint network (the nodes of the network represent the
variables; and the constraints in the network represent the relations between the variables of the problem); and this gives great significance to the research on constraints. An additional advantage is that they achieve global consistency through local computations.
Constraints and the Constraint Satisfaction Problem (CSP) can be classified by various criteria. The most significant classification is based on the type of the values assigned to the nodes.
Another possible classification of CSP is based on the kind of the required solution.
Significant effort was invested in developing general constraint programming languages (CPL) to provide an environment where the only thing a user has to do is to declare what she/he wants, not bothering how it is done. Though these languages aimed at generality, due to the limited ability of data abstraction and higher order constraints they could not fully achieve their goal. If the main stress is on the efficiency, dedicated
solutions claim their place with their unique data structures and specialised constraint
satisfaction algorithms.
The main goal of this paper is to give an overview of constraints as a flexible knowledge representation tool; to draw attention to the problems of representation and to methods of finding the solutions of the different types of constraint networks
DEDICATED MULTIPROCESSOR SYSTEM FOR AUTOMATIC RAIL FLAW DETECTION
An automatized ultrasonic rail flaw detection system has been developed for real-time rail
flaw detection and evaluation. The whole system installed on a testing vehicle, working under
rough environmental conditions must determine the internal irregularities of the rail, document
them on the basis of a table containing the danger information, and immediately mark the rail in
order to make easy the identification of the faulty segments for maintenance staff.
During the measurement, three pairs of ultrasonic transmitters and receivers, with
different orientation, scan the rail providing indirect information about the vertical section. The
rail flaw detection procedure itself is a twodimensional pattern recognition problem consisting
of image reconstruction, spatial filtering with thresholding and classifying phases
Adapting IT Algorithms and Protocols to an Intelligent Urban Traffic Control
Autonomous vehicles, communicating with each other and with the urban infrastructure as well, open opportunity to introduce new, complex and effective behaviours to theintelligent traffic systems. Such systems can be perceived quite naturally as hierarchically built intelligent multi-agent systems, with the decision making based upon well-defined and profoundly tested mathematical algorithms, borrowed e.g. from the field of information technology. In this article, two examples of how to adapt such algorithms to the intelligent urban traffic are presented. Since the optimal and fair timing of the traffic lights is crucial in the traffic control, we show how a simple Round-Robin scheduler and Minimal Destination Distance First scheduling (adaptation of the theoretically optimal Shortest Job First scheduler) were implemented and tested for traffic light control. Another example is the mitigation of the congested traffic using the analogy of the Explicit Congestion Notification (ECN) protocol of the computer networks. We show that the optimal scheduling based traffic light control can handle roughly the same complexity of the traffic as the traditional light programs in the nominal case. However, in extraordinary and especially fastly evolving situations, the intelligent solutions can clearly outperform the traditional ones. The ECN based method can successfully limit the traffic flowing through bounded areas. That way the number of passing-through vehicles in e.g. residential areas may be reduced, making them more comfortable congestion-free zones in a city
RendszermodellezĂ©s mĂ©rĂ©si adatokbĂłl, hibrid-neurális megközelĂtĂ©s = System modelling from measurement data: hybrid-neural approach
A kutatás cĂ©lja mĂ©rĂ©si adatok alapján törtĂ©nĹ‘ rendszermodellezĂ©si eljárások kidolgozása Ă©s vizsgálata volt, kĂĽlönös tekintettel a nemlineáris rendszerek modellezĂ©sĂ©re. A kutatás során többfĂ©le megközelĂtĂ©st alkalmaztunk: egyrĂ©szt a rendszermodellezĂ©si feladatok megoldásánál a lineáris rendszerekre kidolgozott eljárásokbĂłl indultunk ki nemlineáris hatásokat is figyelembe vĂ©ve, másrĂ©szt fekete doboz megközelĂtĂ©seket alkalmaztunk, ahol elsĹ‘dlegesen input-output adatokbĂłl törtĂ©nik a modell konstrukciĂł. Az elĹ‘bbi megközelĂtĂ©s kĂĽlönösen gyengĂ©n nemlineáris rendszerek modellezĂ©sĂ©nĂ©l tűnik járhatĂł Ăştnak, ahol a gyengĂ©n nemlineáris rendszereket, mint nemlineárisan torzĂtott lineáris rendszereket tekintjĂĽk. A nemlineáris torzĂtások hatásának megĂ©rtĂ©sĂ©re egy teljes elmĂ©letet dolgoztunk ki. A fekete doboz modellezĂ©snĂ©l általános modell-struktĂşrákbĂłl indulunk ki, melyek paramĂ©tereit a rendelkezĂ©sre állĂł mĂ©rĂ©si adatok felhasználásával, tanulással határozhatjuk meg. Ekkor az alapvetĹ‘ kĂ©rdĂ©sek a megfelelĹ‘ kiindulĂł adatbázis kialakĂtására Ă©s az adatokkal kapcsolatos problĂ©mákra (zajos adatok, kiugrĂł adatok, inkonzisztens adatok, redundáns adatok, stb.) irányultak, továbbá arra hogy hogyan lehet a fekete doboz modellstruktĂşra komplexitását kĂ©zben tartani Ă©s az adatokon tĂşl meglĂ©vĹ‘ egyĂ©b informáciĂł hatĂ©kony figyelembevĂ©telĂ©t biztosĂtani. A fekete doboz modellezĂ©snĂ©l neuronhálĂłkat Ă©s szupport vektor gĂ©peket vettĂĽnk figyelembe Ă©s a minĂ©l kisebb modell-komplexitás elĂ©rĂ©sĂ©re törekedtĂĽnk. | The goal of the research was to develop and analyse system modelling procedures, especially for modelling non-linear systems. To reach the goal different approaches were applied. One approach is to use procedures developed for linear system modelling, where nonlinear effects are taken into consideration. The other approach applied is black box modelling, where model-construction is mainly based on input-output data. The first approach proved to be successful especially for the modelling of weakly non-linear systems, where these systems are considered as linear ones with the presence of nonlinear distortion. To understand nonlinear distortions a whole theory has been developed. For black box modelling the starting point was the use of certain general model-structures, where the parameters of these structures are determined by training using measurement data. The most relevant questions in this case are related to the construction of data base, and the problems of quality of the available data (noisy data, missing data, outliers, inconsistent data, redundant data, etc.), A further important goal was to find proper ways to utilise additional knowledge and at the same time to reduce model complexity. For black box modelling some special neural network architectures and support vector machines were considered
- …