    New IR & Ranking Algorithm for Top-K Keyword Search on Relational Databases ‘Smart Search’

    Database management systems are as old as computers, and the continuous research and development in databases is huge and an interest of many database venders and researchers, as many researchers work in solving and developing new modules and frameworks for more efficient and effective information retrieval based on free form search by users with no knowledge of the structure of the database. Our work as an extension to previous works, introduces new algorithms and components to existing databases to enable the user to search for keywords with high performance and effective top-k results. Work intervention aims at introducing new table structure for indexing of keywords, which would help algorithms to understand the semantics of keywords and generate only the correct CN‟s (Candidate Networks) for fast retrieval of information with ranking of results according to user‟s history, semantics of keywords, distance between keywords and match of keywords. In which a three modules where developed for this purpose. We implemented our three proposed modules and created the necessary tables, with the development of a web search interface called „Smart Search‟ to test our work with different users. The interface records all user interaction with our „Smart Search‟ for analyses, as the analyses of results shows improvements in performance and effective results returned to the user. We conducted hundreds of randomly generated search terms with different sizes and multiple users; all results recorded and analyzed by the system were based on different factors and parameters. We also compared our results with previous work done by other researchers on the DBLP database which we used in our research. Our final result analysis shows the importance of introducing new components to the database for top-k keywords search and the performance of our proposed system with high effective results.نظم إدارة قواعد البيانات قديمة مثل أجيزة الكمبيوتر، و البحث والتطوير المستمر في قواعد بيانات ضخم و ىنالك اىتمام من العديد من مطوري قواعد البيانات والباحثين، كما يعمل العديد من الباحثين في حل وتطوير وحدات جديدة و أطر السترجاع المعمومات بطرق أكثر كفاءة وفعالية عمى أساس نموذج البحث الغير مقيد من قبل المستخدمين الذين ليس لدييم معرفة في بنية قاعدة البيانات. ويأتي عممنا امتدادا لألعمال السابقة، ويدخل الخوارزميات و مكونات جديدة لقواعد البيانات الموجودة لتمكين المستخدم من البحث عن الكممات المفتاحية )search Keyword )مع األداء العالي و نتائج فعالة في الحصول عمى أعمى ترتيب لمبيانات .)Top-K( وييدف ىذا العمل إلى تقديم بنية جديدة لفيرسة الكممات المفتاحية )Table Keywords Index ،)والتي من شأنيا أن تساعد الخوارزميات المقدمة في ىذا البحث لفيم معاني الكممات المفتاحية المدخمة من قبل المستخدم وتوليد فقط الشبكات المرشحة (s’CN (الصحيحة السترجاع سريع لممعمومات مع ترتيب النتائج وفقا ألوزان مختمفة مثل تاريخ البحث لممستخدم، ترتيب الكمات المفتاحية في النتائج والبعد بين الكممات المفتاحية في النتائج بالنسبة لما قام المستخدم بأدخالو. قمنا بأقتراح ثالث مكونات جديدة )Modules )وتنفيذىا من خالل ىذه االطروحة، مع تطوير واجية البحث عمى شبكة اإلنترنت تسمى "البحث الذكي" الختبار عممنا مع المستخدمين. وتتضمن واجية البحث مكونات تسجل تفاعل المستخدمين وتجميع تمك التفاعالت لمتحميل والمقارنة، وتحميالت النتائج تظير تحسينات في أداء استرجاع البينات و النتائج ذات صمة ودقة أعمى. أجرينا مئات عمميات البحث بأستخدام جمل بحث تم أنشائيا بشكل عشوائي من مختمف األحجام، باالضافة الى االستعانة بعدد من المستخدمين ليذه الغاية. واستندت جميع النتائج المسجمة وتحميميا بواسطة واجية البحث عمى عوامل و معايير مختمفة .وقمنا بالنياية بعمل مقارنة لنتائجنا مع االعمال السابقة التي قام بيا باحثون آخرون عمى نفس قاعدة البيانات (DBLP (الشييرة التي استخدمناىا في أطروحتنا. وتظير نتائجنا النيائية مدى أىمية أدخال بنية جديدة لفيرسة الكممات المفتاحية الى قواعد البيانات العالئقية، وبناء خوارزميات استنادا الى تمك الفيرسة لمبحث بأستخدام كممات مفتاحية فقط والحصول عمى نتائج أفضل ودقة أعمى، أضافة الى التحسن في وقت البحث

    Modelo de dados Big Data com suporte a SQL para performance management em redes de telecomunicações

    Mestrado em Sistemas de InformaçãoCom o tempo, a informação mantida pelas aplicações tem vindo a crescer e espera-se um crescimento exponencialmente na área de banda larga móvel com o surgimento da LTE. Com este crescimento cada vez maior de dados gerados, surge a necessidade de os manter por um período maior de tempo e as RDBMS não respondem rápido o suficiente. Isto fez com que as empresas se se tenham afastado das RDBMS e em busca de outras alternativas. As novas abordagens para o aumento do processamento de dados baseiam-se no desempenho, escalabilidade e robustez. O foco é sempre o processamento de grandes conjuntos de dados, tendo em mente que este conjunto de dados vai crescer e vai ser sempre necess ario uma resposta r apida do sistema. Como a maioria das vezes as RDBMS já fazem parte de um sistema implementado antes desta tendência de crescimento de dados, é necessáio ter em mente que as novas abordagens têm que oferecer algumas soluções que facilitem a conversão do sistema. E uma das soluções que é necessário ter em mente é como um sistema pode entender a semântica SQL.Over time, the information kept by the applications has been growing and it is expected to grow exponantially in the Mobile Broadband area with the emerging of the LTE. This increasing growth of generated data and the need to keep it for a bigger period of time has been revealing that RDBMS are no longer responding fast enough. This has been moving companies away from the RDBMS and into other alternatives. The new approaches for the increasing of data processing are based in the performance, scalability and robustness. The focus is always processing very large data sets, keeping in mind that this data sets will grow and it will always be needed a fast response from the system. Since most of the times the RDBMS are already part of a system implemented before this trend of growing data, it is necessary to keep in mind that the new approaches have to o er some solutions that facilitate the conversion of the system. And one of the solutions that is necessary to keep in mind is how a system can understand SQL semantic

    The 7th Conference of PhD Students in Computer Science

    Investigation of charge migration/transfer in radical cations using Ehrenfest method with fully quantum nuclear motion

    The main focus of this thesis is to investigate the effect of charge migration on molecular dynamics. Upon the creation of a superposition of cationic states by a short ionizing pulse in an attosecond pump-probe experiment, the electronic wavefunction is in a non-stationary state and the initial dynamics are purely electronic, driven by Charge Migration (CM) before the onset of any nuclear motions. The CM can be simulated using a frozen nuclear framework but its importance on long-term dynamics and competition with vibrationally mediated charge motion (i.e. Charge Transfer (CT)) remains unknown. Unravelling the mechanism behind CM and its importance on electron and nuclear coherence can help in designing an initial superposition of electronic states to steer nuclear motions toward a specific product. Further control of the photo-reactivity could be achieved with the use of probe/control laser pulses and open the door for more direct comparison with experimental results. In order to investigate the dynamics upon photoionization with an attosecond pump-pulse, the coupled electron-nuclear dynamics of the system is simulated using nonadiabatic quantum dynamics techniques within the sudden approximation. A single-set approach is adopted for the expansion of the nuclear wavefunction using a linear combination of Gaussian Wavepackets (GWP). The calculation is done using the Quantum-Ehrenfest method (QuEh) and the time-dependent Potential Energy Surfaces (PES) are evaluated with the Complete Active Space Configuration Interatcion (CAS-CI) method. The resulting dynamics are analyzed with adiabatic/diabatic state populations, Normal Mode (NM) displacements and bond lengths averaged over the nuclear wavepacket using Gross Gaussian populations (GGP). To reduce the cost of computation, the algorithm implemented in QUANTICS is parallelized with a Message Passing Interface (MPI). Further, the section of code which interacts with the database that contains previously calculated points on the PES is rewritten using the Structured Query Language (SQL) and the SQLite engine. For the purpose of unravelling the mechanism behind CM, the nonadiabatic dynamics of a model retinal Protonated Schiff Base (rPSB) and benzene are investigated by defining the initial electronic wavefunction in a systematic way. As demonstrated by the results on rPSB, the relaxation mechanism such as single and double bond length alternation and isomerization can controlled by varying the initial composition of electronic states. With the rich symmetry of benzene, the initial nuclear dynamics which are controlled by an initial gradient and electron dynamics can be analyzed using symmetry rules. The initial gradient is a combination of totally symmetric motion and non-symmetric components which correspond to the intra- (eigenstate) and inter-state (couplings) gradients, respectively. The electron dynamics and its associated nuclear motions can be examined by grouping together the localized holes where the CM occurs. With the initial gradient and CM, one can predict the initial nuclear relaxation and possibly control the photo-products formed by designing a specific superposition of electronic eigenstates. To explore the effect of laser pulses on dynamics, an implementation within the dipole approximation using the dipole-electric field dot product is done in the GAUSSIAN program. The dynamics in the presence of an infrared probe pulse is simulated on model systems such as allene and the ethylene cation. The pulse is able to induce change in the electron and nuclear dynamics of the system and some of its effect can be explained using irreducible representations and the alignment of the electric fields. The work presented in this thesis offers an insight into the photocontrol of molecules and opens the door for further investigation of charge-directed dynamics

    Fachlich erweiterbare 3D-Stadtmodelle – Management, Visualisierung und Interaktion

    Domain-extendable semantic 3D city models are complex mappings and inventories of the urban environment which can be utilized as an integrative information backbone to facilitate a range of application fields like urban planning, environmental simulations, disaster management, and energy assessment. Today, more and more countries and cities worldwide are creating their own 3D city models based on the CityGML specification which is an international standard issued by the Open Geospatial Consortium (OGC) to provide an open data model and XML-based format for describing the relevant urban objects with regards to their 3D geometry, topology, semantics, and appearance. It especially provides a flexible and systematic extension mechanism called “Application Domain Extension (ADE)” which allows third parties to dynamically extend the existing CityGML definitions with additional information models from different application domains for representing the extended or newly introduced geographic object types within a common framework. However, due to the consequent large size and high model complexity, the practical utilization of country-wide CityGML datasets has posed a tremendous challenge regarding the setup of an extensive application system to support the efficient data storage, analysis, management, interaction, and visualization. These requirements have been partly solved by the existing free 3D geo-database solution called ‘3D City Database (3DCityDB)’ which offers a rich set of functionalities for dealing with standard CityGML data models, but lacked the support for CityGML ADEs. The key motivation of this thesis is to develop a reliable approach for extending the existing database solution to support the efficient management, visualization, and interaction of large geospatial data elements of arbitrary CityGML ADEs. Emphasis is first placed on answering the question of how to dynamically extend the relational database schema by parsing and interpreting the XML schema files of the ADE and dynamically create new database tables accordingly. Based on a comprehensive survey of the related work, a new graph-based framework has been proposed which uses typed and attributed graphs for semantically representing the object-oriented data models of CityGML ADEs and utilizes graph transformation systems to automatically generate compact table structures extending the 3DCityDB. The transformation process is performed by applying a series of fine-grained graph transformation rules which allow users to declaratively describe the complex mapping rules including the optimization concepts that are employed in the development of the 3DCityDB database schema. The second major contribution of this thesis is the development of a new multi-level system which can serve as a complete and integrative platform for facilitating the various analysis, simulation, and modification operations on the complex-structured 3D city models based on CityGML and 3DCityDB. It introduces an additional application level based on a so-called ‘app-concept’ that allows for constructing a light-weight web application to reach a good balance between the high data model complexity and the specific application requirements of the end users. Each application can be easily built on top of a developed 3D web client whose functionalities go beyond the efficient 3D geo-visualization and interactive exploration, and also allows for performing collaborative modifications and analysis of 3D city models by taking advantage of the Cloud Computing technology. This multi-level system along with the extended 3DCityDB have been successfully utilized and evaluated by many practical projects.Fachlich erweiterbare semantische 3D-Stadtmodelle sind komplexe Abbildungen und Datenbestände der städtischen Umgebung, die als ein integratives Informationsrückgrat genutzt werden können, um eine Reihe von Anwendungsfeldern wie z. B. Stadtplanung, Umweltsimulationen, Katastrophenmanagement und Energiebewertung zu ermöglichen. Heute schaffen immer mehr Länder und Städte weltweit ihre eigenen 3D-Stadtmodelle auf Basis des internationalen Standards CityGML des Open Geospatial Consortium (OGC), um ein offenes Datenmodell und ein XML-basiertes Format zur Beschreibung der relevanten Stadtobjekte in Bezug auf ihre 3D-Geometrien, Topologien, Semantik und Erscheinungen zur Verfügung zu stellen. Es bietet insbesondere einen flexiblen und systematischen Erweiterungsmechanismus namens „Application Domain Extension“ (ADE), der es Dritten ermöglicht, die bestehenden CityGML-Definitionen mit zusätzlichen Informationsmodellen aus verschiedenen Anwendungsdomänen dynamisch zu erweitern, um die erweiterten oder neu eingeführten Stadtobjekt-Typen innerhalb eines gemeinsamen Framework zu repräsentieren. Aufgrund der konsequent großen Datenmenge und hohen Modellkomplexität bei der praktischen Nutzung der landesweiten CityGML-Datensätze wurden jedoch enorme Anforderungen an den Aufbau eines umfangreichen Anwendungssystems zur Unterstützung der effizienten Speicherung, Analyse, Verwaltung, Interaktion und Visualisierung der Daten gestellt. Die bestehende kostenlose 3D-Geodatenbank-Lösung „3D City Database“ (3DCityDB) entsprach bereits teilweise diesen Anforderungen, indem sie zwar eine umfangreiche Funktionalität für den Umgang mit den Standard-CityGML-Datenmodellen, jedoch keine Unterstützung für CityGML-ADEs bietet. Die Schlüsselmotivation für diese Arbeit ist es, einen zuverlässigen Ansatz zur Erweiterung der bestehenden Datenbanklösung zu entwickeln, um das effiziente Management, die Visualisierung und Interaktion großer Datensätze beliebiger CityGML-ADEs zu unterstützen. Der Schwerpunkt liegt zunächst auf der Beantwortung der Schlüsselfrage, wie man das relationale Datenbankschema dynamisch erweitern kann, indem die XML-Schemadateien der ADE analysiert und interpretiert und anschließend dem entsprechende neue Datenbanktabellen erzeugt werden. Auf Grundlage einer umfassenden Studie verwandter Arbeiten wurde ein neues graphbasiertes Framework entwickelt, das die typisierten und attributierten Graphen zur semantischen Darstellung der objektorientierten Datenmodelle von CityGML-ADEs verwendet und anschließend Graphersetzungssysteme nutzt, um eine kompakte Tabellenstruktur zur Erweiterung der 3DCityDB zu generieren. Der Transformationsprozess wird durch die Anwendung einer Reihe feingranularer Graphersetzungsregeln durchgeführt, die es Benutzern ermöglicht, die komplexen Mapping-Regeln einschließlich der Optimierungskonzepte aus der Entwicklung des 3DCityDB-Datenbankschemas deklarativ zu formalisieren. Der zweite wesentliche Beitrag dieser Arbeit ist die Entwicklung eines neuen mehrstufigen Systemkonzepts, das auf CityGML und 3DCityDB basiert und gleichzeitig als eine komplette und integrative Plattform zur Erleichterung der Analyse, Simulationen und Modifikationen der komplex strukturierten 3D-Stadtmodelle dienen kann. Das Systemkonzept enthält eine zusätzliche Anwendungsebene, die auf einem sogenannten „App-Konzept“ basiert, das es ermöglicht, eine leichtgewichtige Applikation bereitzustellen, die eine gute Balance zwischen der hohen Modellkomplexität und den spezifischen Anwendungsanforderungen der Endbenutzer erreicht. Jede Applikation lässt sich ganz einfach mittels eines bereits entwickelten 3D-Webclients aufbauen, dessen Funktionalitäten über die effiziente 3D-Geo-Visualisierung und interaktive Exploration hinausgehen und auch die Durchführung kollaborativer Modifikationen und Analysen von 3D-Stadtmodellen mit Hilfe von der Cloud-Computing-Technologie ermöglichen. Dieses mehrstufige System zusammen mit dem erweiterten 3DCityDB wurde erfolgreich in vielen praktischen Projekten genutzt und bewertet

    Preliminary Specification of Services and Protocols

    This document describes the preliminary specification of services and protocols for the Crutial Architecture. The Crutial Architecture definition, first addressed in Crutial Project Technical Report D4 (January 2007), intends to reply to a grand challenge of computer science and control engineering: how to achieve resilience of critical information infrastructures, in particular in the electrical sector. The definitions herein elaborate on the major architectural options and components established in the Preliminary Architecture Specification (D4), with special relevance to the Crutial middleware building blocks, and are based on the fault, synchrony and topological models defined in the same document. The document, in general lines, describes the Runtime Support Services and APIs, and the Middleware Services and APIs. Then, it delves into the protocols, describing: Runtime Support Protocols, and Middleware Services Protocols. The Runtime Support Services and APIs chapter features as a main component, the Proactive-Reactive Recovery Service, whose aim is to guarantee perpetual execution of any components it protects. The Middleware Services and APIs chapter describes our approach to intrusion-tolerant middleware. The middleware comprises several layers. The Multipoint Network layer is the lowest layer of CRUTIAL's middleware, and features an abstraction of basic communication services, such as provided by standard protocols, like IP, IPsec, UDP, TCP and SSL/TLS. The Communication Support Services feature two important building blocks: the Randomized Intrusion-Tolerant Services (RITAS), and the Overlay Protection Layer (OPL) against DoS attacks. The Activity Support Services currently defined comprise the CIS Protection service, and the Access Control and Authorization service. Protection as described in this report is implemented by mechanisms and protocols residing on a device called Crutial Information Switch (CIS). The Access Control and Authorization service is implemented through PolyOrBAC, which defines the rules for information exchange and collaboration between sub-modules of the architecture, corresponding in fact to different facilities of the CII's organizations.The Monitoring and Failure Detection layer contains a preliminary definition of the middleware services devoted to monitoring and failure detection activities. The remaining chapters describe the protocols implementing the above-mentioned services: Runtime Support Protocols, and Middleware Services Protocol

    Autumn 2021 Full Issue

    Technologies for a FAIRer use of Ocean Best Practices

    The publication and dissemination of best practices in ocean observing is pivotal for multiple aspects of modern marine science, including cross-disciplinary interoperability, improved reproducibility of observations and analyses, and training of new practitioners. Often, best practices are not published in a scientific journal and may not even be formally documented, residing solely within the minds of individuals who pass the information along through direct instruction. Naturally, documenting best practices is essential to accelerate high-quality marine science; however, documentation in a drawer has little impact. To enhance the application and development of best practices, we must leverage contemporary document handling technologies to make best practices discoverable, accessible, and interlinked, echoing the logic of the FAIR data principles [1]