7 research outputs found
Local search engine with global content based on domain specific knowledge
In the growing need for information we have come to rely on search engines. The use of large scale search engines, such as Google, is as common as surfingthe World Wide Web. We are impressed with the capabilities of these search engines but still there is a need for improvment. A common problem withsearching is the ambiguity of words. Their meaning often depends on the context in which they are used or varies across specific domains. To resolve this we propose a domain specific search engine that is globally oriented. We intend to provide content classification according to the target domain concepts, access to privileged information, personalization and custom rankingfunctions. Domain specific concepts have been formalized in the form ofontology. The paper describes our approach to a centralized search service for domain specific content. The approach uses automated indexing for various content sources that can be found in the form of a relational database, we! b service, web portal or page, various document formats and other structured or unstructured data. The gathered data is tagged with various approaches and classified against the domain classification. The indexed data is accessible through a highly optimized and personalized search service
Evaluation of machine learning methods with natural language processing
Vrednotenje metod strojnega učenja se tradicionalno izvaja z oceno delovanja na ročno označeni testni množici. Ta približek uporabljamo preprosto zato, ker nimamo na voljo boljše metode. Moramo se zavedati, da se je metoda strojnega učenja učila iz zelo sorodnih (podobnih) podatkov. Torej so posledično vsa predvidevanja o zmogljivosti na realnih podatkih, ki so ocenjena na podlagi testne množice, optimistična. Dejanska vrednost metode strojnega učenja temelji na njeni sposobnosti tvorjenja dobrih hipotez.
Ovrednotenje metod strojnega učenja z obdelavo naravnega jezika vpelje popolnoma nov vir: izsledke znanstvenih raziskav in študij, zapisanih v znanstvenih objavah. Ta vir ponuja objektivno metodo ocenitve rezultatov na podlagi podatkov iz raziskav. V veliki meri zmanjša delo domenskih strokovnjakov, ki je potrebno za ovrednotenje rezultatov strojnega učenja. Hkrati je tak pristop zmožen tvoriti enciklopedično zbirko formaliziranega znanja, ki je splošno uporabna.Validation of machine learning methods has traditionally been performed with evaluation on hand annotated test sets. This procedure represents an approximation and is used for lack of a better approach. We should consider that the machine learning method has learned from very similar data, consequently all predictions on real data performance, based on this test, are optimistic. The real value of a machine learning method lies in its ability to form good hypothesis.
Natural language processing as a method of evaluation of machine knowledge introduces a new source of validation: research results from scientific studies and papers published in respected conferences and journals. This new source offers a method of objective evaluation of machine learning results. It can greatly diminish the manual effort of domain experts who perform machine learning evaluation. At the same time this approach is capable of forming an encyclopedic database of formal knowledg
Searching for information on horizontally distributed sources
Uporaba spletnih iskalnikov je postala vsakodnevna stalnica. Pogosto so spletni iskalniki edini način iskanja informacij. Spletnim mestom, ki želijo svojim uporabnikom ponuditi napredne iskalne storitve, pogosto ne zadostuje uporaba storitev večjih iskalnikov (Google, Yahoo, Live Search). Razlogi za izgradnjo lastnih iskalnih storitev so predvsem v varnosti, želeni ravni poosebitve ter seveda v potrebi po prilagojenem ocenjevanju in razvrščanju rezultatov. Prispevek predstavi delovanje sodobnih iskalnikov in podrobneje opiše iskalno storitev, ki je bila razvita z vidika posameznega spletnega mesta. Iskalni zadetki se avtomatsko prilagajajo uporabniku in njegovim pravicam, ponujeno pa je tudi iskanje po sorodnih spletnih mestih ter strukturiranih in nestrukturiranih virih podatkov. Iskalna storitev ponuja prav tako poosebljanje iskalnih rezultatov glede na interesna področja uporabnika.In the modern society the use of online search engines is a daily routine. More often than not it is the only way of getting the desired information instantly. For a large web portal, with the ambition to provide their users with advanced search services, the feature specifications of the search services they require far exceed the capabilities provided by large search engines (Google, Yahoo, Live Search). The main initiative for a custom search service is the need for security and authorized access to certain contentalso the desired personalization level and a customized ranking of search results are both major contributors. The paper reviews the architecture of modern search engines and presents in greater detail the search service that was developed. The service provides automated use of users\u27 credentials to limit the search results to only the content that the user has been granted access to, it provides centralized search capability across multiple web siteswith similar content as well as structured and unstructured data sources.The search service also provides personalization according to the user\u27s field of interest