153,518 research outputs found

    ΠšΠ»Π°ΡΡΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΡ Π²Π΅Π±-страниц Π½Π° основС Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΎΠ² машинного обучСния

    Full text link
    Π”ΠΎΠ»Π³ΠΎΠ΅ врСмя появлявшиСся Π² Π»ΠΈΡ‚Π΅Ρ€Π°Ρ‚ΡƒΡ€Π΅ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹ ΠΊΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠ·Π°Ρ†ΠΈΠΈ Π²Π΅Π±-страниц ΠΎΡΡ‚Π°Π²Π°Π»ΠΈΡΡŒ Π² Ρ‚Π΅Π½ΠΈ ΠΌΠ΅Ρ‚ΠΎΠ΄Π° ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Ρ… слов, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Ρ€Π°Π±ΠΎΡ‚Π°Π» достаточно эффСктивно с Π°Π½Π³Π»ΠΎ-язычными сайтами. ΠŸΠΎΡΡ‚ΠΎΠΌΡƒ возмоТности примСнСния ΠΊ этой Π·Π°Π΄Π°Ρ‡Π΅ ΠΏΠΎΡΠ²ΠΈΠ²ΡˆΠΈΡ…ΡΡ Π½Π΅Π΄Π°Π²Π½ΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΎΠ² классификации Π±Ρ‹Π»ΠΈ нСдостаточно Ρ…ΠΎΡ€ΠΎΡˆΠΎ ΠΈΠ·ΡƒΡ‡Π΅Π½Ρ‹ [2,5,3]. Π’Π°ΠΊ, Π½Π°ΠΏΡ€ΠΈΠΌΠ΅Ρ€, строковоС ядро (String Subsequence Kernel, SSK) ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠ»ΠΎ большСС распространСниС Π² Π±ΠΈΠΎΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΠΊΠ΅ для классификации ΠΏΡ€ΠΎΡ‚Π΅ΠΈΠ½ΠΎΠ², Π½Π΅ΠΆΠ΅Π»ΠΈ Π² Π²Π΅Π±-ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠΈ для ΠΊΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠ·Π°Ρ†ΠΈΠΈ Π²Π΅Π±-страниц. Π’Π°ΠΊΠΈΠ΅ Π½ΠΎΠ²Ρ‹Π΅ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ Π±Ρ‹Π»ΠΈ нСпопулярны Ρ‚Π°ΠΊΠΆΠ΅ ΠΈΠ·-Π·Π° ΠΈΡ… нСсоотвСтствия высоким трСбованиям ΠΊ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ, ΠΏΡ€Π΅Π΄ΡŠΡΠ²Π»ΡΠ΅ΠΌΡ‹Ρ… ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚-систСмам. Однако, ΠΏΡ€ΠΈ Π½Π°Π»ΠΈΡ‡ΠΈΠΈ Π΄ΠΎΠ»ΠΆΠ½ΠΎΠΉ ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·Π°Ρ†ΠΈΠΈ Ρ‚Π°ΠΊΠΈΠ΅ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹ ΠΌΠΎΠ³ΡƒΡ‚ ΠΎΡ‚ΠΊΡ€Ρ‹Ρ‚ΡŒ Π½ΠΎΠ²Ρ‹Π΅ возмоТности для создания простых Π² Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠ΅ ΠΊΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠ·Π°Ρ‚ΠΎΡ€ΠΎΠ², ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π±ΡƒΠ΄ΡƒΡ‚ эффСктивны Π΄Π°ΠΆΠ΅ для языков со слоТной ΠΌΠΎΡ€Ρ„ΠΎΠ»ΠΎΠ³ΠΈΠ΅ΠΉ ΠΈ Π³Ρ€Π°ΠΌΠΌΠ°Ρ‚ΠΈΠΊΠΎΠΉ. Π’ Π΄Π°Π½Π½ΠΎΠΉ Ρ€Π°Π±ΠΎΡ‚Π΅ ΠΏΡ€ΠΈΠ²Π΅Π΄Ρ‘Π½ ΠΏΡ€ΠΈΠΌΠ΅Ρ€ Ρ‚Π°ΠΊΠΎΠ³ΠΎ Ρ€ΠΎΠ΄Π° ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·Π°Ρ†ΠΈΠΉ ΠΈ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΎ Π΄Π²Π° классификатора, ΠΈΡ… Ρ€Π΅Π°Π»ΠΈΠ·ΡƒΡŽΡ‰ΠΈΡ…. Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Π΅ Π½Π° практичСских тСстах, ΠΎΡ‡Π΅Π²ΠΈΠ΄Π½Ρ‹Π΅ возмоТности ΠΌΠ°ΡΡˆΡ‚Π°Π±ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΡ, Π·Π°Π»ΠΎΠΆΠ΅Π½Π½Ρ‹Π΅ Π² эти Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹ – всё это Π΄Π°Ρ‘Ρ‚ ΠΏΠΎΠ²ΠΎΠ΄ Π½Π°Π΄Π΅ΡΡ‚ΡŒΡΡ, Ρ‡Ρ‚ΠΎ дальнСйшСС ΠΈΠ·ΡƒΡ‡Π΅Π½ΠΈΠ΅ этого вопроса окаТСтся ΠΏΠ»ΠΎΠ΄ΠΎΡ‚Π²ΠΎΡ€Π½Ρ‹ΠΌ.Novel algorithms of web-page classification have been dominated by widely accepted keyword approach for a long time. The keyword approach has proved to be sufficiently effective for English web-pages. Therefore recently published classification algorithms have not been addressed in web-page classification research at an appropriate scale [2,5,3]. For instance, String Subsequence Kernel (SSK) received much larger attention in Bioinformatics for gene and protein classification than in web-programming for web-page categorization. Such novel methods have proved to be unpopular among Internet system providers also because of their high computational requirements. However, with application of certain optimization approaches, such algorithms can bring development of classification systems to a new level, where high efficiency can be achieved even for languages with complex morphology and grammar. This work represents an example of such optimization attempt and it provides two different realizations for such classifiers. Positive characteristics of presented results and scaling properties of these algorithms encourage further research in this area

    Data mining technology for the evaluation of web-based teaching and learning systems

    Get PDF
    Instructional design for Web-based teaching and learning environments causes problems for two reasons. Firstly, virtual forms of teaching and learning result in little or no direct contact between instructor and learner, making the evaluation of course effectiveness difficult. Secondly, the Web as a relatively new teaching and learning medium still requires more research into learning processes with this technology. We propose data mining – techniques to discover and extract knowledge from a database – as a tool to support the analysis of student learning processes and the evaluation of the effectiveness and usability of Web-based courses. We present and illustrate different data mining techniques for the evaluation of Web-based teaching and learning systems

    Browsing a digital library: A new approach for the New Zealand digital library

    Get PDF
    Browsing is part of the information seeking process, used when information needs are ill-defined or unspecific. Browsing and searching are often interleaved during information seeking to accommodate changing awareness of information needs. Digital Libraries often support full-text search, but are not so helpful in supporting browsing. Described here is a novel browsing system created for the Greenstone software used by the New Zealand Digital Library that supports users in a more natural approach to the information seeking process. Β© Springer-Verlag Berlin Heidelberg 2003

    I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis

    Full text link
    Revelations of large scale electronic surveillance and data mining by governments and corporations have fueled increased adoption of HTTPS. We present a traffic analysis attack against over 6000 webpages spanning the HTTPS deployments of 10 widely used, industry-leading websites in areas such as healthcare, finance, legal services and streaming video. Our attack identifies individual pages in the same website with 89% accuracy, exposing personal details including medical conditions, financial and legal affairs and sexual orientation. We examine evaluation methodology and reveal accuracy variations as large as 18% caused by assumptions affecting caching and cookies. We present a novel defense reducing attack accuracy to 27% with a 9% traffic increase, and demonstrate significantly increased effectiveness of prior defenses in our evaluation context, inclusive of enabled caching, user-specific cookies and pages within the same website
    • …
    corecore