7,727 research outputs found

    Survey of data mining approaches to user modeling for adaptive hypermedia

    Get PDF
    The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

    Bikesharing and Bicycle Safety

    Get PDF
    The growth of bikesharing in the United States has had a transformative impact on urban transportation. Major cities have established large bikesharing systems, including Boston, Chicago, Denver, Minneapolis-Saint Paul, New York City, Salt Lake City, the San Francisco Bay Area, Seattle, Washington DC, and others. These systems began operating as early as 2010, and no fatalities have occurred within the US as of this writing. However, three have happened in North America—two in Canada and one in Mexico. Bikesharing has some qualities that appear inherently unsafe for bicyclists. Most prominently, helmet usage is documented to be quite low in most regions. Bikesharing is also used by irregular bicyclists who are less familiar with the local terrain. In this study, researchers take a closer look at bikesharing safety from qualitative and quantitative perspectives. Through a series of four focus groups, they discussed bikesharing usage and safety with bikesharing members and nonmembers in the Bay Area. They further engaged experts nationwide from a variety of fields to evaluate their opinions and perspectives on bikesharing and safety. Finally, researchers conducted an analysis of bicycle and bikesharing activity data, as well as bicycle and bikesharing collisions to evaluate injury rates associated with bikesharing when compared with benchmarks of personal bicycling. The data analysis found that collision and injury rates for bikesharing are lower than previously computed rates for personal bicycling. Experts and focus group participants independently pointed to bikesharing rider behavior and bikesharing bicycle design as possible factors. In particular, bikesharing bicycles are generally designed in ways that promote stability and limited speeds, which mitigate the conditions that contribute to collisions. Data analysis also explored whether there was evidence of a “safety in numbers benefit” that resulted from bikesharing activity. However, no significant impact from bikesharing activity on broader bicycle collisions could be found within the regions in which they operate. Discussion and recommendations are presented in the conclusion

    The Political Economy of Cable - "Open Access."

    Get PDF
    Advocates of "open access" claim that Internet Service Providers (ISPs) should be able to use a cable TV system's bandwidth on the same terms offered to ISPs owned by the cable system. On that view, "open access" mitigates a monopoly bottleneck and encourages the growth of broadband. This paper shows that cable operators do enjoy market power, and do seek to leverage a dominant position in video into the broadband access market by allocating too little bandwidth for Internet access. Yet, rather than protect cable operators from cannibalizing their cable TV revenue, this strategy defends against imposition of common carrier regulation, which would allow system capacity to be appropriated by regulators and rival broadband networks. Ironically, the push for "open access" limits Internet access by encouraging this under-allocation of broadband spectrum, and by introducing coordination problems slowing technology deployment. These effects are empirically evident in the competitive superiority of cable's "closed" platform vis-a-vis "open" DSL networks, and in financial market reactions to key regulatory events and mergers in broadband.

    Mining High Utility Sequential Patterns from Uncertain Web Access Sequences using the PL-WAP

    Get PDF
    In general, the web access patterns are retrieved from the web access sequence databases using various sequential pattern algorithms such as GSP, WAP, and PLWAP tree. However, these algorithms do not consider sequential data with quantity (internal utility) (e.g., the amount of the time spent by the user on a web page) and quality (external utility) (e.g., the rating of a web page in a website) information. These algorithms also do not work on uncertain sequential items (e.g., purchased products) having probability (0, 1). Factoring in the utility and uncertainty of each sequence item provides more product information that can be beneficial in mining profitable patterns from company’s websites. For example, a customer can purchase a bottle of ink more frequently than a printer but the purchase of a single printer can yield more profit to the business owner than the purchase of multiple bottles of ink. Most existing traditional uncertain sequential pattern algorithms such as U-Apriori, UF-Growth, and U-PLWAP do not include the utility measures. In U-PLWAP, the web sequences are derived from web log data without including the time spent by the user and the web pages are not associated with any rating. By considering these two utilities, sometimes the items with lower existential probability can be more profitable to the website owner. In utility based traditional algorithms, the only algorithm related to both uncertain and high utility is the PHUI-UP algorithm which considers the probability and utility as different entities and the retrieved patterns are not dependent with both due to two different thresholds, and it does not mine uncertain web access database sequences. This thesis proposes the algorithm HUU-PLWAP miner for mining uncertain sequential patterns with internal and external utility information using PLWAP tree approach that cut down on several database scans of level-wise approaches. HUU-PLWAP uses uncertain internal utility values (derived from sequence uncertainty model) and the constant external utility values (predefined) to retrieve the high utility sequential patterns from uncertain web access sequence databases with the help of U-PLWAP methodology. Experiments show that HUU-PLWAP is at least 95% faster than U-PLWAP, and 75% faster than the PHUI-UP algorithm

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
    corecore