1,802 research outputs found
NEXT LEVEL: A COURSE RECOMMENDER SYSTEM BASED ON CAREER INTERESTS
Skills-based hiring is a talent management approach that empowers employers to align recruitment around business results, rather than around credentials and title. It starts with employers identifying the particular skills required for a role, and then screening and evaluating candidates’ competencies against those requirements. With the recent rise in employers adopting skills-based hiring practices, it has become integral for students to take courses that improve their marketability and support their long-term career success. A 2017 survey of over 32,000 students at 43 randomly selected institutions found that only 34% of students believe they will graduate with the skills and knowledge required to be successful in the job market. Furthermore, the study found that while 96% of chief academic officers believe that their institutions are very or somewhat effective at preparing students for the workforce, only 11% of business leaders strongly agree [11]. An implication of the misalignment is that college graduates lack the skills that companies need and value. Fortunately, the rise of skills-based hiring provides an opportunity for universities and students to establish and follow clearer classroom-to-career pathways. To this end, this paper presents a course recommender system that aims to improve students’ career readiness by suggesting relevant skills and courses based on their unique career interests
DancingLines: An Analytical Scheme to Depict Cross-Platform Event Popularity
Nowadays, events usually burst and are propagated online through multiple
modern media like social networks and search engines. There exists various
research discussing the event dissemination trends on individual medium, while
few studies focus on event popularity analysis from a cross-platform
perspective. Challenges come from the vast diversity of events and media,
limited access to aligned datasets across different media and a great deal of
noise in the datasets. In this paper, we design DancingLines, an innovative
scheme that captures and quantitatively analyzes event popularity between
pairwise text media. It contains two models: TF-SW, a semantic-aware popularity
quantification model, based on an integrated weight coefficient leveraging
Word2Vec and TextRank; and wDTW-CD, a pairwise event popularity time series
alignment model matching different event phases adapted from Dynamic Time
Warping. We also propose three metrics to interpret event popularity trends
between pairwise social platforms. Experimental results on eighteen real-world
event datasets from an influential social network and a popular search engine
validate the effectiveness and applicability of our scheme. DancingLines is
demonstrated to possess broad application potentials for discovering the
knowledge of various aspects related to events and different media
Utilizing implicit feedback data to build a hybrid recommender system
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsIn e-commerce applications, buyers are overwhelmed by the number of products due to
the high depth of assortments. They may be interested in receiving recommendations
to assist with their purchasing decisions. However, many recommendation engines
perform poorly in the absence of community data and contextual data. This thesis
examines a hybrid matrix factorisation model, LightFM, representing users and items
as linear combinations of their content features’ latent factors. The model embedding
item features displays superior user and item cold-start performance. The results
demonstrate the importance of selectively embedding contextual data in the presence
of cold-start
Using webcrawling of publicly available websites to assess E-commerce relationships
We investigate e-commerce success factors concerning their impact on the success of commerce transactions between businesses companies. In scientific literature, many e-commerce success factors are introduced. Most of them are focused on companies' website quality. They are evaluated concerning companies' success in the business-to- consumer (B2C) environment where consumers choose their preferred e-commerce websites based on these success factors e.g. website content quality, website interaction, and website customization. In contrast to previous work, this research focuses on the usage of existing e-commerce success factors for predicting successfulness of business-to-business (B2B) ecommerce. The introduced methodology is based on the identification of semantic textual patterns representing success factors from the websites of B2B companies. The successfulness of the identified success factors in B2B ecommerce is evaluated by regression modeling. As a result, it is shown that some B2C e-commerce success factors also enable the predicting of B2B e-commerce success while others do not. This contributes to the existing literature concerning ecommerce success factors. Further, these findings are valuable for B2B e-commerce websites creation
Combining social-based data mining techniques to extract collective trends from twitter
Social Networks have become an important environment for Collective Trends extraction. The interactions
amongst users provide information of their preferences and relationships. This information can be used to
measure the influence of ideas, or opinions, and how they are spread within the Network. Currently, one of the
most relevant and popular Social Networks is Twitter. This Social Network was created to share comments and
opinions. The information provided by users is especially useful in different fields and research areas such as
marketing. This data is presented as short text strings containing different ideas expressed by real people. With
this representation, different Data Mining techniques (such as classification or clustering) will be used for
knowledge extraction to distinguish the meaning of the opinions. Complex Network techniques are also helpful
to discover influential actors and study the information propagation inside the Social Network. This work is
focused on how clustering and classification techniques can be combined to extract collective knowledge from
Twitter. In an initial phase, clustering techniques are applied to extract the main topics from the user opinions.
Later, the collective knowledge extracted is used to relabel the dataset according to the clusters obtained to
improve the classification results. Finally, these results are compared against a dataset which has been
manually labelled by human experts to analyse the accuracy of the proposed method.The preparation of this manuscript has been supported by the Spanish Ministry of Science and Innovation under the
following projects: TIN2010-19872 and ECO2011-30105 (National Plan for Research, Development and
Innovation), as well as the Multidisciplinary Project of Universidad AutĂłnoma de Madrid (CEMU2012-034). The
authors thank Ana M. DĂaz-MartĂn and Mercedes Rozano for the manual classification of the Tweets
The Design of Pre-Processing Multidimensional Data Based on Component Analysis
Increased implementation of new databases related to multidimensional data involving techniques to support
efficient query process, create opportunities for more extensive research. Pre-processing is required because of
lack of data attribute values, noisy data, errors, inconsistencies or outliers and differences in coding. Several types of pre-processing based on component analysis will be carried out for cleaning, data integration and transformation,
as well as to reduce the dimensions. Component analysis can be done by statistical methods, with the aim to separate the various sources of data into a statistical pattern independent. This paper aims to improve the quality of pre-processed data based on component analysis. RapidMiner is used for data pre-processing using FastICA algorithm. Kernel K-mean is used to cluster the pre-processed data and Expectation Maximization (EM) is used to model. The model was tested using wisconsin breast cancer datasets, lung cancer datasets and prostate cancer datasets. The result shows that the performance of the cluster vector value is higher and the processing time is shorter
- …