14 research outputs found

    Revisión de la productividad científica sobre Big Data Marketing durante el periodo 2012 – 2019

    Get PDF
    En el presente estudio se identifican las tendencias más significativas en producción de artículos científicos de alto impacto con respecto a la variable Big Data Marketing durante el periodo comprendido entre los años 2012 y 2019 y cuya revisión se realizará en base de datos Scopus de la cual se logró resaltar la relevancia de 113 artículos indexados en dicha base de datos. Para tal fin se implementan indicadores bibliométricos descriptivos tales como: Volumen de producción, Tipo de documento, numero de citaciones, país de realización, (De Fillippo & Fernandez, 2002). En el periodo estudiado se evidencia un crecimiento anual en el volumen de producción de artículos con la variable en cuestión sin embargo en el 2017 dicha producción presenta una caída importante. Las áreas de conocimiento que más investigan la variable Big Data Marketing son en su orden, las ciencias de la computación, Matemáticas, toma de decisiones e ingeniería.The present study identifies the most significant trends in the production of high-impact scientific articles with respect to the Big Data Marketing variable during the period between 2012 and 2019 and whose revision will be made in the Scopus database. managed to highlight the relevance of 113 articles indexed in said database. For this purpose, descriptive bibliometric indicators are implemented, such as: Volume of production, Type of document, number of citations, country of completion, (De Fillippo & Fernandez, 2002). In the period studied there is an annual growth in the volume of production of articles with the variable in question, however in 2017 this production shows a significant drop. The areas of knowledge that most investigate the variable Big Data Marketing are in their order, computer science, Mathematics, decision making and engineering

    A Probabilistic Embedding Clustering Method for Urban Structure Detection

    Full text link
    Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM) to find latent features from high dimensional urban sensing data by learning via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China) proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.Comment: 6 pages, 7 figures, ICSDM201

    Word2Vec model for sentiment analysis of product reviews in Indonesian language

    Get PDF
    Online product reviews have become a source of greatly valuable information for consumers in making purchase decisions and producers to improve their product and marketing strategies. However, it becomes more and more difficult for people to understand and evaluate what the general opinion about a particular product in manual way since the number of reviews available increases. Hence, the automatic way is preferred. One of the most popular techniques is using machine learning approach such as Support Vector Machine (SVM). In this study, we explore the use of Word2Vec model as features in the SVM based sentiment analysis of product reviews in Indonesian language. The experiment result show that SVM can performs well on the sentiment classification task using any model used. However, the Word2vec model has the lowest accuracy (only 0.70), compared to other baseline method including Bag of Words model using Binary TF, Raw TF, and TF.IDF. This is because only small dataset used to train the Word2Vec model. Word2Vec need large examples to learn the word representation and place similar words into closer position

    Indonesian Language Term Extraction using Multi-Task Neural Network

    Get PDF
    The rapidly expanding size of data makes it difficult to extricate information and store it as computerized knowledge. Relation extraction and term extraction play a crucial role in resolving this issue. Automatically finding a concealed relationship between terms that appear in the text can help people build computer-based knowledge more quickly. Term extraction is required as one of the components because identifying terms that play a significant role in the text is the essential step before determining their relationship. We propose an end-to-end system capable of extracting terms from text to address this Indonesian language issue. Our method combines two multilayer perceptron neural networks to perform Part-of-Speech (PoS) labeling and Noun Phrase Chunking. Our models were trained as a joint model to solve this problem. Our proposed method, with an f-score of 86.80%, can be considered a state-of-the-art algorithm for performing term extraction in the Indonesian Language using noun phrase chunking


    Get PDF
    With the rapid development of Internet, especially the social media technologies, the public have gradually published their perception of social events online through social media. In Web2.0 era, with the concept of extensive participation of public in social-event-related information sharing, the effective content analysis and better results presentation for these media published online thus possesses significant importance for public opinion analysis and monitoring. In view of this, this paper proposes a novel method for public opinion analysis on social media website. First, the probabilistic topic model of Latent Dirichlet Allocation (LDA) is adopted to extract the public ideas about the distinct topics of certain event, and then the deep learning model named word2vec is used to calculate the emotional intensity for each text. Next, the underlying themes in the whole as well as the events of emotional intensity are investigated, and the variation trend of public’s emotion intensities is tracked based on time series analysis. Finally, the rationality and effectiveness of the method are verified with the analysis of a real case

    Exploring fine-grained sentiment values in online product reviews

    Get PDF
    We hypothesise that it is possible to determine a fine-grained set of sentiment values over and above the simple three-way positive/neutral/negative or binary Like/Dislike distinctions by examining textual formatting features. We show that this is possible for online comments about ten different categories of products. In the context of online shopping and reviews, one of the ways to analyse consumers' feedback is by analysing comments. The rating of the ???like??? button on a product or a comment is not sufficient to understand the level of expression. The expression of opinion is not only identified by the meaning of the words used in the comments, nor by simply counting the number of ???thumbs up???, but it also includes the usage of capital letters, the use of repeated words, and the usage of emoticons. In this paper, we investigate whether it is possible to expand up to seven levels of sentiment by extracting such features. Five hundred questionnaires were collected and analysed to verify the level of ???like??? and ???dislike??? value. Our results show significant values on each of the hypotheses. For consumers, reading reviews helps them make better purchase decisions but we show there is also value to be gained in a finer-grained sentiment analysis for future commercial website platforms

    An empirical study of semantic similarity in WordNet and Word2Vec

    Get PDF
    This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors

    Using Word2Vec recommendation for improved purchase prediction

    Get PDF