14 research outputs found
Revisión de la productividad cientÃfica sobre Big Data Marketing durante el periodo 2012 – 2019
En el presente estudio se identifican las tendencias más significativas en producción de artÃculos cientÃficos de alto impacto con respecto a la variable Big Data Marketing durante el periodo comprendido entre los años 2012 y 2019 y cuya revisión se realizará en base de datos Scopus de la cual se logró resaltar la relevancia de 113 artÃculos indexados en dicha base de datos. Para tal fin se implementan indicadores bibliométricos descriptivos tales como: Volumen de producción, Tipo de documento, numero de citaciones, paÃs de realización, (De Fillippo & Fernandez, 2002). En el periodo estudiado se evidencia un crecimiento anual en el volumen de producción de artÃculos con la variable en cuestión sin embargo en el 2017 dicha producción presenta una caÃda importante. Las áreas de conocimiento que más investigan la variable Big Data Marketing son en su orden, las ciencias de la computación, Matemáticas, toma de decisiones e ingenierÃa.The present study identifies the most significant trends in the production of high-impact
scientific articles with respect to the Big Data Marketing variable during the period between
2012 and 2019 and whose revision will be made in the Scopus database. managed to highlight
the relevance of 113 articles indexed in said database. For this purpose, descriptive bibliometric
indicators are implemented, such as: Volume of production, Type of document, number of
citations, country of completion, (De Fillippo & Fernandez, 2002). In the period studied there is
an annual growth in the volume of production of articles with the variable in question, however
in 2017 this production shows a significant drop. The areas of knowledge that most investigate
the variable Big Data Marketing are in their order, computer science, Mathematics, decision
making and engineering
A Probabilistic Embedding Clustering Method for Urban Structure Detection
Urban structure detection is a basic task in urban geography. Clustering is a
core technology to detect the patterns of urban spatial structure, urban
functional region, and so on. In big data era, diverse urban sensing datasets
recording information like human behaviour and human social activity, suffer
from complexity in high dimension and high noise. And unfortunately, the
state-of-the-art clustering methods does not handle the problem with high
dimension and high noise issues concurrently. In this paper, a probabilistic
embedding clustering method is proposed. Firstly, we come up with a
Probabilistic Embedding Model (PEM) to find latent features from high
dimensional urban sensing data by learning via probabilistic model. By latent
features, we could catch essential features hidden in high dimensional data
known as patterns; with the probabilistic model, we can also reduce uncertainty
caused by high noise. Secondly, through tuning the parameters, our model could
discover two kinds of urban structure, the homophily and structural
equivalence, which means communities with intensive interaction or in the same
roles in urban structure. We evaluated the performance of our model by
conducting experiments on real-world data and experiments with real data in
Shanghai (China) proved that our method could discover two kinds of urban
structure, the homophily and structural equivalence, which means clustering
community with intensive interaction or under the same roles in urban space.Comment: 6 pages, 7 figures, ICSDM201
Word2Vec model for sentiment analysis of product reviews in Indonesian language
Online product reviews have become a source of greatly valuable information for consumers in making purchase decisions and producers to improve their product and marketing strategies. However, it becomes more and more difficult for people to understand and evaluate what the general opinion about a particular product in manual way since the number of reviews available increases. Hence, the automatic way is preferred. One of the most popular techniques is using machine learning approach such as Support Vector Machine (SVM). In this study, we explore the use of Word2Vec model as features in the SVM based sentiment analysis of product reviews in Indonesian language. The experiment result show that SVM can performs well on the sentiment classification task using any model used. However, the Word2vec model has the lowest accuracy (only 0.70), compared to other baseline method including Bag of Words model using Binary TF, Raw TF, and TF.IDF. This is because only small dataset used to train the Word2Vec model. Word2Vec need large examples to learn the word representation and place similar words into closer position
Indonesian Language Term Extraction using Multi-Task Neural Network
The rapidly expanding size of data makes it difficult to extricate information and store it as computerized knowledge. Relation extraction and term extraction play a crucial role in resolving this issue. Automatically finding a concealed relationship between terms that appear in the text can help people build computer-based knowledge more quickly. Term extraction is required as one of the components because identifying terms that play a significant role in the text is the essential step before determining their relationship. We propose an end-to-end system capable of extracting terms from text to address this Indonesian language issue. Our method combines two multilayer perceptron neural networks to perform Part-of-Speech (PoS) labeling and Noun Phrase Chunking. Our models were trained as a joint model to solve this problem. Our proposed method, with an f-score of 86.80%, can be considered a state-of-the-art algorithm for performing term extraction in the Indonesian Language using noun phrase chunking
PUBLIC OPINION ANALYSIS BASED ON PROBABILISTIC TOPIC MODELING AND DEEP LEARNING
With the rapid development of Internet, especially the social media technologies, the public have gradually published their perception of social events online through social media. In Web2.0 era, with the concept of extensive participation of public in social-event-related information sharing, the effective content analysis and better results presentation for these media published online thus possesses significant importance for public opinion analysis and monitoring. In view of this, this paper proposes a novel method for public opinion analysis on social media website. First, the probabilistic topic model of Latent Dirichlet Allocation (LDA) is adopted to extract the public ideas about the distinct topics of certain event, and then the deep learning model named word2vec is used to calculate the emotional intensity for each text. Next, the underlying themes in the whole as well as the events of emotional intensity are investigated, and the variation trend of public’s emotion intensities is tracked based on time series analysis. Finally, the rationality and effectiveness of the method are verified with the analysis of a real case
Exploring fine-grained sentiment values in online product reviews
We hypothesise that it is possible to determine a fine-grained set of sentiment values over and above the simple three-way positive/neutral/negative or binary Like/Dislike distinctions by examining textual formatting features. We show that this is possible for online comments about ten different categories of products. In the context of online shopping and reviews, one of the ways to analyse consumers' feedback is by analysing comments. The rating of the ???like??? button on a product or a comment is not sufficient to understand the level of expression. The expression of opinion is not only identified by the meaning of the words used in the comments, nor by simply counting the number of ???thumbs up???, but it also includes the usage of capital letters, the use of repeated words, and the usage of emoticons. In this paper, we investigate whether it is possible to expand up to seven levels of sentiment by extracting such features. Five hundred questionnaires were collected and analysed to verify the level of ???like??? and ???dislike??? value. Our results show significant values on each of the hypotheses. For consumers, reading reviews helps them make better purchase decisions but we show there is also value to be gained in a finer-grained sentiment analysis for future commercial website platforms
An empirical study of semantic similarity in WordNet and Word2Vec
This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors