1,362 research outputs found
Modeling consumer's perception of orange juice
A multiple indicator and multiple cause model with dichotomous indicators was used to study consumer's perception toward orange juice. Results indicate that recalls of orange juice advertising messages by the respondent had a positive impact on his/her perception toward orange juice. Results also suggest that selected socioeconomic variables were important determinants of consumer perception.orange juice, advertising recall, multiple indicator and multiple cause model, Demand and Price Analysis,
Scalable Architecture for Integrated Batch and Streaming Analysis of Big Data
Thesis (Ph.D.) - Indiana University, Computer Sciences, 2015As Big Data processing problems evolve, many modern applications demonstrate special characteristics. Data exists in the form of both large historical datasets and high-speed real-time streams, and many analysis pipelines require integrated parallel batch processing and stream processing. Despite the large size of the whole dataset, most analyses focus on specific subsets according to certain criteria. Correspondingly, integrated support for efficient queries and post- query analysis is required.
To address the system-level requirements brought by such characteristics, this dissertation proposes a scalable architecture for integrated queries, batch analysis, and streaming analysis of Big Data in the cloud. We verify its effectiveness using a representative application domain - social media data analysis - and tackle related research challenges emerging from each module of the architecture by integrating and extending multiple state-of-the-art Big Data storage and processing systems.
In the storage layer, we reveal that existing text indexing techniques do not work well for the unique queries of social data, which put constraints on both textual content and social context. To address this issue, we propose a flexible indexing framework over NoSQL databases to support fully customizable index structures, which can embed necessary social context information for efficient queries.
The batch analysis module demonstrates that analysis workflows consist of multiple algorithms with different computation and communication patterns, which are suitable for different processing frameworks. To achieve efficient workflows, we build an integrated analysis stack based on YARN, and make novel use of customized indices in developing sophisticated analysis algorithms.
In the streaming analysis module, the high-dimensional data representation of social media streams poses special challenges to the problem of parallel stream clustering. Due to the sparsity of the high-dimensional data, traditional synchronization method becomes expensive and severely impacts the scalability of the algorithm. Therefore, we design a novel strategy that broadcasts the incremental changes rather than the whole centroids of the clusters to achieve scalable parallel stream clustering algorithms.
Performance tests using real applications show that our solutions for parallel data loading/indexing, queries, analysis tasks, and stream clustering all significantly outperform implementations using current state-of-the-art technologies
FACTORS INFLUENCING CHANGES IN POTATO AND POTATO SUBSTITUTE DEMAND
Despite the rapid rise in complex carbohydrate consumption over the last twenty-five years, fresh potato consumption has fallen by over 50%. Fresh potato growers and retailers alike need to know whether these changes reflect consumer responses to changing relative prices or incomes, or whether they are due to changes in consumer tastes. This paper uses a linear approximation almost ideal demand system (LA/AIDS) to investigate the effect of relative prices, expenditures, and a set of socioeconomic variables on complex carbohydrate demand. Estimation results show that the socioeconomic variables explain some of the changes in demand, but a significant amount remains as evidence of a change in consumer tastes.Demand and Price Analysis,
A BIO-ECONOMIC DYNAMIC PROGRAMMING ANALYSIS OF THE SEASONAL SUPPLY RESPONSE BY FLORIDA DAIRY PRODUCERS
Seasonal price premiums have been proposed as a means of dampening the highly seasonal patterns of milk production in Florida. A Markov decision bio-economic model of the breeding and replacement decisions was solved via stochastic dynamic programming and used to analyze the potential supply response to seasonal price premiums. The results of the analysis suggest that the seasonal milk supply in Florida is highly price inelastic.Demand and Price Analysis,
Self-protected nanoscale thermometry based on spin defects in silicon carbide
Quantum sensors with solid state electron spins have attracted considerable
interest due to their nanoscale spatial resolution.A critical requirement is to
suppress the environment noise of the solid state spin sensor.Here we
demonstrate a nanoscale thermometer based on silicon carbide (SiC) electron
spins.We experimentally demonstrate that the performance of the spin sensor is
robust against dephasing due to a self protected machenism. The SiC thermometry
may provide a promising platform for sensing in a noisy environment ,e.g.
biological system sensing
Parallel clustering of high-dimensional social media data streams
We introduce Cloud DIKW as an analysis environment supporting scientific
discovery through integrated parallel batch and streaming processing, and apply
it to one representative domain application: social media data stream
clustering. Recent work demonstrated that high-quality clusters can be
generated by representing the data points using high-dimensional vectors that
reflect textual content and social network information. Due to the high cost of
similarity computation, sequential implementations of even single-pass
algorithms cannot keep up with the speed of real-world streams. This paper
presents our efforts to meet the constraints of real-time social stream
clustering through parallelization. We focus on two system-level issues. Most
stream processing engines like Apache Storm organize distributed workers in the
form of a directed acyclic graph, making it difficult to dynamically
synchronize the state of parallel workers. We tackle this challenge by creating
a separate synchronization channel using a pub-sub messaging system. Due to the
sparsity of the high-dimensional vectors, the size of centroids grows quickly
as new data points are assigned to the clusters. Traditional synchronization
that directly broadcasts cluster centroids becomes too expensive and limits the
scalability of the parallel algorithm. We address this problem by communicating
only dynamic changes of the clusters rather than the whole centroid vectors.
Our algorithm under Cloud DIKW can process the Twitter 10% data stream in
real-time with 96-way parallelism. By natural improvements to Cloud DIKW,
including advanced collective communication techniques developed in our Harp
project, we will be able to process the full Twitter stream in real-time with
1000-way parallelism. Our use of powerful general software subsystems will
enable many other applications that need integration of streaming and batch
data analytics.Comment: IEEE/ACM CCGrid 2015: 15th IEEE/ACM International Symposium on
Cluster, Cloud and Grid Computing, 201
Recommended from our members
Garlic Consumption and All-Cause Mortality among Chinese Oldest-Old Individuals: A Population-Based Cohort Study.
In vitro and in vivo experimental studies have shown garlic has protective effects on the aging process; however, there is no evidence that garlic consumption is associated with all-cause mortality among oldest-old individuals (≥80 years). From 1998 to 2011, 27,437 oldest-old participants (mean age: 92.9 years) were recruited from 23 provinces in China. The frequencies of garlic consumption at baseline and at age 60 were collected. Cox proportional hazards models adjusted for potential covariates were constructed to estimate hazard ratios (HRs) relating garlic consumption to all-cause mortality. Among 92,505 person-years of follow-up from baseline to September 1, 2014, 22,321 participants died. Participants who often (≥5 times/week) or occasionally (1-4 times/week) consumed garlic survived longer than those who rarely (less than once/week) consumed it (p < 0.001). Participants who consumed garlic occasionally or often had a lower risk for mortality than those who rarely consumed garlic at baseline; the adjusted HRs for mortality were 0.92(0.89-0.94) and 0.89(0.85-0.92), respectively. The inverse associations between garlic consumption and all-cause mortality were robust in sensitivity analyses and subgroup analyses. In this study, habitual consumption of garlic was associated with a lower all-cause mortality risk; this advocates further investigation into garlic consumption for promoting longevity
- …