15 research outputs found
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Gazo bunseki to kanren joho o riyoshita gazo imi rikai ni kansuru kenkyu
制度:新 ; 報告番号:甲3514号 ; 学位の種類:博士(国際情報通信学) ; 授与年月日:2012/2/8 ; 早大学位記番号:新585
The Third NASA Goddard Conference on Mass Storage Systems and Technologies
This report contains copies of nearly all of the technical papers and viewgraphs presented at the Goddard Conference on Mass Storage Systems and Technologies held in October 1993. The conference served as an informational exchange forum for topics primarily relating to the ingestion and management of massive amounts of data and the attendant problems involved. Discussion topics include the necessary use of computers in the solution of today's infinitely complex problems, the need for greatly increased storage densities in both optical and magnetic recording media, currently popular storage media and magnetic media storage risk factors, data archiving standards including a talk on the current status of the IEEE Storage Systems Reference Model (RM). Additional topics addressed System performance, data storage system concepts, communications technologies, data distribution systems, data compression, and error detection and correction
The use of spectral information in the development of novel techniques for speech-based cognitive load classification
The cognitive load of a user refers to the amount of mental demand imposed on the user when performing a particular task. Estimating the cognitive load (CL) level of the users is necessary to adjust the workload imposed on them accordingly in order to improve task performance. The current speech based CL classification systems are not adequate for commercial use due to their low performance particularly in noisy environments. This thesis proposes many techniques to improve the performance of the speech based cognitive load classification system in both clean and noisy conditions.
This thesis analyses and presents the effectiveness of speech features such as spectral centroid frequency (SCF) and spectral centroid amplitude (SCA) for CL classification. Sub-systems based on SCF and SCA features were developed and fused with the traditional Mel frequency cepstral coefficients (MFCC) based system, producing an 8.9% and 31.5% relative error rate reduction respectively when compared to the MFCC-based system alone. The Stroop test corpus was used in these experiments.
The investigation into cognitive load information in the form of spectral distribution in different subbands shows that the information distributed in the low frequency subband is significantly higher than the high frequency subband. Two different methods are proposed to utilize this finding. The first method, called the multi-band approach, uses a weighting scheme to emphasize the speech features in low frequency subbands. The cognitive load classification accuracy of this approach is shown to be higher than a system based on a non-weighting scheme. The second method is to design an effective filterbank based on the spectral distribution of cognitive load information using the Kullback-Leibler distance measure. It is shown that the designed filterbank consistently provides higher classification accuracies than other existing filterbanks such as mel, Bark, and equivalent rectangular bandwidth.
A discrete cosine transform based speech enhancement technique is proposed in order to increase the robustness of the CL classification system and found to be more suitable than other methods investigated. This proposed method provides a 3.0% average relative error rate reduction for the seven types of noise and five levels of SNR used. In particular, it provides a maximum of 7.5% relative error rate reduction for the F16 noise (in NOISEX-92 database) at 20 dB SNR
Florida Undergraduate Research Conference
FURC serves as a multi-disciplinary conference through which undergraduate students from the state of Florida can present their research. February 16-17, 2024https://digitalcommons.unf.edu/university_events/1006/thumbnail.jp
Recommended from our members
EVA London 2022: Electronic Visualisation and the Arts
The Electronic Visualisation and the Arts London 2022 Conference (EVA London 2022) is co-sponsored by the Computer Arts Society (CAS) and BCS, the Chartered Institute for IT, of which the CAS is a Specialist Group. Of course, this has been a difficult time for all conferences, with the Covid-19 pandemic. For the first time since 2019, the EVA London 2022 Conference is a physical conference. It is also an online conference, as it was in the previous two years. We continue with publishing the proceedings, both online, with open access via ScienceOpen, and also in our traditional printed form, for the second year in full colour. Over recent decades, the EVA London Conference on Electronic Visualisation and the Arts has established itself as one of the United Kingdom’s most innovative and interdisciplinary conferences. It brings together a wide range of research domains to celebrate a diverse set of interests, with a specialised focus on visualisation. The long and short papers in this volume cover varied topics concerning the arts, visualisations, and IT, including 3D graphics, animation, artificial intelligence, creativity, culture, design, digital art, ethics, heritage, literature, museums, music, philosophy, politics, publishing, social media, and virtual reality, as well as other related interdisciplinary areas.
The EVA London 2022 proceedings presents a wide spectrum of papers, demonstrations, Research Workshop contributions, other workshops, and for the seventh year, the EVA London Symposium, in the form of an opening morning session, with three invited contributors. The conference includes a number of other associated evening events including ones organised by the Computer Arts Society, Art in Flux, and EVA International. As in previous years, there are Research Workshop contributions in this volume, aimed at encouraging participation by postgraduate students and early-career artists, accepted either through the peer-review process or directly by the Research Workshop chair. The Research Workshop contributors are offered bursaries to aid participation. In particular, EVA London liaises with Art in Flux, a London-based group of digital artists. The EVA London 2022 proceedings includes long papers and short “poster” papers from international researchers inside and outside academia, from graduate artists, PhD students, industry professionals, established scholars, and senior researchers, who value EVA London for its interdisciplinary community. The conference also features keynote talks. A special feature this year is support for Ukrainian culture after its invasion earlier in the year. This publication has resulted from a selective peer review process, fitting as many excellent submissions as possible into the proceedings.
This year, submission numbers were lower than previous years, mostly likely due to the pandemic and a new requirement to submit drafts of long papers for review as well as abstracts. It is still pleasing to have so many good proposals from which to select the papers that have been included. EVA London is part of a larger network of EVA international conferences. EVA events have been held in Athens, Beijing, Berlin, Brussels, California, Cambridge (both UK and USA), Canberra, Copenhagen, Dallas, Delhi, Edinburgh, Florence, Gifu (Japan), Glasgow, Harvard, Jerusalem, Kiev, Laval, London, Madrid, Montreal, Moscow, New York, Paris, Prague, St Petersburg, Thessaloniki, and Warsaw. Further venues for EVA conferences are very much encouraged by the EVA community. As noted earlier, this volume is a record of accepted submissions to EVA London 2022. Associated online presentations are in general recorded and made available online after the conference
Recommended from our members
An Investigation into the Performance of Ethnicity Verification Between Humans and Machine Learning Algorithms
There has been a significant increase in the interest for the task of classifying
demographic profiles i.e. race and ethnicity. Ethnicity is a significant human
characteristic and applying facial image data for the discrimination of ethnicity is
integral to face-related biometric systems. Given the diversity in the application
of ethnicity-specific information such as face recognition and iris recognition, and
the availability of image datasets for more commonly available human
populations, i.e. Caucasian, African-American, Asians, and South-Asian Indians.
A gap has been identified for the development of a system which analyses the
full-face and its individual feature-components (eyes, nose and mouth), for the
Pakistani ethnic group. An efficient system is proposed for the verification of the
Pakistani ethnicity, which incorporates a two-tier (computer vs human) approach.
Firstly, hand-crafted features were used to ascertain the descriptive nature of a
frontal-image and facial profile, for the Pakistani ethnicity. A total of 26 facial
landmarks were selected (16 frontal and 10 for the profile) and by incorporating
2 models for redundant information removal, and a linear classifier for the binary
task. The experimental results concluded that the facial profile image of a
Pakistani face is distinct amongst other ethnicities. However, the methodology
consisted of limitations for example, low performance accuracy, the laborious
nature of manual data i.e. facial landmark, annotation, and the small facial image
dataset. To make the system more accurate and robust, Deep Learning models
are employed for ethnicity classification. Various state-of-the-art Deep models
are trained on a range of facial image conditions, i.e. full face and partial-face
images, plus standalone feature components such as the nose and mouth. Since
ethnicity is pertinent to the research, a novel facial image database entitled
Pakistani Face Database (PFDB), was created using a criterion-specific selection
process, to ensure assurance in each of the assigned class-memberships, i.e.
Pakistani and Non-Pakistani. Comparative analysis between 6 Deep Learning
models was carried out on augmented image datasets, and the analysis
demonstrates that Deep Learning yields better performance accuracy compared
to low-level features. The human phase of the ethnicity classification framework
tested the discrimination ability of novice Pakistani and Non-Pakistani
participants, using a computerised ethnicity task. The results suggest that
humans are better at discriminating between Pakistani and Non-Pakistani full
face images, relative to individual face-feature components (eyes, nose, mouth),
struggling the most with the nose, when making judgements of ethnicity. To
understand the effects of display conditions on ethnicity discrimination accuracy, two conditions were tested; (i) Two-Alternative Forced Choice (2-AFC) and (ii)
Single image procedure. The results concluded that participants perform
significantly better in trials where the target (Pakistani) image is shown alongside
a distractor (Non-Pakistani) image. To conclude the proposed framework,
directions for future study are suggested to advance the current understanding of
image based ethnicity verification.Acumé Forensi
Data Mining
The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining
AN ENHANCEMENT ON TARGETED PHISHING ATTACKS IN THE STATE OF QATAR
The latest report by Kaspersky on Spam and Phishing, listed Qatar as one of the top 10 countries by percentage of email phishing and targeted phishing attacks. Since the Qatari economy has grown exponentially and become increasingly global in nature, email phishing and targeted phishing attacks have the capacity to be devastating to the Qatari economy, yet there are no adequate measures put in place such as awareness training programmes to minimise these threats to the state of Qatar. Therefore, this research aims to explore targeted attacks in specific organisations in the state of Qatar by presenting a new technique to prevent targeted attacks. This novel enterprise-wide email phishing detection system has been used by organisations and individuals not only in the state of Qatar but also in organisations in the UK. This detection system is based on domain names by which attackers carefully register domain names which victims trust. The results show that this detection system has proven its ability to reduce email phishing attacks. Moreover, it aims to develop email phishing awareness training techniques specifically designed for the state of Qatar to complement the presented technique in order to increase email phishing awareness, focused on targeted attacks and the content, and reduce the impact of phishing email attacks. This research was carried out by developing an interactive email phishing awareness training website that has been tested by organisations in the state of Qatar. The results of this training programme proved to get effective results by training users on how to spot email phishing and targeted attacks
Analysing formal visual elements of corporate logotypes using computational aesthetics
The marketing mix contains a significant proportion of elements that derive their appeal and effectiveness from visuals. This thesis proposes the application of quantitative measures from the literature on computational aesthetics to evaluate and study the formal characteristics of corporate visuals in the form of logotypes (logos). It is argued that the proposed approach has a number of advantages in terms of efficiency, consistency and accuracy over existing approaches in marketing that rely on subjective assessments. The proposed approach is grounded on a critical review of a diverse literature that encompasses Marketing, Art History and Philosophy, and, Visual Science and Psychology. The computational aesthetic measures are framed within the construct of Henderson and Cote (1998) and van der Lans et al. (2009), in order to analyse brand logo design elements along with their effect on consumers. The thesis is underpinned by three empirical studies.
The first study uses an extensive set of 107 computational aesthetic measures to quantify the design elements in a sample of 215 professionally designed logotypes drawn from the World Intellectual Property Organization Global Brand Database. The study uses for the first time an array of different measures for evaluating design elements related to colour that include hue, saturation, and colourfulness. The metrics capture both global design features of logos along with features related to visual segments. The metrics are linked to logo elaborateness, naturalness and harmony, using the theoretical framework of Henderson and Cote (1998). The results show that measures have a very diverse behaviour across metrics and typically follow highly non-normal distributions. Factor analysis indicates that the categorisation of the measurements in three factors is a reasonable representation of the data with some correspondence to the dimensions of elaborateness, naturalness and harmony.
The second study demonstrates that the proposed computational aesthetic measures can be used to approximate the subjective evaluation of logo designs provided by experts.
Specifically, eight design elements for the sample of 215 logos, corresponding to harmony, elaborateness and naturalness, are evaluated by three experts. The results show for the first time that computational aesthetic measures related to colour along with other measures are useful in approximating subjective expert reviews. Unlike previous literature, this research combines both standard statistical methods for modelling and inference, along with more recent techniques from machine learning. Linear regression analysis suggests that the objective computational measures contain useful information for predicting proxy subjective expert reviews for logos. Model accuracy is substantially improved using neural network regression analysis based on Radial Basis Functions.
The last study examines the role of consumer personality traits as moderators of the effect of perceived logo dynamism on consumer attitude towards the logo. One hundred and twenty-two participants were asked to evaluate elements of logo design (visual appearance, complexity, informativeness, familiarity, novelty, dynamism and engagement), their attitude towards the brand and their personality traits (sensation seeking, risk taking propensity, nostalgia and need for cognition). The estimates extracted were shown to vary significantly in terms of central tendency and dispersion and mostly follow non-normal distributions. Following Cian et al. (2014) the moderated mediator model by Preacher and Hayes (2008) is applied to test the suitability of personality traits as moderators of the effect of logo dynamism on attitudes towards the logo. The personality traits used as moderators are Need for Cognition and Risk-Taking Propensity, whereas Engagement was used as a Mediator. This is the first study to employ personality traits as moderators in such a study using this methodology. The results offer limited support of the role of personality traits as moderators in this relationship. Therefore, the study strengthens the case for the development of objective measures of visual characteristics.
The working hypothesis in the thesis is that, with the help of computational aesthetic measures, marketing visuals such as corporate logos, can afford themselves to a consistent quantitative approach which can prove to be important for researchers and practitioners alike. By being able to group and measure the aesthetic differences, similarities and emerging patterns, access is gained to a new family of metrics, which can be applied to any type of logo across time, product, industry or culture