Nguyen Le Thanh, James W. Beauchamp, Hyunhwan “Aiden” Lee, Gang Ren, Joseph Johnson, and Mitsunori Ogihara, “Multi-Scale Auralization for Multimedia Analytical Feature Interaction”
2019 Audio Engineering Society (AES) 147th Pro Audio Convention in New York, NY.
Modern human-computer interaction systems use multiple perceptual dimensions to enhance intuition and efficiency of the user by improving their situational awareness. A signal processing and interaction framework is proposed for auralizing signal patterns for augmenting the visualization-focused analysis tasks of social media content analysis and annotations, with the goal of assisting the user in analyzing, retrieving, and organizing relevant information for marketing research. Audio signals are generated from video/audio signal patterns as an auralization framework, for example, using the audio frequency modulation that follows the magnitude contours of video color saturation. The integration of visual and aural presentations will benefit the user interactions by reducing the fatigue level and sharping the users’ sensitivity, thereby improving work efficiency, confidence, and satisfaction.
The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross-modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.
Temporal contour shapes are closely linked to the narrative structure of multimedia content and provide important reference points in content-based multimedia timeline analysis. In this paper, multimedia timeline is extracted from content as time varying video and audio signal features. A temporal contour representation is implemented based on sequential pattern discovery algorithm for modeling the variation contours of multimedia features. The proposed contour representation extracts repetitive temporal patterns from a hierarchy of time resolutions or from synchronized video/audio feature dimensions. The statistically significant contour components, depicting the dominant timeline shapes, are utilized as a structural or analytical representation of the timeline. The modeling performance of this proposed temporal modeling framework is demonstrated through empirical validation and subjective evaluations.