We developed a video summarization technique for unedited rush video that employs high-level feature fusion to estimate viewer attention and identify segments for inclusion. It aims to capture distinct video events using a variety of features: k-means based weighting, speech, camera motion, significant differences in HSV color space, and a dynamic time warping (DTW) based feature that suppresses repeated scenes. The feature functions drive clustering that identifies visually distinct, high-attention segments that constitute the final summary. The optimal weighting of individual features is obtained using a gradient descent algorithm that maximizes the recall of ground truth events from representative training videos. The system has a lengthy computation time but manually-judged inclusion of distinct shots reflect high-quality summaries. The summaries were judged by reviewers to be relatively easy to view and had an average amount of redundancy.
We extended this video summarization project to research on video annotation through search and mining. This individual project involves automatically extracting keywords that can be used to textually tag a video, exploiting the redundancy that exists in video data.
CurveSynth is an interactive sound and visual synthesizer developed for a unique environment at the institution. The Allosphere is a three-story spherical space that creates a completely immersive experience both visually and aurally. CurveSynth allows multiple users to draw lines, shapes, and curves that drive sound and visual interactions and is a research tool designed to explore human interactivity and art in the Allosphere.
|