HVR High-performance Video Retrieval

Overview
Feature
Specifications
Application

High-performance video retrieval (HVR) is a 1U rack dedicated hardware device with two models (HRV - 384 and HVR- 192) designed with the-state-of-art technologies. HVR utilizes proprietory hardware to achieve accuracy and performance. It has an leading position in the industry.

This product integrates a dedicated SOC (System on Chip) hardware chip to achieve video decoding with multiple resolutions. The image feature calculation is realized by the latest convolutional neural network CNN technology, which has better semantic expression than the traditional local feature SIFT.

Large-scale high-dimensional data of the search algorithm based on graph theory delivers 100 million seconds library capacity and millisecond-level search speed. This product is a high-performance device combining software functions and dedicated hardware, leading the latest technology direction for high-performance video and image retrieval. The collection and final retrieval of video data is a software function that runs on the server. The HVR server and the HVR engine are connected through the 10 GE network link, and they can be clustered and managed.

Difficulties in High Performance Retrieval
The form of video propagation and storage is encoded into compressed data or coded stream. H.264 and H.265 are widely used today as standards for video compression, which have good compression rate and faster decoding speed.

A video taken by a mobile phone and transmitted via social media, sender and receiver share the same content but may be in different data format, is called same source video. Since when a raw video stream is encoded first and then decoded, the data will change greatly after the cycle. This is because the encoding process is lossy compression and signal distortion occurs. When the video is edited, rendered, subtitled, changed resolution, the processed video will have a very different code stream while the semantic is kept the same. Therefore, the video content cannot be judged based on the code stream.

Any changes to the video will cause re-encoding, such as changing the resolution, adding subtitles, adding LOGO, and so on. Re-encoding leads to data change. A video can be broken down into a sequence of image frames. The problem of video retrieval translates into image matching. The general principle of video retrieval is shown in Figure 1.

A high-definition video (HD-1080P) stream with H.264 encoding has an online data rate about 6-8 Mbps, while the decoded data volume could be 720 Mbps. A high configuration server can decode 4 to 6 channels of HD video simultaneously by a decoding software application. Common algorithms for extracting image frame features are SIFT and SURF. The speed of these algorithms is generally 2 to 4 channels of 1080P video. Artificial intelligence technology has advanced by leaps and bounds. In recent years, convolutional neural network CNN has replaced SIFT and SURF. The performance of feature retrieval is related to the capacity of the sample library. The traditional K - D tree performs large-scale high-dimensional data comparison at very slow speed, which may be below 2 video channels. If you want to do a series of large-scale sample database retrieval, video decoding, feature extraction and retrieval features for above 8 channels of high-definition video, the work is a great deal of challenge. It’s difficult to process concurrent 1 Gbps video traffic by Intel server and software.

• Searching Video by Clip
Given a video clip (i.e. query video), the location of the query video is determined in the pre-collected video library, and five video locations with the highest matching degree are returned.
• Searching Video by Frame
Given a video frame image, the location of the query frame is determined in the pre-collected video library, and the five positions with the highest matching degree are returned.
• Intelligent Framing
A video consists of several shots. Each shot consists of a successive set of frames. The key frame is the first frame of that shot, which is valuable for video recognition. HVR can automatically identify and extract key frames.
• Discovering Duplicated videos
In a large number of video libraries, video files with the same content are found to realize the management of video libraries.