High-performance video retrieval (HVR) is a 1U rack dedicated hardware device with
two models (HRV - 384 and HVR- 192) designed with the-state-of-art technologies.
HVR utilizes proprietory hardware to achieve accuracy and performance. It has an
leading position in the industry.
This product integrates a dedicated SOC (System on
Chip) hardware chip to achieve video decoding with multiple resolutions. The image
feature calculation is realized by the latest convolutional neural network CNN
technology, which has better semantic expression than the traditional local feature
SIFT.
Large-scale high-dimensional data of the search algorithm based on graph
theory delivers 100 million seconds library capacity and millisecond-level search
speed. This product is a high-performance device combining software functions and
dedicated hardware, leading the latest technology direction for high-performance
video and image retrieval. The collection and final retrieval of video data is a software
function that runs on the server. The HVR server and the HVR engine are connected
through the 10 GE network link, and they can be clustered and managed.
Difficulties in High Performance Retrieval
The form of video propagation and storage is encoded into compressed data or
coded stream. H.264 and H.265 are widely used today as standards for video
compression, which have good compression rate and faster decoding speed.
A video taken by a mobile phone and transmitted via social media, sender and
receiver share the same content but may be in different data format, is called same
source video. Since when a raw video stream is encoded first and then decoded, the
data will change greatly after the cycle. This is because the encoding process is lossy
compression and signal distortion occurs. When the video is edited, rendered,
subtitled, changed resolution, the processed video will have a very different code
stream while the semantic is kept the same. Therefore, the video content cannot be
judged based on the code stream.
Any changes to the video will cause re-encoding, such as changing the resolution,
adding subtitles, adding LOGO, and so on. Re-encoding leads to data change.
A video can be broken down into a sequence of image frames. The problem of video
retrieval translates into image matching. The general principle of video retrieval is
shown in Figure 1.
A high-definition video (HD-1080P) stream with H.264 encoding has an online data
rate about 6-8 Mbps, while the decoded data volume could be 720 Mbps. A high
configuration server can decode 4 to 6 channels of HD video simultaneously by a
decoding software application. Common algorithms for extracting image frame
features are SIFT and SURF. The speed of these algorithms is generally 2 to 4
channels of 1080P video. Artificial intelligence technology has advanced by leaps and
bounds. In recent years, convolutional neural network CNN has replaced SIFT and
SURF. The performance of feature retrieval is related to the capacity of the sample
library. The traditional K - D tree performs large-scale high-dimensional data
comparison at very slow speed, which may be below 2 video channels. If you want to
do a series of large-scale sample database retrieval, video decoding, feature
extraction and retrieval features for above 8 channels of high-definition video, the
work is a great deal of challenge. It’s difficult to process concurrent 1 Gbps video
traffic by Intel server and software.
• Searching Video by Clip
Given a video
clip (i.e. query video), the location of the
query video is determined in the
pre-collected video library, and five video
locations with the highest matching degree
are returned.
• Searching Video by Frame
Given a
video frame image, the location of the
query frame is determined in the
pre-collected video library, and the five
positions with the highest matching degree
are returned.
• Intelligent Framing
A video consists of
several shots. Each shot consists of a
successive set of frames. The key frame is
the first frame of that shot, which is
valuable for video recognition. HVR can
automatically identify and extract key
frames.
• Discovering Duplicated videos
In a
large number of video libraries, video files
with the same content are found to realize
the management of video libraries.