On September 11, Shengshu Technology held a media open day event to unveil the “Subject Consistency” feature, which enables consistent generation of any subject, making video generation more stable and controllable. This feature is currently available to users for free.
Introduction of the Vidu Video Model
Earlier, at the end of April, Shengshu Technology, in collaboration with Tsinghua University, globally launched the original video model Vidu, which went live and became fully accessible at the end of July.
During the open day, Shengshu Technology CEO Tang Jiayu explained to media reporters, including those from the Daily Economic News, that the “Subject Consistency” feature aims to address the “uncontrollable” limitations of video models. Currently, video models face limitations such as weak continuity and random output. Weak continuity refers to the inability to ensure consistency in subjects, scenes, and styles each time a video is generated, especially apparent in complex interactions. Random output means that the results can be quite unpredictable, requiring multiple attempts to generate, and precise control over details like camera movement and lighting effects is not yet achievable.
Overcoming Previous Limitations
Previously, the industry attempted to use a “first AI-generated images, then image-generated videos” approach, creating storyboard images with AI drawing tools to maintain subject consistency at the image level before converting and editing them into video clips.
With the “Subject Consistency” feature, users can upload a single image of any subject to lock in the subject’s appearance, then use descriptive words to switch scenes freely while generating videos with consistent subjects. This feature is not limited to a single object and applies to “any subject,” including people, animals, products, as well as animated characters and fictional subjects.
This text provides insights into the new capabilities of Shengshu Technology’s video model, emphasizing the innovative approach to achieving consistency in video generation.
Source: Eastmoney