Creating marketing videos is far easier than it used to be, with a plethora of tools to help you, but still time consuming. Wouldn’t it be easy if you could just get videos created automatically to show off key bits of your on-page content?
Google has been experimenting with just that and a post on their AI blog last month details how.
Google describes URL2Video as “a research prototype pipeline to automatically convert a web page into a short video, given temporal and visual constraints provided by the content owner.”
Using machine learning and computational methods the prototype leverages the rich assets most business sites would commonly have on a webpage. It pulls assets (text, images, or videos) and design styles (including fonts, colors, graphical layouts, and hierarchy), and organises these into a sequence of shots, keeping a look and feel similar to the source page.
It then renders the repurposed materials into a video, with a user specified aspect ratio and duration.
Google says 2 goals are considered when composing a video:
- each video shot should provide concise information, and
- the visual design should be consistent with the source page.
To make the content concise, it presents only dominant elements from a page, such as a headline and a few multimedia assets.
URL2Video makes decisions about both the temporal and spatial arrangement to present the assets in individual shots, and adjusts the presentation timing of assets to make the video more dynamic and engaging.
The video is presented in MPEG-4 container format.
The prototype makes automatic editing decisions, such as deciding on font and color, timing, and content ordering, based on principles and best practice derived from interviews conducted with web and video design experts.
The authoring interface gives the user scope to review the design attributes in each video shot and make changes such as reordering, amending design elements and adjusting the constraints to generate a new video.
Google's video shows in greater detail how URL2Video works.
What's next?
Google says this current research was focused on visual presentation, and they are now working on developing new techniques around audio track and voiceover elements.
Their vision of the future is one where creators focus on making high-level decisions, with an ML model interactively suggesting detailed temporal and graphical edits.
For more details about Google’s experimentation and methodology, see: https://ai.googleblog.com/2020/10/experimenting-with-automatic-video.html