Media Optimization Archives - Go Fish Digital

Query Categorization Based On Image Results

Bill Slawski — Wed, 27 Apr 2022 13:00:27 +0000

Google was recently granted a patent on Query Categorization Based On Image Results.

The patent tells us that: “internet search engines provide information about Internet-accessible resources (such as Web pages, images, text documents, multimedia content) responsive to a user’s search query by returning, when image searching, a set of image search results in response to the query.”

A search result includes, for example, a Uniform Resource Locator (URL) of an image or a document containing the image and a snippet of information.

Related Content:

Ranking SERPs Using a Scoring Function

The search results can be ranked (such as in order) according to scores assigned by a scoring function.

The scoring function ranks the search results according to various signals:

Where (and how often) query text appears in document text surrounding an image
An image caption or alternative text for the idea
How standard the query terms are in the search results indexed by the search engine.

In general, the subject described in this patent is in a method that includes:

Obtaining images from the first image results for a first query, where a number of the acquired images associated with scores and user behavior data that state user interaction with the obtained images when the obtained images are search results for the query
Selecting a number of the acquired images each having respective behavior data that satisfies a threshold
Associating the chosen first images with several annotations based on analysis of the selected images’ content

These can optionally include the following features.

The first query can be associated with categories based on the annotations. The query categorization and annotation associations can get stored for future use. The second image results responsive to a second query that is the same or like the first query can get received.

Each of the second images gets associated with a score, and the second image can get modified based on the categories related to the first query.

One of the query categorizations can state that the first query is a single-person query and increases the scores of the second image, whose annotations say that the set of second images contains a single face.

One query categorization can state that the first query is diverse and increase the scores of the second images, whose annotations say that the set of second images is diverse.

One of the categories can state that the first query is a text query and increase the scores of the second image, whose annotations say that the set of second images contains the text.

The first query can get provided to a trained classifier to determine a query categorization in the categories.

Analysis of the selected first images’ content can include clustering the first image results to determine an annotation in the annotations. User behavior data can be the number of times users select the image in search results for the first query.

The subject matter described in this patent can get implemented so on realize the following advantages:

The image result set gets analyzed to derive image annotations and a query categorization, and user interaction with image search results can get used to derive types for queries.

Query Categorization

Query categories can, in turn, improve the relevance, quality, and diversity of image search results.

Query categorization can also get used as part of query processing or in an off-line process.

Query categories can get used to provide automated query suggestions such as “show only images with faces” or “show only clip art.”

Query categorization based on image results
Inventors: Anna Majkowska and Cristian Tapus
Assignee: GOOGLE LLC
US Patent: 11,308,149
Granted: April 19, 2022
Filed: November 3, 2017

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for query categorization based on image results.

In one aspect, a method includes receiving images from image results responsive to a query, wherein each of the photos gets associated with an order in the image results and respective user behavior data for the image as a search result for the first query and associating of the first images with a plurality of annotations based on analysis of the selected first images’ content.

A System That Uses Query Categorization To Improve The Set Of Results Returned For A Query

A client, such as a web browser or other process executing on a computing device submits an input query to a search engine, and the search engine returns image search results to the client. In some implementations, a query comprises text such as characters in a character set (e.g., “red tomato”).

A query comprises images, sounds, videos, or combinations of these. Other query types are possible. The search engine will search for results based on alternate query versions equal to, broader than, or more specific than the input query.

The image search results are an ordered or ranked list of documents or links to such, which are determined to be responsive to the input query, with the documents determined to be most relevant having the highest rank. A copy is a web page, an image, or another electronic file.

In the case of image search, the search engine determines an image’s relevance based, at least in part, on the following:

Image’s content
The text surrounding the image
Image caption
Alternative text for the image

Categories Associated With A Query

In producing the image search results, the search engine in some implementations submits a request for categories associated with the query. The search engine can use the associated categories to re-order the image search results by increasing the rank of image results determined to belong to the related categories.

In some cases, it may decrease image results that do not belong to the associated categories or both.

The search engine can also use the categories of the results to determine how they should get ranked in the finalized set of results in combination with or of the query category.

A categorizer engine or other process employs image results retrieved for the query and a user behavior data repository to derive categories for the query. The repository contains user behavior data. The storage indicates the number of times populations of users selected an image result for a given query.

Image selection can be accomplished in various ways, including using the keyboard, a computer mouse or a finger gesture, a voice command, or other methods. User behavior data includes “click data.”

Click Data Indicates How Long A User Views Or “Dwells” On An Image Result

Click data indicates how long a user views or “dwells” on an image result after selecting it in a results list for the query. For example, a long time dwelling on an image (such as greater than 1 minute), termed a “long click,” can state that a user found the image relevant to the user’s query.

A brief period of viewing an image (e.g., less than 30 seconds), termed a “short click,” can get interpreted as a lack of image relevance. Other types of user behavior data are possible.

By way of illustration, user behavior data can get generated by a process that creates a record for result documents selected by users in response to a specific query. Each form can get represented as a tuple: ) that includes:

A question submitted by users
A query reference indicating the query
A document reference a paper selected by users in response to the query
Aggregation of click data (such as a count of each click type) for all users or a subset of all users that selected the document reference in response to the query.

Extensions of this tuple-based approach to user behavior data are possible. For instance, the user behavior data can get extended to include location-specific (such as country or state) or language-specific identifiers.

With such identifiers included, a country-specific tuple would consist of the country from where the user query originated, and a language-specific tuple would consist of the language of the user query.

For simplicity of presentation, the user behavior data associated with documents A-CCC for the query get depicted in the table as being either a “high,” “med,” or “low” amount of favorable user behavior data (such as user behavior data indicating relevance between the document and the query).

User Behavior Data For A Document

Favorable user behavior data for a document can state that the paper is selected by users when it gets viewed in the results for the query, or when a users view the document after choosing it from the results for the query, the users view the document for an extended period (such as the user finds the document to be relevant to the question).

The categorizer engine works in conjunction with the search engine using returned results and user behavior data to determine query categories and then re-rank the results before they get returned to the user.

In general, for the query (such as a query or an alternate form of the query) specified in the query category request, the categorizer engine analyzes image results for the query to determine if the query belongs to categories. Image results analyzed in some implementations have been selected by users as a search result for the query a total number of times above a threshold (such as set at least ten times).

The categorizer engine analyzes all image results retrieved by the search engine for a given query. in other implementations

The categorizer engine analyzes image results for the query where a metric (e.g., the total number of selections or another measure) for the click data is above a threshold.

The image results can be analyzed online using computer vision techniques in various ways, either offline or online, during the scoring process. Images get annotated with information extracted from their visual content.

Image Annotations

For example, image annotations can get stored in the annotation store. Each analyzed image (e.g., image 1, image 2, etc.) gets associated with annotations (e.g., A1, A2, and so on) in a photo to annotation association.

The annotations can include:

The number of faces in the image
The size of each face
The dominant colors of the image
Whether a picture contains text or a graph
Whether an image is a screenshot

Additionally, each image can get annotated with a fingerprint which can then determine if two images are identical or identical.

Next, the categorizer engine analyzes image results for a given query and their annotations to determine query categories. Associations of query categories (e.g., C1, C2, and so on) for a given query (such as query 1, query 2, etc.) can be determined in many ways, such as using a simple heuristic or using an automated classifier.

A Simple Query Categorizer Based On A Heuristic

As an example, a simple query categorizer based on a heuristic can get used to determine the desired dominant color for the query (and whether there is one).

The heuristic can be, for example, that if out of the top 20 most often clicked images for the query, at least 70% have a dominant color red, then the query can get categorized as “red query.” For such queries, the search engine can re-order the retrieved results to increase the rank of all images annotated with red as a dominant color.

The same categorization can get used with all other standard colors. An advantage of this approach to over-analyzing the text of the query is that it works for all languages without the need for translation (such as it will promote images with dominant red color for the question “red apple” in any language). It is more robust (such as it will not increase the rank of red images for the query “red sea”).

An Example Categorizer Engine

The categorizer engine can work in an online mode or offline mode in which query category associations get stored ahead of time (e.g., in the table) for use by the search engine during query processing.

The engine receives query image results for a given query and provides the image results to image annotators. Each image annotator analyzes image results and extracts information about the visual content of the image, which gets stored as an image annotation (e.g., image annotations) for the idea.

A Face Image Annotator

By way of illustration, a face image annotator:

Determines how many faces are in an image and the size of each face
a fingerprint image annotator extracts visual image features in a condensed form (fingerprint) which then can get compared with the fingerprint of another image to determine if the two images are similar
A screenshot image annotator determines if an image is a screenshot
A text image annotator determines if a picture contains text
A graph/chart image query determines if an image includes graphs or charts (e.g., bar graphs)
A dominant color annotator determines if a picture contains a dominant color

Other image annotators can also get used. For example, several image annotators get described in a paper entitled “Rapid Object Detection Using a Boosted Cascade of Simple Features,” by Viola, P.; Jones, M., Mitsubishi Electric Research Laboratories, TR2004-043 (May 2004).

Next, the categorizer engine analyzes image results for a given query and their annotations to determine query categories (e.g., query categories). Query categories are determined using a classifier, and a query classifier can get realized using a machine learning system.

Use of Adaptive Boosting

By way of illustration, AdaBoost, short for Adaptive Boosting, is a machine learning system that can be used with other learning algorithms to improve their performance. AdaBoost gets used to generate a query categorization. (More learning algorithms are possible)

AdaBoost invokes a “weak” image annotator in a series of rounds. By way of illustration, the single-person query classifier can get based on a learning machine algorithm trained to determine whether a query calls for images of a single person.

By way of illustration, such a query classifier can get trained with data sets comprising a query, a set of feature vectors representing result images for the question with zero or more faces, and the correct categorization for the query (i.e., faces or not). For each call, the query classifier updates a distribution of weights that indicates the importance of examples in the training data set for the classification.

On each round, the weights of each classified training example get increased (or the consequences of each classified training example get decreased), so the new query categorization focuses more on those examples. The resulting trained query categorization can take as input a query and output a probability that the query calls for images containing single persons.

A diverse/homogeneous query classifier takes as input a query and outputs a probability that the query is for various images. The classifier uses a clustering algorithm to cluster image results according to their fingerprints based on a measure of distance from each other. Each image gets associated with a cluster identifier.

The image cluster identifier gets used to determine the number of clusters, the size of the groups, and the similarity between clusters formed by images in the result set. For example, this information gets used to associate a probability that the query is specific (or inviting duplicates) or not,

Associating Queries With Canonical Meanings And Representations

The query categorization can also get used to associate queries with canonical meanings and representations. For example, if there is a single large cluster or several large clusters, the probability of the question getting related to duplicate image results is high. If there are many smaller clusters, then the likelihood that the query gets associated with the same image results is low.

Duplicates of images are usually not very useful as they provide no more information, so they should get demoted as query results. But, there are exceptions. For example, if there are many duplicates in initial results (a few, large clusters), the query is particular, and duplicates should not get demoted.

A screenshot/non-screenshot query categorization takes as input a query and outputs a probability that the query calls for images that are screenshots. A text/non-text query classifier accepts as input a query and outputs a chance that the query calls for images that contain text.

A graph/non-graph query categorization takes an input of a query and outputs a probability that the query calls for images that contain a graph or a chart. A color query classifier 133f takes an information query and outputs a chance that the query calls shots that get dominated by a single color. Other query classifiers are possible.

Improving The Relevance Of Image Results Based On Query Categorization

A searcher can interact with the system through a client or other device. For example, the client device can be a computer terminal within a local area network (LAN) or a vast area network (WAN). The client device can be a mobile device (e.g., a mobile phone, a mobile computer, a personal desktop assistant, etc.) capable of communicating over a LAN, a WAN, or some other network (e.g., a cellular phone network).

The client device can include a random access memory (RAM) (or other memory and a storage device) and a processor.

The processor gets structured to process instructions and data within the system. The processor is a single-threaded or multi-threaded microprocessor having processing cores. The processor receives structured to execute instructions stored in the RAM (or other memory and a storage device included with the client device) to render graphical information for a user interface.

A searcher can connect to the search engine within a server system to submit an input query. The search engine is an image search engine or a generic search engine that can retrieve images and other types of content such as documents (e.g., HTML pages).

When the user submits the input query through an input device attached to a client device, a client-side question gets sent into a network and forwarded to the server system as a server-side query. The server system can be server devices in locations. A server device includes a memory device consisting of the search engine loaded therein.

A processor gets structured to process instructions within the device. These instructions can install components of the search engine. The processor can be single-threaded or multi-threaded and include many processing cores. The processor can process instructions stored in the memory related to the search engine and send information to the client device through the network to create a graphical presentation in the user interface of the client device (e.g., search results on a web page displayed in a web browser).

The server-side query gets received by the search engine. The search engine uses the information within the input query (such as query terms) to find relevant documents. The search engine can include an indexing engine that searches a corpus (e.g., web pages on the Internet) to index the documents found in that corpus. The index information for the corpus documents can be stored in an index database.

This index database can get accessed to identify documents related to the user. Note that an electronic copy (which will s get referred to as a document) does not correspond to a file. A record can get stored in a part of a file that holds other documents, in a single file dedicated to the document in question, or in many coordinated files. Moreover, a copy can get stored in a memory without being stored in a file.

The search engine can include a ranking engine to rank the documents related to the input query. The documents’ ranking can get performed using traditional techniques to determine an Information Retrieval (IR) score for indexed records given a given query.

Any appropriate method may determine the relevance of a particular document in a specific search term or to other provided information. For example, the general level of back-links to a document containing matches for a search term may get used to infer a document’s relevance.

In particular, if a document gets linked to (e.g., is the target of a hyperlink) by many other relevant documents (such as documents containing matches for the search terms), it can get inferred that the target document is particularly relevant. This inference can get made because the authors of the pointing papers presumably point, for the most part, to other documents that are relevant to their audience.

The pointing documents target links from other relevant documents, which can be considered more relevant. The first document is particularly appropriate because it targets applicable (or even highly relevant) documents.

Such a technique may determine a document’s relevance or one of many determinants. Appropriate methods can also get taken to identify and cut attempts to cast fraudulent votes to drive up the relevance of a page.

To further improve such traditional document ranking techniques, the ranking engine can receive more signals from a rank modifier engine to assist in determining an appropriate ranking for the documents.

In conjunction with image annotators and query categorization described above, the rank modifier engine provides relevance measures for the papers. The ranking engine can use to improve the search results’ ranking provided to the user.

The rank modifier engine can perform operations to generate the measures of relevance.

Whether an image result’s score increases or decreases depends on whether the image’s visual content (as represented in image annotations) matches the query categorization, each image category gets considered.

For example, if the query’s categorization is “single person,” then an image result that gets classified both as a “screenshot” and “single face” would first have its score decreased because of the “screenshot” category. It can then increase its score because of the “single face” category.

The search engine can forward the final, ranked result list within server-side search results through the network. Exiting the network, client-side search results can get received by the client device, where the results can get stored within the RAM and used by the processor to display the results on an output device for the user.

An Information Retrieval System

These components include an:

Indexing engine
Scoring engine
Ranking engine
Rank modifier engine

The indexing engine functions as described above for the indexing engine. The scoring engine generates scores for document results based on many features, including content-based features that link a query to document results and query-independent parts that generally state the quality of documents results.

Content-based features for images include aspects of the document that contains the picture, such as query matches to the document’s title or the image’s caption.

The query-independent features include, for example, aspects of document cross-referencing of the paper or the domain or image dimensions.

Moreover, the particular functions used by the scoring engine can get tuned to adjust the various feature contributions to the final IR score, using automatic or semi-automatic processes.

The ranking engine ranks document results for display to a user based on IR scores received from the scoring machine and signals from the rank modifier engine.

The rank modifier engine provides relevance measures for the documents, which the ranking engine can use to improve the search results’ ranking provided to the user. A tracking component records user behavior information, such as individual user selections of the results presented in the order.

The tracking component gets embedded JavaScript code included in a web page ranking that identifies user selections of individual document results and identifies when the user returns to the results page, thus indicating the amount of time the user spent viewing the selected document result.

The tracking component is a proxy system through which user selections of the document results get routed. The tracking component can also include pre-installed software for the client (such as a toolbar plug-in to the client’s operating system).

Other implementations are also possible, for example, one that uses a feature of a web browser that allows a tag/directive to get included in a page, which requests the browser to connect back to the server with messages about links clicked by the user.

The recorded information gets stored in result selection logs. The recorded information includes log entries that state user interaction with each result document presented for each query submitted.

For each user selection of a result document presented for a query, the log entries state the query (Q), the paper (D), the user’s dwell time (T) on the document, the language (L) employed by the user, and the country (C) where the user is likely located (e.g., based on the server used to access the IR system) and a region code (R) identifying the metropolitan area of the user.

The log entries also record negative information, such as that a document result gets presented to a user but was not selected.

Other information such as:

Positions of clicks (i.e., user selections in the user interface
Information about the session (such as existence and type of previous clicks (Post-click session activity))
R scores of clicked results
IR scores of all results shown before click
Titles and snippets are displayed to the user before the click
User’s cookie
Cookie age
IP (Internet Protocol) address
User-agent of the browser
So on

The time (T) between the initial click-through to the document result and the users returning to the main page and clicking on another document result (or submitting a new search query) also gets recorded.

An assessment gets made about the time (T) about whether this time indicates a longer view of the document or a shorter one since more extended arguments generally show quality or relevance for the clicked-through result. This time assessment (T) can be made in conjunction with various weighting techniques.

The components shown can be combined in various manners and multiple system configurations. The scoring end tanking engines merge into a single ranking engine, such as the ranking engine. The rank modifier engine and the ranking engine can also get merged. In general, a ranking engine includes any software component that generates a ranking of document results after a query. Moreover, a ranking engine can fit a client system also (or rather than) in a server system.

Another example is the information retrieval system. The server system includes an indexing engine and a scoring/ranking engine.

In this system, A client system includes:

A user interface for presenting a ranking
A tracking component
Result selection logs
A ranking/rank modifier engine.

For example, the client system can include a company’s enterprise network and personal computers, in which a browser plug-in incorporates the ranking/rank modifier engine.

When an employee in the company initiates a search on the server system, the scoring/ranking engine can return the search results. An initial ranking or the actual IR scores for the results. The browser plug-in then re-ranks the results based on tracked page selections for the company-specific user base.

A Technique For Query Categorization

This technique can be performed online (as part of query processing) or in an offline manner.

First image results responsive to the first query get received. Each of the first images gets associated with an order (such as an IR score) and a respective user behavior data (such as click data).

A number of the first images get selected where a metric for the respective behavior data for each selected image satisfies a threshold.

The selected first images get associated with several annotations based on the chosen first images’ content analysis. The image annotations can get persisted in image annotations.

Categories are then associated with the first query based on the annotations.

The query categorization associations can last in query categories.

Second image results responsive to a second query that is the same or the first query are then received.

(If the second query is not found in the query categorization, the second query can get transformed or “rewritten” to determine if an alternate form matches a query in the query categorization.)

In this example, the second query is the same as or can be rewritten as the first query.

The second image results are re-ordered based on the query categorization before being associated with the first query.

Query Categorization Based On Image Results is an original blog post first published on Go Fish Digital.

Video Collages With Interesting Moments

Bill Slawski — Thu, 30 Sep 2021 14:25:05 +0000

Photo Collages and Video Collages

We may see video collages in hardware associated with Google that generates videos. Google photos have had a collage feature, and I can visit and seen collages of pictures from the exact locations all joined together. There is a way of tagging “key moments” from videos using schema markup so that search results in Google can point to key moments from videos (highly recommended). A recent Google patent describes making video collages and refers to “interesting moments” in those videos. It doesn’t tell us the difference between a key moment in one video and interesting moments in video collages of multiple videos.

Related Content:

But it does describe why it might make video collages:

There are currently one billion smartphones in use. There is potential for seven times the amount of growth in the future. Smartphones get used for capturing and consuming content, like photos and videos. Videos convey more than photos because they capture temporal variation. But, people may be less likely to view videos because not all parts of the video are interesting.

The background description of this patent presents the context of this patent.

Generating Video Collages

This patent refers to interesting moments in videos as opposed to key moments in videos. There are many help Pages about marking up key moments in videos, but none saying that they point to interesting moments. But they do point to moments that are designated as interesting by the people who post those videos. The Video Collages Patent does layout a framework describing how video collages might get built, filled with interesting moments.

Using Schema To Tag Key Moments in Videos in Search Results

When I came across this patent, I got reminded of the Google Developer’s post on implementing SeektoAction markup: A new way to enable video key moments in search. In brief, it works like this:

Today, we’re launching a new way for you to enable key moments for videos on your site without the effort of manually labeling each segment. All you have to do is tell Google the URL pattern for skipping to a specific timestamp within your video. Google will then use AI to identify key moments in the video and display links directly to those moments in Search results.

I also got reminded of people asking me questions about “key moments” found on Youtube Videos. There is a Google Blog post on this topic: Search helps you find key moments in videos What that quickly tells us is that:

Starting today, you can find key moments within videos and get to the information you’re looking for faster, with help from content creators.

When you search for things like how-to videos with multiple steps, or long videos like speeches or a documentary, the search will provide links to key moments within the video, based on timestamps provided by content creators.

You’ll easily scan to see whether a video has what you’re looking for and find the relevant section of the content.

For people who use screen readers, this change also makes video content more accessible.

This Google Developer’s Page tells us about those timestamps: Get videos on Google with schema markup

Implementations of the patent relate to a computer-implemented method to generate a collage. The method includes determining exciting moments in a video. The method further comprises generating video segments based on the exciting moments, where each of the video segments has at least one of the exciting moments from the video. The method further includes generating a collage from the video segments, where the collage comprises at least two windows, and each window contains one of the video segments.

I also came across a Search Engine Land Article on Key Moments in Videos, which tells us that: Google officially launches SeekToAction for key moments for videos in search

I also found this support page on Youtube about audience retention: Measure key moments for audience retention

Key Moments in Videos May be similar to Interesting Moments in Video Collages

The patent provides a lot of information about interesting moments.

Operations of the video collages patent further include receiving a selection of the video segments in the collage and causing the video to get displayed that corresponds to the selection.

Determining the interesting moments in a video includes:

Identifying audio in the video
Citing a type of action associated with the audio in the video
Generating an interest score for each type of audio in the video
Determining the interesting moments based on the interest score for each type of audio in the video
Deciding on the interesting moments in the video includes:
Noting motion in the video
Finding type of action associated with the continual motion in the video
Creating an interest score for each type of action in the video
Locating the interesting moments based on the interest score for each type of action in the video

The video segments in the collage get configured to play automatically. At least a first segment of the video segments in the collage gets configured to play at a different frame rate than other video segments in the collage.

Piecing together the video collages from the video segments includes generating graphical data that renders the collage with video segments in windows of different sizes. The windows may get based on the interest scores for the video segments, the length of each video segment, and an artistic effect.

Making Video Collages of Interesting Moments

A computer-implemented method to generate a hierarchical collage includes:

Finding interesting moments in a video
Including video segments based on interesting moments
Grouping the video segments into groups
Making first collages, each corresponding to a respective one of the groups and each of the first collages including at least two video segments
Selecting a representative segment for each of the groups from the at least two video segments of each of the two or more first collages
Showing a second collage that includes the representative segment for each of the groups, wherein the representative segment in the second collage links to a corresponding first collage that includes at least two video segments that get included in a corresponding group
Choosing a selection of the representative segments in the second collage and causing the corresponding first collage to become displayed
Gatherng the video segments into groups is based on the timing of each of the video segments or grouping the video segments into groups gets based on a type of interesting moment associated with each of the video segments
Deriving an interest score for the interesting moments and selecting the representative for each of the groups
may get based on the interest score

A method comprises means for:

Determining interesting moments in a video
Generating video segments based on the interesting moments, wherein each of the video segments includes at least one of the interesting moments from the video
Creating a collage from the video segments, wherein the collage includes at least two windows and wherein each window includes one of the video segments

The system and methods described below solve the problem of identifying exciting moments in a video by generating a collage that includes video segments of the exciting moments.

The Video Collages of Interesting Moments Patent

The Video Collages patent is found at:

Collage of interesting moments in a video
Inventors: Sharadh Ramaswamy, Matthias Grundmann, and Kenneth Conley
Assignee: Google LLC
US Patent: 11,120,835
Granted: September 14, 2021
Filed: December 17, 2018

Abstract

A computer-implemented method includes determining interesting moments in a video. The method further includes generating video segments based on the interesting moments, wherein each of the segments includes at least one of the interesting moments from the video. The method further includes generating a collage from the video segments, where the collage includes at least two windows and wherein each window includes one of the video segments.

The patent tells us that searchers are more likely to view a video if they can preview interesting moments in videos and navigate directly to those exciting moments in the video.

A video application is described here:

Finds interesting moments in a video
Builds video segments based on the interesting moments
Makes a collage from the video segments that include the video segments in a single pane

For example, a video may have a first video segment of a child laughing, a second video segment of a dog running after the child, and a third video segment of the child blowing out a birthday cake.

How Video Collages are Generated

The video application may generate video collages that display short, e.g., two to three seconds long, loops of the first, second, and third video segments. The frame rates of each of the video segments may differ. For example, the first video segment may include a slow-motion video, the second video segment may consist of a fast-motion video, and the third video segment may include a regular-speed video segment.

When a user selects one of the video segments in the collage, the application may cause the video to get displayed that corresponds to the selected part. For example, if the first video segment occurs at 2:03 minutes, user selection causes the video to play at 2:03 minutes.

The video application may generate a hierarchical collage. The video application may determine exciting moments in a video. It might then create video segments based on the exciting moments.
It could group the video segments into groups and generate first collages based on the groups. It could then select a representative piece for each group and generate a second collage that includes a usual segment for each group.

The groups may become based on timing or a type of interesting moment associated with each video segment. Continuing with the example above, a first group could include a first video segment of a child laughing, a second video segment of a dog running after the child, and a third video segment of the child blowing out a birthday cake that all occur in the first third of the video.

This video application may also generate an interest score for each video segment and select the representative segment based on the interest score. For example, the third video segment of the child blowing out the birthday cake may have an interest score indicative of the most interesting video segment. As a result, the video application may select the third segment as the representative segment for the first group in the first collage.

When a user selects one of the usual segments in the second collage, the video application may cause the first collage to get displayed.

An Example Application That Generates Video Collages

This patent is about an application that includes a video server, user devices, a second server, and a network. It looks like it could generate video collages with a variety of hardware devices, and may have been purposefully left wide open for undeveloped hardware.

Users may become associated with respective user devices. The method may include other servers or devices.

The entities of the system get coupled via a network. The network may be conventional: wired or wireless, and may have many different configurations, including a star configuration, token ring configuration, or other configurations. Furthermore, the network may include a local area network (LAN), a vast area network (WAN) (e.g., the Internet), and other interconnected data paths across which many devices may communicate.

The database may store videos created or uploaded by users associated with user devices and collages generated from the videos.

The database may store videos developed independently of the user’s devices.

The database may also store social network data associated with users.

The user device may be a computer with a memory and a hardware processor, such as a camera, a laptop computer, a desktop computer, a tablet computer, a mobile telephone, a wearable device, a head-mounted display. The hardware processor could also be a mobile e-mail device, a portable game player, a portable music player, a reader device, a television with processors embedded therein or coupled to it, or another electronic device capable of accessing a network.

The user device gets coupled to the network via a signal line. A signal line may be a wired connection, such as Ethernet, coaxial cable, fiber-optic cable, etc., or a wireless connection, such as Wi-Fi.RTM., Bluetooth.RTM., or other wireless technology. User devices get accessed by Users, respectively.

Examples of User Devices Used to Create Video Collages

The user device can be a mobile device that gets included in a wearable device worn by the user. For example, the user device gets included as part of a clip (e.g., a wristband), part of jewelry, or part of a pair of glasses. In another example, the user device can be a smartwatch. The user may view images from the video application on a display of the device worn by the user. For example, the user may view the pictures on a smartwatch or a smart wristband display.

The video application may be a standalone application that gets stored on the user’s device. The video application may get stored in part on the user device and the video server. For example, the video application may include a thin-client video application stored on the user devicea and a video application stored on the video server.

The video applicationb stored on the user device may record video transmitted to the video application stored on the video server. A collage gets generated from the video. The video application may send the collage to the video application for display on the user device. In another example, the video application stored on the user devicea may generate the collage and send the collage to the video application stored on the video server. The video application stored on the video server may include the same components or different components as the video application stored on the user device.

The video application may be a standalone application stored on the video server. A user may access the video application via a web page using a browser or other software on the user’s device. For example, the users may upload a video stored on the device or from the second server to the video application to generate a collage.

The second server may include a processor, a memory, and network communication capabilities. The second server is a hardware server. The second server sends and receives data to and from the video server and the user devices via the network.

The second server may provide data to the video application. For example, the second server may be a separate server that generates videos used by the video application to create collages. In another example, the second server may be a social network server that maintains a social network where the collages may get shared by a user with other social network users. In yet another example, the second server may include video processing software that analyzes videos to identify objects, faces, events, a type of action, text, etc. The second server may get associated with the same company that maintains the video server or a different company.

Video Collages with Entity Information Attached

As long as a user consents to use such data, the second server may provide the video application with profile information or images that the video application may use to identify a person in a photo with a corresponding social network profile. In another example, the second server may provide the video application with information related to entities identified in the images used by the video application.

For example, the second server may include an electronic encyclopedia that provides information about landmarks identified in the photos. This electronic shopping website provides information for purchasing entities identified in the images. This electronic calendar application offers, subject to user consent, an event name associated with a video, a map application that provides information about a location associated with a video, etc.

The systems and methods discussed herein collect, store, and use user personal information only upon receiving explicit authorization from the relevant users. For example, a user controls whether programs or features collect user information about that particular user or other users apply to the program or part. Users hold the information pertinent to that user and whether the information gets managed and which gets collected.

For example, users can get provided with control options. Specific data may get treated in ways before it gets stored or used to remove personally identifiable information. For example, a user’s identity may get treated to determine no personally identifiable information. As another example, a user’s geographic location may get generalized to a larger region so that the user’s particular location cannot get determined.

An Example Computer That Generates Video Collages

The computer may be a video server or a user device.

The computer may include a processor, a memory, a communication unit, a display, and a storage device.

A video application may get stored in the memory.

The video application includes a video processing module, a segmentation module, a collage module, and a user interface module. Other modules and configurations are possible.

The video processing module may be operable to determine exciting moments in a video. The video processing module may be a set of instructions executable by the processor to decide exciting moments in the video. The video processing module may get stored in the computer’s memory and accessible and executable by the processor.

The video processing module may get stored on a device that is the video server. The video processing module may receive the video from the video application stored on the user device. The video processing module may receive the video from a second server, which stores movies or television shows.

The video processing module determines exciting moments in the video associated with a user. The video processing module may identify the exciting moments and choose the interesting moments based on the label. For example, the user interface module may generate a user interface that includes an option for the user to select frames, for example, by clicking on the shelves in the video to identify interesting moments. The video processing module may associate metadata with the video that includes time locations for the interesting moments placed by the user. The video processing module may receive a sign of what forms an interesting moment from a user. For example, the user may specify that interesting moments include people in the video saying a particular phrase or speaking on a specific topic.

Video Processing Finding Interesting Moments

The video processing module determines interesting moments by identifying audio in the video. The video processing module may determine the type of audio in the video. For example, the video processing module may classify the audio associated with music, applause, laughter, booing, etc. The video processing module may determine the level of volume of the audio. For example, in a basketball game video, an increase in the audio from cheering and booing may get associated with an interesting moment, such as a basketball player missing a shot.

The video processing module may generate an interest score for each type of motion based on the type of audio. For example, the video processing module may develop an interest score that indicates that the moment is interesting based on the start of music or laughter. The video processing module may generate an interest score that means the moment is not interesting based on a cough or general background noise. The video processing module may determine the interesting moment based on the interest score for each type of audio in the video.

The video processing module determines interesting moments by identifying continual motion in the video and identifying a type of action associated with the constant movement in the video. The video processing module may determine activity by classifying pixels in an image frame as background or foreground.

The video processing module may classify all image frames or a subset of image frames of the video.

The video processing module identifies the background and the foreground in a subset of the image frames based on the timing of the image frames. The subset may include a few or all of the intra-coded structures (I-frames) of the video. For example, the video processing module may perform classification on every third frame in the video. In another example, the video processing module may perform a sort on a subset of the frames in the video, e.g., only I-frames, I-frames, and a few or all predicted picture frames (P-frames), etc.

Comparing Foreground Motion in Video Segments

That video processing module may compare the foreground in many video image frames to identify foreground motion. For example, the video processing module may use different techniques to identify activity in the foreground, such as frame differencing, adaptive median filtering, and background subtraction. This process advantageously identifies the motion of objects in the foreground. For example, in a video of a person doing a cartwheel outside, the video processing module may ignore movement in the background, such as swaying the trees in the wind. Still, the video processing module identifies the person performing the cartwheel because the person is in the foreground.

And, the video processing module may analyze the video to determine the action associated with the continual motion. For example, the video processing module may use a vector-based on continual movement to compare the constant motion with continual motion in available videos. The video processing module may use the vector t, identify a person walking a dog, punching another person, catching a fish, etc. In another example, the video processing module may perform image recognition to identify objects and types of motion associated with the things in other past videos to identify the action.

For example, the video processing module identifies a trampoline. It determines that a person is jumping on the trampoline based on trampolines becoming associated with jumping, a cake becoming associated with cutting or blowing out a birthday cake, skis becoming associated with skiing, etc. The video processing module may associate metadata with the video that includes timestamps of each action type. For example, the video processing module may generate metadata that identifies a timestamp of each instance of a person riding a scooter in the video.

Interesting Moments Based on Continual Motion in Videos

Also, the video processing module may determine an interesting moment based on the action associated with the continual motion. For example, the video processing module may determine that a video includes a user riding a skateboard. The video processing module generates an interest score based on the type of action. The video processing module may develop an interest score that corresponds to the act of skateboarding. The video processing module may assign the interest score based on the quality of the action. For example, the video processing module may give an interest score that indicates a more interesting moment when the frames with the movement show:

A person with a visible face
Edges where the quality of the images is high

These would get based on the visibility of the action, lighting, blur, stability of the video.

On user consent, the video processing module may generate the interest score based on user preferences. For example, if a user has expressed an interest in skateboarding, the video processing module generates an interest score that indicates that the user finds skateboarding to be enjoyable. The user provides explicit interests that the video processing module adds to a user profile associated with the user. When the user provides consent to the analysis of implicit behavior, the video processing module determines types of actions to add to the user profile based on implicit behavior, such as providing indications of approval for media associated with types of activities.

Object Recognition on Objects in Video Collages

The video processing module may perform object recognition to identify objects in the video. Upon user consent, the video processing module may perform object recognition that includes identifying a face in the video and determining an identity of the face. The video processing module may compare an image frame of the face to images of people, reach the image frame to other members that use the video application, etc. Upon user consent, the video processing module may request identifying information from the second server.

For example, the second server may maintain a social network. The video processing module may request profile images or other social network users connected to the user associated with the video. Upon user consent, the video processing module may use facial recognition techniques to people in image frames of the video to identify people related to the faces.

The video processing module may generate metadata that includes identifying the objects and timestamps of when the things appear in the video. For example, the metadata may consist of labels that identify a type of object or person. If the user has provided consent, the video processing module may generate metadata that includes identifying people and timestamps when the people appear in the video. For example, for a video of the user’s daughter, the video processing module may generate metadata that identifies each time the daughter appears in the video and timestamps and identifies objects that the daughter interacts with within the video.

The video processing module generates an interest score to identify a type of object or a person in the video. The video processing module may compare a variety of objects to a list of positive things and a list of harmful objects that include objects that get commonly recognized as being positive and negative, respectively.

When the user consents to user data, the video processing module assigns the interest score based on personalization information for a user associated with the video. For example, upon user consent, the video processing module maintains a social graph and generates the interest score based on a relationship between the user and a person in the video as identified using the social graph.

Personalozation and User’s Reaactions to Video

The video processing module may determine personalization information, subject to user consent, based on detailed data provided by the user, implicit information found on the user’s reactions to videos, such as comments provided on video websites, activity in social network applications, etc. The video processing module determines user preferences based on the types of videos associated with the user. For example, the video processing module may determine that the user prefers videos about sports based on the user creating or watching videos that include different types of sports, such as baseball, basketball, etc.

The video processing module may determine an event associated with the video. The video processing module may determine the event based on metadata associated with the video. For example, the metadata may include a date and a location associated with the video. The video processing module may use the date and the location to retrieve information, for example, from a second server, about what event occurred at that date and time. When the user provides consent to metadata, the video processing module may use metadata that identifies objects and people in the video to determine the event.

For example, the video processing module may determine that the event was a concert based on identifying crowds of people in the video. Particular objects may get associated with specific circumstances. For example, cakes get associated with birthdays and weddings. Basketball gets associated with a court, etc. In another example, people may get related to events, such as people wearing uniforms with specific circumstances during school hours, people sitting in pews with a church gathering, people around a table with plates with dinner, etc. The video processing module may generate an exciting score based on the type of event identified in the video.

The video processing module may use more sources of data to identify the event. For example, the video processing module may determine the date, the time, and the location where the video got taken based on metadata associated with the video and, upon user consent, request event information associated with the data and the time from a calendar application associated with the user. The video processing module may request the event information from a second server that manages the calendar application.

Events From Videos Determined From Publicly Available Information

The video processing module may determine the event from publicly available information. For example, the video processing module may use the date, the time, and the location associated with the video to determine that the video is from a football game. The video processing module may associate metadata with the video that includes identifying information for the event.

The video processing module may transcribe the audio to text and identify an interesting moment based on the reader. The video processing module may generate metadata that identifies a timestamp for each instance where a user spoke a specific word. For example, where the video is from speeches given at a conference on cloud computing, the video processing module may identify a timestamp for each location where a speaker said “the future.” The video processing module may use the audio as a sign of an interesting moment. For example, for sports events or other competitions, the video processing module may identify when a crowd starts cheering and determine continual motion that occurred right before the cheering, including an interesting moment.

The video processing module may determine whether the interest score meets or exceeds a threshold segmentation value. Suppose a part of the video includes an interest score that meets or exceeds the threshold segmentation value. In that case, the video processing module may instruct the segmentation module to generate a video segment that consists of the interesting moment. Portions of the video that fail to meet or exceed the threshold segmentation value may not get identified as including an interesting moment.

Segmentation to Find Interesting Moments For Video Collages

And, the segmentation module generates video segments that include interesting moments. Where the interesting moment is associated with continual motion, the segmentation module may create a video segment with a beginning and an end. The segmentation module may identify a start and an intermediate endpoint of continual motion within the piece and pick a sub-segment that includes both these points. For example, if the video is of a girl doing many cartwheels, the start point may be the start of a first cartwheel, and the intermediate endpoint may be the end of the first cartwheel. In another example, the segmentation module may identify a segment based on different types of motion.

For example, a first sub-segment maybe a cartwheel, and a second subsegment may be a jumping celebration. Next, may determine how to generate the segment by including at least a particular number of interesting moments. For example, the segmentation module may create a video segment with a first interesting moment with a specific object in the first frames. It may show a second interesting moment with continual motion in a group of double frames and a third interesting moment that includes a person in a third frame. Also, the segmentation module may generate a video segment that is one to three seconds long.

The segmentation module may generate a video segment that includes many frames at different periods in the video. For example, the segmentation module may create a video segment that provides for many instances where people at a conference say “cloud computing” at different periods in the video.

The segmentation module generates video segments based on a theme. When a user specifies that interesting moments include a type of action, the segmentation module generates a video segment that consists of the interesting moments identified by the video processing module. For example, the segmentation module may show a video segment where a person rides a scooter in the video. The segmentation module may select many action instances to include in the video segment based on the interesting scores.

Ranking Interesting Moments To Choose For Video Collages

The segmentation module may rank the interesting moments based on their corresponding interesting scores and select many of the interesting moments based on the length of the video segment, such as three seconds, five seconds, twenty seconds, etc. For example, the segmentation module may select the top five most interesting moments based on the ranking because the total length of the five most interesting moments is under 20 seconds.

The segmentation module may determine markers that state different sections within the video and generate segments that include interesting moments within the units.

The sections may include:

Different acts or scenes in a movie
Different news segments in a news reporting show
Different videos in a show about people filming dangerous stunts on video
Etc.

For example, the segmentation module may generate three video segments for a movie. The three segments represent the three acts in the film, and each segment includes interesting moments cut from the corresponding act. The markers may consist of metadata stating each section’s start and end, black frames, white frames, a title card, a chapter card, etc.

The segmentation module verifies that the video segments are different from each other. For example, the segmentation module may determine that each video segment includes different objects, so the collage does not include video segments that look too similar.

The collage module may be operable to generate a collage from the video segments. The collage module can be a set of instructions executable by the processor to provide the functionality described below for generating the collage. The collage module can become stored in the computer’s memory and accessible and executable by the processor.

The collage module receives video segments from the segmentation module. The collage module may retrieve the selected video segments from the storage device.

Generating Video Collages From Video Segments

The collage module may generate a collage from the video segments where the video segments get displayed in a single pane. The video collages may take many forms. For example, the collage module may generate video collages when at least two video segments are available. In another example, the collage module may create video collages when at least four video segments are available. The video segments may be displayed in square windows, in portrait windows (e.g., if the video segment gets shot in portrait mode), in a landscape window (e.g., if the video gets shot in landscape mode), and with different aspect ratios (e.g., 16:9, 4:3, etc.).

The collage module may configure the aspect ratios and orientations based on the user device used to view the collage. For example, the collage module may use a 16:9 aspect ratio for high-definition televisions, a 1:1 aspect ratio for square displays or viewing areas, a portrait collage for a user device in a portrait orientation, and a vast collage (e.g., 100:9) for wearables such as augmented reality and virtual reality displays.

The collage module may combine a predetermined number of video segments to form the collage. For example, the collage module may rank the video segments from most attractive to least interesting based on the interest scores and generate a collage based on the predetermined number of video segments that are the most interesting. The collage module may select video segments with interest scores that meet or exceed a predetermined collage value.

The collage module processes the video segments. For example, the collage module may convert the video segments to high dynamic range (HDR), black and white, sepia, etc.

The Layout and Ordering of Video Segments Based O Chronology

The collage module may layout and order the video segments based on chronology, interest scores, visual similarity, color similarity, and the length of time of each piece. Ordering the collage based on chronology may include the first video segment corresponding to the earliest time, the second video segment corresponding to the earliest time, etc. The collage module may order the video segments based on the interest scores by ranking the video segments from most attractive to least interesting based on the interest scores and order the collage based on the ranking. The collage module may arrange the video segments in a clockwise direction, counterclockwise guidance, or an arbitrary direction. Other configurations are possible.

The collage module generates instructions for the user interface module to create graphical data that renders the collage with video segments in windows of different sizes. The size of the windows may get based on interest scores for each of the video segments. For example, the video segment with an interest score that indicates that it is most interesting may have the largest window size.

Additionally, the size of the windows may get based on the length of the video segments. For example, the shortest video segment may correspond to the smallest window size. The collage module may determine window size based on an artistic effect. For example, the collage module may generate windows that resemble creative works from the De Stijl art movement. In particular, the collage module may create a collage with shapes that resemble a Piet Mondrian painting with different sized boxes and different line thicknesses that distinguish the separation between different video segments.

The collage module generates a collage that is a video file (e.g., an animated GIF, an MPG, etc.) with associated code (e.g., JavaScript) that recognizes user selection (e.g., to move to the second collage in a hierarchy, to playback a specific segment, etc.). The collage module may link the video segments to a location in the video. Upon selecting one of the video segments, the video gets displayed in the video that corresponds to the piece. For example, each video segment in the collage may include a hyperlink to the corresponding location in the video.

Generating Video Collages by Meeting a Threshold Score

The collage module generates and displays a collage by determining video segments that meet a threshold score. It may evaluate display characteristics for the collage and identify window layouts that meet the display characteristics. It can also select a particular window layout, generate the collage, and cause the collage to get displayed.

A graphic representation gets illustrated. The graphical representation includes an example timeline of a video and a corresponding collage 310 generated from four interesting moments. The timeline represents an eight-minute video. The eight-minute video may be an ice skating competition where four different ice skating couples each have a two-minute demonstration. The video processing module identified four interesting moments labeled A, B, C, and D in this example.

The segmentation module generates four video segments where each video segment includes a corresponding interesting moment.

Interesting moment A may include a first couple executing a sustained edge step.

The interesting moment B may consist of a second couple where one of the skaters runs a triple axel jump.

The interesting moment C may include a third couple executing the sustained edge step.

And the interesting moment D may consist of a fourth couple executing a serpentine step sequence.

The video processing module may determine the interesting moments based on a user identifying the interesting moments, identifying continual motion, for example, a motion that occurs before the crowd starts cheering, or another technique.

The collage module generates a collage from the video segments. In this example, the collage module generates a collage that orders the video segments chronologically in a clockwise direction. Suppose a user selects one of the video segments.

The user interface module may cause the video to get displayed at the location in the video that corresponds to the time of the video segment.

For example, in the example depicted, if a user selects video segment D, a new window may appear that displays the video at the D location illustrated on the timeline near the end of the video.

A Graphic Representation of Another Example Video Collage

In this example, the collage includes 19 video segments. The collage module may generate the different sized windows for the collage based on the interest scores for each video segment and the length of the video segments. For example, a figure may represent a collage generated from a video of a news program. Video segment A may represent the feature news story for the news program, which is both the most interesting and the longest. As a result, video segment A gets described with the largest window. Video segments B, C, and H, represent other less interesting and shorter news segments. Lastly, video segments D, E, F, and G represent short snippets in the news program.

The collage module generates a hierarchical collage. Hierarchical collages may be helpful to, for example, present a limited number of video segments in a single window. Besides, the hierarchical collage may create an entertaining effect that helps users stay more engaged when so many video segments appear too crowded. The collage module may group the video segments based on the timing of the video segments or a type of interesting moment associated with the video segments.

The collage module may generate the first collages based on the groups. For example, the collage module may divide a video into three parts and develop the first collages for each video segment in the first, second, and last. In another example, a video may include tryouts and competitions. The collage module may group based on the type of interesting moment by distinguishing between tryouts and competitions.

The collage module may generate two first collages, one first collage for the video segments in the tryouts and one second for the video segments in the competitions. The representative segment may be the most extended video segment for a group. The representative segment may be a segment that includes a high amount of continual motion compared with other elements in the group. A combination of interest score, segment length, amount of continual movement, etc., may get used to select the representative segment.

The collage module may select a representative segment from the video segments associated with the first collages. The usual component may get based on the interest score for each of the video segments in the group. For example, continuing with the above example of a group of tryouts and a group of competitions, the collage module may select the most interesting tryout video segment to represent the tryout group’s representative segment.

The collage module may generate a second collage that includes the representative segment for each of the groups. The standard components link to each of the corresponding first collages such that the selection of one of the usual segments causes the related first collage to be visible. The collage module may instruct the user interface module to generate graphical data that drives the second collage to open to display the corresponding first collage, replace the second collage with the first collage, or to causes all the first collages to get displayed.

The collage module configures the video segments in the collage to play automatically. Or additionally, the collages may have to get selected to play. The video segments may play at once or sequentially such that a first video segment plays, then a second video segment plays, etc. The video segments may play once or become configured to play on a continuous loop. A user may be able to configure automatic playback or other options as system settings.

The collage module configures the video segments to play at different frame rates. For example, video segment A may play at the standard speed of 24 FPS (frames per second), video segment B may play at a slower pace of 16 FPS, video segment C may play at a faster speed of 50 FPS, and video segment D may play at 24 FPS. The collage module selects the frame rate based on the content of the video segment. For example, the collage module may determine a slow frame rate for video segments when the rate of continual motion in the video segment is high, such as a video segment of a pitcher throwing a baseball. The collage module may select a faster frame rate when the rate of continual motion in part is low, such as a video segment of a person blowing out a candle or cutting a cake.

An Example Timeline And Hierarchical Video Collages

For example, the timeline represents a video of a meeting that includes presenters giving talks, attendees forming discussion groups, and closing remarks becoming presented. The collage module groups the video segments into three groups: group A represents a section where presenters talk, group B represents a section where people form discussion groups, and group C describes closing remarks.

The collage module generates two first collages: one for group A, which includes four video segments, and one for group B, which provides three video segments. The collage module generates a second collage that includes representative details for the two first collages and the video segment for group C. The second collage may consist of a usual component from each group’s A, B, and C.

Suppose a user selects the representative segment for group A. In that case, the user interface module causes a user interface to display the first collage for group A, which includes video segments A1, A2, A3, and A4. If the user selects video segment A3, it causes the user interface to display the video at the location corresponding to A3 in the timeline.

The user interface module may be operable to provide information to a user. That user interface module can be a set of instructions executable by the processor to provide the functionality described below for providing information to a user. The user interface module can get stored in the computer’s memory and accessible and executable by the processor.

The user interface module may receive instructions from the other modules in the video application to generate graphical data operable to display a user interface. For example, the user interface module may create a user interface that displays a collage created by the collage module.

The user interface module may generate graphical data to display collages that link to the full video. Responses to a user clicking on the collage the user interface may display the original video or cause a new webpage to open that includes the full video. The user interface module provides an option to download the collage to a user device or stream the collage from the video server.

The user interface module may generate an option for a user to provide feedback on the collages. For example, the user interface module may create a user interface that includes a feedback button that the user can select to view a drop-down menu that contains objects that the user wants to add as explicit interests. The user interface module may provide the things based on labels associated with the video segments used to create the list of objects that the user may select as explicit interests.

A Graphic Representation of A User Interface That Includes A Vdeos Section

In the videos section, the user interface module may receive a designation of an interesting moment from a user. In this example, the user interface module includes instructions that inform users that the user can identify interesting moments by clicking on the video. As a result of the user selection, the video segment module generates a segment that includes the interesting moment. The collage module generates a collage that consists of the video segments.

A Figure also includes a collages section that consists of a collage. In this example, the user selects one of the playback buttons to view a corresponding video segment. The user interface also includes an option for indicating approval of the video in a +1 button and a share button that allows the user to share the collage. For example, the user interface module may generate an option for sharing the collage via a social network, using e-mail, via a chat application, etc.

An Example Method To Generate A Video Collage

Interesting moments get determined in a video. For example, a user identifies the interesting moments, selected based on continual motion, objects in the video, etc. Video segments get generated based on the interesting moments, where each of the video segments includes at least one of the interesting moments from the video. A collage gets generated from the video segments, where the collage consists of at least two windows, and each window includes one of the video segments.

Generating Hierarchical Video Collage

The steps may get performed by the video application.

Video collages get created based on interesting moments.

Interesting moments get determined in a video.

The video segments get grouped into groups.

Two or more first video collages get generated, each corresponding to one of the two or more groups. Each of the first video collages include at least two video segments. A representative component gets selected for each group from at least two video segments of each of the first collages. A second collage gets generated that includes the usual segment for each group. The second collage links to a corresponding first collage that provides at least two video segments in a related group.

Video Collages With Interesting Moments is an original blog post first published on Go Fish Digital.