How AT&T Could Differentiate Video Product With Content Narratives 

AT&T is looking at ways to augment subscription video programming by generating content narratives for TV shows and movies that it offers subscribers, according to a patent application published Thursday.

GopalanAT&T Labs Senior Scientist Raghuraman Gopalan is named as lead inventor on the patent application, titled, “Method and apparatus for augmenting media content.”

Abstract: Aspects of the subject disclosure may include, for example, generating narrative descriptions corresponding to visual features, visual events, and interactions there between for media content, where the narrative descriptions are associated with time stamps of the media content, and presenting the media content and an audio reproduction of the narrative descriptions, wherein the audio reproduction is synchronized to video of the media content according to the time stamps. Other embodiments are disclosed.

Patent Application

Claims:

1. A device, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, comprising: scanning a plurality of images from a media content item to detect a plurality of visual features present within the plurality of images; analyzing interactions between the plurality of visual features to determine a plurality of visual events that are depicted in the media content item; generating a plurality of narrative descriptions corresponding to the plurality of visual features and the plurality of visual events, wherein the plurality of narrative descriptions are associated with a plurality of time stamps of the media content item; filtering the plurality of narrative descriptions according to requirements of an application to generate a plurality of filtered descriptions; and presenting the media content item and the plurality of filtered descriptions via the application.

2. The device of claim 1, wherein the operations further comprise: generating a plurality of speech renderings of the plurality of filtered descriptions; and presenting the plurality of speech renderings.

3. The device of claim 1, wherein the operations further comprise: searching the plurality of filtered descriptions based on a search term; selecting a video clip from the media content item, wherein a filtered description of the plurality of filtered descriptions substantially matches the search term; generating a speech rendering corresponding to the filtered descriptions; and presenting the video clip and the speech rendering via the application.

4. The device of claim 1, wherein the operations further comprise filtering the plurality of visual features according to second requirements of the application.

5. The device of claim 1, wherein the plurality of narrative descriptions comprise natural language descriptions of the plurality of visual features and the plurality of visual events.

6. The device of claim 1, wherein the operations further comprise analyzing audio content of the media content item to extract audio information associated with the plurality of visual features, wherein the step of analyzing interactions further includes the audio information.

7. The device of claim 1, wherein the operations further comprise analyzing lighting characteristics of the plurality of visual features to extract mood information, wherein the step of analyzing interactions further includes the mood information.

8. The device of claim 1, wherein the filtering of the narrative descriptions according to second requirements of a profile of user associated with the application.

9. The device of claim 1, wherein the operations further comprise: comparing the plurality of narrative descriptions to a plurality of objectionable event descriptions; and censoring a portion of the media content item corresponding to a narrative description of the plurality of narrative descriptions that substantially matches an objectionable event description of the plurality of objectionable event descriptions.

10. The device of claim 9, wherein the censoring comprises removing the portion of the media content item.

11. The device of claim 9, wherein the censoring comprises generating an advisory of the objectionable event.

12. The device of claim 1, wherein the operations further comprise: comparing the plurality of narrative descriptions to a plurality of target event descriptions; extracting a portion of the media content item corresponding to a narrative description of the plurality of narrative descriptions that substantially matches a target event description of the plurality of target event descriptions; and presenting the portion of the media content item.

13. The device of claim 1, wherein the operations further comprise generating a graphical representation of a portion of the plurality of narrative descriptions.

14. The device of claim 1, wherein the operations further comprise receiving the media content item via a video capturing device, wherein the plurality of narrative descriptions are generated in nearly real-time with the receiving of the media content item.

15. The device of claim 14, wherein the video capturing device comprises a wearable camera device.

16. The device of claim 1, wherein the operations further comprise: receiving the media content item via a first channel of a media source network; receiving a second media content item via a second channel of the media source network; and presenting the second media content item, wherein the plurality of narrative descriptions associated with the media content item is generated during the presenting of the second media content item.

17. A machine-readable storage medium, comprising executable instructions that, when executed by a processor, facilitate performance of operations, comprising: determining a plurality of visual events in first media content according to interactions between a plurality of visual features in the first media content; generating a first group of narrative descriptions corresponding to the plurality of visual features and the plurality of visual events, wherein the first group of narrative descriptions are associated with a plurality of time stamps of the first media content; comparing the first group of narrative descriptions to a second group of narrative descriptions associated with second media content to determine a semantic correlation between the first media content and the second media content; and generating a recommendation for the first media content according to the semantic correlation and user affinity information associated with the second media content.

18. The machine-readable storage medium of claim 17, wherein the operations further comprise: comparing the first group of narrative descriptions to a target descriptor; and determining a genre category for the first media content according to a portion of the first group of narrative descriptions substantially matching the target descriptor, wherein the recommendation comprises the genre category.

19. A method, comprising: generating, by a system comprising a processor, narrative descriptions corresponding to visual features, visual events, and interactions there between for media content, wherein the narrative descriptions are associated with time stamps of the media content; and presenting, by the system, the media content and an audio reproduction of the narrative descriptions, wherein the audio reproduction is synchronized to video of the media content according to the time stamps.

20. The method of claim 19, comprising: comparing the narrative descriptions to target event descriptions; extracting a portion of the media content corresponding to a narrative description of media content that substantially matches a target event description; and marking the portion of the media content for selective presentation.