there would be many possible ways of doing this, but what comes to mind is something like this:
- You have a corpus of videos, and you use a Cloud API to label those videos. For example you could use something like VideoAI to label all of the objects, places, and actions in a video.
- You have a basic knowledge graph of the objects, places, and actions which constitute "violence" in your situation. For example, this might include things like guns, knives, "civil unrest", and so on.
- Each time you see a new video, you extract metadata about it with the Cloud API, and you put that into the knowledge graph, linked to all of the different topics and concepts embedded in the video
- You then trace whether or not paths exists between things depicted in the video, and stuff labeled "violence" in your knowledge graph. You might for example assign points based on how many instances they are, and how long of a time period they occur for in the video. More mentions for longer times = higher score.
High scoring videos are violent.