Celebrity recognition using AWS Rekognition
June 30, 2020
OTT platforms all over the world are adding heaps of content. There is also a lot of old TV and cinema content getting digitized. Because these videos are not tagged properly with meta data of artists, it’s really difficult for viwers to discover content. That is a problem for OTT players as well because they are hosting content which is hardly getting watched.
In this post I’ll attempt to use AWS Rekognition to identify artists in the video by timestamps.
Pre-requisites #
- You need an AWS account
- You should have python installed
- You should have aws-cli installed
Configure AWS cli #
Create an access key and configure. Follow this link for instructions
Install dependencies #
From your linux shell or windows commandline, install-
pip install boto3 youtube-dl
Download the video #
I recently watched Panchayat
on Amazon Prime. It is now one of my favorite web-series so we will use ‘Panchayat’ trailer for this exercise-
Download the video using youtube-dl
command line utility-
- List all formats available
$ youtube-dl -F https://www.youtube.com/watch?v=mojZJ7oeD_g
Output:
[youtube] mojZJ7oeD_g: Downloading webpage
[info] Available formats for mojZJ7oeD_g:
format code extension resolution note
249 webm audio only tiny 55k , opus @ 50k (48000Hz), 906.13KiB
250 webm audio only tiny 71k , opus @ 70k (48000Hz), 1.13MiB
140 m4a audio only tiny 130k , m4a_dash container, mp4a.40.2@128k (44100Hz), 2.21MiB
251 webm audio only tiny 138k , opus @160k (48000Hz), 2.17MiB
..... Lot many formats ......
18 mp4 640x360 360p 470k , avc1.42001E, 25fps, mp4a.40.2@ 96k (44100Hz), 8.03MiB
22 mp4 1280x720 720p 1010k , avc1.64001F, 25fps, mp4a.40.2@192k (44100Hz) (best)
- MP4 is one of the supported formats. We will use this format to avoid any surprises later-
$ youtube-dl -f 22 https://www.youtube.com/watch?v=mojZJ7oeD_g
output:
[youtube] mojZJ7oeD_g: Downloading webpage
[download] Destination: Panchayat - Official Trailer _ New Series 2020 _ TVF _ Amazon Prime Video-mojZJ7oeD_g.mp4
[download] 100% of 17.27MiB in 00:00
Create a S3 bucket on AWS #
For Rekognition to work, the videos should reside in regions where Rekognition works. We’ll go ahead and create a bucket in Singapore.
Choose a uique name
$ aws s3api create-bucket --bucket video-bucket-unique-name --region ap-southeast-1 --create-bucket-configuration LocationConstraint=ap-southeast-1
Output:
{
"Location": "http://video-bucket-unique-name.s3.amazonaws.com/"
}
Upload video to the bucket #
aws s3 cp "Panchayat - Official Trailer _ New Series 2020 _ TVF _ Amazon Prime Video-mojZJ7oeD_g.mp4" s3://video-bucket-unique-name/video.mp4
Output:
upload: ./Panchayat - Official Trailer _ New Series 2020 _ TVF _ Amazon Prime Video-mojZJ7oeD_g.mp4 to s3://video-bucket-unique-name/video.mp4
Start video recognition #
Save below script as rekognition.py
and run it as python rekognition.py
.
The script -
- Starts the celebrity recognition process in background, we have to track the progress using
JobId
- Once the status is
SUCCEEDED
, it reads the Rekognition output - Rekognition output in JSON format is the parsed to print celebrities along with time stamps
import time
import boto3
video_name = "video.mp4"
bucket = "video-bucket-unique-name"
client = boto3.client('rekognition', region_name='ap-southeast-1')
youtube_id = "mojZJ7oeD_g"
response = client.start_celebrity_recognition(
Video={
'S3Object': {
'Bucket': bucket,
'Name': video_name,
}
},
)
job_id = response["JobId"]
print("Rekognition started with job-id: {}".format(job_id))
print("Please wait for 20-30 seconds".format(job_id))
celeb_status = ""
while(celeb_status not in ['SUCCEEDED', 'FAILED']):
r = client.get_celebrity_recognition(JobId=job_id)
celeb_status = r['JobStatus']
time.sleep(10)
print("Celebrity recognition completed")
for celebrity in r['Celebrities']:
timestamp = celebrity['Timestamp']
name = celebrity['Celebrity']['Name']
print("- {} : {}s : [Link](https://youtube.com/watch?v={}&t={})".format(name, timestamp//1000, youtube_id, timestamp//1000) )
Output:
- Jitendra Kumar : 5s : Link
- Jitendra Kumar : 6s : Link . . Detailed output at the end
Analysing the output #
Considering Jeetu, Neena, Raghuwir and Biswapati are not so popular artists (I admire them a lot but thats a fact), AWS Rekognition has done a great job of identifying artists correctly. Let us look at some of the false positives.
All the three instances below have a strikingly similar facial features. Even we as humans might have failed in some cases. So all in all this is great cloud API for OTT players to generate meta deta for their videos.
Detected Artist | Artist’s Photo | From Video |
---|---|---|
Alireza Ghorbani | ||
Matt Garrison | ||
Will Skelton |
Detailed output #
Click to Expand
- Jitendra Kumar : 5s : Link
- Jitendra Kumar : 5s : Link
- Jitendra Kumar : 6s : Link
- Biswapati Sarkar : 9s : Link
- Biswapati Sarkar : 9s : Link
- Biswapati Sarkar : 10s : Link
- Jitendra Kumar : 10s : Link
- Jitendra Kumar : 11s : Link
- Jitendra Kumar : 14s : Link
- Jitendra Kumar : 15s : Link
- Jitendra Kumar : 16s : Link
- Neena Gupta : 18s : Link
- Alireza Ghorbani : 18s : Link
- Neena Gupta : 18s : Link
- Alireza Ghorbani : 19s : Link
- Neena Gupta : 19s : Link
- Alireza Ghorbani : 19s : Link
- Neena Gupta : 19s : Link
- Raghubir Yadav : 19s : Link
- Raghubir Yadav : 20s : Link
- Raghubir Yadav : 20s : Link
- Raghubir Yadav : 21s : Link
- Biswapati Sarkar : 23s : Link
- Biswapati Sarkar : 24s : Link
- Nagma : 24s : Link
- Biswapati Sarkar : 24s : Link
- Jitendra Kumar : 29s : Link
- Jitendra Kumar : 35s : Link
- Jitendra Kumar : 35s : Link
- Jitendra Kumar : 36s : Link
- Raghubir Yadav : 39s : Link
- Raghubir Yadav : 39s : Link
- Raghubir Yadav : 40s : Link
- Raghubir Yadav : 40s : Link
- Neena Gupta : 41s : Link
- Neena Gupta : 41s : Link
- Neena Gupta : 42s : Link
- Neena Gupta : 42s : Link
- Raghubir Yadav : 43s : Link
- Raghubir Yadav : 43s : Link
- Raghubir Yadav : 44s : Link
- Raghubir Yadav : 44s : Link
- Neena Gupta : 45s : Link
- Neena Gupta : 45s : Link
- Neena Gupta : 46s : Link
- Neena Gupta : 46s : Link
- Jitendra Kumar : 49s : Link
- Jitendra Kumar : 49s : Link
- Akhilendra Mishra : 50s : Link
- Akhilendra Mishra : 50s : Link
- Jitendra Kumar : 51s : Link
- Jitendra Kumar : 51s : Link
- Anubhav Mohanty : 54s : Link
- Jitendra Kumar : 56s : Link
- Jitendra Kumar : 56s : Link
- Jitendra Kumar : 56s : Link
- Jitendra Kumar : 57s : Link
- Jitendra Kumar : 59s : Link
- Jitendra Kumar : 60s : Link
- Jitendra Kumar : 60s : Link
- Jitendra Kumar : 61s : Link
- Jitendra Kumar : 62s : Link
- Jitendra Kumar : 63s : Link
- Jitendra Kumar : 63s : Link
- Jitendra Kumar : 64s : Link
- Jitendra Kumar : 64s : Link
- Jitendra Kumar : 65s : Link
- Biswapati Sarkar : 66s : Link
- Biswapati Sarkar : 66s : Link
- Biswapati Sarkar : 67s : Link
- Jitendra Kumar : 79s : Link
- Jitendra Kumar : 80s : Link
- Jitendra Kumar : 81s : Link
- Jitendra Kumar : 85s : Link
- Jitendra Kumar : 88s : Link
- Raghubir Yadav : 88s : Link
- Bappaditya Bandopadhyay : 88s : Link
- Jitendra Kumar : 89s : Link
- Raghubir Yadav : 89s : Link
- Bappaditya Bandopadhyay : 89s : Link
- Neena Gupta : 89s : Link
- Jitendra Kumar : 92s : Link
- Jitendra Kumar : 96s : Link
- Jitendra Kumar : 97s : Link
- Matt Garrison : 97s : Link
- Matt Garrison : 97s : Link
- Matt Garrison : 97s : Link
- Matt Garrison : 98s : Link
- Jitendra Kumar : 100s : Link
- Jitendra Kumar : 100s : Link
- Raghubir Yadav : 107s : Link
- Will Skelton : 107s : Link
- Raghubir Yadav : 107s : Link
- Will Skelton : 107s : Link
- Jitendra Kumar : 109s : Link
- Neena Gupta : 109s : Link
- Serdar Bilgili : 110s : Link
- Jitendra Kumar : 110s : Link
- Raghubir Yadav : 110s : Link
- Jitendra Kumar : 111s : Link
- Raghubir Yadav : 111s : Link
- Will Skelton : 116s : Link
- Will Skelton : 116s : Link
- Will Skelton : 117s : Link
- Will Skelton : 117s : Link
- Will Skelton : 117s : Link
- Will Skelton : 118s : Link
- Will Skelton : 118s : Link
- Jitendra Kumar : 118s : Link
- Jitendra Kumar : 119s : Link
- Jitendra Kumar : 119s : Link
- Jitendra Kumar : 119s : Link
- Will Skelton : 120s : Link
- Will Skelton : 121s : Link
- Will Skelton : 123s : Link
- Will Skelton : 123s : Link
- Jitendra Kumar : 125s : Link
- Jitendra Kumar : 126s : Link