Sometimes it feels like I am a god. Hi everyone, today we’ll be talking about how NETFLIX onboards new content onboards new content onto their platform So if you have a TV series or a movie and you want to get it uploaded on NETFLIX, apart from the legal challenges there’s also engineering challenges that NETFLIX solves. I’ll be keeping this video as simple as possible so that the maximum number of viewers can understand what’s going on. But there will be some technical details when it comes to video encoding and other technical processes. Firstly what kind of challenges will we face when we are uploading new content ? Well, we need to store it in different formats sometimes you might be knowing about MP4, AVI and other formats The reason they have this is because different people have different internet connection speeds. So if you have a really good internet connection speed and you can deal with a really difficult format for example a detailed one where the data loss is minimum and you want to see like maximum video quality And then you’ll have something like medium quality and low quality too. So, all of these are nothing but codecs. A codec is a way in which you compress video. So originally, like this video right now, It’s going to be taking a lot of detail, but when I edit this video I’ll make sure that the size of the file is not huge. I’ll try to keep it within 1GB. So that is one type of codec. If I reduce the quality more then the size of the file reduces because it is lossy compression. I’m losing some data to keep the file size smaller. And the second thing that NETFLIX does is play with different resolutions. If you are watching on a cell phone, then the resolution that you need is much lesser than the resolution you need on your “What’s it called” TV or even on your laptop. In this way you’re seeing that a single video has multiple formats and multiple resolutions and each of these formats and resolutions are creating tuples like they are creating pairs. You have high quality 720p The number of formats lets call that F into(multiplied by) the number of resolutions R are the number of videos that you’ll end up processing. If the engineers in the Netflix come up with much better technique of storing data. Let’s say you had high quality requiring you 6GB Now its just requiring you 1GB. Then you take the older movies that you had encoded which are 6GB big. You run them through the new process and it becomes 1 GB. But the thing is this process is going to take some time.So you don’t want to give all this responsibility to a single computer because it’s going to take time and it has a chance of failing. (what if the computer shuts down?) So what netflix does is really interesting and very smart. It takes the original video and breaks it into chunks. Now what you can do with each of these chunks is to run them through different resolutions and different formats. At the end of it, you will have this chunk lets say chunk A .mp4 So that’s a format In resolution 1020.Then you will have A in avi may be 480 and so on and so forth. Effectively you have taken a really big video and broken it into small parts, so that you can deal with it effectively per processor One resolution, one format, one chunk That’s one task. The story of processing these chunks is pretty interesting. Initially what used to happen is, you would have this video file and you would break it into chunks of 3 minutes each.So that’s equal size it looks good because every processor is doing the equal amount of work and you can actually quantify it. But the thing is imagine an action movie and at the 3rd minute the two cars-the villain’s car is just about to overtake the hero’s- and then you have a new chunk. If that’s the case and someone makes an API call for this chunk it’s going to take time. Like initially you are watching this video you come to this point, you get an API call and there is a lag. The user experience is bad because you wanted to see that seamlessly. What they ended up doing is breaking the chunks not based on time stamps but based on scenes. So you can make this instead of 3 minute thing, you can make it much more fine grained 4 secs each. It’s called a shot one shot 4 seconds and you can collate shots, put them all together to create a scene So that’s the car scene you can think about. Instead of having it arbitrarily stop at 3 minutes you collate them into scenes and each scene has a lot of chunks. 4 second long chunks. Right. Now if a person is watching a video and they click on some point.The video suggestion algorithm will take this as one scene. And the user experience will be much better because you get the entire block fetched together. In fact this algorithm is much more complicated. What happens is netflix sees the entire movie and treats it like a set of chunks. If you arbitrarily go to points then netflix assumes that this movie is a sparse movie, in the sense that you go one point and you see a scene and then you head to next point and then you see a scene and so on and so forth So its recommendation algorithm, its prediction algorithm is going to say that this is a sparse movie or sparsely seen movie and what we should be doing is not trying to be too smart not trying to be sending a lot of data, instead just send the data that the user has asked for because they are probably clicking on different points in that buffer that you get. On the other hand if it’s a very engaging movie lets say, I don’t know whats an engaging movie but something that is dense movie meaning that people are watching it for a continuous period of time and you can easily say that you know linearly that this part is going to be picked up next. Then this is called a dense movie. Instead of sending just the part that you have asked for it predicatively, proactively fetches the future parts, gets it onto your computer and shows it to you. If you are wondering where netflix stores all this data, then its like google drive called Amazon S3 Something that nearly all the engineers know. This is where people store their static data meaning that you don’t change that data, you can go and store stuff. It’s extremely cheap compared to a database because a database has updates and gives you other guarantees also. So Amazon S3 is what netflix uses to store that video content. The most interesting thing about netflix is that they were able to bring up an innovative solution to something that was there in the internet space for ages. You know about internet service providers. If you go on your browser right now and type facebook.com. What’s going to happen is that you will talk to your internet service provider. They have a list of addresses.They map that to IP addresses. So if you facebook.com,its mapped to an IP address: they have a table over here, which maps it. And this IP address is, you can assume it to be physical place. Its actually a computer some where on the internet which is giving you Facebook. So you are literally talking to Facebook when you say facebook.com. So that’s, let’s say, over here. Very similarly when you say Netflix, it is an IP address. It’s going to be taking you to a computer which gives you Netflix or is Netflix basically. So you can actually, end up chatting with it maybe. But Netflix exists somewhere and every time you ask your internet service provider to talk to netflix, it goes and talks to that computer and then returns you the response. These servers are usually in the U.S which means they are geographically concentrated. In a place like India which is really far its going to take a lot of time to send a signal and then receive it especially if its video because there is a lot data which is going to be coming in and its going to be slow. So to improve on user experience,one of the principle things you do as an engineer is to cache information. which means you pre-compute and store it in some place. Let’s say sacred games comes out in India You want to watch that, you put in in a cache. Now Netflix extended the concept and applied it to ISP’s. So what the ISP does is that when ever it gets a request from India, let’s say and its a movie which is from Bollywood, they won’t go and hit the Netflix U.S server just like that. They are going to be asking a cache which has been placed by netflix.This is called a Open Connect box. In this box, you are going to have a ton of movies. You can assume this to be something like a hard drive and if you find the movie here, that’s well and good you just return it quickly. So that’s a lot of bandwidth which was saved hitting the netflix server, that’s a lot time which was saved that’s much better user experience and also this is localized. So for India you can keep separate movies for Britain you can have different movies, for U.S you can have different movies. This is a brilliant concept because what you have done is reduced the load on not just you but also the ISP’s. So they really want to have these boxes. Every time you hit netflix and get a really quick response,you end up assuming that your ISP guy is a really nice guy. Its gone upto such an extent that around 90% of netflix trafic is taken care of by these ISP boxes that they provide. They are called open connect and this technology is revolutionary not so much who knows but youtube is also doing this. I think youtube red boxes come up with ISP again saving a lot of bandwidth for them and really improving user experience in a lot of places. And also of course you can keep all your local popular movies in this box. In that way the user’s here are going to be hitting this box far more often than they are going to be hitting this. Sometimes you do need some content change because something new has come up, a new series or a new movie in that case what you can do is, around 4 am in the night is a good time:The load on boxes is minimum. So you can have a lot of write operations being sent in from the U.S server, so it will suggest you what to copy. 1) You register your movie on netflix, 2) netflix processes them the same way that we talked about. 3) After it has been brought down to chunks 4) It sends them to your ISP or maybe it can directly send it over here and populate this box with these new movie chunks. That way this box has the latest content and the users are happy. So its the innovative menthods on the video processing and the video serving side which keep netflix running at scale.If you think about 90% of your requests are being taken care of by this box. So that is a superb gain and its a really innovative solution.We will be having a lot more videos like this which is system designing in the real world.This is the interesting bit and of course if you have any doubts or suggestions,you can leave them in the comments below. If you like the video then make sure to hit the like button and if you want notifications for further videos like this, hit the subscribe button I’ll see you next time ๐Ÿ™‚

NETFLIX System Design: How does Netflix onboard new content?

100 thoughts on “NETFLIX System Design: How does Netflix onboard new content?

  • August 30, 2019 at 4:51 pm
    Permalink

    OMG!!!! the opening๐Ÿ˜‚๐Ÿ˜‚

    Reply
  • August 30, 2019 at 4:53 pm
    Permalink

    Awesome Video ๐Ÿ˜

    Reply
  • August 30, 2019 at 4:57 pm
    Permalink

    I was waiting for this video, awesome

    Reply
  • August 30, 2019 at 4:59 pm
    Permalink

    Bhaiya…plz…do not use the teleportation hack when u r writing…it gives us…some time to write and understand too….and also concentration remains same….๐Ÿ™๐Ÿ™๐Ÿ™….plz bhaiya…it's my request…bcoz I am suffering…๐Ÿ™๐Ÿ™๐Ÿ™

    Reply
  • August 30, 2019 at 5:03 pm
    Permalink

    Hi, some good content there. Isn't the cache box that your refer to as open connect is a CDN or part of CDN to be precise ? If not how is it different from a CDN ?

    Reply
  • August 30, 2019 at 5:04 pm
    Permalink

    Great video gaurav. Keep up. I guess open connect is a CDN, RIGHT?

    Reply
  • August 30, 2019 at 5:13 pm
    Permalink

    Great content.. keep up the good work..

    Reply
  • August 30, 2019 at 5:26 pm
    Permalink

    Wow, excellent video. Love your unique style of breaking down the topics in chunks and explaining them neatly. Keep them coming.

    Reply
  • August 30, 2019 at 5:26 pm
    Permalink

    Great, Nicely explained !!!

    Reply
  • August 30, 2019 at 5:33 pm
    Permalink

    Please shed some light on Docker, kubernetes and rancher!

    Reply
  • August 30, 2019 at 5:35 pm
    Permalink

    Great video ,Does local storage made at ISP's or Netflix replicated storage Data centers ? because ,if Netflix stores data at ISP's what gurantees them the security ?

    Reply
  • August 30, 2019 at 5:43 pm
    Permalink

    Nice video!! Video request about Design Online food ordering service like Uber eats and explain how to integrate it with existing Uber ride-sharing service.

    Reply
  • August 30, 2019 at 5:49 pm
    Permalink

    Directi -> Uber -> Netflix . I see where you are going :p

    Reply
  • August 30, 2019 at 5:57 pm
    Permalink

    So i was right about the new topic. Transcoding it is!

    Reply
  • August 30, 2019 at 5:57 pm
    Permalink

    Have a look at http://highscalability.squarespace.com/blog/2017/12/11/netflix-what-happens-when-you-press-play.html
    and https://medium.com/netflix-techblog for more detailed, jargon..ed and time consuming version. ๐Ÿ˜‰
    Thanks Gaurav. Keep going (y) <3

    Reply
  • August 30, 2019 at 6:00 pm
    Permalink

    Awesome Video. Very good explanation. Video content caching is not an easy job. The video is really good.

    Reply
  • August 30, 2019 at 6:03 pm
    Permalink

    Gourav, thank you.
    Can you please make a video swiggy (example) payments and it handles different response like success, failed, or pending.

    Reply
  • August 30, 2019 at 6:07 pm
    Permalink

    As i have some experience in this area, i would like to say that you delivered crisp content on the same. Good job!

    Reply
  • August 30, 2019 at 6:53 pm
    Permalink

    Great

    Reply
  • August 30, 2019 at 7:17 pm
    Permalink

    I saw this in linusTechtips 2years ago .

    Reply
  • August 30, 2019 at 7:45 pm
    Permalink

    I had a doubt if some one gets the access for open connect as it is a local cache then there is big potential problem right there. Any why we should require an open connect it means you have your data in netflix as well as in open connect. it means wastage of data same data is presemt at 2 places

    Reply
  • August 30, 2019 at 7:55 pm
    Permalink

    One question.
    There are several isp in india so does Netflix type of cache it in each and every isp's and is it safe because any since it's the cache of the isp's the contents can easily be confiscated or does Netflix takes care it the security by itself?

    Reply
  • August 30, 2019 at 9:08 pm
    Permalink

    Hi Gaurav!
    Neatly presented ideation of Netflix! I have a question about open connect as talked in your video.
    1. How movie will be inserted in open connect if itโ€™s out of the box of interest of user? E.g. I want to see Korean series.
    A. What algo they use to prevent load for this kind of interest and how they figure out?
    B. Do they use ML for next chunk prediction or explicit programming?
    2. Which recommendation algo they use for predictions of favourites?

    Thanks ๐Ÿ™๐Ÿป and keep it up!

    Reply
  • August 30, 2019 at 10:12 pm
    Permalink

    Your videos are pure delight to watch !

    Reply
  • August 30, 2019 at 10:24 pm
    Permalink

    I just love your system design videos.

    Reply
  • August 30, 2019 at 10:51 pm
    Permalink

    1020 or 1080 ?

    Reply
  • August 31, 2019 at 2:16 am
    Permalink

    Its DNS not ISP

    Reply
  • August 31, 2019 at 2:40 am
    Permalink

    I was always more amused about Hotstar instead of Netflix.

    Reply
  • August 31, 2019 at 4:03 am
    Permalink

    keep up the awesome content!

    Reply
  • August 31, 2019 at 4:24 am
    Permalink

    Does this chunks is also used in music apps like Gaana or Spotify

    Reply
  • August 31, 2019 at 4:24 am
    Permalink

    Quality content!

    Reply
  • August 31, 2019 at 5:09 am
    Permalink

    Have been working in this arena for a while now, you got everything correct man, other stuff you post usually goes over my head because I haven't dabbled in a lot of those things but for once, felt nice to already know what you were gonna say. Haha.

    Reply
  • August 31, 2019 at 5:38 am
    Permalink

    To be honest learned something useful today, and willing to learn from you more like this kinda of topic!

    Reply
  • August 31, 2019 at 5:59 am
    Permalink

    Is this similar or related to what Netflix came out with, "zull architecture"

    Reply
  • August 31, 2019 at 6:52 am
    Permalink

    You have the most unique content on Youtube, man. No bs, pure knowledge. Keep going.

    Reply
  • August 31, 2019 at 7:03 am
    Permalink

    Hey gaurav nice video. Just one doubt. Isn't the client and open connect/netflix servers talk directly once ISP found the location of site ? I see you are speaking abt ISP taking load. Can you explain more on this

    Reply
  • August 31, 2019 at 7:51 am
    Permalink

    amazing stuff….you have got a new subscriber….

    Reply
  • August 31, 2019 at 7:57 am
    Permalink

    Tnanks Gaurav

    Reply
  • August 31, 2019 at 10:37 am
    Permalink

    Cache or Content Delivery Network?

    Reply
  • August 31, 2019 at 10:52 am
    Permalink

    Awesome explanation gaurav , even though i m not learning system design i learnt a lot about the way u taught the concepts and correlated to some of the concepts currently i have been studying , ๐Ÿ™‚ thnkx bro , loved ur 'follow ur passion' blog on wordpress ๐Ÿ™‚

    Reply
  • August 31, 2019 at 11:49 am
    Permalink

    Why did you tone down the technical aspects by this much for this video?

    Reply
  • August 31, 2019 at 12:10 pm
    Permalink

    nice video thanks for sharing. how do you get to know this?

    Reply
  • August 31, 2019 at 1:08 pm
    Permalink

    Great video Gaurav. Keep up the work. Can you please do a system design video on amazon subscription

    Reply
  • August 31, 2019 at 2:08 pm
    Permalink

    what is happening to you bro…you don't look good. You seems tired and working a lot..please do something about it.

    Reply
  • August 31, 2019 at 2:31 pm
    Permalink

    Netflix subscribers : We are the coolest people living on this planet ! We watch netflix and chill
    Netflix Engineers : Hold my(our) Beer !

    Reply
  • August 31, 2019 at 2:39 pm
    Permalink

    Hello, how you get this kind of information??

    Reply
  • August 31, 2019 at 3:06 pm
    Permalink

    What about user authentication and authorisation when dealing with open connect??

    Reply
  • August 31, 2019 at 3:40 pm
    Permalink

    Awesome content man…. Keep going..๐Ÿ‘๐Ÿ‘

    Reply
  • August 31, 2019 at 4:38 pm
    Permalink

    Man, I learned a lot. Colleges should teach students like this. Thank you very much for this. ๐Ÿ™‚

    Reply
  • August 31, 2019 at 4:47 pm
    Permalink

    Thats a really great explanation. Thanks

    Reply
  • August 31, 2019 at 6:17 pm
    Permalink

    I guess Netflix also use AWS Edge location as well to cache the S3 content and serve local content as well.

    Reply
  • August 31, 2019 at 6:18 pm
    Permalink

    steam

    Reply
  • August 31, 2019 at 6:58 pm
    Permalink

    Can u suggest the best book for understanding and also learning System Design questions, which might also help in interviews

    Pls am in a great need for it….pls

    Reply
  • August 31, 2019 at 9:46 pm
    Permalink

    Thanks for the video Gaurav. What I knew was that Netflix has servers for different regions and not just in the US. Also will not the Open connect or the cache will ever get full? What about when many users are accessing at the same time, like on weekends?

    Reply
  • August 31, 2019 at 9:50 pm
    Permalink

    Superb explanation bro

    Reply
  • September 1, 2019 at 3:57 am
    Permalink

    8:14 Can we do this with CDN ? Are CDN and Cache similar?

    Reply
  • September 1, 2019 at 7:39 am
    Permalink

    hey ur awesome!

    Reply
  • September 1, 2019 at 10:44 am
    Permalink

    Brief and precise.. Nice.

    Reply
  • September 1, 2019 at 12:09 pm
    Permalink

    Thanks for this wonderful insight on the engineering side of Netflix, looking forward to more system design videos.

    Reply
  • September 1, 2019 at 1:29 pm
    Permalink

    Great content always!

    Reply
  • September 1, 2019 at 1:45 pm
    Permalink

    How much money is required to set up netflix system??

    Reply
  • September 1, 2019 at 2:48 pm
    Permalink

    must say u made it really Interesting with editing ๐Ÿ˜› and great content!

    Reply
  • September 1, 2019 at 2:59 pm
    Permalink

    Wait. Do we even have 1020?

    Reply
  • September 1, 2019 at 3:17 pm
    Permalink

    Let's talk about tik tok. Why It's so fuckingg fast even in low connection!! What engineering they use and how they manage all.

    Reply
  • September 1, 2019 at 4:48 pm
    Permalink

    Video content is awesome.
    However it's not 1020p the correct standard is 1080p

    Reply
  • September 1, 2019 at 8:18 pm
    Permalink

    Hello Gaurav, can you tell me how caching at the ISP level different from CDN / Amazon Cloudfront ?

    Reply
  • September 1, 2019 at 11:18 pm
    Permalink

    What about the security of such cache boxes? Given that netflix thrives on content… Keeping these movies safe is paramount and the more you distribute the files the more risk you're at. I'm sure they have a way… Just wondering how different would it be to the usual security?

    Reply
  • September 2, 2019 at 3:43 am
    Permalink

    That was great explanation..
    But one question..
    Why going back in movie is time consuming for more than 10 sec?

    Reply
  • September 2, 2019 at 4:16 am
    Permalink

    Another informative video from you. Nice explanation. Thanks. Just wanted to add something on the caching part. Many websites take help of CDN providers like Akamai to do the caching on behalf of them. The CDN providers have the required infrastructure across the globe wherein they have placed their caching servers in most of the countries.

    Reply
  • September 2, 2019 at 4:41 am
    Permalink

    One request.. can't you speak slowly while ?

    Reply
  • September 2, 2019 at 6:49 am
    Permalink

    Don't be surprised if you hit 500K subscribers by the end of 2019. Your content quality is skyrocketing.

    Reply
  • September 2, 2019 at 7:20 am
    Permalink

    Very detailed. Great work!

    Reply
  • September 2, 2019 at 11:17 am
    Permalink

    Domain [1:40, 1:48] ๐Ÿ˜‚๐Ÿ˜‚๐Ÿ˜‚๐Ÿ˜‚

    Reply
  • September 2, 2019 at 12:16 pm
    Permalink

    Great brother more videos please

    Reply
  • September 2, 2019 at 12:48 pm
    Permalink

    Is the caching done also using AWS CloudFront CDN endpints by Netflix, or do they use Open-connect only ?

    Reply
  • September 2, 2019 at 1:33 pm
    Permalink

    nice your videos are getting interactive like that of sivaraj .
    keep it up

    Reply
  • September 2, 2019 at 2:01 pm
    Permalink

    So well explained ๐ŸคŸ

    Reply
  • September 2, 2019 at 2:54 pm
    Permalink

    Aren't those boxes which serve 90% of content are CDNs'? And how will Netflix decide which videos' should be stored in the boxes near to the region?

    Reply
  • September 2, 2019 at 6:09 pm
    Permalink

    How does that make Open Connect different from a CDN? What you end up achieving is more or less the same. (Maybe CDN won't lead to 90% localization of traffic due to some limitation!?)
    https://openconnect.netflix.com/en/

    Reply
  • September 2, 2019 at 6:38 pm
    Permalink

    Open connect is similar to something like Redis ?

    Reply
  • September 2, 2019 at 7:47 pm
    Permalink

    Nice, bother talking about zero downtimes ?

    Reply
  • September 2, 2019 at 8:58 pm
    Permalink

    Awesome man! Simple and to the point. The best kind of online content.

    Reply
  • September 3, 2019 at 4:07 am
    Permalink

    I recommend reading this article:

    โ€œHow Netflix works: the (hugely simplified) complex stuff that happens every time you hit Playโ€ by Mayukh Nair https://link.medium.com/hOGSFphJFZ

    Reply
  • September 3, 2019 at 5:34 am
    Permalink

    Hey Gaurav. What was that about the 4AM time? I couldnt understand what will i understand if I netflixed at 4AM

    Reply
  • September 3, 2019 at 6:12 am
    Permalink

    Trivago explain bro

    Reply
  • September 3, 2019 at 9:27 am
    Permalink

    7:15 That table is called a DNS

    Reply
  • September 3, 2019 at 10:28 am
    Permalink

    It's amazing to see how Guarav improves the quality of his content, and I can tell you, guys, as a newbie tech Youtuber, it's a big deal. Keep it up, bro! ๐Ÿ˜‰

    Reply
  • September 3, 2019 at 12:00 pm
    Permalink

    Netflix is hosted on Amazon cloud , we have highly available and scalable infrastructure .

    Reply
  • September 3, 2019 at 4:34 pm
    Permalink

    please please put subtitles in your videoes

    Reply
  • September 3, 2019 at 5:24 pm
    Permalink

    How ISP know what I am requesting to any website? Isn't it a encrypted request? ISP do know which website you want to visit but not what you want from a site?

    Reply
  • September 4, 2019 at 3:21 am
    Permalink

    Where is this Open Connect placed and maintained like for say Indian users?…Like Uber does for updating it's db about uber rides…

    Reply
  • September 4, 2019 at 3:53 am
    Permalink

    Hi gaurav, please make video on stepsetgo app..how the whole model work.

    Reply
  • September 4, 2019 at 8:02 am
    Permalink

    Great content, bro.
    I would like to see a video about AWS architecture

    Reply
  • September 4, 2019 at 10:37 am
    Permalink

    Hey Gaurav, loved your video. I had one suggestion, there is this technology called pixel shuffling, which is used to reconstruct the video playback. Can you dwelve deeper into how is that being done?
    Thanks a lot

    Reply
  • September 4, 2019 at 4:36 pm
    Permalink

    Since you said the Open Connect box just has data, how are the requests being authenticated?

    Reply
  • September 4, 2019 at 4:49 pm
    Permalink

    But their are a lot of ISPs in any country.all they have separate memory boxes?

    Reply
  • September 4, 2019 at 7:24 pm
    Permalink

    What is so revolutionary about Open Connect boxes? Its nothing but placing your localised servers close to the clients isn't it ? I will be shocked to know it had not been prevelant before Netflix did

    Reply
  • September 4, 2019 at 8:31 pm
    Permalink

    Kabhiu kabhi lagta hai ke apun hee bhagwan hai!

    Reply
  • September 5, 2019 at 4:03 am
    Permalink

    Isn't this ISP thing similar to CDNs anyway? What is the difference?

    Reply
  • September 5, 2019 at 6:17 am
    Permalink

    Happy Teachers Day Sir! U taught me a lot! I really appreciate that!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *