AWS re:Invent 2021 - Large-scale distributed training of media ML models with Amazon FSx

In this session, learn about the challenges of scalable distributed training of media machine learning models on multi-GPU nodes used by Netflix and how the Amazon FSx solution is used to resolve the data loader performance bottlenecks of the training system. See the impressive results in terms of performance and throughput improvements on multi-node GPUs and the scalability of Amazon FSx. Learn more about re:Invent 2021 at Subscribe: More AWS videos More AWS events videos ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—a
Back to Top