Apache Spark: Deep Dive into internal architecture

Your video will begin in 10
Skip ad (5)
IA Yourtube Master rapport

Thanks! Share it with your friends!

You disliked this video. Thanks for the feedback!

Added by admin
10 Views
Join me as we dive into Apache Spark, a data analytics system introduced in a 2010 paper from Berkeley.

Spark is a favorite among software engineers, data scientists, and machine learning engineers. It provides a single platform on top of multiple specialized systems like GraphX, MLlib, and Discretized Streams.

In this video, we see how Spark's in-memory processing outpaces traditional MapReduce, achieving up to 1000x speed improvements, and explore its scalability, performance, and integration with technologies like Kubernetes.

The core principles behind Spark's design are general-purpose data analytics, fault tolerance, and pluggability with various cluster managers and data sources.

This video is for all engineers who want to understand Spark's architecture and use cases.

00:00 What is Apache Spark?
01:56 Key Features
02:50 High-Level Design
06:30 Data Flow
09:44 Conclusion
10:13 Interesting News!

References:
Spark Paper: https://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf
Older Paper: https://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf
Mesos Paper: https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Hindman.pdf
InterviewReady: http://interviewready.io/resources/

Courses:
System Design Simplified Course: https://interviewready.io/course-page/system-design-course
Low-Level Design Course: https://interviewready.io/course-page/low-level-design-course

#ApacheSpark #SystemDesign #ResearchPaper
Category
BOOST YOUR BUSINESS WITH SYSTEME.IO
Tags
system design, interview preparation, interviews

Post your comment

Comments

Be the first to comment