Section: Micro-certificate 1: Foundations of Scalable Big Data Systems and Distributed Intelligence | CCS5202 Big Data Technology

This micro-credential provides a comprehensive, intermediate-level foundation in big data. It progresses from core conceptual definitions and applications to hands-on infrastructure management (Hadoop & NoSQL). Students then advance to batch processing techniques (via MapReduce programming) and wrap up with live data streams and real-time technology using Apache Storm.

- Tutup Buka
  Digital Badge 1: Big Data Concepts
  
  Select activity Big Data Concepts (slides)
  
  Big Data Concepts (slides) File
  
  Big Data is defined by its massive Volume, high Velocity, and diverse Variety, representing an ocean of information that traditional software simply cannot handle. To make sense of this chaos, Big Data Computing utilizes distributed frameworks like Hadoop and Apache Spark, which break tasks into smaller chunks and process them across clusters of servers simultaneously. This computational power transforms raw data into Applications that define modern life, from Healthcare systems that predict disease outbreaks to Retail algorithms that personalize shopping experiences and Finance tools that detect fraud in real-time. Ultimately, Big Data is the engine of the 21st century, turning digital noise into the actionable insights that power smart cities and global economies.
  
  Select activity Foundations of Big Data infographic
  
  Foundations of Big Data infographic Interactive Content
  
  The infographics describe in detail the foundation layers of big data. Also, it shows the key enablers and tools associated with big data.
  
  Select activity Activities for Big Data foundations
  
  Activities for Big Data foundations Interactive Content
  
  Select activity Assignment on Big Data Foundations
  
  Assignment on Big Data Foundations
  
  Opened: Sabtu, 16 Mei 2026, 12:00 AM
  
  Due: Sabtu, 23 Mei 2026, 12:00 AM
  
  Select activity Multiple Choice Questions Assessment.
  
  Multiple Choice Questions Assessment. Quiz
- Tutup Buka
  Digital Badge 2: Big Data Modelling and Management
  
  Select activity Big Data Modelling (slides)
  
  Big Data Modelling (slides) File
  
  The slides describe the core components and processes involved in big data management and modelling. It outlines the Hadoop architecture, highlighting how systems like HDFS and YARN handle distributed storage and resource scheduling. The visual further breaks down data processing services that utilize programming frameworks such as MapReduce, Apache Spark, and Flink for high-speed analysis. Additionally, it compares query-based processing tools like Hive and Presto, which facilitate efficient data retrieval through SQL-like languages.
  
  Select activity Assignment on Big Data Modelling and Management
  
  Assignment on Big Data Modelling and Management
  
  Opened: Sabtu, 16 Mei 2026, 12:00 AM
  
  Due: Sabtu, 23 Mei 2026, 12:00 AM
- Tutup Buka
  Digital Badge 3: Big Data Processing Technology
  
  Select activity Big Data Processing (slide)
  
  Big Data Processing (slide) File
  
  The slides outline the five essential stages required for efficient big data processing, beginning with the extraction and validation of information from various sources. It details how data is transformed into usable formats and loaded into centralized systems before being analyzed through visualization and business intelligence tools. The final phase emphasizes the role of machine learning, explaining how supervised, unsupervised, and reinforcement learning help automate pattern recognition and predictive modeling.
  
  Select activity Map Reduce Explained (video)
  
  Map Reduce Explained (video) Interactive Content
  
  The provided video explores the essential structure and logic behind the MapReduce architectural model. It outlines the fundamental principles that govern how the system processes large-scale data across distributed networks. By examining the individual components that make up the framework, the text illustrates how hardware and software work in tandem. Additionally, the source addresses the programming methodologies required to implement and manage these complex data tasks. Overall, the documentation serves as a comprehensive guide to understanding the internal mechanics of this data processing technology.
  
  Select activity Map Reduce pt 1 (slides)
  
  Map Reduce pt 1 (slides) File
  
  The provided text outlines the core structural and operational elements of the MapReduce architecture. It highlights the fundamental conceptual theories that underpin this specific data processing model. Additionally, the source identifies the physical components required to build and maintain the system's infrastructure. It also addresses the software development side by focusing on the specialized programming techniques used within the framework.
  
  Select activity Map Reduce pt 2 (slides)
  
  Map Reduce pt 2 (slides) File
  
  The slides is continuing the first part slides. Overall, these excerpts serve as a foundational guide for understanding how the technology organizes and executes large-scale tasks. Through these three focus areas, the material offers a comprehensive look at the mechanics of distributed computing.

Digital Badge 4: Real-Time Data Processing Technology

Real-time data processing in big data refers to the continuous ingestion, analysis, and streaming of data as it is generated. Unlike traditional batch processing, which collects and processes large volumes of data at scheduled intervals, real-time processing operates on data almost instantly often within milliseconds or seconds.

Select activity Real Time Data (slides)

Real Time Data (slides) File

The slides explore the fundamental principles and practical applications of real-time big data processing, a method that prioritizes instantaneous data analysis over traditional delayed batching. By utilizing advanced frameworks like the Lambda or Kappa architectures, organizations can ingest and evaluate continuous information streams to achieve operational agility.
Select activity Real-time architecture (slides)

Real-time architecture (slides) File

This text explores the fundamental principles and technical frameworks behind real-time data processing, focusing on systems that analyze information instantly as it is generated. It highlights the transition from traditional batch processing to continuous data streaming, emphasizing the need for low latency and scalability in modern digital environments.

Kursus CCS5202 Big Data Technology: Foundations of Scalable Big Data Systems and Distributed Intelligence

Section outline

Digital Badge 1: Big Data Concepts

Digital Badge 2: Big Data Modelling and Management

Digital Badge 3: Big Data Processing Technology

Digital Badge 4: Real-Time Data Processing Technology