C2090-101: IBM Big Data Engineer

C2090-101: IBM Big Data Engineer

Full Name: IBM Big Data Engineer

Exam Code: C2090-101


IBM Big Data Engineer Exam Summary:


Exam Name
IBM Certified Data Engineer - Big Data
Exam Code 
C2090-101
Exam Price 
$200 (USD)
Duration 
75 mins
Number of Questions 
53
Passing Score 
65%
Books / Training
Sample Questions
Practice Exam

IBM C2090-101 Exam Syllabus Topics:


Topic (Weights) Details 
Data Loading (34%) - Load unstructured data into InfoSphere BigInsights
- Import streaming data into Hadoop using InfoSphere Streams
- Create a BigSheets workbook
- Import data into Hadoop and create Big SQL table definitions
- Import data to HBase
- Import data to Hive
- Use Data Click to load from relational sources into InfoSphere BigInsights with a self-service process
- Extract data from a relational source using Sqoop
- Load log data into Hadoop using Flume
- Insert data via IBM General Parallel File System (GPFS) Posix file system API
- Load data with Hadoop command line utility
Data Security (8%) - Keep data secure within PCI standards
- Uses masking (e.g. Optim, Big SQL), and redaction to protect sensitive data
Architecture and Integration (17%) - Implement MapReduce
- Evaluate use cases for selecting Hive, Big SQL, or HBase
- Create and/or query a Solr index
- Evaluate use cases for selecting potential file formats (e.g. JSON, CSV, Parquet, Sequence, etc..)
- Utilize Apache Hue for search visualization
Performance and Scalability (15%) - Use Resilient Distributed Dataset (RDD) to improve MapReduce performance
- Choose file formats to optimize performance of Big SQL, JAQL, etc.
- Make specific performance tuning decisions for Hive and HBase
- Analyze performance considerations when using Apache Spark
Data Preparation, Transformation, and Export (26%) - Use Jaql query methods to transform data in InfoSphere BigInsights
- Capture and prep social data for analytics
- Integrating SPSS model scoring in InfoSphere Streams
- Implement entity resolution within a Big Data platform (e.g. Big Match)
- Utilize Pig for data transformation and data manipulation
- Use Big SQL to transform data in InfoSphere BigInsights
- Export processing results out of Hadoop (e.g. DataClick, DataStage, etc.)
- Utilize consistent regions in InfoSphere Streams to ensure at least once processing

0 comments:

Post a Comment