Los Angeles
9:00 -12:00 am,13:00-16:30 pm
October 26th- 29th 2016
CDA (Certified Data Analyst) Institute aims to build the world-standard professional Data Analysts’ team to promote the development of the business and technology.
Decision science is the brains behind Big Data Analytics. It's not a new field; people have been doing decision science for decades. In the "old days" data was limited, hence complex algorithms were needed to extract useful insights from the data. Given the shift in the data paradigm, we no longer need very complex algorithms. Instead, we need to run simple stuff at scale.
At the end of the day, data science is all about counting smart. In this course, we will learn the essential skills for modern development, and implement industry standard algorithms on large datasets. And, all the while, we will keep it simple.
Decision Science is a mix of Computer Science, Statistics, and Management Skills. In this bootcamp, we will focus on Computer Science (C) and Statistics (S) based on business understanding.
Big Data Bootcamp of CDA is offering 4 day extensive bootcamp on Decision Science of Big Data. This is a fast paced, vendor agnostic, technical overview of the Big Data landscape. No prior knowledge of databases or programming is assumed. Big Data Bootcamp of CDA is targeted towards both technical and non-technical people who want to understand the emerging world of Big Data, with a specific focus on (Big) Data Analysis with R(Apache Spark), Data visualization,Machine Learning & Use Cases. Attendees will try to solve the real-world problems with data science!
I. Corporate:
Why you should send your employees to Big Data Bootcamp of CDA?
Unlike other Big Data training sessions, our bootcamp is unique in the following aspects:
Experts in the Big Data from the US with hands on
experience provide the training
Our training is vendor agnostic and provide hands-on
exercise of installing and running jobs on R
We can customize our training to your corporate needs
4 days of intense training (8 hrs/day) with multiple
use cases to practice (equivalent to one full month's
training from other sources)
A Big Data Certification of CDA will be provided upon
attending 4 days and completion of the training.
II. Individual:
Unlike other Big Data training sessions, our bootcamp is unique in the following aspects:
Experts in the Big Data from the US with hands on experience provide the training
4 days of intense training (6 hrs/day) with multiple use cases to practice (equivalent to one full month's training from other sources)
Trainees will get a good overview of Big Data and be able to gain employment in highly lucrative Big Data industry
Top performers in the class will get placement assistance from the trainers
Our training is vendor agnostic and provide hands-on exercise of installing and running jobs on R
A Big Data Certification of CDA will be provided upon attending 4 days and completion of the training
III. Course Details:
Day 1 Morning: Introduction to Data Science
What is Data Science? Learn about the importance of data, machine learning, and big data. Find out about CDA Institute’s free online resource . And get a feel for popular
open data science tools through IBM's Data Scientist Workbench including Jupyter (IPython) Notebooks, RStudio IDE, and Apache Spark.
Day 1 Afternoon: Introduction to R Programming for Data Science
R is a popular programming language used in data science. Don't know anything about R or need a refresher? This is just right for you. We cover the basics of R programming, which will provide a strong foundation as we cover data analysis, data visualization, machine learning and big data in this bootcamp.
Topics covered:
Getting started with the R environment and libraries
Numbers, variables, logical statements in R
Arrays, Matrices, Lists and Dataframes
Reading data from files
Loops and conditional statements
Custom functions
Day 2 Morning: Data Analysis with R
Learn how to analyze data using R. This section will take you from the basics of R to exploring many different types of data. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations,predict future trends from data, and more!
Topics covered:
Importing Datasets
Cleaning the Data
Dataframe manipulation
Summarizing the Data
Day 2 Afternoon: Data Visualization with R
A picture is worth a thousand words - or should we say data points? In this section, we will go through how to plot the major graphs in R. Learn how to plot bar graphs, line graphs, histograms, and more. Then jump into visualizing text data with word clouds. Create visualizations of geographic maps. Finally, learn how to create an interactive visualization of earthquakes.
Topic covered:
Intro to data visualization with R
Basic Plots (Bar graphs, histograms, pie graphs)
Scatterplots and line graphs
Word Cloud
Leaflet Maps
Shiny Dashboard Tutorial - Earthquakes
Day 3 Morning: Machine learning with R
How can we get machines to learn from the data on their own? In this part you will learn get an overview of machine learning algorithms. To get hands-on practice with machine learning, you will work with real datasets and practice data mining techniques to predict house prices, classify food recipes, cluster weather stations, and also create a
recommender system for books.
Topics covered:
Overview of Machine Learning
Regression
Classification (Decision trees)
Clustering (k-means)
Recommender System (collaborative filtering)
Day 3 Afternoon: Big Data Analysis with R
You will learn how to work with Big Data using Apache Spark. Spark is a lightweight front-end library that is used for distributed processing when dealing with big data.
You will read data from a big dataset, preprocess and apply preprocessing operations.
Topics covered:
Intro to Apache SparkR
Reading data from a big dataset
Selecting data, filtering, and aggregating big data
Day 4 All Day: Data Science Capstone Project
Let's solve real-world problems with data science! In this section, using your newly-adopted skills, you will be given mentor-guided time to apply what you have learned to a real problem. Your project will include finding a problem, searching for an open dataset, preprocessing the data, summarizing and visualizing the data, and apply machine learning to find insight from your data. Publish your findings online and present your results to your classmates.
IV. Who Should Attend:
Engineers, Developers, Architects, Analytics professionals, Networking specialists, Managers, Executives, Students, Professional Services, DBA, Data Analyst, Sales, Pre Sales, Consultants, Technical Marketing, PM, Teaching Staff...
V. Bootcamp Location
Los Angeles (specific address will be confirmed 2 weeks before the bootcamp starts)
VI. Lab Requirements
Each student should bring their own laptop ( Windows 7/8/10 and Mac, Virtualization Enabled, Minimum 8GB RAM and Free 25GB-50GB hard disk ) with administrative privileges and wireless connectivity. If you have AMD laptop, it should be AMD-V enabled. If you have Intel laptop or Mac, it should support Intel-VTx. An extra USB drive of 16gb minimum will be handy if you want to use your personal USB drive for all files and images.
VII. Tuition
$800 USD
VIII.Payment
Although Online Order using credit card is preferable, we accept several other types of payment: Wire Transfer, checks and Invoices / Purchase orders.
IX. Refund Policy
No refunds will be given for cancellations
If you have any questions concerning Big Data Bootcamp of CDA, please do not hesitate to contact shenwenting@pinggu.org or Skype: davidfnck