University of Oxford Logo University of OxfordSoftware Engineering - Home
On Facebook
Facebook
Follow us on twitter
Twitter
Linked in
Linked in
Google plus
Google plus
Digg
Digg
Pinterest
Pinterest
Stumble Upon
Stumble Upon
CLO

Cloud Computing and Big Data

Cloud computing and big data techniques are changing the way we collect, analyse, store and use data. This course looks at the theoretical and practical technologies behind big data and cloud computing.

Course dates

16th July 2018Oxford University Department of Computer Science 0 places remaining.
8th July 2019Oxford University Department of Computer Science 0 places remaining.

Objectives

The aims of this course are to show how cloud computing and big data techniques can be used to solve massive scale problems. The course will aim to introduce students to both the theoretical background of cloud computing as well as the practical applications. The processing of large datasets using Big Data techniques, map-reduce and other techniques will be a large focus. In addition the course will cover approaches to building applications and managing them on the cloud.

Contents

The course will cover the following topics:

Origins and background of Cloud Computing Grids, Parallel computing, Functional programming, Infrastructure as a Service

Using Cloud services Amazon EC2 fundamentals, Concepts of IaaS, Openstack and Private Cloud

Map-reduce and Big Data analytics Map-reduce theory, Hadoop, Hive and Pig, Functional decomposition

Theory of Cloud Computing CAP Theorem, Eventual Consistency, Shared Nothing architectures, Dynamo algorithm; Amdahl’s law, Gustafson’s Law, Karp-Flatt Metric; Lambda Architecture, Multi-tenancy, PaaS and SaaS models

NoSQL databases and scalable data storage alternatives, graph databases, Mongo and Cassandra

Case studies and examples

Futures and alternatives to Map Reduce Real time stream analytics, Generalized functional decomposition, Apache Spark and Storm, Futures

Requirements

The course introduces the fundamentals of creating big data processing systems in the cloud.

Practicals will require programming in Python, as well as the use of the UNIX command line / bash shell. While students do not need significant experience in Python itself, some serious programming experience is required as the course exercises require the students to write big data analytics code. This course is not suitable for students who have no practical experience in writing code.

Other aspects that will be helpful for the course are:

  • Functional programming experience: we use extensive lambda expressions and functional patterns.
  • Simple SQL expressions
  • Distributed computing and basics of IP networking

Students who have not programmed in Python are expected to use the resources on Python to gain experience before the class. There are also pointers to resources on the command line and SQL. All students are expected to complete the pre-study exercise which looks at lambda expressions in Python.