Geek Night NCR

Flashback

Tools, Techniques, Performance and Optimizations - all about Big Data.
Join us as we explore some of the most recent developments in Big Data

12^th edition

Oct 7 2017

Geek Night is a regular event to promote the sharing of technical knowledge and increase collaboration among the geeks in the National Capital Region (Delhi). It is organized by a passionate group of programmers and sponsored by Thoughtworks.

We love feedback! If you have any suggestions or cribs, feel free to fill out our feedback form. Don't worry, your feedback will remain completely anonymous.
Geek Night Volunteers

Agenda

9:30 - 10:00 am

Registration

10:00 - 10:30 am

Keynote & Introduction

By: Shipra Shandilya

10:30 - 11:15 am

From App Dev to Data Engineer - How problem solving techniques differ

Many application developer changed their role to big data engineer. The problem is that they continue using the earlier programming approaches in big data problems, forgetting that they mostly work for silo-computing(single machine/data in MBs). In this session I will explain how to solve the problems which are compute-intensive as well as data-intensive.

By: Chandu Kavar

11:15 - 12:15 pm

Number Partitioning Using Optimisation

Partition Problem is a famous class of problems in Number Theory which deals with the task of dividing a set of positive integers into several subsets so that the sums of the numbers in every subset equal each other. There are different variants of the problems which may include optimisation with various constraints. We have taken some interesting approaches to solve such problems while working with one of our customers. The talk will consist of few of the approaches identified, detailing their conceptual and mathematical models.

By: Swapnil Kumar

12:15 - 12:45 pm

Execution engine evolution: MR to Tez

In this session we will talk about traditional MapReduce execution flow and Tez execution flow. Then we will be talking about the advantage of using Tez as an execution engine in Hive and Pig. Lastly the differences between MapReduce and Tez.

By: Kuldeep Mishra, Deepika Kamboj

12:45 - 1:45 pm

Lunch

Break Session

1:45 - 2:45 pm

Spark performance tuning

This session is about Spark execution engine and it's performance optimization techniques. It will start with a basic introduction of Spark along with YARN at first and then talking how the performance can be optimized by focussing on different job parameters like Resource allocation(executors and memory), Dynamic allocation, Serialization and Shuffle.

By: Shakti Garg, Nisha Kumari

2:45 - 3:30 pm

Streaming your data : Options in the wild

In this beginner session we take a look at the different types of streaming/queueing tools out there and a very basic overview of the various difference constructs for each of them with examples.

By: Atif Syed, Palash Chatterjee

3:30 - 4:00 pm

Tea and networking

Geek Night NCR

Flashback

Tools, Techniques, Performance and Optimizations - all about Big Data. Join us as we explore some of the most recent developments in Big Data

Oct 7 2017

Agenda

Previous Editions

Tools, Techniques, Performance and Optimizations - all about Big Data.
Join us as we explore some of the most recent developments in Big Data