Big Data Analytics with Spark

Gil Vernik

Abstract

I will explain the challenges of current Big Data engines ( in particular Apache Spark ) integration with object stores, what are the issues and their origin. I will discuss what can be done to make this integration more efficient and remove barriers of some algorithms. I will present Stocator, an open source (Apache License 2.0) object store connector for Hadoop and Apache Spark specifically designed to optimize their performance with object stores.

Speaker

Photo of Gil Vernik

Gil Vernik is a researcher in IBM Haifa, where he works with Apache Spark, Hadoop, object stores, and NoSQL databases. Gil has more than 25 years of experience as a code developer on both the server side and client side and is fluent in Java, Python, Scala, C/C++, and Erlang. He holds a PhD in mathematics from the University of Haifa and held a postdoctoral position in Germany.

Requested 2 times

Lecture languages

EnglishGermanHebrew

Topics

Storage

Duration options

1 hour

Travel/delivery options

In-countryOutside of country: Open for discussionRemote via video conference

Country

Israel

Lecture booking request

Thank you for your interest in hosting an IBM speaker. Please fill out the following form with as much detail as possible. An IBM representative will reach out to discuss your booking request. All guest lectures are subject to availability and agreements under this collaboration are not legally binding.