Loading…
October 25-27, 2017 - Prague, Czech Republic
Click Here For Information & Registration
Machine Learning + SMACK [clear filter]
Friday, October 27
 

09:45 CEST

Apache Flink Meets Apache Mesos and DC/OS - Jörg Schad, Mesosphere, Inc.
Apache Mesos allows operators to run distributed applications across an entire datacenter and is attracting ever increasing interest. As much as distributed applications see increased use enabled by Mesos, Mesos also sees increasing use due to a growing ecosystem of well integrated applications. One of the latest additions to the Mesos family is Apache Flink.

Flink is one of the most popular open source systems for real-time high scale data processing and allows users to deal with low-latency streaming analytical workloads on Mesos.

In this talk we explain the challenges solved while integrating Flink with Mesos, including how Flink’s distributed architecture can be modeled as a Mesos framework, and how Flink was integrated with Fenzo. Next, we describe how Flink was packaged to easily run on DC/OS.

Speakers
avatar for Jörg Schad

Jörg Schad

CTO, ArangoDB
Jörg Schad is the CTO at ArangoDB. In a previous life, he has worked on or built machine learning pipelines in healthcare, distributed systems, including early Kubernetes code at Mesosphere, and in-memory databases. He received his Ph.D. for research about distributed databases and... Read More →



Friday October 27, 2017 09:45 - 10:35 CEST
Congress Hall 1

11:00 CEST

Running Distributed TensorFlow on DC/OS - Kevin Klues, Mesosphere, Inc.
Running distributed TensorFlow is challenging, especially if you want to train large models on your own infrastructure. In this talk, Kevin Klues and Sam Pringle will present an open source TensorFlow framework for distributed training on DC/OS. This framework addresses several challenges associated with distributed TensorFlow, and they hope it will make life much easier for anyone doing machine learning with large models/datasets. Kevin will introduce TensorFlow on Mesos and DC/OS, and Sam will give a live demo of the framework.

Speakers
avatar for Kevin Klues

Kevin Klues

Distinguished Engineer, NVIDIA
Kevin Klues is a distinguished engineer on the NVIDIA Cloud Native team. Kevin has been involved in the design and implementation of a number of Kubernetes technologies, including the Topology Manager, the Kubernetes stack for Multi-Instance GPUs, and Dynamic Resource Allocation (DRA... Read More →



Friday October 27, 2017 11:00 - 11:50 CEST
Congress Hall 1

12:00 CEST

How We Built a Highly Scalable Machine Learning Platform Using Apache Mesos - Daniel Sârbe, SDL
Is there a way to combine new architectural patterns such as micro-services with Big Data technologies and run everything in Mesos?

In this talk I will present a novel, highly scalable Machine Learning platform for our Machine Translation use-cases.

I will explain how, in order to reach this goal, we have combined a wide variety of Big Data technologies(like Kafka, HBase, Hadoop), and I will discuss the challenges that we have faced along the way. I will also present how we adopted a containerized micro-services architecture(based on Mesos, Docker, Zookeeper) in order deploy our highly scalable Machine Learning platform.

Speakers
avatar for Daniel Sârbe

Daniel Sârbe

Development Manager, SDL
Daniel is leading the Big Data and Cloud Machine Translation group at SDL and in the last two years he was involved in building a highly scalable Machine Learning platform using some technologies from the BigData ecosystem like Kafka, HBase, Hadoop HDFS, ELK, Mesos in combination... Read More →


Friday October 27, 2017 12:00 - 12:50 CEST
Congress Hall 1

14:00 CEST

Building FAST Data Solutions with DC/OS on Azure - Rob Bagby, Microsoft
In this session, we will illustrate how to develop, deploy and manage FAST data solutions at scale on DC/OS and Azure. We will discuss the challenges of stateful containers in the cloud and provide guidance on how to implement both Cassandra and Kafka in DC/OS. We will further discuss how to manage your deployed solution with a partner solution. This session will be demo heavy, illustrating how to develop Cassandra and Kafka applications locally, run them at scale in DC/OS and manage them with a 3rd party workflow solution.

Speakers

Friday October 27, 2017 14:00 - 14:50 CEST
Congress Hall 1

15:00 CEST

Accelerating Spark Workloads in a Mesos Environment with Alluxio - Gene Pang, Alluxio, Inc.
Organizations Mesos and Apache Spark together to gain insight from large amounts of data. It is common for Spark to process data stored in disparate public cloud storage, such as Amazon S3, Microsoft Azure Blob Storage, or Google Cloud Storage as well as on-premise data on HDFS, Ceph or ECS. This architecture results in sub-optimal performance as data and compute are not co-located.

Using Alluxio, a memory speed virtual distributed storage system, deployed on Mesos enables connecting any compute framework, such as Apache Spark, to storage systems via a unified namespace. Alluxio enables applications to interact with any data at memory speed. Alluxio can eliminate the pains of ETL and data duplication, and enable new workloads across all data. Gene will discuss the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.

Speakers
avatar for Gene Pang

Gene Pang

Head Architect, Alluxio, Inc.
Gene Pang is the PMC Maintainer of the Alluxio open source project and a founding member of Alluxio, Inc. He graduated with a Ph.D. from the AMPLab at UC Berkeley, working on distributed database systems. Before starting at Berkeley, he worked at Google and has an M.S. from Stanford... Read More →


Friday October 27, 2017 15:00 - 15:50 CEST
Congress Hall 1

16:00 CEST

Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability on DC/OS - Dinesh Israni, Portworx Inc
Most modern distributed applications like Cassandra and HDFS provide replication of data across nodes and failure zones to be able to deal with failures. But the time taken to recover to a pre-failure level of redundancy in cases of permanent node failures can be large, since a lot of data needs to be copied over to the new node. Also, some of these applications cannot accept new writes on the nodes being bootstrapped, further increasing the recovery time.

Dinesh Israni will talk about how you can use dcos-commons frameworks for Cassandra, Elasticsearch, HDFS, Kafka and Spark along with External Persistent volumes to reduce recovery times for your distributed applications and achieve high availability for applications that don’t provide replication.

Speakers
avatar for Dinesh Israni

Dinesh Israni

Senior Software Engineer, Portworx Inc
Dinesh Israni is a Senior Software Engineer at Portworx with over 7 years of experience building Distributed Storage solutions. Prior to Portworx, Dinesh was at Microsoft, through their acquisition of StorSimple, working on their Hybrid Cloud Storage solution. Recently, he has been... Read More →



Friday October 27, 2017 16:00 - 16:50 CEST
Congress Hall 1

17:00 CEST

What Building Multiple Scalable DC/OS Deployments Taught Me About Running Stateful Services on DC/OS - Nathan Shimek, New Context
As a systems integrator specializing in cloud transformation projects, New Context has helped customers run mission-critical applications on DCOS. As part of that work, we’ve overcome a host of challenges that pop up when running stateful services like databases, queues, and key-value stores on top of DC/OS.

DC/OS supports local and external volumes for stateful applications, but there are a number of documented “caveats” that must be overcome like volumes being pinned to hosts, inability to dynamically provision volumes at run time, resource requirements being fixed at task launch, and being limited to one task per volume.

This talk will provide background on the main gotchas of running stateful services on Marathon and DCOS, and will discuss how to overcome them based on real-world projects conducted alongside some of the largest container users in the world.

Speakers
NS

Nathan Shimek

Vice President of Client Solutions, New Context
Nathan serves as VP of Client Solutions for New Context. He has over 13 years of experience leading high performing operations and development organizations for companies like LifeLock and Saba Software.


Friday October 27, 2017 17:00 - 17:50 CEST
Congress Hall 1
 
Filter sessions
Apply filters to sessions.