#### EPFL

#### School of Computer and Communication Sciences

## The Summer Research Institute 2017

14.06.17 - 20.06.17

## Overview

The **Summer Research Institute (SuRI)** is an annual event that takes place at the
School of Computer and Communication Sciences of the École polytechnique fédérale de Lausanne, Switzerland.
The workshop brings together renowned researchers and experts from academia and industry.
It features a number of research talks by speakers from all around the world and is conducive to informal discussions and social activities.
This year's edition of SuRI features two tracks:

**Data Science (June 14 - 16, 2017)****Security and Privacy (June 19 - 20, 2017)**

The event is **open to everyone** and attendance is **free of charge**.

For organisational reasons, however, we would kindly ask you to **register below**.

## Registration

Closed.

## Program (Overview)

**15.06.2017**

**16.06.2017**

**19.06.17**

**20.06.17**

## Program (Detailed)

**14.06.2017**

An effective technique for solving optimization problems over massive data sets is to partition the data into smaller pieces, solve the problem on each piece and compute a representative solution from it, and finally obtain a solution inside the union of the representative solutions for all pieces. Such an algorithm can be implemented easily in 2 rounds of MapReduces or be applied in a streaming model. This technique can be captured via the concept of composable core-sets, and has been recently applied to solve various problems. We will see that diversity maximization and clustering problems, a deterministic variant of this technique applies, and for submodular maximization and column subset selection problems, a randomized variant of composable core-set problem can be applied, e.g., a 0.3 and a 0.58-approximation for non-monotone and monotone submodualr maximization problems. Finally, I show how to improve the above results to achieve a 1-1/e-\eps-approximation for coverage function using a new sketching technique. I will discuss how to design almost optimal streaming and distributed algorithms for coverage problems, and how to use these algorithms in feature selection problems. The main part of the talk summarizes results from a STOC 2015 (with M. ZadiMoghaddam) and a new paper (with H. Bateni and H. Esfandiari). The survey part of the talk is from three recent papers that appeared in PODS'14, NIPS'14, and ICML'16.

PAC-Bayes is a state-of-the-art framework for analysing the generalisation of learning algorithms that incorporates the Bayesian approach, yet provides statistical learning bounds. One feature of the approach is that though the prior must be fixed, we do not need to have an explicit expression for it, only to be able to bound the distance between prior and posterior. Furthermore, the choice of prior only impacts the quality of the bound and not the validity of the results. We will discuss the implications of these observations describing ways in which the prior may be chosen to improve the quality of the bounds obtained. The application of these ideas to the stability analysis for SVMs delivers a significant tightening of the well-known stability bounds.

A long line of research studies the space complexity of estimating a norm l(x) in the data-stream model, i.e., when x is the frequency vector of an input stream which consists of insertions and deletions of n item types.

Restricting attention to norms l (on R^n) that are symmetric, meaning that l is invariant under sign-flips and coordinate-permutations, I will show that the streaming space complexity is essentially determined by the measure-concentration characteristics of l. The same quantity is known to govern many phenomena in high-dimensional spaces, such as large-deviation bounds and the critical dimension in Dvoretzky's Theorem.

The family of symmetric norms contains several well-studied norms, such as all l_p norms, and indeed we provide a new explanation for the disparity in space complexity between p<=2 and p>2. We also obtain bounds for other norms that are useful in applications.

Joint work with Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, and Lin F. Yang.

We consider relative error low rank approximation of tensors with respect to the Frobenius norm: given an order-$q$ tensor $A$, output a rank-$k$ tensor $B$ for which $\|A-B\|_F^2 \leq (1+\epsilon)$OPT, where OPT $= \inf_{\textrm{rank-}k~A'} \|A-A'\|_F^2$. Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors. One structural issue is that there may be no rank-$k$ tensor $A_k$ achieving the above infinum. Another, computational issue, is that an efficient relative error low rank approximation algorithm for tensors would allow one to compute the rank of a tensor, which is NP-hard. We bypass these issues via (1) bicriteria and (2) parameterized complexity solutions. Our results give the first relative error low rank approximations for tensors for a large number of robust error measures for which nothing was known, as well as column row and tube subset selection.

**15.06.2017**

The maximum matching problem is among the most well-studied problems in combinatorial optimization with many applications. We consider the problem of approximating a maximum matching in graph streams where the input graph is revealed as a stream of edge updates that may include both edge insertions and deletions. The goal is to design a streaming algorithm that computes an approximate matching in sublinear space, that is, using space that is substantially smaller than the space needed to store all the edges in the graph.

In this talk, we will describe some progress on this problem that precisely characterizes the tradeoff between the space available to the algorithm and the quality of the matching approximation. As it turns out, the study of sublinear space algorithms for matchings is intimately connected to understanding communication complexity of solving the matching problem in a distributed model of computation, called the simultaneous communication model. We will also present some very recent developments on the matching problem in the simultaneous model which show that a simple assumption on how the graph is partitioned completely alters the tractability of the problem.

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in dAspremont 2008 and Devolder, Glineur, and Nesterov 2014. This work considers the use of 'fast gradient' methods for the special case of stochastic approximation for the least squares regression problem. Our main result refutes the conventional wisdom by showing that acceleration can be made robust to statistical errors. In particular, this work introduces an accelerated stochastic gradient method that provably achieves the minimax optimal statistical risk faster than stochastic gradient descent. Critical to the analysis is a sharp characterization of accelerated stochastic gradient descent as a stochastic process.

We hope this characterization gives insights towards the broader question of designing simple and effective accelerated stochastic methods for more general convex and non-convex problems. We will also discuss some preliminary experimental results in the non-convex setting.

Starting from the seminal works of Tukey (1960) and Huber (1964), the field of robust statistics asks: Are there estimators that provably work in the presence of noise? The trouble is that all known provably robust estimators are also hard to compute in high-dimensions.

Here, we study a basic problem in robust statistics, posed in various forms in the above works. Given corrupted samples from a high-dimensional Gaussian, are there efficient algorithms to accurately estimate its parameters? We give the first algorithms that are able to tolerate a constant fraction of corruptions that is independent of the dimension. Additionally, we give several more applications of our techniques to product distributions and various mixture models.

This is based on joint work with Ilias Diakonikolas, Jerry Li, Gautam Kamath, Daniel Kane and Alistair Stewart.

Robust inference is an extension of probabilistic inference, where some of the observations may be adversarially corrupted. We limit the adversarial corruption to a finite set of modification rules. We model robust inference as a zero-sum game between an adversary, who selects a modification rule, and a predictor, who wants to accurately predict the state of nature.

There are two variants of the model, one where the adversary needs to pick the modification rule in advance and one where the adversary can select the modification rule after observing the realized uncorrupted input. For both settings we derive efficient near optimal policy runs in polynomial time. Our efficient algorithms are based on methodologies for developing local computation algorithms.

We also consider a learning setting where the predictor receives a set of uncorrupted inputs and their classification. The predictor needs to select a hypothesis, from a known set of hypotheses, and is tested on inputs which the adversary corrupts. We show how to utilize an ERM oracle to derive a near optimal predictor strategy, namely, picking a hypothesis that minimizes the error on the corrupted test inputs.

Based on joint works with Uriel Feige, Aviad Rubinstein, Robert Schapire, Moshe Tennenholtz, Shai Vardi.

**16.06.2017**

Given a set of points P and a (kernel) function K, the Kernel Density Estimate at a point x is defined as the average value of the kernel between the point x and every point y in P. We study the problem of designing a data structure that given a data set P and a kernel function, returns approximations to the kernel density of a query point in sublinear time. This problem and weighted versions arise as basic primitives in statistics (density estimation), machine learning (kernel methods) and scientific computing (physical simulations).

We introduce a class of unbiased estimators for kernel density using importance sampling, implemented through locality-sensitive hashing. We give general theorems bounding the variance of such estimators, that give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.

Joint work with Paris Syminelakis.

Learning in presence of outliers is a critical problem that can heavily affect performance of the learning algorithms in practice. In this talk, we present a general approach for learning with outliers, where we iteratively estimate the model parameters with estimated inliers and threshold out point which seems unlikely to be generated from the model to obtain more refined set of inliers. We instantiate this general approach for the outlier efficient PCA problem and demonstrate that it leads to nearly optimal solution in O(PCA) computation time.

Datasets are often reused to perform multiple statistical analyses in an adaptive way, in which each analysis may depend on the outcomes of previous analyses on the same dataset. Standard statistical guarantees do not account for these dependencies and little is known about how to provably avoid overfitting in the adaptive setting. In this talk I'll describe a new framework to address this problem, an approach based on differential privacy, and several algorithms based on this approach. Based on joint works with Dwork, Hardt, Pitassi, Reingold and Roth (STOC 2015, NIPS 2015) and with Steinke (COLT 2017)

**19.06.17**

Transparency is one of the distinguishing and attractive features of blockchains. It is at odds, however, with another key feature desired by practitioners: Confidentiality. In this talk, I’ll introduce Solidus, a practical system for blockchain transactions that harmonizes these two seemingly contradictory features within a framework used by real-world financial institutions. Solidus uses a new primitive called PVORM (Publicly Verifiable Oblivious RAM Machine) to hide both transaction values and the blockchain transaction graph (i.e., the identities of transacting entities). At the same time, it offers public verifiability of transaction validity and transparency for auditors. A preliminary implementation shows that Solidus can achieve practical throughputs and a high degree of scalability.

When it comes to the topics of machine learning and robotics many authors use the term of «artificial intelligence». From my perspective as a privacy commissioner data processing is so far «artificial» as it is a cultural act, which can vary strongly with regard to the technical devices applied and the speed, quantity, quality or complexity of the processing. In my opinion existing legal principles do cope with any of those characteristics of data processing regardless of the technology it is based on. The rule of law satisfies the need of society to determine a responsibility of the humans who were last perceived in control of a process and it will always give way to solutions in order to reach this goal. Even if my prediction showed to be wrong: If we lost the last bit of control to “something” other than a "human" or deliberately handed power over to it one day, among us humans, the last of "us" in the chain would be held responsible, rather than the "thing" which gathered control over our data.

Routing is at the heart of the Internet and has been a continual source of security problems since its expansion in the 1980s. SCION is a new approach to the Internet, which offers dramatically better security properties than we currently have. We describe a collaborative effort, the Verified Scion Project, at ETH Zurich that aims to verify Scion, going the full distance from high-level network-wide properties down to the code running on SCION routers. We will explain the issues involved, the approach we take, the progress we have made, and perspectives for the future.

The work reported on is joint work between three groups at ETH Zurich: my Information Security Group, the Network Security Group of Adrian Perrig, and the Programming Methodology Group of Peter Mueller.

TLS 1.3 is the next version of the Transport Layer Security (TLS) protocol. Its clean-slate design is a reaction both to the increasing demand for low-latency HTTPS connections, and to a series of recent high-profile attacks on TLS. The hope is that a fresh protocol with modern cryptography will prevent legacy problems; the danger is that it will expose new kinds of attacks, or reintroduce old flaws that were fixed in previous versions of TLS. After 18 drafts, the protocol is nearing completion, and the working group has appealed to researchers to analyze the protocol before publication.

We respond by presenting a comprehensive analysis of TLS 1.3 Draft-18:

(1) We present symbolic ProVerif models for various intermediate versions of TLS 1.3 and evaluate them against a rich class of attacks to reconstruct both known and previously unpublished vulnerabilities that influenced the current design of the protocol. In particular, we test whether TLS 1.3 prevents well-known attacks on TLS 1.2, such as Logjam or the Triple Handshake, even if it is run in parallel with TLS 1.2.

(2) We present a computational CryptoVerif model for TLS 1.3 Draft-18 and mechanically prove its security under standard (strong) assumptions on its cryptographic primitives.

(3) We present RefTLS, an interoperable implementation of TLS 1.0-1.3 and automatically analyze its protocol core by extracting a ProVerif model from its typed JavaScript code.

The talk will mainly focus on (2) and briefly sketch (1) and (3).

**20.06.17**

The internet was built without strong privacy protections, making its use today susceptible to routine mass surveillance. Our research for the past 15 years has focused on building scalable protections for on-line communications that hide content, but also "anonymity systems" that hide the meta-data of who is talking with whom on-line. In this talk I will provide an overview of our latest results in attacking widely deployed anonymity systems, and present our latest designs, showing that such systems can be made robust at scale. Anonymity systems are relevant not only for communications privacy, but also as building blocks for private queries, social networking and private storage, and are likely to become a fundamental building block of information security in the years to come.

Adversarial machine learning which lies in the intersection of learning and security aims to understand the effects of adversaries on learning algorithms and safe guard against them by design of protection mechanisms.

In this talk, we discuss the effect of strategic adversaries in recommendation systems. Such systems can be modeled using a multistage sequential prediction framework where at each stage, the recommendation system combines the predictions of set of experts about an unknown outcome with the aim of accurately predicting the outcome. The outcome is often the 'rating/interest' of a user in an item. Specifically, we study an adversarial setting in which one of the experts is malicious and his goal is to impose the maximum loss on the system. We show that in some settings the greedy policy of always reporting false prediction is asymptotically optimal for the malicious expert. Our result could be viewed as a generalization of the regret bound for learning from expert advice problem in the adversarial setting with respect to the best dynamic policy, rather than the conventional regret bound for the best action (static policy) in hindsight.

As the capabilities of censors increase and their ability to perform more powerful deep-packet inspection techniques grows, more powerful systems are needed in turn to disguise user traffic and allow users under a censor's influence to access blocked content on the Internet. Decoy routing is a censorship resistance technique that hides traffic under the guise of a HTTPS connection to a benign, non-censored 'decoy' site. However, existing techniques far from perfectly disguise themselves as a typical access of content on the decoy server. Differences in latencies, packet sizes, and timings betray their use to a censor capable of performing basic packet and latency analysis.

While many of the more recent decoy routing systems focus on deployability concerns, they do so at the cost of security, adding vulnerabilities to both passive and active attacks. We propose Slitheen, a decoy routing system that hides censored data inside actual traffic from the decoy sites, perfectly preserving the traffic characteristics. Our system is secure against previously undefended passive attacks, and known active attacks. Further, we present our latest work demonstrating how to securely support route asymmetry in decoy routing systems like Slitheen, as well as a tiered deployment strategy to remove barriers to ISP participation in censorship-resistance deployments.

This is joint work with Cecylia Bocovich.

STAR-Vote is a collaboration between a number of academics and the Travis County (Austin), Texas elections office, which currently uses a DRE voting system and previously used an optical scan voting system. STAR-Vote represents a rare opportunity for a variety of sophisticated technologies, such as end-to-end cryptography and risk limiting audits, to be designed into a new voting system, from scratch, with a variety of real world constraints, such as election-day vote centers that must support thousands of ballot styles and run all day in the event of a power failure.

## Speakers

### Data Science Track

### Moses Charikar

(Stanford)

### Artur Czumaj

(University of Warwick)

### Vitaly Feldman

(IBM Research)

### Amir Globerson

(Tel Aviv University)

### Piotr Indyk

(MIT)

### Prateek Jain

(Microsoft Research)

### Sham Kakade

(University of Washington)

### Sanjeev Khanna

(University of Pennsylvania)

### Robert Krauthgamer

(Weizmann)

### Stefano Leonardi

(Sapienza University of Rome)

## Speakers

### Security and Privacy Track

## Directions

If you're flying in, Genève-Cointrin is the nearest international airport. By train, it takes ~45 minutes to reach Lausanne. Zürich Airport is about ~2.5 hours away by train. To get information on train schedules, please see the Swiss Rail website.

To reach the SuRI venue from the EPFL M1 metro station, please check the map below.