Since the first Data Transparency Lab kick-off workshop back in November last year, we’ve come a long way. Today we’re very excited to reveal the first 6 projects that will be funded by the DTL with a total grant of 300,000 euros – 50,000 euros for each.
All of the projects chosen epitomise the core aims and ambitions of the DTL: to support research in tools, data and methodologies that shed light on the use of personal data by online services, giving people more control over their data. The grants are aimed towards supporting, fully or partially, the relevant work of a Principal Investigator (PI) and a least one PhD student or postdoc for a period of approximately a year.
For those who don’t already know, the DTL is a community-based effort, established by Telefonica, Mozilla, MIT and the ODI, to reveal the flow and usage of personal data online, and to explore ways towards a transparent and respectful data trade in the future. It’s made up of academics, public institutions, startups and large companies to create an independent research community.
Following a call for proposals in April this year over 60 applications were received and reviewed. A specialist research committee was setup, chaired by Krishna Gummadi (Max Planck Institute for Software System) and Nikolaos Laoutaris (Telefonica) and consisting of data and privacy experts from numerous academic institutions such as Boston University, Princeton and Cambridge as well as from organisations and businesses such as AT&T and INRIA. This specialist committee was tasked with assessing each of these proposals and judging their suitability based on the following research areas…
- Reverse-engineering personal data usage in online services (e.g. advertising, recommender services, pricing and availability of goods & information)
- Detecting personal data gathering by online services
- Privacy-preserving personal data analytics / management tools
- Raising user and societal awareness of data-use
A number of proposals were then put forward to the DTL board where a final decision was made on the 6 announced today (details below). The DTL board is made up of the four DTL founding members – Professor Sandy Pentland, MIT; Alina Hua, Mozilla; Pablo Rodriguez, Telefonica and Jeni Tennison, Open Data Institute. The winning project teams will now receive their funding to start building the tools and platforms proposed and will provide an update on their progress at the DTLConference 2015 which takes place in November.
At a time when the power and significance of data is indisputable but when fears over privacy and data use are understandably high, we need to find a balance. A balance that recognises the value of data but puts people in control and gives them the transparency they deserve. We think that the DTL – a collaboration with businesses, organisations and academic institutions – is a step in the right direction, and through funding this first set of projects we’re confident we can start to make data transparency a real reality and retain trust in our digital society. The future of the Web relies on it.
DTL 2015 Project Details
The 6 successful proposals that will be funded consist of 5 tools and one platform. Each will receive 50,000 euros and be supported over the course of a year.
Providing Data-Driven Privacy Awareness
Lorrie Faith Cranor (Carnegie Mellon University) and Blase Ur (Carnegie Mellon University)
“Build and test a data-driven privacy tool that enables users to explore precisely on which webpages different companies have tracked them, as well as what those companies may have inferred about their interests. In addition to releasing a privacy tool as a fully functional, open-source project, we will conduct a 75-participant, 2-week field trial comparing visualizations of personalised tracking data.”
Revealing and Controlling Mobile Privacy Leaks
David Choffnes (Northeastern University), Christo Wilson (Northeastern University) and Alan Mislove (Northeastern University)
“Improving privacy in an environment of ubiquitous connectivity and rich-sensors requires trusted third-party systems that enable auditing and control over personal information (PII) leaks.
“We will investigate how to use machine learning to reliably identify PII from network flows, and identify algorithms that incorporate user feedback to adapt to the changing landscape of privacy leaks. Second, we will build tools that allow users to control how their information is (or not) shared with second and third parties. These tools will be deployed as free, open-source applications that can run in a number of deployment scenarios, including on a device in a user’s home network, or in a shared cloud-based VM environment.”
FDVT: Personal Data Valuation Tool for Facebook Users
Angel Cuevas (Universidad Carlos III de Madrid) and Raquel Aparicio (Universidad Carlos III de Madrid)
“The goal of this project is to develop a tool that informs people (in real-time) of the economic value of their personal information associated with their browsing activity. Due to the complexity of the problem this particular project will narrow the scope of this tool to just FB to begin with i.e., inform FB users of the value that they are generating to FB. We’ll call this the FB Data Valuation Tool (FDVT).”
Digital Halo: Browsing History Awareness
Arkadiusz Stopczynski (Technical University of Denmark) ; Mieszko Piotr Manijak (Technical University of Denmark ; Piotr Sapiezynski (Technical University of Denmark) and Sune Lehmann (Technical University of Denmark)
“Our online browsing history is intensely personal. Our search terms and the web-pages we visit, reveal our fears, interests, illnesses, and secret ambitions.
“A few years ago, the immersion project originating at the MIT Media Lab received world-wide press coverage by visualizing the latent social information contained in our email header information. We aim to do something similar for web-browsing. Using topic models, we aim to design a simple dashboard that allows individuals to visualize the content of their browsing, and observe how these topics change over time. Crucially, we will combine this visualization with information on data trackers (how many tracking parties, how much outgoing information), thus allowing users to directly observe what the data tracking means for them.
Characterising Trade-offs Between Privacy and Function
Nick Feamster (Princeton University) and Sarthak Grover (Princeton University)
“In this project, we propose to develop mechanisms and tools to better understand the following two questions:
- How much data does a user reveal during the course of normal browsing activity?
- To what extent does the data that a service keeps about a user help meaningfully personalize the service?
“We will conduct controlled studies that explore the extent to which decisions that a user makes about protecting privacy may ultimately cripple the function of usability of an Internet service.”
Reverse-engineering online tracking: From niche research field to easy-to-use tool
Arvind Narayanan (Princeton University) and Steven Englehardt (Princeton University)
“At Princeton we have built OpenWPM, a platform for online tracking transparency. We have used it in several published studies to detect and reverse-engineer online tracking. In the proposed work, we aim to democratize web privacy measurement: transform it from a niche research field to a widely available tool.
“We will do this in two steps: use OpenWPM to publish a web privacy census” ( a monthly web-scale measurement of privacy, comprising 1 million sites}. The census will detect and measure most of the types of known privacy violations reported by researchers so far: circumvention of cookie blocking, leakage of PII to third parties, canvas fingerprinting, and more. Second, we will build an analysis platform to allow anyone to analyse the census data with minimal expertise. The platform will allow packaging and distributing study data, scripts, and results in a format that’s easy to replicate and extend.”