Privacy, cookies and web analytics
Dr Lionel Khalil, Director of the Office of Institutional research and assessment, Notre-Dame University, Louaize, Lebanon
Abstract : The generalization of monitoring data records has ground in the need to ensure safety and protection of rights. There are two different trends to manage cookies in Web analytics egocentric and socio centric approaches which used simultaneously are considered too intrusive and are limited by most of the regulators. I fact it is difficult to achieve true anonymity. There is a tradeoff between functionality and privacy. It is not easy to make an informed consent to classify why, when and what to disclose, what is useful and whom to disclose it to ( relatives, coworkers, friends ). In practice private data are vulnerable. Even most of the applications that access the address book have good intentions, several applications are solely designed to steal Phonebook contacts. Considering that the practice of uploading address book data is so widespread, the answers to those question are unknown, and the uncertainty is enough to make even the most trusting of people paranoid. There are several action to be taken to observe this growing threat to privacy, to inform customers on the main dangers of those applications.
The generalization of monitoring data records has ground in the need to ensure safety and protection of rights. Claims of privacy violation raise with the discomfort of the lake of adequacy between the level of data recording and the incidental usage of this data. It is illusory to define an acceptable general level of privacy violation for all because this level is specific to each culture.
The common definition of personal information within the meaning of Privacy Principles consists of any data that can lead to the identification of an individual, either alone or when combined with other identifying information. In the case of Web analytics, Information collected may be considered personal information within the definition of personal information. Here is the debate.
The use of Web analytics has exploded with the development of Social Media. Web Analytics helps a software company or website owners to better know their customers, focus the commercial efforts and build better relationships with customers. Web Analytics is in practice based on collecting personal data from the customer with or without informed consent. Information collected depend on the application and the device (mobile or PC) but fall in the following categories :
- Personal data such as name, device ID number, IP address, location
- Data stored in the device ( PC or mobile) such as Contacts, photos
- Potential Competitor information than can be found in the History of navigation and competitors cookies
- Behavioral data such as way of clicking on the page and keywords used in the search engine
Thanks to web analytics the website owners can know the customer personal data, the contacts of his friends, if he has visited competitors and his way to use the website. That information helps to propose a better service and to place appropriately customized advertising on the screen. For example, Gmail reads the mails received in their customers’ mailbox and proposes advertising based on the mail content. Most of the applications on mobile use web analytics without the express consent of the user.
There are two different trends in Web analytics[1], the first one, an egocentric approach which reflects the traditional forensic trend to identify the profile and the behavior of an individual, the need to trail him, his friends, his connections, his daily activities to have a sound portray of him and use him as a viral vector of promotion of a product or a service; the other one, a socio-centric approach, which is a normative trend to characterize the imprint of a person in his behavior and reduce the personal information imprint into predefined, normalized and schematic categorical sets that might be used to target him specifically with a generic product fitting his current need.
In Part I we will discuss the conflict between profiling a person and his right for anonymity. In Part II we will present the need for a legal framework to monitor and limit the use of web trackers such as cookies.
Part I – Profiling a person and right for anonymity
Users tend to have very contradictory behavior when they face the issue of privacy. On one hand the trend is an aspiration to go public on each and every part of their private life; and on the other hand there is a general sense of privacy violation.
1- The right of anonymity
In general does the law enforce anonymity? Even the right to freedom of movement, expressed by Article 13 of the Universal Declaration of Human Rights does not necessarily guaranty the freedom of movement to be exercised under anonymity. Meanwhile there are few legal status in several countries that allow full anonymity : in the health sector, we can recall donor anonymity of body organs, elective abortion, anonymous child birth. In the press, the reporter's privilege has always been legally defined as the professional privilege of a reporter to maintain the anonymity of his sources because this relationship furthers the free flow of information to the public. Some auxiliary of justice and some witness can benefit from anonymity too in special cases. In literature, an anonymous work is a work published by its author, who decides to remain anonymous for a variety of reasons, and either under a pseudonym, or no name at all. This collection of specific rules is not exhaustive but shows that anonymity is seen as an exception more than a general principle.
There is also a major difference between the lawful collection of data under confidentiality and a real and global anonymity. Medical bodies, employer bodies, state bodies, Lawyers, accountants, etc… have the authority to collect data on persons; and such data are indeed kept secret. This confidentiality is the consequence of the principle of finality of any collection of personal data. The responsibility of confidentiality is bounded by a professional obligation. Rather than recognizing a full right for anonymity, general principle of law recognize a general right for privacy in any interaction with other bodies of the society. This right for privacy seems a fair balance between the need to identify a citizen in his interaction with the society and the legitimate need for anonymity of each person.
Privacy is widely protected since 1980’by states law and more generally thanks to the guidance of the loi-type from the OECD Privacy Guidelines of 1980[2] and the APEC Privacy Framework of 2004.
On May 2014, the major change on privacy and the Internet were about a ruling from the European Union’s highest court, European Court of Justice which held that Internet search companies must respect a “right to be forgotten.” This “right to be forgotten” require anonymity over the data which have been collected in the past. This right do NOT imply that the collection of the data was illegal. The main issue is about the name of convicted who have been in their time rightfully reported in the press. Several years later those convicted persons request a right that their name has been removed from the on-line archives of newspapers. There is balance between a legitimate need to save information and a prejudice when the dissemination of this press information become harmful for the former convicted trying to renew with a peaceful life.
2-The use and abuse of personal information to profile individual
Social Network businesses are centered on the exploitation of personal information. More specifically Truecaller is a commercialization of the aggregation of Personal Contacts (Address Book). By usage any owner of a phone never require the authorization of his contacts to put their name and number on his phonebook contacts. By aggregating personal phonebook contacts of his users, Truecaller has a legitimate access to a wide phone database sharing the contacts of all the users of Truecaller. Any person can request to be removed from the shared phone database of Truecaller, but by default any name and number available on the TrueCaller users are in the shared database.
The 2012’ Path Scandal[3], involved the right to access the Address Book of the client. This right is essential to most “Find my friends on this service” features. The popular Path app was caught uploading and permanently storing people’s entire address books on Path’s servers without requesting any permission.
A recent study from Peter Gilbert et al.[4],Duke University has shown a generalized practice of violation of privacy by the apps by employing a traffic-monitoring utility called mitmproxy to observe the data flowing between apps and the Internet: out of 30 popular apps, 15 send information such as Personal contacts, microphone capture, camera capture, location without the consent of the user to the owner of the application.
App developers and owners defend themselves against the privacy violation with the following line of defense. First App developers and owners are protecting themselves with a data privacy policy or a end-user license agreement and an explicit but mostly uninformed consent. Second, they are arguing that the access to the personal information is officially and technically allowed by Apple or Android. Third they are arguing that web analytics robots access the information read the personal information, but robots do not copy the personal information and there is no human access to the personal information.
Here is the debate about permission and consent. If the door of your garden is not locked, does it imply that someone can rightfully enter in your garden to take photos ? and in your house ?
For Daniel J. Solove[5] Privacy self-management are the rights to notice, access, and consent regarding the collection, use, and disclosure of personal data. Even well-informed individuals face structural problems that reduce their capacity to appropriately manage their privacy. Indeed an individual person is limited to manage separately privacy rights for each and every entities (website, apps… ) that are collecting and using personal data. Moreover, an individual person is not able to conceptualize the real threat on privacy as the aggregation of several pieces of information over a period of time by different entities. Finally it is difficult to assess harm on a long term horizon in balanced with the immediate benefits.
On the contrary Hoofnagle et al[6] describes the interventions to support consumer privacy interests as paternalistic judgments that individuals cannot make proper judgments for themselves. Merely giving consumers some legal or technical mechanism to block such data collection and tracking is paternalistic because it intervenes in the natural market ecosystem of users, and websites.
There is a tradeoff between functionality and privacy. Nevertheless, the complex dilemma with consent remains that consent to collection, use, and disclosure of personal data is often not meaningful, but paternalistic measures denies people the freedom to make consensual choices about their data.
Part II- The need for a legal framework to monitor and limit the use of web trackers such as cookies.
Cookies have been designed around 1990, and have been initially designed to keep track of browser – server interaction during a session on a website or between two sessions. The Internet Engineering Task Force (IETF) initiated in 1995 a standardisation process for cookies[7] In 2000, the IETF published the RFC 29653 “HTTP State Management Mechanism” [8], which specify a way to create a session with HTTP requests and responses.
It started to raise both security and privacy concerns when third parties started to read them. Indeed there is limited support for confidentiality, integrity and authentication in the way cookies are used. As a matter of fact, because there are used to store core information in the interaction with the website, the possibilities for misusing cookies are very real and are being exploited., they can be misused mainly for profiling.
There are alternative technologies to cookies [9] more difficult for users to detect and block. The first group is based on putting on the user computer a unique identifier that advertisers can use to track individuals ( such as ETags, Flash cookies, HTML5 local storage, and Evercookies). The other technique is to identify the user based on his unique imprint, the imprint is based on the serial number of the browser used and other characteristics such as font, pictures etc...
The European legal framework amending in October 2009 the European Union’s e-Privacy Directive (2002/58/EC) requires website users to opt-in to tracking cookies in the Article 5(3): “the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent, having been provided with clear and comprehensive information.” Unfortunately few E.U. countries have implemented the amended e-Privacy Directive. In France the Article II-32 of the Act of 6 January 1978, amended in August 2011 to transpose the Directive 2009/136/EC takes this principle. Under the UK 98’Data Protection Act, tracers (cookies or other) require obtaining consent may not be posted or read on their device, as long as the person has not consented.
In practice, the limitations proposed by the French CNIL are severely limiting use and abuse of tracking. The data collected cannot be cross-checked with other treatments ( client files , ... ).The cookie deposited must be used solely for the production of anonymous statistics. So even if the user consents the data collected cannot be reused in another treatment.
This constraint do not allow a website to mix ego centric and socio centric usages of cookies. Let’s look at several application of this limitation. In the first example if a customer reads a FAQ page several times to understand a feature. There is no right to pro-actively advise him or offer personalized assistance. In the second scenario of practical limitation of the CNIL, if an user has a bug visiting a website, the website is not allowed to use the cookies to know what browser is used and in which conditions. For the third scenario of practical limitation of the CNIL, the website owner cannot crosscheck the usage of his website with the type of subscription of the user to propose a more adapted service to the user.
Indeed the limitation imposed on Cookies usage by the CNIL to use the collected data as ego centric and socio centric limits the capacity to improve the products, and services proposed by the website.
Conclusion :
It is difficult to achieve true anonymity. There is a tradeoff between functionality and privacy. It is not easy to make an informed consent to classify why, when and what to disclose : people want to disclose only what is useful and whom to disclose it to ( relatives, coworkers, friends ). Nevertheless the real privacy harm is the aggregation of private data on a long term. Most of the applications that access the address book have good intentions. But several applications are solely designed to steal Phonebook contacts. The practice of Apps of uploading address book data is nowadays so common, and the real usage of the address data book is unknown enough to develop some paranoid attitude. There are several action to be taken to observe this growing threat to privacy, to inform customers on the main dangers of those applications.
[1] Facebook, Twitter et les autres... p174, by Christine Balagué, David Fayon, PEARSON VILLAGE MONDIAL (26 février 2010) ISBN-13: 978-2744064197
[2] OECD guidelines on the protection of privacy and transborder flows of personal data, by OECD (1980); APEC Privacy Framework, APEC (2004);
[3] Path CEO apologizes for address book uploading, deletes all user data, and updates app with privacy controls, By Nathan Ingraham on February 8, 2012,http://www.theverge.com/2012/2/8/2785217/path-ios-address-book-upload-ceo-apology
[4] Automating Privacy Testing of Smartphone Applications, by Peter Gilbert, Byung-Gon Chun, Landon P. Cox, Jaeyeon Jung Duke University, Intel Labs retrieved from http://www.cs.duke.edu/~lpcox/TR-CS-2011-02.pdf
[5] John Marshall Harlan Research Professor of Law, George Washington University Law
School, Solove, Daniel J., Privacy Self-Management and the Consent Dilemma (November 4, 2012). 126 Harvard Law Review 1880 (2013); GWU Legal Studies Research Paper No. 2012-141; GWU Law School Public Law Research Paper No. 2012-141. Available at SSRN: http://ssrn.com/abstract=2171018
[6] Behavioral Advertising: The Offer You Cannot Refuse (August 28, 2012), by Hoofnagle, Chris Jay and Soltani, Ashkan and Good, Nathan and Wambach, Dietrich James and Ayenson, Mika. Harvard Law & Policy Review 273 (2012); UC Berkeley Public Law Research Paper No. 2137601. Available at SSRN: http://ssrn.com/abstract=2137601
[7] HTTP Cookies: Standards, privacy, and politics, by David Kristol, ACM Transactions on Internet Technology, 1(2),
2001, pp.151–198, available at: http://www.cs.stevens.edu/~nicolosi/classes/sp10-cspriv/ref5-1.pdf
[8] HTTP State Management Mechanism, IETF RFC 2965, published in 2000, available at:
http://www.ietf.org/rfc/rfc2965.txt
[9] Behavioral Advertising: The Offer You Cannot Refuse (August 28, 2012), by Hoofnagle, Chris Jay and Soltani, Ashkan and Good, Nathan and Wambach, Dietrich James and Ayenson, Mika. Harvard Law & Policy Review 273 (2012); UC Berkeley Public Law Research Paper No. 2137601. Available at SSRN: http://ssrn.com/abstract=2137601