Zum Hauptinhalt springen

T3CON Recap—Web Analytics: Balancing Data Collection, Privacy, and User Experience

The possibility to collect almost any data you want about users has created a Wild West situation on the internet. In response, governments have created a whole raft of privacy legislation — and many people question the purpose of collecting so much data.

Alexander Veit shared his extensive experience on the topic in his talk Analytics You Want vs. Analytics You Need, at T3CON23. Veit offered a lively history of web analytics, up to the present day. 

He suggested that a more streamlined approach to analytics could be superior for meeting business goals, because it can improve user experience.

Read on for a full recap of this session, or catch up on what else you missed at T3CON23 and get ready for T3CON24.

About the Speaker

Alexander Veit is the founder of TWIPLA — “The Website Intelligence Platform” — which is the number-one analytics app on Wix and has been installed on more than 2.5 million websites. He is also a dedicated member of the TYPO3 community, having been a user, configurator, and programmer since 2001.

The internet evolves

Veit started his talk by tracing the evolution of the internet and web browsers, showing how they have enabled more web analytics capabilities over time.

The internet itself began as a scientific, institutional tool to ease and simplify communication between scientific establishments back in the late 1960s. Veit discussed how it evolved, leading up to the World Wide Web in the late 1980s. TCP/IP was the foundational communication protocol of the internet that allowed different types of computers to talk to each other.

More recently, we have come to characterize further evolutions of the internet under the ideas of Web 1.0 (the original, read-only approach), Web 2.0 (the addition of social media and user-generated content), and Web 3.0 (focusing on decentralization, with buzzwords like blockchain and cryptocurrency).

Analytics — from server to browser

Veit explained that a crucial turning point for analytics came in 2007, when analytics tools were integrated into the web browsers themselves.

Up to this moment, web analytics had relied on simple text files generated by web servers to track page requests and basic visitor data. Tools like AWStats were popular in the late 1990s, and helped provide a rudimentary understanding of web traffic. They parsed these text files to generate basic visualizations, like bar and line charts.

Around 2007, there was a notable transition to more advanced analytics methodologies. This coincided with the emergence of new web browsers, such as Safari and Firefox, that could exploit technologies like cookies and JavaScript. These features had originally evolved to improve users’ web experience, but now heralded the move from server-based analysis to more sophisticated techniques implemented within the browser itself.

The door to a multitude of data

With browser-based analytics, it became possible to record a wide array of events and interactions on a device: everything from writing data to the device and retrieving information from it, to acquiring detailed insights from the host system. This extensive list of events includes every keystroke and a multitude of other actions that can be recorded for analysis purposes.

As Veit observed: “The browser opens not only the door to the internet, but also the door to a lot of data.”

The web browser is the door to the internet. And yes, you’re right, there’s quite a long list of possibilities that have been added to web browsers in terms of behavioral web browser defaults, writing to the device, getting data from the device, acquiring information from the host, a list of tonnes of events where every keystroke, everything has been recorded, or can be recorded.

How the internet became the Wild West

The new capabilities unlocked by advanced, browser-based analytics quickly made basic metrics like “How many visits did my page get?” seem boring. Veit outlined some of the new forms of analytics and related business models that emerged:

  • Advanced analytics:
    • User journeys and funnels: Tracking the paths users take through a site to understand navigation and conversion processes.
    • Behavioral analytics: Including heatmaps to visualize where users click and interact most on a page, and session recordings to observe user activity in real-time.
    • Optimization and user experience (UX): Using collected data to enhance the functionality and design of a website, improving user interaction and satisfaction.
  • Retargeting: Employing data to show ads to users on different platforms, based on their previous interactions or product views on a site. This approach is exemplified by advertisements for a product showing up on one website after having viewed it in a different online shop.
  • Personalization: Utilizing data stored on website visitors’ devices to tailor their experiences and present more relevant content or product suggestions.
  • Social media and online advertising: The growth and integration of social media platforms and online advertising have significantly accelerated the scope and capabilities of web analytics, enabling more targeted and effective marketing strategies.

Veit explained how the vast amount of data available might at first appear pretty handy, offering many convenient insights. However, it is also scary and raises privacy concerns: the extensive collection and combination of data from your online journey can feel intrusive. He compared it to  “Big Brother […] watching you, in order to sell you more goods that you might not even need.”

Businesses built on data

The presentation went on to show how collecting user data can be very profitable. The revenue trends of major players like Google, Amazon, and Meta show a noticeable hockey stick pattern of exponential growth. This contrasts with the more linear growth of internet user numbers. 

While people may be increasingly willing to spend money online, it’s important to recognize that these businesses are fundamentally built on data. Veit explained that this data is predominantly provided by users themselves through their internet activities.

The internet,” as Veit put it, “the source of the world’s knowledge, as it was planned, somehow turned into the Wild West.”

When companies are focused on generating a return on investment, they encourage users to invest more, thereby funneling more money into their platforms. The underlying mechanism of this economic model is the strategic use of user-provided data for business growth.

One of the things I encountered myself was retargeting. I went to an online shop, looked for a certain product, and then on a totally different page I visited the day after there was an advertisement for that product. My first feeling was, OK, that feels very handy. But on the other hand, it’s also a bit scary, because there is so much data around.”

Privacy concerns and EU legislation

More recently, legal frameworks have played some role in mitigating the Wild West situation. Veit discussed how lawyer Max Schrems uncovered issues with how Facebook handled deleted data and started the Europe Versus Facebook movement. This movement was instrumental in bringing about 2016’s General Data Protection Regulation (GDPR) and has continued to shape EU privacy legislation since.

The European Union Maastricht treaty contains a clause that privacy is a fundamental right, but this has not prevented US companies built on user data from passing that data from within the EU back to the US. Veit showed how two different data-sharing agreements have been invalidated as a result of complaints brought by Schrems: Safe Harbour, from 2000, was invalidated in the 2015 Schrems I ruling, and 2016’s Privacy Shield by Schrems II in 2020. Veit speculated that the same fate awaits the EU-US Data Privacy Framework, established in 2023.

While the GDPR primarily concerns the storage of a user’s information away from their own control, another piece of EU legislation — the e-Privacy Directive — focuses on accessing data from a user’s device. Both require informed, upfront consent from users, with the exception being for cookies or data that are technically essential for service provision. Compliance with these laws is crucial, and failure to obtain proper consent can result in legal penalties and damage to reputation.

Changed norms through EU privacy law violations

Veit pointed to the GDPR enforcement tracker for a growing compendium of cases where companies have been found in violation of the EU privacy law. Indeed, the need to comply with the GDPR has brought about a number of notable changes in internet norms, including:

  • No more pre-checked checkboxes: Websites were compelled to update their consent mechanisms to require active opt-in from users, moving away from pre-checked checkboxes for data collection.
  • Facebook Like button: It was argued that this feature could breach GDPR regulations concerning data sharing and user consent, so the button has largely disappeared from third-party websites.
  • Google Fonts: Google Fonts is a library of free, open-source fonts. However, when a website uses Google Fonts (or any external resource), it typically makes requests to external servers — in this case, Google’s servers. A German website’s use of Google Fonts led to a €100 fine for GDPR violation, prompting a shift to moving fonts to the websites’ own servers.

Tele2’s Google Analytics penalty: Swedish Telecom company Tele2 was fined €1 million for non-compliant use of Google Analytics, leading to a reevaluation of third-party analytics tools under GDPR standards.

Everything I’ve discussed is equally applicable to apps, games, and other stuff where you interact with user data. We always talk about personal data, which per definition in Germany, is also an IP address. So, sending data to a Google server, with your personal IP address, is sending personal data to the United States.

Analytics you want vs. analytics you need

Having taken us through the history of analytics, Veit reached the core question of his presentation: What is the best way to implement analytics? He laid out two main options:

  • Option A: Collect all visitor data
  • Option B: Collect and process only the website visitor data that you really need

When implementing option A, a common solution is to implement a Consent Management Banner (CMP) that requests users to accept the website’s collection of data.

CMPs often nudge users into accepting tracking across a range of services via a simple Accept All button. Veit provided an example of a site that allowed users to manually opt in to or out of a range of services — 747 in total!

Effective optimization and the privacy paradox

Such management banners can be overwhelming for users, who are already skeptical of being tracked across so many parameters — and may leave the site as a result. By contrast, companies choosing Option B will see fewer users leaving, but also collect less data.

Veit referred to this as the privacy paradox. It entails that “privacy measures can actually open the door to more data and better insights.” Option A and B are in fact two ways to get insights into effective optimization, representing:

  • A: Higher data density from fewer website visitors.
  • B: Lower data density, but from (many) more website visitors.

Each approach, Veit argued, has its advantages and disadvantages, and must be selected according to your business case.

The privacy paradox means that privacy measures can actually open the door to more data and better insights. There are two ways to get the insights you need for effective optimization around website goals. Either we have a higher data density from fewer website visitors — meaning cookie banners and all the rejections — or you have a lower data density from many more website visitors. In the end, there is no right or wrong answer. It really depends on what your website is about and what your goals are

Reasons to implement a consent management banner

A business might have good reasons to go with Option A and implement a CMP:

  • Data-dependent business model: If a business’s operations and goals are entirely reliant on the data acquired from users, implementing a CMP is crucial. This scenario is typical for companies whose services or products are fine-tuned or personalized based on user data.
  • Unique market position or monopoly: In cases where a business holds a monopoly or operates in a niche market with minimal competition, a CMP can be valuable. Such businesses might rely on detailed user data to maintain their market position, understand their unique user base, or identify potential areas for expansion or improvement, especially when their market presence isn’t easily challenged.
  • Content exchange for data (“Pay with Data”): Some businesses offer content or services in exchange for user data, aligning with Article 7 of the GDPR. In this model, users consent to provide their data as a form of payment to access certain content. Implementing a CMP in such scenarios ensures that the data exchange process is compliant with legal requirements.

No consent banner — 500% more traffic

By contrast, Veit offered an example where Option B gave the best outcome: the website of the Technical University of Munich (TUM), on which he had worked. 

The TUM website experienced a 500% improvement in traffic after removing global consent banners. These banners were initially implemented primarily due to a small number of YouTube video embeds. It seemed unnecessary to ask for global consent when a majority of visitors might not even view the page containing the embeds.

The increase in traffic was attributed to visitors previously rejecting cookie consent or using ad blockers, which negatively impacted user engagement and traffic. By eliminating intrusive consent requests and respecting user privacy more effectively, TUM significantly enhanced the user experience and accessibility of their website.

Benefits of improved user experience

Veit concluded that, while there are valid cases for both Option A and Option B, but that Option B might be favorable.

The majority, 80 or 90 percent of websites, have main goals like presenting your company, presenting your products, presenting your service to the audience. And every time you force them to do something else, which they have not expected, you might lose a certain amount of your audience. It’s also just an obstacle on their way to inform themselves about what you’re about to offer.

For a majority of businesses, then, adopting a more streamlined approach — just analytics you need — could be a preferable option, simply because it can greatly improve users’ experience of a site.

Ensure to sign up for T3CON24 to see more amazing talks like this one.

2024's dates have been announced!