The state of web analytics πŸ“Š

If you have ever set up a website from scratch or at least have managed one for a certain amount of time, then there's a high chance that you exactly know what web analytics tools are, and what benefits they provide. In today's article, we're going to explore a bit the world of web analytics. How this particular market looks like and what direction it's heading. Also, what are some of the best tools for the job and what exact data can be retrieved about the user? And finally, my personal opinion and vision of what ideal web analytics tool should look like. Let's get started! 😁

Why web analytics?

I think in the modern world, almost everybody who has ever accessed the WWW in some way, knows what analytics means. This brilliant world loved among data researchers and hated among users, refers to collecting, picking and, naturally, analyzing the data from users of a given product. A practice so popular, that it can be easily observed in everyday life. But, what does it mean for the vast community of web developers, and why should we care?

First, let's take a look at all this stuff from user point-of-view. You most likely wouldn't be pleasantly surprised knowing that someone is constantly monitoring your activity and saving some data on you. Of course, you can easily opt-out by not accepting cookies or just leaving the given website. Sadly, some sites make cookies a requirement, thus forcing you to accept them. From that moment on, unless you're truly concerned about your privacy, you most likely forget about everything and continue experiencing the web. As for the collected data - we'll talk about that later.

When being a web developer, high chances are that you'll end up on the other side of the barrier. Managing a company website, portfolio, blog or even a special site for your OSS project, you'd most likely quickly turn to use the web analytics. Setting this whole process up cannot be easier - with Google Analytics, the most popular option on the market, you just create a free account for your website and copy-and-paste a snippet of tracking code. Then you just look at your data and celebrate any kind of growth in visitors number. Beyond that you can see how well different pages do, what's the general source of your visitors and a lot more... All this data allows you to improve your website from small to really big extent. But in reality (I can't personally prove that statement), when it comes to personal websites, when numbers go to millions, many people often stop looking at them so often. Don't know what's about data analytics companies. πŸ•΅

How does it work?

Before we go any further, I would like to do a quick round up on how web analytics tools work under-the-hood. I won't be doing any detailed tutorial on this here (BTW, let me know if you would like to see one in comments below), just a small overview. We'll mainly take a look at the client-side-of-things as that's what interests us the most - what is and how easily certain data can be obtained. πŸ“Š

General background

Web analytics tools, like many others, are composed of several specific parts. Here we can decouple our software into tracking code (client), server code and dashboard (backend). Only when all these parts are combined, they deliver a flawless experience.

Tracking code (also known as the snippet that many just copy-and-paste and don't really care about) is probably one of the most important parts of the web analytics software. Its job is to store, collect and send data about the user to the server. While collecting the data is done with different JS methods, the storage part is where the infamous cookies come in. πŸͺ To know what data is connected with which user, a unique ID is assigned to each new visitor. It's later stored with the help of cookies and used whenever data needs to be sent to the server, for easier comparison of different users' sessions.

When data arrives at the server, it needs to be processed. Invoking different methods, comparing with older sessions, calculating other data etc. - it's all going on the server. For example, we can retrieve data about the user's screen size. Then, on the server, we can determine whether he's using a TV, PC, tablet or a phone to access your website. Of course, we cannot forget about some kind of storage for our data - database. Also, it's important to keep this data well-organized and secure. πŸ˜‰

Finally, together with the server, you'd most likely want to have a nice-looking dashboard. The better your data is presented, the more information you can get from it. The more pleasing to the eye the better. Obviously, this part isn't mandatory, but often highly required. It's not really that hard to do and the benefits can be staggering! πŸ˜ƒ

Data

With this quick overview in mind, I think it's a good idea to just check out what and how easily certain information can be collected from the user using JS built-in methods.

User sessions

You can count the number given user visited your site with nothing more than client-server interaction and user IDs which I talked about earlier.

Timing

You most likely would like to know when your page gets most visits. That's why you may need to keep track of the user's timing. Knowing details like how long users stay on your page and at what time they visit it gives you a great advantage. For this purpose, you can utilize the JS Date object. This way you can retrieve the time user enters and exits your website. For keeping the time throughout different pages of your site, you can save time with localStorage paired with onload and onunload events. Also, remember that time retrieved with .getTime() is in UTC format, so you'll have to convert it to your time-format-of-choice if it's needed. For example, if you'd like to have this time in the user's format for whatever reason, you'd have to use .getTimezoneOffset() method and later apply the retrieved value.

const date = new Date();
localStorage.setItem('startTime', date.getTime());

Location

If you'd like to know where your users mostly come from, you would have to collect their location. Depending on your needs, simple IP-to-location service might be just enough. It allows you to know the user's country of origin. If you'd like to get some more accurate info, you'll need to utilize geolocation API. This will give you user's precise latitude and longitude coordinates. Keep in mind, that so sensible data requires proper privileges from the user.

navigator.geolocation.getCurrentPosition(({coords}) => {
	coords.latitude;
	coords.longitude;
})

Referrer

Whether you share and popularize your website with social media or not, you might want to know what other websites link to your page and from what source you get the most views. It's really easy to access such a value. You only need to read the value of document.referrer.

User agent

If you have been into JS programming for a while, you most likely know the infamous user agent string. With proper parsing, it can provide you information about user browser, OS, architecture, and sometimes even the device's name. Of course, in its raw form navigator.userAgent is just a long, meaningless without required knowledge, the string that data needs to be extracted from. There are quite a few JS libraries solely for this purpose.

Screen size

Knowing the user's device screen size is quite easy and obvious with JS. If you can be used to determine user's device type. Naturally, such information can also be determined from the OS name, but this method might be a little more accurate.

Language

User's language can obviously be determined based on his location. But there's a much simple method - just access navigator.language property. Just keep in mind that this will return the respective user's preferred language code, not necessarily the location-based one.

More

The global window.navigator object provides a lot of information about the user, his browser and system. You can read device platform with navigator.platform, the device core count with navigator.hardwareConcurrency, get data about browser and more. JS provides many options and possible sources for your data. With that said, everything I mentioned above will most probably be sufficient for a great number of use-cases.

The rest

Again, when your tracking code collects enough data, you can proceed to send it to the server. The easiest way is to use AJAX and XMLHttpRequest. Your server can be written in PHP, Node.js or any other programming language. Here, you'd listen to coming data, analyze it, apply your own algorithms and save the output to the database. You should remember to keep your data in a structure that will help you to easily maintain and access it whenever needed, e.g. dashboard graphs.

Hey Google!

With the knowledge about what kind of data can be read by analytics tool, let's get back to the topic in the title of this post, namely the current state of web analytics tools market. Here, without much discussion, the clear winner is Google Analytics (GA). It's a really popular analytics tool made by Google, powering statistics of millions of websites every day. And it does it for "free". Why is it so wide-spread and what's the real price of "free"?

The pros

GA setup, as I mentioned earlier, is fairly simple. You just create a free account for your domain, copy-and-paste the tracking code to every page of your website and you're done. You get a fine, modern-looking dashboard with plenty of data to analyze and explore.

One of the most interesting sections of GA is the real-time one. Here you can see data about users visiting your page right now, with only a few seconds offset. Just like in general section, you get information about the number of current users, their referral links, the address of page they see right now, the country of origin and some more.

You also have access to the acquisition section. Here, you get detailed graphs about what are the sources of your users' visits. It can be divided into categories like direct visits (from search engine and when typing URL), social (from social media), referral (from referring websites) and email (from marketing campaigns and newsletters). These numbers are just enough to know what you need to further improve.

The next important section is the one about users and their behavior. In the first one, you get general information about all your users in one place. Data like OS, device type, location, language, browser, & more. More interesting is the sub-section about the user's behavior. Here you can compare new vs returning users in numbers, the frequency of their visits and the time they spent on your website.

Beyond that, GA dashboard allows you to do much more. You can access even more data, however, this may not be really needed. You can create your own comparison charts and graphs and see the data in different time periods. Surely this tool is awesome, but it comes with a price.

The cons

You most likely know what we're going to talk about in this section. Privacy is a nightmare when it comes to analytics tools. What's worse, not only the website owner has the data, but Google also, and maybe even more!

But really, who can blame Google for the current state of internet privacy? Users and websites manager decided to use their tools with free will. Nobody was forced to do it. πŸ‘ Anyway, the inaccuracy in some statements and people who just didn't want to read everything in detail are only two of many more of other reasons behind today's state of web analytics and privacy.

Getting back to GA. There's a feature that you can enable through GA dashboard to get even more data about user's age, interests and more. The question is how? How and where this data comes from? It's not something that you can get through a standard website visit. I guess that it's connected with Google Account of specific users. As Google Chrome have around 60% of web browsers market share, the number of Google Accounts can be very high too. That's the one possible source of this data. Naturally, to get access to such information you have to agree to even more Google deals. I personally have used GA on some pages, and never activated these additional features. Doesn't seem too trustworthy to me. 🀨

Alternatives

So yeah, Google Analytics isn't perfect. But because they're so popular and free (at least when we're talking about money), it's really hard for a good competitive product. Me, personally understanding the problem with GA and wanting to switch, have searched the web for best alternatives. Yet, I haven't switched. And the main reason for that is the price. All similar services are paid and, in some cases, cost a lot! Having a much better tool, that doesn't require any money and only collects data on its own through my website, is a tempting offer. That's why many people still choose GA. Anyway, here are some of the best paid tools I stumbled upon.

Adobe Analytics

Adobe Analytics landing page

Statcounter

Statcounter landing page

Simple Analytics

Simple Analytics landing page

Above are just some of paid web analytics tools that can be found out there. Naturally, there's plenty more! I haven't tested any of the above tools, like I normally do when putting together a list, so, sorry for no description. I just won't write about stuff that I haven't used. They advertise interesting features on their pages, so you can visit them and decide for yourself.

Now, the fact that something isn't named Google Analytics, doesn't mean that it has higher regards to privacy than GA. Of course, it can even be worse! Closed-source, managed tools can be fully trusted IMHO. That's why I turned to open-source, self-hosted alternatives. Here are some of the best I found.

OWA

OWA landing page

Open Web Analytics provide a fairly good alternative to GA. But, a quick look at its landing page and GitHub repo reveal one of its biggest disadvantages - it's not actively maintained. Also, its integrated dashboard looks a bit old. I hope it will get better soon but for now, I recommend taking a look at other tools.

Matomo

Matomo landing page

Matomo (formerly known as Piwik) is a leading open-source web analytics tool. It is actively maintained and provides both self-hosted and managed (paid) solutions. It comes with a nice dashboard, has a pretty simple setup and collects just enough data for any purpose.

Countly

Countly landing page

Countly is a modern, on-going web analytics project. It has a sleek dashboard and plugin-based architecture. Thus it can be easily extended to serve as e.g. mobile analytics tool or campaign manager. Sadly, many plugins and functionalities are available only in the pro (paid) version (self-hosted or not). πŸ˜”

Fathom

Fathom landing page

Fathom is a simple, privacy-focus web analytics tool. With its GDPR-compliance, it provides just as little data as possible, without violating users' privacy. It comes in both free, self-hosted versions, as well as paid managed. Again, if you want to collect and analyze highly-detailed and specific data about your users, it might not be a tool for you. This is for those you care about privacy - their owns and their users'.

The ideal

By this point of the article, we've explored what data can be collected and a bit about GA and some other tools. So, I think it's a perfect time to think about what the perfect web analytics tool should look like.

IMHO, the most important thing when it comes to web analytics is balance. The balance between the users' privacy and collected data. User's don't want to share too much and analytics don't want to get too little. So, how to achieve this state? I think it's just impossible. You can't satisfy the needs of both sides at once. Always at least one will be unhappy with your choice. Well, who should it be then?

Personally, I'd give the control to the creator/website manager. He's the one in charge. He should get a professional tool that allows him to retrieve only the data he needs, thus saving the users' privacy a fair bit. It could be done in many ways. Through a system of plugins or configurable tracking script. Possibilities are endless!

To pretty much summarize this whole post, here a quick list of all features my ideal web analytics tool should have:

  • Integrated dashboard (sleek UI) built with the latest web technologies (Vue or React).
  • Simple, extendable and pluggable architecture that allows adding functionalities to dashboard and tracking code. Plugins API should be easy to use and implement (obviously).
  • Tracking code, as well as the whole tool should be configurable, thus allowing the website administrator to choose and see only data he requires, starting from minimal defaults.
  • Optional notification for users integrated with a tracking code to let the users know what exact data is collected.
  • 100% open-source and self-hosted-only. Built with JS and NodeJS for easy install and setup (I'm totally biased here πŸ˜‚)

That's just my vision. That's also very much doable and something I would like to realize in the near future. What do you think about this feature-set? Would something like this suit your needs? Would you like to see such a project come to real life? Let me know in the comments below! 😁

Fine enough?

With this quick article about the state of web analytics and web analytics in general, what's your opinion? What do you think about the direction this market is heading? Write down in the comments. Also, what's your opinion on this article? Let me know with a reaction below. Anyway, if you like this post, consider sharing it, following me on Twitter and on my Facebook page, and signing for the newsletter below (coming very soon πŸ˜‰) to keep up-to-date with the latest content from this blog. πŸš€

Resources