The state of web analytics đ
If you have ever set up a website from scratch or at least have managed one for a certain amount of time, then thereâs a high chance that you exactly know what web analytics tools are, and what benefits they provide. In todayâs article, weâre going to explore a bit the world of web analytics. How this particular market looks like and what direction itâs heading. Also, what are some of the best tools for the job and what exact data can be retrieved about the user? And finally, my personal opinion and vision of what ideal web analytics tool should look like. Letâs get started! đ
Why web analytics?
I think in the modern world, almost everybody who has ever accessed the WWW in some way, knows what analytics means. This brilliant world loved among data researchers and hated among users, refers to collecting, picking and, naturally, analyzing the data from users of a given product. A practice so popular, that it can be easily observed in everyday life. But, what does it mean for the vast community of web developers, and why should we care?
First, letâs take a look at all this stuff from user point-of-view. You most likely wouldnât be pleasantly surprised knowing that someone is constantly monitoring your activity and saving some data on you. Of course, you can easily opt-out by not accepting cookies or just leaving the given website. Sadly, some sites make cookies a requirement, thus forcing you to accept them. From that moment on, unless youâre truly concerned about your privacy, you most likely forget about everything and continue experiencing the web. As for the collected data - weâll talk about that later.
When being a web developer, high chances are that youâll end up on the other side of the barrier. Managing a company website, portfolio, blog or even a special site for your OSS project, youâd most likely quickly turn to use the web analytics. Setting this whole process up cannot be easier - with Google Analytics, the most popular option on the market, you just create a free account for your website and copy-and-paste a snippet of tracking code. Then you just look at your data and celebrate any kind of growth in visitors number. Beyond that you can see how well different pages do, whatâs the general source of your visitors and a lot more⊠All this data allows you to improve your website from small to really big extent. But in reality (I canât personally prove that statement), when it comes to personal websites, when numbers go to millions, many people often stop looking at them so often. Donât know whatâs about data analytics companies. đ”
How does it work?
Before we go any further, I would like to do a quick round up on how web analytics tools work under-the-hood. I wonât be doing any detailed tutorial on this here (BTW, let me know if you would like to see one in comments below), just a small overview. Weâll mainly take a look at the client-side-of-things as thatâs what interests us the most - what is and how easily certain data can be obtained. đ
General background
Web analytics tools, like many others, are composed of several specific parts. Here we can decouple our software into tracking code (client), server code and dashboard (backend). Only when all these parts are combined, they deliver a flawless experience.
Tracking code (also known as the snippet that many just copy-and-paste and donât really care about) is probably one of the most important parts of the web analytics software. Its job is to store, collect and send data about the user to the server. While collecting the data is done with different JS methods, the storage part is where the infamous cookies come in. đȘ To know what data is connected with which user, a unique ID is assigned to each new visitor. Itâs later stored with the help of cookies and used whenever data needs to be sent to the server, for easier comparison of different usersâ sessions.
When data arrives at the server, it needs to be processed. Invoking different methods, comparing with older sessions, calculating other data etc. - itâs all going on the server. For example, we can retrieve data about the userâs screen size. Then, on the server, we can determine whether heâs using a TV, PC, tablet or a phone to access your website. Of course, we cannot forget about some kind of storage for our data - database. Also, itâs important to keep this data well-organized and secure. đ
Finally, together with the server, youâd most likely want to have a nice-looking dashboard. The better your data is presented, the more information you can get from it. The more pleasing to the eye the better. Obviously, this part isnât mandatory, but often highly required. Itâs not really that hard to do and the benefits can be staggering! đ
Data
With this quick overview in mind, I think itâs a good idea to just check out what and how easily certain information can be collected from the user using JS built-in methods.
User sessions
You can count the number given user visited your site with nothing more than client-server interaction and user IDs which I talked about earlier.
Timing
You most likely would like to know when your page gets most visits. Thatâs why you may need to keep track of the userâs timing. Knowing details like how long users stay on your page and at what time they visit it gives you a great advantage. For this purpose, you can utilize the JS [Date](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date)
object. This way you can retrieve the time user enters and exits your website. For keeping the time throughout different pages of your site, you can save time with [localStorage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage)
paired with [onload](https://developer.mozilla.org/en-US/docs/Web/API/GlobalEventHandlers/onload)
and [onunload](https://developer.mozilla.org/en-US/docs/Web/API/WindowEventHandlers/onunload)
events. Also, remember that time retrieved with .getTime()
is in UTC format, so youâll have to convert it to your time-format-of-choice if itâs needed. For example, if youâd like to have this time in the userâs format for whatever reason, youâd have to use .getTimezoneOffset()
method and later apply the retrieved value.
const date = new Date();
localStorage.setItem('startTime', date.getTime());
Location
If youâd like to know where your users mostly come from, you would have to collect their location. Depending on your needs, simple IP-to-location service might be just enough. It allows you to know the userâs country of origin. If youâd like to get some more accurate info, youâll need to utilize geolocation API. This will give you userâs precise latitude and longitude coordinates. Keep in mind, that so sensible data requires proper privileges from the user.
navigator.geolocation.getCurrentPosition(({coords}) => {
coords.latitude;
coords.longitude;
})
Referrer
Whether you share and popularize your website with social media or not, you might want to know what other websites link to your page and from what source you get the most views. Itâs really easy to access such a value. You only need to read the value of document.referrer
.
User agent
If you have been into JS programming for a while, you most likely know the infamous user agent string. With proper parsing, it can provide you information about user browser, OS, architecture, and sometimes even the deviceâs name. Of course, in its raw form navigator.userAgent
is just a long, meaningless without required knowledge, the string that data needs to be extracted from. There are quite a few JS libraries solely for this purpose.
Screen size
Knowing the userâs device screen size is quite easy and obvious with JS. If you can be used to determine userâs device type. Naturally, such information can also be determined from the OS name, but this method might be a little more accurate.
Language
Userâs language can obviously be determined based on his location. But thereâs a much simple method - just access navigator.language
property. Just keep in mind that this will return the respective userâs preferred language code, not necessarily the location-based one.
More
The global window.navigator
object provides a lot of information about the user, his browser and system. You can read device platform with navigator.platform
, the device core count with navigator.hardwareConcurrency
, get data about browser and more. JS provides many options and possible sources for your data. With that said, everything I mentioned above will most probably be sufficient for a great number of use-cases.
The rest
Again, when your tracking code collects enough data, you can proceed to send it to the server. The easiest way is to use AJAX and XMLHttpRequest. Your server can be written in PHP, Node.js or any other programming language. Here, youâd listen to coming data, analyze it, apply your own algorithms and save the output to the database. You should remember to keep your data in a structure that will help you to easily maintain and access it whenever needed, e.g. dashboard graphs.
Hey Google!
With the knowledge about what kind of data can be read by analytics tool, letâs get back to the topic in the title of this post, namely the current state of web analytics tools market. Here, without much discussion, the clear winner is Google Analytics (GA). Itâs a really popular analytics tool made by Google, powering statistics of millions of websites every day. And it does it for âfreeâ. Why is it so wide-spread and whatâs the real price of âfreeâ?
The pros
GA setup, as I mentioned earlier, is fairly simple. You just create a free account for your domain, copy-and-paste the tracking code to every page of your website and youâre done. You get a fine, modern-looking dashboard with plenty of data to analyze and explore.
One of the most interesting sections of GA is the real-time one. Here you can see data about users visiting your page right now, with only a few seconds offset. Just like in general section, you get information about the number of current users, their referral links, the address of page they see right now, the country of origin and some more.
You also have access to the acquisition section. Here, you get detailed graphs about what are the sources of your usersâ visits. It can be divided into categories like direct visits (from search engine and when typing URL), social (from social media), referral (from referring websites) and email (from marketing campaigns and newsletters). These numbers are just enough to know what you need to further improve.
The next important section is the one about users and their behavior. In the first one, you get general information about all your users in one place. Data like OS, device type, location, language, browser, & more. More interesting is the sub-section about the userâs behavior. Here you can compare new vs returning users in numbers, the frequency of their visits and the time they spent on your website.
Beyond that, GA dashboard allows you to do much more. You can access even more data, however, this may not be really needed. You can create your own comparison charts and graphs and see the data in different time periods. Surely this tool is awesome, but it comes with a price.
The cons
You most likely know what weâre going to talk about in this section. Privacy is a nightmare when it comes to analytics tools. Whatâs worse, not only the website owner has the data, but Google also, and maybe even more!
But really, who can blame Google for the current state of internet privacy? Users and websites manager decided to use their tools with free will. Nobody was forced to do it. đ Anyway, the inaccuracy in some statements and people who just didnât want to read everything in detail are only two of many more of other reasons behind todayâs state of web analytics and privacy.
Getting back to GA. Thereâs a feature that you can enable through GA dashboard to get even more data about userâs age, interests and more. The question is how? How and where this data comes from? Itâs not something that you can get through a standard website visit. I guess that itâs connected with Google Account of specific users. As Google Chrome have around 60% of web browsers market share, the number of Google Accounts can be very high too. Thatâs the one possible source of this data. Naturally, to get access to such information you have to agree to even more Google deals. I personally have used GA on some pages, and never activated these additional features. Doesnât seem too trustworthy to me. đ€š
Alternatives
So yeah, Google Analytics isnât perfect. But because theyâre so popular and free (at least when weâre talking about money), itâs really hard for a good competitive product. Me, personally understanding the problem with GA and wanting to switch, have searched the web for best alternatives. Yet, I havenât switched. And the main reason for that is the price. All similar services are paid and, in some cases, cost a lot! Having a much better tool, that doesnât require any money and only collects data on its own through my website, is a tempting offer. Thatâs why many people still choose GA. Anyway, here are some of the best paid tools I stumbled upon.
Adobe Analytics
Statcounter
Simple Analytics
Above are just some of paid web analytics tools that can be found out there. Naturally, thereâs plenty more! I havenât tested any of the above tools, like I normally do when putting together a list, so, sorry for no description. I just wonât write about stuff that I havenât used. They advertise interesting features on their pages, so you can visit them and decide for yourself.
Now, the fact that something isnât named Google Analytics, doesnât mean that it has higher regards to privacy than GA. Of course, it can even be worse! Closed-source, managed tools can be fully trusted IMHO. Thatâs why I turned to open-source, self-hosted alternatives. Here are some of the best I found.
OWA
Open Web Analytics provide a fairly good alternative to GA. But, a quick look at its landing page and GitHub repo reveal one of its biggest disadvantages - itâs not actively maintained. Also, its integrated dashboard looks a bit old. I hope it will get better soon but for now, I recommend taking a look at other tools.
Matomo
Matomo (formerly known as Piwik) is a leading open-source web analytics tool. It is actively maintained and provides both self-hosted and managed (paid) solutions. It comes with a nice dashboard, has a pretty simple setup and collects just enough data for any purpose.
Countly
Countly is a modern, on-going web analytics project. It has a sleek dashboard and plugin-based architecture. Thus it can be easily extended to serve as e.g. mobile analytics tool or campaign manager. Sadly, many plugins and functionalities are available only in the pro (paid) version (self-hosted or not). đ
Fathom
Fathom is a simple, privacy-focus web analytics tool. With its GDPR-compliance, it provides just as little data as possible, without violating usersâ privacy. It comes in both free, self-hosted versions, as well as paid managed. Again, if you want to collect and analyze highly-detailed and specific data about your users, it might not be a tool for you. This is for those you care about privacy - their owns and their usersâ.
The ideal
By this point of the article, weâve explored what data can be collected and a bit about GA and some other tools. So, I think itâs a perfect time to think about what the perfect web analytics tool should look like.
IMHO, the most important thing when it comes to web analytics is balance. The balance between the usersâ privacy and collected data. Userâs donât want to share too much and analytics donât want to get too little. So, how to achieve this state? I think itâs just impossible. You canât satisfy the needs of both sides at once. Always at least one will be unhappy with your choice. Well, who should it be then?
Personally, Iâd give the control to the creator/website manager. Heâs the one in charge. He should get a professional tool that allows him to retrieve only the data he needs, thus saving the usersâ privacy a fair bit. It could be done in many ways. Through a system of plugins or configurable tracking script. Possibilities are endless!
To pretty much summarize this whole post, here a quick list of all features my ideal web analytics tool should have:
- Integrated dashboard (sleek UI) built with the latest web technologies (Vue or React).
- Simple, extendable and pluggable architecture that allows adding functionalities to dashboard and tracking code. Plugins API should be easy to use and implement (obviously).
- Tracking code, as well as the whole tool should be configurable, thus allowing the website administrator to choose and see only data he requires, starting from minimal defaults.
- Optional notification for users integrated with a tracking code to let the users know what exact data is collected.
- 100% open-source and self-hosted-only. Built with JS and NodeJS for easy install and setup (Iâm totally biased here đ)
Thatâs just my vision. Thatâs also very much doable and something I would like to realize in the near future. What do you think about this feature-set? Would something like this suit your needs? Would you like to see such a project come to real life? Let me know in the comments below! đ
Fine enough?
With this quick article about the state of web analytics and web analytics in general, whatâs your opinion? What do you think about the direction this market is heading? Write down in the comments. Also, whatâs your opinion on this article? Let me know with a reaction below. Anyway, if you like this post, consider sharing it, following me on Twitter and on my Facebook page, and signing for the newsletter below (coming very soon đ) to keep up-to-date with the latest content from this blog. đ
Resources
- 30 of the Best Web Analytics Tools from shanebarker.com;
- The Web Needs OpenWebTraffic from staltz.com;
- Data Privacy Concerns with Google from hackernoon.com;
If you need
Custom Web App
I can help you get your next project, from idea to reality.