Build a GA4 Roll-up Property Using Server-side Google Tag Manager

Markus Baersch
23 min readAug 4, 2022

--

Since there are no data views in GA4 and an incoming data stream does not flow into multiple GA4 properties, the question of correct tagging arises for many websites. If e.g. several languages or countries, different parts on individual subdomains or in folders were previously viewed separately in Universal Analytics views, but had a common property: How should this be mapped in GA4? This article documents the results of an investigation for a larger, multi-part website. The question is whether and how it is possible to combine data from different individual properties in a common property via a serverside GTM for GA4.

One or multiple properties?

One property for all means a lot of filter work in GA4 reports or different reporting UIs via library / menu customization for different parts and tasks. This does not solve the problem that several time zones are then lumped together on international websites, but that does not count as a counter argument here, because it would have been the case in UA. If you have already operated a parallel UA property there, this would theoretically also be possible with GA4.

Multiple properties allow individual settings, access control, time zones, etc. That’s great, but without tagging another property in the browser at the same time, you won’t get a view in which all the data can be found. Using the example of a website that consists of several subdomains, which are all to be measured in their own individual property, but are also to serve a “roll-up” / collective property, the whole thing looks something like this in simplified form for Universal Analytics:

Client-only dual tagging option for Universal Analytics

This setup could just as easily serve a website with multiple folders for different purposes, audiences, or languages. No matter why there are always several properties or if dual tagging is used or a customTask to duplicate everything: All data that also accumulates in the individual properties for the parts of the website end up in the collective property as a “roll-up”, which is supplied in parallel directly from the browser.

Data blending in Google Data Studio, evaluations based on Big Query data or switching to GA360 (where there are sub properties and roll-up properties) are the only way out if you want to avoid double tagging in the browser. With one exception:

Options when using your own endpoint

If a serverside GTM is used as the tracking endpoint (or a comparable “proxy” between browser and tracking service), the rules of the game are different. With Universal Analytics it was / is absolutely possible to forward hits for different properties to the respective recipient (the “UA single property”) as well as to send all hits of all parts of the website / domain to another collective / roll-up property. Sessions managed on the receiver side provided a useful and consistent overall view — as long as you could then come to terms with the limits and sampling etc.

Stream Consolidation for Universal Analytics

As long as the recipient — in this case Universal Analytics — takes care of managing sessions for individual visitors, their status, number of visits, determination of goals achieved, etc., this works fine.

Key question: Is all that possible with GA4?

With GA4, however, the cards are reshuffled again and the above setup fails due to the fact that sessions are managed in the browser and so each property has a different status. The client ID is shared, but the session ID and numbering of the session are under the control of the browser and are different for each property. The same applies to the information as to whether a session is starting or whether a user is new or not. Can the above construct also work for GA4, although control of the session and the other above attributes for visits and visitors live in the browser — i.e. the sender — and not the recipient of the hits?

Stream consolidation for GA4 with ssGTM: Possible?

I tried different options to find out exactly that.

Restrictions for “simple” duplication of GA4 datastreams at the endpoint

Combining several GA4 data streams into a new (valid) stream at the ssGTM endpoint is not trivial for the reasons mentioned above.

The following aspects pose as problems:

  • Session ID: It changes with the property when navigating from one part of the website to another. For example, if you switch between blog and shop with separate properties for measurement.
  • Session number: The number is also not consistent if you throw all GA4 hits into one pool. The above switching from blog to shop creates a new session from the point of view of the shop property with an (usually) different number, because one usually does not always visit all parts of a website
  • Session Start and First Visit marker: Whenever a session arises, a marker is sent with the hit (_ss=1) which causes a session_start event to arise in GA4. Also (if the session number is 1 there is a marker for the first visit (_fv=1) which results in a first_visit event and ensures that there is something like a “First User Source” etc. in GA4

As a result, in such a “collective property” with all unfiltered hits from different GA4 data streams without further adjustments, a lot of sessions, “new” visitors and the associated events that are actually (from the point of view of an overall property) superfluous.

GA4 example: returning visitor

This is what it looks like using the example of a returning visitor who has been to the www property several times and once to the shop.

  • If the renewed visit to the www property begins, a session and a start arise. The session ID is unique and the visit number is 3 because there have been previous visits to the property. Further events can now follow within the property
  • One click leads to the blog (where no visit has taken place so far). A new session is created there from the perspective of the blog property and a session_start and first_visit are reported — also to the roll-up property. From their point of view, the session ID and number have also changed
  • The same thing happens when you switch to the shop, but there is no first_visit because there was already a visit there from a previous session
  • From the shop back to www, the session ID and number now changes again from the point of view of the roll-up property. Since a www session already exists, there are no new session_start events
How a roll-up property “sees” consolidted streams

In this way, inconsistent session information arrives in the ro-llup property and, in addition, unwanted events may occur when changing. So while the number of page views should match or transactions should end up correctly in the roll-up, a “cross-domain” attribution of conversions to visits and their source are usually disturbed.

Solutions for GA4

I’ve tried a few things and tinkering to make the overall view as meaningful as possible. The above inconsistencies in the session information and unnecessary restarts are problems that can be dealt with in different ways using the standard tools of the client- and server-side Google Tag Manager.

  1. Sending all hits using a standard tag for GA4 with overridden measurement id to consolidate all hits in one roll-up property. Suitable for mainly separate hosts or domains
  2. Like 1), but with special handling of changes between the properties and new sessions that arise as a result. For useful results, a separate, comprehensive session should also be managed. This can optionally be controlled in a client-side GTM or from server-side
  3. Using the GA4 Measurement Protocol
  4. A separate client in ssGTM that sends all hits to the roll-up property (adjusted for further session starts and its own session control as with option 2) and then passes it on to the normal tags
  5. Duplication of hits in the client by monkey patching the send function in the browser. This option is not dealt with here and the result is comparable to option 1. It is only mentioned here for the sake of completeness.

Expect different results when implementing options 1 through 4. Not every setup has to pull the “preferred” option 2. The following sections describe the implementation and the resulting data in a roll-up property.

Option 1: Standard GA4 tag to consolidate all hits into one property

A simple tag that fires on all GA4 hits regardless of the Measurement ID and overwrites it with the new ID of the collection property can only be used with the above restrictions. Even overriding individual fields is of no help here if it is the specific field in question. Because the information about the session start or first visit when changing data streams sits in “System Properties” under the key x-ga-system_properties in the event model, and overwriting doesn’t seem to change anything in the outgoing hit. I tried it first by removing the parameters from the original content of the information stored as an object in a variable using a custom template. However, this does not work, because an _ss=1 and eventually _fv=1 is still sent out when you look at the outgoing hits. Removing the information by overwriting it with an “Undefined Value” variable has the same effect; ditto the editing functions for parameters that the tag brings with it. This method is therefore not very suitable because it can lead to an overflow of sessions.

So is this option generally unsuitable? I do not think so. In cases where e.g. different countries or otherwise separate target groups are reached on separate domains, which are to be measured in individual properties, this option is quite useful with some limitations. As long as the number of “property changers” (both within a session and in general) is kept within limits, the results match quite well what one would get from the sum of the individual properties.

Another advantage of this solution is that it can be used in a “multi-domain setup”, in which not only several hosts but completely different domains are measured. However, consolidating at your own endpoint does not solve any cross-domain problems. If this is not mapped in the client by true cross domain tracking, each domain will have its own cookies and there will be no overall ID. As I said: This doesn’t have to be tragic — depending on the reason why there are several domains.

Option 2 — Step A: Standard GA4 tag with special handling on property change

In order to somehow avoid the occurrence of further session_start and possibly first_visit events, a different approach can be followed.

In order for all hits to arrive consistently in GA4, the session ID and session number must be retained. It is also about preventing the creation of new sessions in GA4 and the attribution of conversions over an unbroken overall session in a “multi-stream roll-up property”. For this purpose, a special treatment must take place every time a new session (from the point of view of the individual properties with whose measurement IDs the respective site is tagged) arises when the domain changes. For this purpose, an independent session ID and number is generated and used throughout all events. When switching between the individual parts, certain events are skipped. Based on the example above, the different handling for the roll-up property looks like this:

Independent roll-up session for consistent session data

In this way, both the individual properties and the roll-up variant would get exactly what is needed. Sounds good? Then let’s start with the second problem:

Detect property changes

In order to recognize whether a session already exists or not, a marker like a session cookie can be used to see whether the first entry has already been measured, including the session start and, if applicable, the first visit. Which means that another start (from another property) has taken place. In the very simplest case, you only prevent this hit with a blocking trigger (recognizable from the above-mentioned system properties and a separate cookie) and skip the whole event.However, we solve it differently in this example and close the gap.

If you create a session cookie with any value, you can use it in a trigger to determine whether a session start has already been measured or not. To do this, you need a tag that fires whenever there is no marker cookie yet, but a GA4 request with session start features comes in. You can use e.g. the “Cookie Monster Tag Template” by Simo Ahava. A timestamp as a value works just as well here as any constant, a “1”, the client ID from the GA4 cookie or whatever.

Set your own “session start already measured” marker with a cookie

If you access this cookie value in a standard cookie variable, it can then be used a) for the firing trigger of the Cookie Monster tag (fire if there is no value) and b) the blocking trigger for the GA4 collective property tag.

Blocking trigger for session_start events when property changes

Here, the marker for the session start in incoming events (see last condition) is accessed by reading the “System Properties” (see above) as event parameters via a standard variable and serving as an input value for an extraction variable that gets the value for “ss” extracted from the JSON object.

Reading the session start marker from incoming events

If you stop at this point with the special treatment, you will actually already see fewer sessions in the roll-up property, so you could already book this as a partial success. But it’s ugly because events are missing that way (whenever the domain is changed). Also, if you need your own cookies anyway, you can go all the way and gain complete control over the roll-up session.

Option 2 — Step B: Session control with own cookies

In order not to send the possibly inconsistent session IDs and numbers to the roll-up property when visitors switch between the individual properties, an independent session for the roll-up property must be established. This can be managed in a number of ways; e.g. in a common cookie or in separate cookies — managed server side or in the client. In case of the client, there are alternatives such as localStorage. On the server you could work with fingerprints and persistence in Firestore… let´s just say, there are choices.

Since cookies (and consent) are required for GA4 anyway, separate cookies set on the client side are sufficient. Managing the roll-up session in the client has the advantage that all hits already bring them with them in the request when they arrive at the endpoint. If it is created there, this not only means some effort in the form of variables and triggers, but you are also confronted with batching problems (see below: “Alternative: create a session in ssGTM”). I tried it anyway and was satisfied with the result, but the client-side session is evenly suitable for the roll-up property. And individual properties also work with browser-generated cookies — so why bother?

Session helper tag for GA4 in browser

The recipe for the roll-up session is relatively straightforward: All containers of the individual properties involved (if there are several) require an HTML tag that takes care of the cookies via JavaScript.

It is first determined whether there is already a cookie for the session ID. If not, it is created, the cookie value is read out with the session number (if it exists), the session number is incremented and saved again in a cookie. Here is the code for a corresponding solution that does not require an additional cookie variable for the session number, but instead reads the cookie value directly in the code for the sake of simplicity (just ignore the German comments ;)):

Quick but working roll-up session handling in client-side GTM

So there are two cookies involved: a session ID in a session cookie and a session number in a persistent cookie. As mentioned above, this is not the only solution, but it works and makes using the ssGTM endpoint easier. This saves you the trouble of separating the ID and number in a single (then persistent) cookie and manually renewing “too old” sessions.

The HTML tag is used as a setup tag of the GA4 configuration tag, so that before any hits get sent, it has a fair chance of recognizing and initializing a roll-up session that may not yet exist. The cookie values ​​then travel to the tracking endpoint along with the hits.

Alternative: create a session in ssGTM

If you want to save yourself the work to put the above tag in all containers for individual properties: Using a helper tag can take care of the whole thing with a little extra work on the server. Since in this case the session number and ID must first be determined on the server side at the time the first hit arrives, used at the event and then sent in the response (via Cookie Monster), several variables based on (existing) custom templates such as “Number & String Operations” and “Timestamp” are required, with which a “Next Session Number” and “New Session ID” are generated for the start of a new roll-up session, that fires an special tag, which is only used for the “first hit”. All subsequent hits then use the ID and number from the cookies previously set on the server in another tag.

In addition to the advantage that no adjustments are required in the client in this case, it is also possible to work completely with “server-hardened” cookies (or even httpOnly). There is some risk of a “chicken and egg” situation (which I haven’t seen in testing) from batching. So, in theory, this is the more robust option… while at least one ITP vulnerability can also be mitigated by simply renewing the client-originated session number cookie.

Setup on tracking server

In ssGTM, incoming session information is received in cookie variables. For example the session number.

Access roll-up session data in ssGTM

Then also create a cookie variable for the session ID. Both are then used to override session ID and number (ga_session_number and ga_session_id) in a GA4 tag that fires on all requests arriving in GA4 format:

Override session data in “roll-up” GA4 tag

If this tag is blocked for all domain changes (see above), the state of option 2, step A is reached — but with coherent session ID and numbering.

Closing the gap in the property change

What remains is the problem of missing events when changing the property / domain. A normal GA4 tag is no solution because (as tested in option 1) it is not possible to overwrite system properties. Instead (among some other things that require custom code in custom tag templates) you can simply use the Measurement Protocol.

To do this, just send the event via measurement protocol as part of the existing session when changing properties, so that there are no session start or first_visit markers (in this case, they cannot even be included). Using the same session ID (the number can be omitted, although it can be sent as session_number), the hit ends up in the context of the web session, which is composed of the individual sessions of the sub-properties by the common session ID. Such events are more or less equivalent to a “normal” hit that in most cases should be a page view… but it doesn’t have to be. In my tests, I actually found all other events that may have been sent before the page_view occurred (a list can be found at the end of this article).

Filling the gaps with GA4 Measurement Protocol

Important note: The design of the event shown here only includes the absolute minimum state (apart from the “serverside_route” which I only added for debugging purposes in this experiment). In order to cover all parameters — and e.g. also pass on user properties — either a custom tag template is required that covers all required fields (that’s how I solved it) or the above example has to be supplemented with all relevant event parameters. Since there can also be e-commerce events, incomplete parameters would not be able to fulfill their actual purpose in terms of reporting. For example: With the above minimal setup I saw gaps in conversions and differences in total engagement time due to missing parameters until I forwarded all necessary information by GA4MP as well.

The tag is triggered when a property change occurs. The same trigger is defined as an exception for the “normal” roll-up GA4 tag, so that it is blocked on these events.

Trigger for firing GA4MP tag (and blocking regular roll-up tag)

In this way you have created the best possible approximation of a roll-up property for the individual parts of the website. Depending on the reason for which different properties exist, property changes within a session may be so irrelevant that you can simply save yourself the effort of filling them up using the measurement protocol. Tagging all events in a custom property enables filtering of all events created by this tag in GA4 (mostly they should be page_view events).

Option 3: Measurement Protocol “All In”

A completely different approach would be the continuous and sole use of an HTTP request tag to send the hits via measurement protocol. If the — reduced — data of all incoming hits of the individual properties are simply passed on to GA4, some problems have been eliminated. Since a session ID (generated on the server side and stored in a cookie by Cookie Monster) can be easily generated and used as a common bracket around all hits, the Measurement Protocol is theoretically even the best choice. A session number is not required to achieve the measurement of a common session. I actually tried it, whereby I then limited myself to the events that are essential and thus e.g. For example, engagement is sacrificed because it has to be specified in GA4MP on the hit itself. Sure, you can tinker and get engagement timings from A to B, but it wasn’t worth the effort for a test, so I went with a constant default value:

(Too) simple Measurement Protocol hit with HTTP Request tag in ssGTM

The resulting reports are not completely unusable… but as I have already described in my post on GA4 website measurement via Measurement Protocol (German), many bells and whistles are missing. It may be that this route is more suitable in the future if Google decides to do more with the data that comes from GA4MP. As of now, attribution can only be done manually and in Big Query or though raw data processing. Nevertheless: If you don’t have an incoming GA4 data stream at all, but still want to create a “roll-up property” in GA4 from various incoming requests for different purposes, you can definitely go this route.

Option 4: Own client

If you assume that incoming GA4 requests should only be forwarded to GA4, then you still typically have a client that translates everything into the event model and a tag that reassembles the same request. This makes sense in most setups, because there is more than one tag waiting for an event on the other side. At least in an experiment, it would be conceivable that you could save all of that completely. In this case, whenever a GA4 request arrives, a separate client would react to it.

It can forward the incoming request including all other events sent as a batch in one go to the GA4 roll-up property (forwarding requests to google-analytics.com including — almost — all parameters and POST payload) and then process it normally and make the events available to tags. The client’s overhead then consists of managing the measurement state of the session start using a separate cookie and reacting accordingly whenever a _ss marker is present in the request. The result should actually be the same in GA4 if you also include a few headers and thus pass on the (shortened) IP and user agent in a controlled manner.

The goal would be to save all the overhead of event processing in client and tag. This should result in less load at the endpoint, which is ideally reflected in the required instances and/or costs. Especially since you don’t have multiple calls for batches, but can process everything in one request and batch it out. If anyone is interested in this experiment — talk to me! 😉

Note: If you are tempted to simply use a kind of “Request Repeater” as tag in ssGTM as a variant of the above idea to pass on all incoming parameters of the requests to GA4, you will quickly run into a problem: Page views or other events are either higher or lower than the sum of the partial properties. This is due to batching, where an incoming request can contain multiple events. When forwarding in a tag instead of a client, the events that may be contained in the request body are either missing or processed several times, depending on the setup. Since conversions would also be affected, it’s not really a solution (I actually tried it 😐).

Results

I checked all the tests using a GDS report, in which I combined selected metrics from the four individual properties involved in the test via data blending and compared them with the data from the roll-up property. Since up to five data sources can be merged, differences between totals and roll-up can also be displayed directly. I used the main website, a blog, a B2B and a B2C shop in my test.

The following figure shows (almost) the expected result when using option 2B (regardless of whether the roll-up session is managed on the client or server side):

  • Users and sessions should be lower, as there is a high probability of switching between the individual parts
  • Pageviews should match
  • Revenue and transactions should also be the same
  • Total conversions should correspond to the roll-up value
  • Event count should be lower because First Visit and Session Start events are missing
  • The distribution of users to individual source / medium combination is independent of the individual properties (not shown in the picture)
Results for roll-up property compared to single properies

There are minor differences in page views (“views”), conversions, and revenue, but overall things seem to be working out very well.

Where do the differences come from?

A certain difference in the events is to be expected / welcomed, because there must be a “hole” if the session_start and first_visit from the individual properties no longer appear in the roll-up if a session has already been started there. The control of individual event types via corresponding filters shows a certain pattern: A few view_item and user_engagement events are missing, so that the measurement of the engagement time is also somewhat wrong. The following view shows that of the 9,681 difference in events, a whopping 9,509 are due to expected session starts and newly recognized users — and engagement events.

Event breakdown for “missing” events

The explanation lies (at least I think so) where events arise where there is still an existing GA4 session of the single property, but the roll-up session of the browser no longer exists. This happens e.g. after closing the browser, which ends the life of the session cookie used for this purpose. If events are then sent as part of a batch (=several GA4 events in one request), such gaps sometimes may arise. The reverse case can also have this effect. Therefore there are — very small — differences. But is that tragic? Certainly not. If you want to close the event gap further, you can find solutions in the administration of the roll-up session; e.g. using a shared cookie with a lifetime independent of the session; possibly hardened by the server or generated directly there (see above).

The difference in transaction amount remains. I can’t really explain the “around 50 cents per day”, but there was always a certain deviation in the cent range over the test period, in which the number of participants in the form of individual properties was limited to allow for a clear test setup with simple evaluation in Google Data Studio — and of course afterwards as well. It seems pointless to invest more time for further research here.

As you can see, the differences are marginal or explainable and to get over. The roll-up property does its job: A cross-single-session measurement that traces conversions and revenue back to their source, regardless of the “site section” they occur in or where the visitor might have entered.

Important note: Conversions via Measurement Protocol

Since not only page views, but also other events can be affected when a property change takes place and special treatment is necessary, all possible conversions must be marked or entered in the roll-up property in the configuration area. Because while the information that an event is a conversion arrives correctly in the GA4 roll-up property when using the standard GA4 tag, this is not the case with the measurement protocol. In order for such events sent via GA4MP to be correctly recognized as a conversion in the roll-up property, the conversion configuration must also be repeated there. Yes, exactly: I fell for that (like many stumbling blocks that can no longer be seen in this article). Here you can see the various events that were transferred with a different tag and not the normal “GA4 Roll-Up Tag” when the property was changed.

Not just page views: every event can contain a session start marker

As you can see, while property switch events are mostly page_view events, anything else can be there too. If there are also conversions such as generate_lead, those will be missing in conversion reports in a roll-up property without the appropriate configuration.

Restrictions

With option 2, the results are in my opinion quite usable — both in GA4 itself and in Big Query. There are some limitations to this method that are difficult to address when they occur. Luckily, not all issues are equally relevant to every setup.

  • There may be inconsistencies in the configuration of event properties… and — much worse — user properties. Once there is a session scope, it doesn’t get any better.
  • Within a session, the status of an “Engaged” session can change back to “Not Engaged” when the property is changed
  • result quality of the roll-up property is largely dependent on the setup of the individual properties. All partial properties should ideally be set up in a consistent way and kept in sync
  • Missing first visit information such as “First Session Source” occur when legacy issues exist, i.e. client IDs are not new. The first entry measured in this way may not be reported as a first visit from the roll-up perspective. In that case, a marker for the first visit is then simply missing for this new user and the corresponding dimensions are therefore also not assigned.
  • The limit is reached with cross domain tracking (except for option 1), because your own session cannot be transported between the domains (without further work in the client and sacrificing httpOnly … or switching to fingerprints and server-side state).

Where to start?

In general, setting up option 2 can be done with a few variables and tags, regardless of the long explanation in this article, if an ssGTM is already in use. And following a similar path, other first-party endpoints can be used to consolidate incoming data streams of different properties as well. Even if setting up an ssGTM is associated with costs and maintenance: Depending on how important parallel data evaluation in GA4 (without parallel tagging in the browser) is, these are negligible compared to the costs of the 360 ​​variant. So you don’t necessarily have to buy the “large version” for this reason alone if the other advantages are irrelevant.

If your requirements are simple or you have separate target groups, option 1 might be the best choice (at least to start with). It only requires one additional tag on the server — done. “All-In Measurement Protocol”, on the other hand, only makes sense in exceptional cases. For example, if the incoming data stream is not in GA4 format and only limited event properties are available anyway, the GA4MP may be an option. Building your own client remais untested, but could be profitable with larger amounts of data and theoretically has the best coverage in the roll-up property with minimal computing effort on the server.

In case of doubt which option fits your own requirements best, test yourself using several options or variants (server-side vs. client-side session management, for example). At least that’s how I found my preferred solution. If you want to give it a try: best of luck!

--

--

Markus Baersch
Markus Baersch

Responses (1)