Path Analysis: A process of determining a sequence of pages visited in a visitor session prior to some desired outcome (a purchase, a sign up, visiting a certain part of site etc). The desired end goal is to get a sequence of pages, each of whom form a path, that lead to a desired outcome. Usually these paths are ranked by frequency.
Is doing Path Analysis a good use of time? In my humble opinion the answer is a rather emphatic no, except for one exception (which I’ll discuss below). Almost always Path Analysis tends to be a sub optimal use of our time, resources and any money that is expended on buying tools that do “great” Path Analysis.
We usually strive to do Path Analysis in a quest to find this magic pill that will tell us exactly what “paths” our visitors are following on our website. If they “follow” the path we intended we celebrate.
If as usually it turns out that our visitors don’t “follow” the path that WE want them to follow then it back to the drawing board to redesign the site structure / architecture to get them to “follow” the path or at times, worse, hours of “analysis” on: what the heck were they thinking when they click on this button or go to that page (bad customers, bad customers!).
Challenges with Path Analysis are:
- Imagine a website with five pages. Page one Start, Page five Finish. With a simple visualization in your mind you can imagine the number of paths that a visitor could take. Now imagine a website with 100 pages, now one with 5,000 pages. The number of possible paths quickly becomes infinity (well not really but you get the point).
- Most of our tools do a terrible job of representing this path: click forward, back to home, click forward, reverse to three pages ago, hit buy. In a world of linear path representation at a page level this is really hard to compute, even harder to depict. Yet this is exactly how our customers browse our websites.
- On most websites the most common path is usually followed by less than five percent of visitors, usually 1%. As responsible analysts could we make any decision on something such a small fraction of site traffic is doing?
- Even if the most common path is followed by 90% of the visitors current Path Analysis has two fatal flaws:
- It can’t show / say which page in a series was most influential in convincing a customer to move on.
- Current tools aggregate traffic into one bucket, when in reality each segment of traffic behaves differently (say DM traffic vs SEM vs “bookmarks” vs Print Ads). Segmentation is always key.
All of the above combine to make it quite sub optimal to glean any actionable insights that will lead to making our websites more endearing to our customers.
There is one exception to this rule. For structured experiences such as a Checkout or a Closed-off DM Landing Page experience (no navigation, just Next – Next – Next – Submit) Path Analysis can identify where the “fall off” can occur. Once that is identified we will still not know the Why (see Qualitative Metrics Post) but Path Analysis is helpful.
Here is example of new way of thinking about “Path Analysis” that I think is heading in the right direction. (Please see Disclaimers – Disclosures first.) There are atleast three more things I would like to see fixed in this version but ClickTracks address some of the usual fatal flaws here.
- It is possible to break down a linear process into one in which we can group a bunch of related pages (say all product pages) into “groups”. This helps fix the problem of linearity because customers can go from A to B to C or C to A to B and it does not matter for related content.
- It is possible for Visitors to show up in any stage at any point (this is actual behavior now with SEO influencing where people land). Google Analytics also has this feature(please correct me if others do as well).
- Perhaps the cutest thing is that it shows which page in the “Path” is most influential in moving people to the next stage. This is awesome because one can simply look at the “darker shaded” pages and know, for example, that no one cares about system requirements but rather the page on our 10 year no questions asked return policy is the most important one in convincing people to add to cart.
- It is also quite easy to view how different segments are influenced by different content, in my unreadable screen shot you can see All Visitors vs Visitors from Google. Imagine this intelligence then turned around and applied to personalization (!).
This is not perfect but getting there and I think all the vendors will soon coalesce around this innovation and we will all be greatly empowered.
Path Analysis as it is practiced currently ultimately is like communism (with sincerest apologies to anyone in my audience who might be offended). There are overt/covert intentions to control things, to try to regulate, to say that we know better than you what you want, to push out a certain way of thinking. I know this sounds extreme, and it is but simply for shock value and not to offend anyone.
The web on the other hand is the ultimate personal medium and one in which we all like different things, we all have specific preferences and opinions and a certain way we want to accomplish something. The beauty of the web is that all that is possible and cheaply with easily accessible technology. So why do typical Path Analysis and why try to “push” a certain way of navigation / browsing / buying? Why not get a deep and rich understanding of our customers and then provide them various different options to browse our website they want to and get to the end goal the way they want to.
Why not let democracy flourish? On our websites and in our customer experiences?
Agree? Disagree? Does Path Analysis work for you? Please share your feedback via comments.
…Google Analytics also has this feature (please correct me if others do as well)…
Paths by "content group" have been available for quite some time in WebTrends, I believe ever since they implemented a "tag" version. Agree that path analysis is worthless without segmentation, but then again, (as you pointed out previously), just about any analysis is worthless without segmentation!
Avinash, I can't disagree with most of what you say, yet I find that path analysis, at a low level (small sample) is incredibly useful to understand user behavior on the web. Granted, I still can't see when they clicked the back button and often don't understand how they can start on certain pages. But if I take a 100 random successful clickpaths and analyze them — not even analyze them, just read them — I start to have a richer understanding of movement online. I also start to see what customers *don't* do.
Jim: Thanks for checking out the blog and leaving a comment (just the kind of thing that will keep me going, hearing from greats in the industry such as yourself).
I think you are right but what is different about the latest clicktracks attempt is that in the "content group" you can get the usage of each page in the content group plus, and this is killer, the "influence" of each page in the content group in convincing our site visitors to move to the next stage.
This is simple in computation but awesome in insights.
Hey Avinash – great blog! Just passing by and (almost) missing this industry… Take care -Xavier
Offhand thoughts from the sidelines:
1) Adding to Robbin's comment… How about Path Analysis providing insight on user-brand/site interaction, awareness, involvement and therefore quality of audience? Thoughts?
2) To draw an association, say Path Analysis is a game of 20 questions – the user tends to or needs to assimilate certain points of information to think/feel about what is being read or seen, so that it can be actioned. If we were to work backward from desired site action (say sign up, purchase etc.) can path analysis tell us what those 20 questions or information points are, that ready up the user for an action? Given this, it only makes sense that content grouping is then a more logical way to arrive at level of importance of each factor to the user.
3) Given the movement to user-generated content, the question of "Why not let democracy flourish?" is very appropriate.
Sulakshana, Thanks for the comment…..
The wonderful thing about analysis, vs reporting, is that there are always multiple answers and those answers are influenced by our experiences and perspectives. There is no right or wrong, just different.
The source of my perspective is that if our website is getting a million visitors a month then I would be on very thin ice inferring anything from the path (or think site interaction) of a thousand visitors.
This is especially true if you pause for a moment and think that any given website exists for multiple purpose, when you pick a small sample you increase the chance of seeing causality in things that you should not.
On brand / awareness / involvement I would switch from relying on Path Analysis to doing lab testing, get a few people in a room and ask them what they think. This is a bit better than Path because atleast you can ask them stuff and get into their head (even accounting for the Hawthorne Effect).
I stress that this is just a personal point of view, the great thing about our space is there is enough room for many different points of view.
It is midnight in the US as I type this and the illusion of walking backward in a user's path just seems so dreamy …… nice …. interesting. :)
Traditional Path Analysis, I think not. But as you rightly point out "influence rating", as outlined in the post above, in each content group I think certainly yes.
Thanks again for some thought provoking comments.
Yes, that is "killer", I'll have to ask John Marshall what kind of algo he is using there.
I think we basically agree that "unfocused path reports" are useless. Personally, I think path analysis is critical but I never look at "paths" in the traditional sense of the reports, I only look at one path at a time defined by a specific entry page; that is, the entire report includes only one entry page. That way, you "force" the consolidation of visitor paths and you often do see "most popular path" hitting 20% – 50%.
The success of this approach depends quite a bit on the navigation, of course. If you have 100 link choices on the entry page it probably does not work as well, but I would argue that giving people more than, say, 10 link choices is insane anyway, and 5 is probably much better…
Jim, Thanks for the follow up….
I think this idea quite insightful and you are both increasing the population but also using it in a sense like the exception example in the blog post. This is great.
Links on a page is one of the reasons path analysis might not give insights, not in the sense you mentioned but traditional sense. It is often not considered in analysis.
:) Like they do in fine art, you turn a painting around to check if it is actually symmetrical; looking at it the right way up is something one gets used to.
I'm so happy to read our design getting these compliments. We spent 10 months trying different prototypes and so many of them suffered from fatal flaws like too much complexity, or inability to compare segments side by side.
If anyone is interested in our approach and how the data expresses persuasiveness, and how segmentation works, please consider attending the next free virtual class in our series: http://www.clicktracks.com/seminars/
Very nice analysis, Avinash. And I certainly feel as though I am in the company of geniuses as I review the comments.
The path anlaysis has been a very misunderstood element of measurement, and always one of the most disappointing, IMO. I always see where clients that have analytics software with this feature always believe that they are missing something. I think because it looks to be a feature that promises valuable data, but rarely delivers.
I agree wholeheartedly that the analytics programs (Shout-out to ClickTracks) that allow the user to uncover "the long tail" paths by segmentation are the most valuable.
(I just saw this post, but I'll give it my 2 cents anyway.)
When asked by clients how people navigate their sites, my customray answer is "Only in two ways: a) how you planned it, and b) not, the latter being the most common "pattern".
Apart from doing scenario analysis (in WebTrends; Fallout Analysis in Omniture, etc.) of preferably closed process (in which you can't go to step 3 if you didn't complete step 2, etc.), which is rather analyzing a pre-configured model than pathing, I too believe that the heuristic qualities of that type of analysis is very low, if not null in many instances.
I find the idea of determining (linear regression?) the influencial/causal weight of critical pages on certain actions (ex. completion of so and so) VERY interesting. So, instead of trying to base optimization decisions on vague pathing, we could identify which pages have a stronger propensity to "deliver" the expected behavior.
I think your're spot on… you need to focus at the page groups, and then the effective pages within a group. This is a good paper on categorizing pages for analyzing: http://www.semphonic.com/resources/wpaper_005.pdf.
At each level, pages have different functions. If you look at the pages in groups of what they were designed to do, you can then analyze to see if they are doing what they were intended for.
Currently, we are doing an analysis to determine the time gap of first visit to purchase, the purpose is to see if we can figure out something to make our 'pages' more attractive. But I think it is hard to get insights since it is hard to separate the time_band, there are so many unstable reasons that can cause the delay of our customer. How can I determine which time gap is more 'comfortable' for us.I hope i can get some suggestions from you upon how to dig the gold from this analysis, thanks for your advice.
I work with several performers. In looking at their visitor logs, I like to track how many look at the calendar, from there, how many look at an individual show/date details and from THERE, how many buy tickets. It's not the same as a "next, next, next, checkout" type deal..but…if they get to the calendar after several prompts to do so, and ..1 in 100 buys tickets to a show, we have a conversion rate..KINDA. it's not science, really..but I was thinking that someone who is better with the log analysis tools than I am could make science otu of it :). Or is it just like Twain said: "There's lies, damn lies, and statistics."?
Dana :
I think that makes sense, you are essentially trying to understand content consumption, or people moving from one "block" to the next rather than page1, page2, page3… This should be quite possible.
In fact you can use the Google Analytics funnel report to create a view that looks like what you want. Start there, possible next step is this and then that and then goal. You get the idea.
ClickTracks also had a great funnel report. I love the ultimate flexibility of it and the ability to have multiple pieces of content (individual pages or directories) in each step, and of course multiple steps to a goal. Check out this page:
http://clicktracks.com/funnel_report.php
or this flash video:
http://clicktracks.com/demos_small/funnel_report_demo.php
I am sure other tools can do this as well.
Hope this helps a bit.
-Avinash.
The reason I found this article was as part of an investigation around performance testing. Such testing practically relies on having an idea of HOW 'real' users navigate your system (as opposed to just following the 'happy path'). OK, such analysis might not tell you WHY the various paths are followed, but they will help weight the tests/scenarios performed as part of load/performance testing.
I like your point about using the data for personalisation (and qualitative studies). This is on my wishlist to present content to the customer (e.g. main benefits of a product) in a way that is tailored to their segment, rather than presenting/selling in the same way to all…
Kudos on the article… Not sure if I am impressed with the knowledge here or how many super nerds actually exist out there. Call me if you want to talk nerd… 408 x
Is it possible to find complete path of visitor through google analytics.
I don't want to use Navigation or Entrance path which are only 1 or 2 level deep.
@gargi: I use the Navigation Summary, take note of the most popular pages, then click on them and then take note of the popular pages again and so on, this way I can get the information for more than 1-2 levels. I then add this information into a mind manager program to see this information visually.
Hey Avinash
Great Post.lot of patience needs to read your post.
Indepth analysis reflects your dedication that you put in each of your post.
I am kind of new to Google Analytics, so was wondering if the path analysis is actually available with GA. I mean yes, we can look at goal flow but I am talking more about path analysis that was possible with Yahoo web analytics.
Any suggestions will help.
Cheers!!
Deep: Yes you can, and it is a bit smarter in trying to avoid some of the issues outlined in this post.
The report is called Visitors Flow. You can find it in the Audience folder in GA.
-Avinash.
Although it took me many years to get there, after spending many, many hours in trying to find the golden nuggets from site behavior using pathing (Coremetics in the early days, had a decent stab at it) I agree completely with the Avinash perspective.
These days, I use session recording tools and their segmentation capabilities based on behavior to illuminate any glaring issues in a flow, and then determine if it needs to be addressed to fix it.
The people whom make their way successfully through the site to purchase or consume free content, can do it a 1000 different ways or more. Happy to leave it at that.