save-up-to-$450-on-expert-led-search-marketing-training

Your chance to save up to $450 on a seat at Search Marketing Expo expires this Saturday night.

Don’t wait — register now and join us February 19-20 in San Jose for an all-new agenda featuring 85 sessions organized into three lanes with no limits: SEO, SEM/PPC, and digital commerce marketing.

Breaking news! Benjamin Spiegel – Chief Digital Officer, P&G Beauty at Proctor and Gamble – will join Dana Tan – Sr. Manager, Global SEO, Under Armour, and Search Engine Land Editor-in-Chief, Ginny Marvin for a fireside keynote chat about increasing digital commerce sales at the intersection of SEO, PPC, and social. See the agenda!

Make this the year you drive more awareness, traffic, conversions, and sales with actionable tactics and fresh insights from the industry’s top experts.

If you’re planning to attend SMX West, do yourself and your wallet a favor — book by this Saturday, January 18, and enjoy serious savings! Once these rates are gone, they’re gone.

Register now and I’ll see you in San Jose!



About The Author

Lauren Donovan has worked in online marketing since 2006, specializing in content generation, organic social media, community management, real-time journalism, and holistic social befriending. She currently serves as the Content Marketing Manager at Third Door Media, parent company to Search Engine Land, Marketing Land, MarTech Today, SMX, and The MarTech Conference.



make-it-easy-for-search-engines-to-rank-your-website-in-2020

Contributor and SMX speaker, Fili Wiese, explains why it’s important to audit your website in 2020 to help search engine algorithms better understand your content.

Below is the video transcript:

Hello, my name is Fili. I used to work for Google and now I am at searchbrothers.com. One of the things I want to be talking about is what you can prepare for 2020.

If you have a website and you are concerned about SEO, you do need to know what goes into search engines.

Search engines work with algorithms and algorithms work with trends. And this all depends on what you put into those trends. Your website, your content you’re responsible for – that’s what Google wants to rank, what Bing wants to rank.

Search engines want to rank your website, but you need to make it easy for them.

One of the things that you need to do if you haven’t done it yet, is to look at your log files. Do you have them? Do you have access to them? If you don’t, you need to do that today, sooner rather than later.

Also, you want to make sure that you audit your website, and you check all the technical signals that go into those trends. If you are impacted by different medical updates or other core updates, then you really want to know what’s going on with signals you’re sending into the black box of the algorithms. What comes out of that black box are the rankings, the “serves.” And those serves depend heavily on what you put in.

So for 2020, focus on your input and let the search engines focus on the output.

More predictions for 2020


Opinions expressed in this article are those of the guest author and not necessarily Marketing Land. Staff authors are listed here.



About The Author

Fili Wiese is a renowned technical SEO expert, ex-Google engineer and was a senior technical lead in the Google Search Quality team. At SearchBrothers he offers SEO consulting services with SEO audits, SEO workshops and successfully recovers websites from Google penalties.



2019-search-engine-patents-you-need-to-know-about

Bill Slawski, director of SEO research for Go Fish Digital, has published his list of the top 10 search engine patents to know from 2019. The list of patents touches on various sectors of search, including Google News, local search knowledge graphs and more, and gives us a peek at the technology that Google is, or may one day be, using to generate search results. 

Knowledge-based patents. The majority of the list pertains to what Slawski categorizes as knowledge-based patents. 

One interesting example is a patent on user-specific knowledge graphs to support queries and predictions. Last year, Google said that “there is very little search personalization” happening within its search result rankings. Although the original patent was filed in 2013, a recent whitepaper from Google on personal knowledge graphs touches on many of the same points.

Local search-based patents. Google’s patent on using quality visit scores from in-person trips to local businesses to influence local search rankings was filed in 2015 but granted in July 2019.

The use of such quality visit scores was mentioned in one of Google’s ads and analytics support pages, and it mentioned that the company may award digital and physical badges to the most visited businesses, designating them as local favorites, Slawski said.

Search-based patents. Slawski also highlighted Google’s automatic query pattern generation patent, which evaluates query patterns in an attempt to extract more information about the intent of a search beyond whether it’s informational, navigational or transactional in nature. 

“That Google is combining the use of query log information with knowledge graph information to learn about what people might search for, and anticipate such questions,” Slawski wrote in his post, “shows us how they may combine information like they do with augmentation queries, and answering questions using knowledge graphs.”

Why we care. Of course, just because a company possesses a patent doesn’t mean it is now, or ever will be, implemented. But keeping an eye on Google patents can offer an interesting perspective on where the company is steering search and how it’s thinking about evolution of search.

For the full list of search engine patents to know from 2019, head to Slawski’s original post on SEO by the Sea.



About The Author

George Nguyen is an Associate Editor at Third Door Media. His background is in content marketing, journalism, and storytelling.

save-up-to-$700-off-actionable-search-marketing-tactics


Unbiased content you can trust, a training experience that fits your needs… book your pass now and join us February 19-20 in San Jose!

  • More

Make 2020 the year you achieve your traffic, ranking, and revenue goals: Attend Search Marketing Expo – SMX West – February 19-20 in San Jose for actionable tactics from industry experts.

Register by next Saturday, December 21 to save up to $700 off on-site rates! Here’s what you’ll get:

  • Unbiased content you can trust, all programmed by the experts at Search Engine Land, the industry publication of record
  • A training experience that fits your needs… the agenda is organized into three lanes with no limits… choose from up to six concurrent sessions on SEO, SEM, and digital commerce marketing (new!)
  • Access to an engaging community of marketers who love talking shop, trading ideas, and sharing new ways to overcome workplace challenges
  • Time-saving solutions from market-defining vendors and in-depth demos that inform and empower
  • Amenities that keep you productive on-the-go, including hot lunches, free WiFi, delicious refreshments, and more

Don’t wait. Choose your ideal pass and register by Saturday, December 21 to save up to $700 off on-site rates (and use up leftover 2019 budget while you’re at it!).

See you in San Jose 🙂

Psst… Striving to get your department on the same page? Send them to SMX for an unforgettable team-building experience. Find out about our special group rates and register today.



About The Author

Lauren Donovan has worked in online marketing since 2006, specializing in content generation, organic social media, community management, real-time journalism, and holistic social befriending. She currently serves as the Content Marketing Manager at Third Door Media, parent company to Search Engine Land, Marketing Land, MarTech Today, SMX, and The MarTech Conference.




building-a-search-engine-from-scratch

Building a search engine from scratch

A whirlwind tour of the big ideas powering our web search

December 6th, 2019

The previous blog post in this series explored our journey so far in building an independent, alternative search engine. If you haven’t read it yet, we would highly recommend checking it out first!

It is no secret that Google search is one of the most lucrative businesses on the planet. With quarterly revenues of Alphabet Inc. exceeding $40 Billion[1] and a big portion of that driven by the advertising revenue on Google’s search properties, it might be a little surprising to see the lack of competition to Google in this area[2]. We at Cliqz believe that this is partly due to the web search bootstrapping problem: the entry barriers in this field are so massive that the biggest, most successful companies in the world with the resources to tackle the problem shy away from it. This post attempts to detail the bootstrapping problem and explain the Cliqz approach to overcoming it. But let us first start by defining the search problem.

The expectation for a modern web search engine is to be able to answer any user question with the most relevant documents that exist for the topic on the internet. The search engine is also expected to be blazingly fast, but we can ignore that for the time being. At the risk of gross oversimplification, we can define the web search task as computing the content match of each candidate document with respect to the user question (query), computing the current popularity of the document and combining these scores with some heuristic.

The content match score measures how well a given document matches a given query. This could be as simple as an exact keyword match, where the score is proportional to the number of query words present in the document:

query avengers endgame
document avengers endgame imdb

If we could score all our documents this way, filter the ones that contain all the query words and sort them based on some popularity measure, we would have a functioning, albeit toy, search engine. Let us look at the challenges involved in building a system capable of handling just the exact keyword match scenario at a web scale, which is a bare minimum requirement of a modern search engine.

According to a study published on worldwidewebsize.com, a conservative estimate of the number of documents indexed by Google is around 60 Billion.

1. The infrastructure costs involved in serving a massive, constantly updating inverted index at scale.

Considering just the text content of these documents, this represents at least a petabyte of data. A linear scan through these documents is technically not feasible, so a well understood solution to this problem is to build an inverted index. The big cloud providers like Amazon, Google or Microsoft are able to provide us with the infrastructure needed to serve this system, but it is still going to cost millions of euros each year to operate. And remember, this is just to get started.

2. The engineering costs involved in crawling and sanitizing the web at scale.

The crawler[3] infrastructure needed to keep this data up to date while detecting newer documents on the internet is another massive hurdle. The crawler needs to be polite (some form of domain level rate-limiting), be geographically distributed, handle multilingual data and aggressively avoid link farms and spider traps[4]. A huge portion of the crawlable[5] web is spam and duplicate content; sanitizing this data is another big engineering effort.

Also, a significant portion of the web is cut off from you if your crawler is not famous. Google has a huge competitive advantage in this regard, a lot of site owners allow just GoogleBot (and maybe BingBot), making it an extremely tedious process for an unknown crawler to get whitelisted on these sites. We would have to handle these on a case-by-case basis knowing that getting the attention of the sites is not guaranteed.

3. You need users to get more users (Catch-22)

Even assuming we manage to index and serve these pages, measuring and improving the search relevance is a challenge. Manual evaluation of search results can help get us started, but we would need real users to measure changes in search relevance in order to be competitive.

4. Finding the relevant pages amongst all the noise on the web.

The biggest challenge in search, though, is the removal of noise. The Web is so vast that any query can be answered. The ability to discard the noise in the process makes all the difference between a useless search engine and a great one. We discussed this topic with some rigor in our previous post, providing a rationale for why using query logs is a smarter way to cut through the noise on the web. We also wrote in depth about how to collect these logs in a responsible manner using Human Web. Feel free to check these posts out for further information.

Query/URL pairs, typically referred to as query logs, are often used by search engines to optimize their ranking and for SEO to optimize incoming traffic. Here is a sample from the AOL query logs dataset[6].

Query Clicked Url
google http://www.google.com
wnmu http://www.wnmu.edu
ww.vibe.com http://www.vibe985.com
www.accuweather.com http://www.accuweather.com
weather http://asp.usatoday.com
college savings plan http://www.collegesavings.org
pennsylvania college savings plan http://www.patreasury.org
pennsylvania college savings plan http://swz.salary.com

We can use these query logs to build a model of the page outside of its content, which we refer to as page models. The example below comes from a truncated version of the page model that we have at the moment for one particular CNN article on Tesla’s Cybertruck launch. The scores associated with the query are computed as a function of its frequency (i.e. the number of times the query/URL pair was seen in our logs) and its recency (i.e. recently generated query logs are a better predictor for relevance).

{
  "queries": [
    [
      "tesla cybertruck",
      0.5111737168808949
    ],
    [
      "tesla truck",
      0.4108341455983614
    ],
    [
      "new tesla car",
      0.022784473090294764
    ],
    ...
    ...
    ...
    [
      "pick up tesla",
      0.020538972510725183
    ],
    [
      "new tesla truck",
      0.019462471432017632
    ],
    [
      "cyber truck tesla price",
      0.006587470155023614
    ],
    [
      "how much is the cybertruck",
      0.003764268660013494
    ],
    ...
    ...
    ...
    [
      "cybertruck unveiling",
      0.0016181605575564585
    ],
    [
      "new tesla cybertruck",
      0.0016181605575564585
    ],
    [
      "cyber truck unveiling",
      0.0016181605575564585
    ]
  ]
}

We have hundreds of queries on the page, but even this small sample should provide you with an intuition on how a page model helps us summarize and understand the contents of the page. Even without the actual page text, the page model suggests that the article is about a new Tesla car called the Cybertruck; it details an unveiling event and it contains potential pricing information.

The more unique queries we can gather for a page, the better our model for the page will be. Our use of Human Web also enables us to collect anonymous statistics on the page, a part of which is shown below. This structure shows the popularity of the page in different countries at this moment in time, which is used as a popularity signal. We can see that it is very popular in the UK, less so is Australia, etc.

"counters": {
  "ar": 0.003380009657170449,
  "at": 0.016900048285852245,
  "au": 0.11492032834379527,
  "be": 0.02704007725736359,
  ...
  ...
  ...
  "br": 0.012071463061323033,
  "cn": 0.0014485755673587638,
  "cz": 0.008691453404152583,
  "de": 0.06422018348623854,
  "dk": 0.028971511347175277,
  "fr": 0.025108643167551906,
  ...
  ...
  ...
  "gb": 0.3355866731047803,
  "it": 0.00772573635924674,
  "jp": 0.005311443746982134,
  "ru": 0.0159343312409464,
  "se": 0.0294543698696282,
  "ua": 0.012071463061323033
  ...
}

Now that we understand how page models are generated, we can start stepping through the search process. We can break down this process into different stages, as described below.

The TL;DR Version

This is a high level overview if you want to know how Cliqz search is different.

  1. Our model of a web page is based on queries only. These queries could either be observed in the query logs or could be synthetic, i.e. we generate them. In other words, during the recall phase, we do not try to match query words directly with the content of the page. This is a crucial differentiating factor – it is the reason we are able to build a search engine with dramatically less resources in comparison to our competitors.
  2. Given a query, we first look for similar queries using a multitude of keyword and word vector based matching techniques.
  3. We pick the most similar queries and fetch the pages associated with them.
  4. At this point, we start considering the content of the page. We utilize it for feature extraction during ranking, filtering and dynamic snippet generation.

1. The Query Correction Stage

When the user enters a query into our search box, we first perform some potential corrections on it. This not only involves some normalization, but also expansions and spell corrections, if necessary. This is handled by a service which is called suggest – we will have a post detailing its inner workings soon. For now, we can assume that the service provides us with a list of possible alternatives to the user query.

2. The Recall and Precision Stages

We can now start building the core of the search engine. Our index contains billions of pages; the job of the search is to find the N, usually around 50, most relevant pages for a given query. We can broadly split this problem into 2 parts, the recall and the precision stages.

The recall stage involves narrowing down the billions of pages to a much smaller set of, say, five hundred candidate pages, while trying to get as many relevant pages as possible into this set. The precision stage performs more intensive checks on these candidate pages to filter the top N pages and decide on the final ordering.

2.1 Recall Stage

A common technique used to perform efficient retrieval in search is to build an inverted index. Rather than building one with the words on the page as keys, we use ngrams of the queries on the page model as keys. This let’s us build a much smaller and less noisy index.

We can perform various types of matching on this index:

  • Word based exact match: We look for the exact user query in our index and retrieve the linked pages.
  • Word based partial match: We split the user query into ngrams and retrieve the linked pages for each of these.
  • Synonym and stemming based partial match: We stem the words in the user query and retrieve the linked pages for each of its ngrams. We could also replace words in the query with their synonyms and repeat the operation. It should be clear that this approach, if not used with caution, could quickly result in an explosion of candidate pages.

This approach works great for a lot of cases, e.g. when we want to match model numbers, names, codes, rare-words. Basically when the token of the query is a must, which is not possible to know before-hand. But it can also introduce a lot of noise as shown below, because we lack a semantic understanding of the query.

query soldier of fortune game
document 1 ps2 game soldier fortune
document 2 soldier of fortune games
document 3 (Noise) fortune games
document 4 (Noise) soldier games

An alternative approach to recall is to map these queries to a vector space and match in this higher dimensional space. Each query gets represented by a K dimensional vector, 200 in our case, and we find the nearest neighbors to the user query in this vector space.

This approach has an advantage with the fact that it can match semantically. But it could also introduce noise with aggressive semantic matching, as illustrated below. This technique could also be unreliable when the query contains rare words, like model numbers or names, as their vector space neighbors could be random.

query soldier of fortune game
document 1 soldier of fortune android
document 2 sof play
document 3 soldier of fortune playstation
document 4 (Noise) defence of wealth game

We train these query vectors based on some custom word embeddings learnt from our data. These are trained on billions of <good query, bad query> pairs collected from our data and use the Byte-Pair-Encoding implementation of SentencePiece[7] to address the issue with missing words in our vocabulary.

We then build an index with billions of these query vectors using Granne[8], our in-house solution for memory efficient vector matching. We will have a post on the internals of Granne soon on this blog, you should definitely look out for it if the topic is of interest.

We also generate synthetic queries out of titles, descriptions, words in the URL and the actual content of the page. These are, by definition, noisier that the query-logs captured by HumanWeb. But they are needed; otherwise, newer pages with fresh content would not be retrievable.

Which model do we prefer? The need for semantics is highly dependent on the context of the query which is unfortunately difficult to know a priori. Consider this example:

Scenario 1 how tall are people in stockholm? how tall are people in sweden?
Scenario 2 restaurants in stockholm restaurants in sweden

As you can see, the same semantic matching could result in good or bad matches based on the query. The semantic matching in Scenario 1 is useful for us to retrieve good results, but the matching in Scenario 2 could provide irrelevant results. Both keyword and vector based models have their strength and weaknesses. Combining them in an ensemble, together with multiple variations of those models, gives us much better results than any particular model in isolation. As one would expect, there is no silver bullet model or algorithm that does the trick.

The recall stage combines the results from both the keyword and vector-based indices. It then scores them with some basic heuristics to narrow down our set of candidate pages. Given the strict latency requirements, the recall stage is designed to be as quick as possible while providing acceptable recall numbers.

2.2 Precision Stage

The top pages from the recall stage enter the precision stage. Now that we are dealing with a smaller set of pages, we can subject them to additional scrutiny. Though the earlier versions of Cliqz search used a heuristic driven approach for this, we now use gradient boosted decision trees[9] trained on hundreds of heuristic and machine learned features. These are extracted from the query itself, the content of the page and the features provided by Human Web. The trees are trained on manually rated search results from our search quality team.

3. Filter Stage

The pages that survive the precision stage are now subject to more checks, which we call filters. Some of these are:

  • Deduplication Filter: This filter improves diversity of results by removing pages that have duplicated or similar content.
  • Language Filter: This filter removes pages which do not match the user’s language or our acceptable languages list.
  • Adult Filter: This filter is used to control the visibility of pages with adult content.
  • Platform Filter: This filter replaces links with platform appropriate ones e.g. mobile users would see the mobile version of the web page, if available.
  • Status Code Filter: This filter removes obsolete pages, i.e. links we cannot resolve anymore.

4. Snippet Generation Stage

Once the result set is finalized, we enhance the links to provide well formatted and query-sensitive snippets. We also extract any relevant structured data the page may contain to enrich its snippet.

The final result set is returned back to the user through our API gateway to the various frontend clients.

Maintaining Multiple Mutable Indices

The section on recall presented an extremely basic version of the index for the sake of clarity. But in reality, we have multiple versions of the index running in various configurations using different technologies. We use Keyvi[10], Granne, RocksDB and Cassandra in production to store different parts of our index based on their mutability, latency and compression constraints.

The total size of our index currently is around 50 TB. If you could find a server with the required disk space along with sufficient RAM, it is possible to run our search on localhost, completely disconnected from the internet. It doesn’t get more independent than that.

Search Quality

Search quality measurement plays an integral part in how we test and build our search. Apart from automated sanity checking of results against our competitors, we also have a dedicated team working on manually rating our results. Over the years, we have collected the ratings for millions of results. These ratings are used not only to test the system but to help train the ranking algorithms we mentioned before.

Fetcher

The query logs we collect from Human Web are unfortunately insufficient to build a good quality search. The actual content of the page is not only necessary to perform better ranking, it is required to be able to provide a richer user experience. Enriching the result with titles, snippets, geolocation and images helps the user make an informed decision about visiting a page.

It may seem like Common Crawl would suffice for this purpose, but it has poor coverage outside of the US and its update frequency is not realistic for use in a search engine.

While we do not crawl the web in the traditional sense, we maintain a distributed fetching infrastructure spread across multiple countries. Apart from fetching the pages in our index at periodic intervals, it is designed to respect politeness constraints and robots.txt while also dealing with blocklists and crawler unfriendly websites.

We still have issues getting our fetcher cliqzbot whitelisted on some very popular domains, like Facebook, Instagram, LinkedIn, GitHub and Bloomberg. If you can help in any way, please do reach out at beta[at]cliqz[dot]com. You’d be improving our search a lot!

Tech Stack

  • We maintain a hybrid deployment of services implemented primarily in Python, Rust, C and Golang.
  • Keyvi is our main index store. We built it to be space efficient, fast but also provide various approximate matching capabilities with its FST data structure.
  • The mutable part of our indices are maintained on Cassandra and RocksDB.
  • We built Granne to handle our vector based matching needs. It is a lot more memory efficient than other solutions we could find – you can read more about it in tommorrow’s blog post.
  • We use qpick[11] for our keyword matching requirements. It is built to be both memory and space efficient, while being able to scale to billions of queries.

Granne and qpick have been open-sourced under MIT and GPL-2.0 licenses respectively, do check them out!

It is hard to summarize the infrastructural challenges in running a search engine in a small section of this post. We will have dedicated blog posts detailing our Kubernetes infrastructure which enables all the above soon, so do check them out!

While some of our early decisions allowed us to drastically reduce the effort required in setting up a web scale search engine, it should be clear by now that there are still a lot of moving parts which must work in unison to enable the final experience. The devil is in the details for each of these steps – topics like language detection, adult content filtering, handling redirects or multilingual word embeddings are challenging at web scale. We will have posts on more of the internals soon, but we would love to hear your thoughts on our approach. Feel free to share and discuss this post, we will try our best to answer your questions!

References


  1. Alphabet Q3 2019 earnings release – report ↩︎

  2. Statista search engine market share 2019 – link ↩︎

  3. Web Crawler – Wikipedia – link ↩︎

  4. Spamdexing – Wikipedia – link ↩︎

  5. Deep Web – Wikipedia – link ↩︎

  6. Web Search Query Log Downloads – link ↩︎

  7. SentencePiece – GitHub – link ↩︎

  8. Granne – GitHub – link ↩︎

  9. Gradient Boosting – Wikipedia – link ↩︎

  10. Keyvi – GitHub – link ↩︎

  11. qpick – GitHub – link ↩︎

here’s-what-to-expect-in-paid-search,-email-marketing,-spending-and-more-during-the-bfcm-stretch

Thanksgiving Day is upon us, and with it, the busiest shopping week of the year. The five-day stretch between Thanksgiving Day and Cyber Monday poses a unique challenge this year for retailers — 2019’s “Cyber 5” will kick-off a shorter-than-usual holiday shopping season with six fewer days between Thanksgiving and Christmas Day.

Fortunately for e-commerce brands and online retailers, the shorter season is not likely to impact how much consumers plan to spend during the next month. eMarketer reports 2019 will be the first year holiday revenue will surpass $1 trillion, a 3.8% jump over last year.

So what can online retailers expect over the next five days? We’ve assembled the following rundown this year’s holiday shopping forecasts so you know what’s coming.

Cyber Monday to see record sales

Adobe reports this year’s holiday season will bring in $143 billion in online revenue, with $30 billion coming in between Thanksgiving and Cyber Monday. Both Adobe and the National Retail Foundation (NRF) predict online holiday sales could see as much as a 14% lift over last year’s numbers.

The NRF expects 70 million consumers will shop online during Cyber Monday. At the same time, Adobe predicts those shoppers will spend $9.4 billion that day, nearly a 20% increase over last year’s Cyber Monday sales.

“Cyber Monday is once again expected to be the biggest online shopping day in US history, with a total that could approach — or even surpass — $10 billion,” reports eMarketer.

Consumer spending habits

Shoppers are expected to spend $892 on their holiday shopping, according to a holiday shopping survey of 2,000 consumers by The Harris Poll and OpenX. Half of these consumers already started their holiday gift buying back in September, with a majority of dollars spent during the holidays happening in online sales (desktop, mobile or tablet). Deloitte’s survey of 4,400 U.S. adults found the same — that most spending (59%) would happen online — but, the participants in Deloitte’s survey reported a much higher average holiday spend: $1,500 with $879 happening online.

Adobe predicts 47% of online holiday revenue will happen via a smartphone, with U.S. consumers spending $14 billion more on their mobile device this holiday season compared to last year.

What’s happening on Amazon?

A recent Episerver survey polling more than 4,500 shoppers across the U.S., UK and other countries revealed 42% would be buying most or all of their holiday gifts on Amazon.

Amazon also pulls more shoppers at the start of their shopping journey during the holidays — according to the survey, 32% of online shoppers who aren’t sure what they want to buy will start their gift search on Amazon compared to 18% who turn to Google. Amazon wins an even larger share when there’s a specific product being searched — for example, when searching for an apparel item, 43% of the shoppers surveyed by Episerver turn to Amazon versus 29% who turn to Google.

Amazon gave shoppers an early look at the discounts it will be offering between November 30 and December 2, with more than 100 deals spanning across various categories — from Amazon devices and Amazon brands to fashion, electronics and toy products.

Google Shopping ads will be key, even if growth is slow

Marketing Land contributor Andy Taylor says retailers can expect Google Shopping to play a key role during the holiday season this year, but is on the fence about whether or not the platform will see the growth it experienced during the last quarter of 2019.

“It’s unclear if Google Shopping has another big push like the one we saw last Q4 in it, or if Google has more or less used up its powder with respect to expanding these ad units to the extent observed at the end of 2018,” writes Taylor, who serves as the director of research for the search marketing firm Tinuiti, “As such, advertisers shouldn’t be shocked if Shopping growth is slower during the holidays this year than last year.”

(Don’t miss the digital commerce marketing tract at SMX West 2020!)

Taylor also noted Amazon’s appearance in Google Shopping ads, which has grown over the last year. In October, 2018, Amazon was barely visible in Shopping results for apparel, but this year tells a different story. “Amazon’s impression share is now more than double what apparel retailers saw last December and has held steady for the last three months,” reports Taylor.

Email marketing: “Free shipping” and “% off” discounts win

Last year’s holiday email marketing results were less than inspiring. After sending more holiday-themed campaigns than any previous year during the fourth quarter of 2018, brands saw average open rates at 10.5% and click rates at 1%, according to Yes Marketing’s 2018 holiday report. The holiday-themed email performance rates scored lower than non-themed emails sent during the same time period, which delivered a 12.6% open rate and 1.1% click rate.

For brands wanting to offer discounts via their email messaging, Yes Marketing’s Senior Director of Client Service Kyle Henderick said “% off” promotions performed best last year, delivering more than double the conversion rates of a business-as-usual message. He also says “Free Shipping” offers are an effective way to stand-out during the Cyber Monday email surge: “When subscribers are sorting through hundreds of Cyber Monday emails, ensuring free shipping is prominent is almost a sure-fire way to capture their attention.”

More about retail for the winter holidays



About The Author

Amy Gesenhues is a senior editor for Third Door Media, covering the latest news and updates for Marketing Land, Search Engine Land and MarTech Today. From 2009 to 2012, she was an award-winning syndicated columnist for a number of daily newspapers from New York to Texas. With more than ten years of marketing management experience, she has contributed to a variety of traditional and online publications, including MarketingProfs, SoftwareCEO, and Sales and Marketing Management Magazine. Read more of Amy’s articles.



paid-search-trends-to-watch-for-the-2019-holiday-shopping-season

The holidays are here! That means search marketers everywhere are putting the finishing touches on strategies to make the most out of the next few weeks, the most important stretch of sales for many businesses.

However, don’t go into the holiday shopping season without reading up on these key trends which might help wrap up your strategy a little bit tighter.

Google Shopping will likely be the star of retail

It should be absolutely no surprise to retailers that Google Shopping is incredibly important to paid search success, and I’ve written about its rise many times over the years. This continues to be the case today, as Google Shopping accounted for 48% of all Google search spend in Q3 2019 for Tinuiti (my employer) retail advertisers. Advertisers should once again prepare for Shopping to play a key role during the winter holidays this year.

However, this Q4 and the months that follow will be an important time for both Google and advertisers in determining just how long Google Shopping can continue its torrid pace of growth, as we’re beginning to lap some shifts that happened at the end of last year that significantly drove up Google Shopping traffic.

As you can see from the chart below, Google Shopping click growth jumped from 41% last Q3 to 49% in Q4, and while growth has remained strong since, there has been a steady deceleration.

As has long been the case, phones in particular are driving much of the Google Shopping growth, and in Q3 2019 clicks grew 36% on phones compared to 27% overall.

The leap last Q4 coincided with an explosion in Google Shopping impressions, as Google seemed to prioritize Google Shopping over text ads. The impression growth was most pronounced on phones, where impressions increased 127% Y/Y in Q4 compared to 81% in Q3.

Some of this increase can certainly be attributed to newer, growing Shopping variations such as Showcase Shopping Ads, which produce advertiser-specific listings for more general searches.

The queries that trigger these ads tend to be about 20% shorter in terms of character count than the queries triggering traditional Google Shopping listings. While character count is far from a decisive metric with regards to determining how general or focused a search is, it does indicate that Google is finding shorter queries which likely include less product-specific qualifiers that it’s now showing Showcase ads for.

However, the impressive Shopping growth that occurred at the end of 2018 wasn’t just a matter of Google finding additional spots to throw Showcase ads, as true traditional Shopping listings also saw an explosion in growth. Taken together, the evidence points to a significant expansion in the share of search queries producing Google Shopping results.

All of this is to say that it’s unclear if Google Shopping has another big push like the one we saw last Q4 in it, or if Google has more or less used up its powder with respect to expanding these ad units to the extent observed at the end of 2018. As such, advertisers shouldn’t be shocked if Shopping growth is slower during the holidays this year than last year.

Nor should we be surprised if Google once again finds a way to push growth back up as it has so many other times. After all, the surge last year was unexpected, and Google’s latest additions of image search and YouTube inventory as well as additional Showcase-eligible product categories may help in a potential rebound.

Regardless, a rather large player you might have heard of stands ready to steal some Shopping clicks from under the tree.

Amazon poised to play Grinch more so than in past years

Much like the importance of Google Shopping, it’s difficult for U.S. retailers to be unaware of the trillion-dollar website in the room – Amazon. Even still, many retailers might be surprised to know just how dominant the e-commerce giant has become in Shopping over the last year.

This is most apparent when looking at Amazon’s Shopping impression share in apparel through Auction Insights reports. As of last October, Amazon was only barely visible in Shopping results against apparel retailers in the U.S., but that has changed rapidly.

Amazon’s impression share is now more than double what apparel retailers saw last December and has held steady for the last three months. In addition to impression share gains over the last year in other categories such as home goods, furniture and electronics, all signs point to Amazon more fully flexing its might in Google Shopping this holiday season.

Of course, given Amazon’s choice to take a couple days off from Shopping during Prime Day, it’s probably unwise for anyone outside of its paid search team to espouse confident opinions on its likely Q4 strategy. But the foundation seems laid for a bigger holiday presence than ever before.

What’s a competitor to do? There’s not much in the way of Amazon-specific advice for competing in Shopping, as competing with Amazon looks a lot like competing with any Shopping advertiser.

Stay on top of the queries triggering ads and funnel traffic effectively using keyword negatives. Keep feeds up to date and out of trouble by responding quickly to any warnings from Google Merchant Center. Take advantage of Shopping variations like Showcase ads and Local Inventory Ads (for brick-and-mortar advertisers) to ensure ads are eligible to show in as many different types of relevant scenarios as possible.

On the last point, Local Inventory Ads (LIA) are a nice differentiator for retailers with physical stores, since Amazon can’t offer the same in-store options. However, Amazon’s impression share is just as strong against LIA campaigns as traditional Shopping for many brands, so don’t think it won’t be lurking for searches with local intent as well.

Speaking of local intent – it’s time for my favorite paid search trend of the year.

Searchers turn to Maps for the Turbo Man dash

When it’s down to the Christmas wire and shipping cutoffs have left the prospect of getting a gift delivered in time shaky, many shoppers are forced to physical stores to make sure Jamie gets the right action figure.

This is readily apparent when looking at the share of Google text ad clicks which are attributed to the “Get location details” (GLD) click type, which comes predominantly from Google Maps according to Google. The chart below shows daily share for one national apparel retailer from last holiday season, for which GLD clicks accounted for 14% of all text ad traffic on 12/23 – the biggest daily share observed between November and December. A close second was Christmas Eve, with 13%.

These figures can vary significantly by advertiser, but the general trend of GLD clicks spiking in the lead up to Christmas relative to other days of the year is very common among brands with a brick-and-mortar presence.

In terms of accounting for this, advertisers often look to results from last year to determine if they overspent or underinvested on particular days. If a brick-and-mortar brand were to only look at the online conversions attributed to ads in assessing the value of traffic on the last days leading up to Christmas, the picture may not provide a true representation of the value of that traffic given the huge offline intent on these days. This is true throughout the year for brands with physical stores, but made more glaring in situations like last-minute holiday shopping.

Given the way the calendar falls this year, last-minute shopping is likely to be hugely important.

Shortest holiday season since 2013 will make for a time crunch

The period between Thanksgiving and Christmas will be a full six days shorter this year than in 2018, and we haven’t had a Thanksgiving occur this late into November since 2013. As such, the race will be on for both consumers and brands alike.

History offers us a helpful test on the effects of a shorter holiday shopping period in the form of a 1939 decision by FDR to move the Thanksgiving holiday one week earlier at the request of retailers who hoped to drive more revenue from the holiday season. 23 states immediately adopted the new date (the third Thursday of November), while 23 others stuck to the original fourth Thursday of November. Two states chose to celebrate both.

After the holiday season, businesses reported that total consumer spending was similar across states that adopted the earlier date and those that stuck with the later date, indicating a longer period between the two holidays didn’t produce more spending. However, the distribution of sales revenue throughout the holiday season was different between the two, with the bulk of holiday shopping occurring in the last week before Christmas for states with the later date compared to evenly distributed throughout the holiday season for those celebrating the earlier date.

Using this as an indicator for how shopping might shake out this year (though there may have been one or two major developments in retail since 1939…), the shorter holiday season shouldn’t in and of itself reduce holiday-related sales for retailers. However, the last week ahead of Christmas might be especially important this year.

Most importantly, the U.S. settled on the fourth Thursday of November as Thanksgiving Day once and for all in 1941, meaning marketers will only have to deal with one Black Friday and Cyber Monday. And for that, I am thankful. Have a Happy Thanksgiving everyone.

More about retail for the winter holidays


Opinions expressed in this article are those of the guest author and not necessarily Marketing Land. Staff authors are listed here.



About The Author

Andy Taylor is director of research at Tinuiti, responsible for analyzing trends across the digital marketing spectrum for best practices and industry commentary. A seasoned marketer with 9-plus years of experience, he speaks frequently at industry conferences and events.



google-search-results-have-more-human-help-than-you-think,-report-finds

google it —

Google is sometimes hands-on under the hood, and investigators want to know more.


A large Google sign seen on a window of Google's headquarters.

Enlarge / Mountain View, Calif.—May 21, 2018: Exterior view of a Googleplex building, the corporate headquarters of Google and parent company Alphabet.

Google, and its parent company Alphabet, has its metaphorical fingers in a hundred different lucrative pies. To untold millions of users, though, “to Google” something has become a synonym for “search,” the company’s original business—a business that is now under investigation as more details about its inner workings come to light.

A coalition of attorneys general investigating Google’s practices is expanding its probe to include the company’s search business, CNBC reports while citing people familiar with the matter.

Attorneys general for almost every state teamed up in September to launch a joint antitrust probe into Google. The investigation is being led by Texas Attorney General Ken Paxton, who said last month that the probe would first focus on the company’s advertising business, which continues to dominate the online advertising sector.

Paxton said at the time, however, that he’d willingly take the investigation in new directions if circumstances called for it, telling the Washington Post, “If we end up learning things that lead us in other directions, we’ll certainly bring those back to the states and talk about whether we expand into other areas.”

Why search?

Google’s decades-long dominance in the search market may not be quite as organic as the company has alluded, according to The Wall Street Journal, which published a lengthy report today delving into the way Google’s black-box search process actually works.

Google’s increasingly hands-on approach to search results, which has taken a sharp upturn since 2016, “marks a shift from its founding philosophy of ‘organizing the world’s information’ to one that is far more active in deciding how that information should appear,” the WSJ writes.

Some of that manipulation comes from very human hands, sources told the paper in more than 100 interviews. Employees and contractors have “evaluated” search results for effectiveness and quality, among other factors, and promoted certain results to the top of the virtual heap as a result.

One former contractor the WSJ spoke with described down-voting any search results that read like a “how-to manual” for queries relating to suicide until the National Suicide Prevention Lifeline came up as the top result. According to the contractor, Google soon after put out a message to the contracting firm that the Lifeline should be marked as the top result for all searches relating to suicide so that the company algorithms would adjust to consider it the top result.

Or in another instance, sources told the WSJ, employees made a conscious choice for how to handle anti-vax messaging:

One of the first hot-button issues surfaced in 2015, according to people familiar with the matter, when some employees complained that a search for “how do vaccines cause autism” delivered misinformation through sites that oppose vaccinations.

At least one employee defended the result, writing that Google should “let the algorithms decide” what shows up, according to one person familiar with the matter. Instead, the people said, Google made a change so that the first result is a site called howdovaccinescauseautism.com—which states on its home page in large black letters, “They f—ing don’t.” (The phrase has become a meme within Google.)

The algorithms governing Google’s auto-complete and suggestion functions are also heavily subject to review, the sources said. Google says publicly it doesn’t allow for predictions related to “harassment, bullying, threats, inappropriate sexualization, or predictions that expose private or sensitive information,” and that policy’s not new. The engineer who created the auto-complete function in 2004 gave an example using Britney Spears, who at the time was making more headlines for her marriages than for her music.

The engineer “didn’t want a piece of human anatomy or the description of a sex act to appear when someone started typing the singer’s name,” as the paper describes it. The unfiltered search results were “kind of horrible,” he added.

The company has since maintained an internal blacklist of terms that are not allowed to appear in autocomplete, organic search, or Google News, the sources told the WSJ, even though company leadership has said publicly, including to Congress, that the company does not use blacklists or whitelists to influence its results.

The modern blacklist reportedly includes not only spam sites, which get de-indexed from search, but also the type of misinformation sites that are endemic to Facebook (or, for that matter, Google’s own YouTube).

Why antitrust?

Google relying on human intervention, and endless tweaks to its algorithms as the WSJ describes, isn’t an antitrust violation. When it uses its trove of data from one operation to make choices that may harm competitors to its other operations, though, that can draw attention.

All that human intervention and algorithmic tweaking also affects advertising and business results, according to the WSJ. Those tweaks “favor big businesses over smaller ones,” the paper writes, “contrary to [Google’s] public position that it never takes that type of action.”

The largest advertisers, including eBay, have received “direct advice” on how to improve their search results after seeing traffic from organic search drop, sources told the paper. Smaller businesses, however, have not been so lucky, being left instead to try to figure out the systems either bringing them traffic or denying them traffic on their own.

Links to Google’s own features and properties also take up an increasingly large percentage of the search results page, the WSJ notes. For example, if you search for one of today’s chart-toppers, such as Beyoncé, you’re greeted with three large Google modules that take up more than half the screen real estate:

Most of the results on the page are Google modules (highlighted in red).

Enlarge / Most of the results on the page are Google modules (highlighted in red).

More than half of Google searches are now reportedly “no-click” searches, where individuals look only at the page of results and use the snippets on it rather than clicking through to any of the sources from which Google is drawing that information. That kind of use of data, among others, could be considered harmful to competition, since the company is using data collected from competitors to keep users from going to those competitors.

Google, for its part, disputed the WSJ’s findings throughout, telling the paper, “We do today what we have done all along, provide relevant results from the most reliable sources available.”

search-marketing-expo-is-next-week.-will-i-see-you-there?

Did you know? SMX East is happening next week — November 13-14 in New York City — and there’s still time to buy your ticket.

This year’s agenda is the biggest one the Search Engine Land experts have ever created: 100 search marketing sessions covering SEO, SEM, CRO, agency operations (new!), local search for multi-location brands (new!), analytics, mobile, video, content, tools, and more.

You’ll also access interactive Q&A clinics, full-day training with leading brands including Google and Microsoft Ads, 30 market-defining vendors, exclusive networking events, delicious meals, free WiFi, the SMX mobile app, and downloadable speaker presentations.

The year is almost over… do yourself and your career a favor: Attend SMX East for the expert-led training and actionable tactics you need to generate awareness, drive traffic, and boost conversions in 2020 and beyond.

Ready to register? Smart move! If you book before November 13, you’ll enjoy up to $300 off on-site rates.

See you in NYC 🙂

Psst… Focused on meeting vendors and growing your network? Attend SMX with a free Expo pass to unlock the entire Expo Hall, sponsored sessions, training with Microsoft and Google, Q&A clinics, evening networking events, refreshments, WiFi, the mobile app, downloadable speaker presentations, and more. Grab your Expo pass now!



About The Author

Lauren Donovan has worked in online marketing since 2006, specializing in content generation, organic social media, community management, real-time journalism, and holistic social befriending. She currently serves as the Content Marketing Manager at Third Door Media, parent company to Search Engine Land, Marketing Land, MarTech Today, SMX, and The MarTech Conference.



you’re-about-to-miss-search-marketing-expo

Join us in two weeks at Search Marketing Expo — SMX East — November 13-14 in NYC, for 100 tactic-rich search marketing sessions, two new tracks devoted to agency operations and local search marketing for multi-location brands, empowering keynotes with Rand Fishkin, Google, and Microsoft Advertising, intimate training with industry experts, and invaluable networking that plugs you into a thriving community of engaging marketers.

Here’s a look at what’s in store…

Pick the pass that suits your goals and budget:

  • All Access: The complete SMX experience — all sessions, keynotes, clinics, networking events, and amenities. Book before November 13 and save $150 off on-site rates.
  • All Access Workshop (best value!): Maximize your learning with a full-day pre-conference workshop. Book before November 13 and save $300 off on-site rates.
  • Expo : Interested in growing your network and meeting vendors? This free pass gets you the entire Expo Hall, sponsored sessions, Q&A clinics, refreshments, select networking, WiFi, downloadable speaker presentations, the mobile app, and more. Book now and it’s free! (On-site rates cost $99.)

We guarantee you will walk away with at least one actionable tactic that can help bring your search marketing campaigns (and your career) to a new level of success.

Register now and I’ll see you in two weeks!



About The Author

Lauren Donovan has worked in online marketing since 2006, specializing in content generation, organic social media, community management, real-time journalism, and holistic social befriending. She currently serves as the Content Marketing Manager at Third Door Media, parent company to Search Engine Land, Marketing Land, MarTech Today, SMX, and The MarTech Conference.