Indexed, not submitted in sitemap

Indexed, not submitted in sitemap

One of our large web properties saw a huge drop in Google rankings on December 17th 2019. Three weeks later, on January the 8th 2020, we had another drop. This time it was down to zero. All of our 2640 former pages simply vanished. Everything on this property was white hat, which basically means we strictly follow the Guidelines for Webmasters and Contentmanagers by Google. So what happened? Why do we suddenly stopped appearing in the Google search result pages? We have no clue, and Google remains quiet.

2021 update

Our website is back and growing. It’s not yet at its peak traffic but it looks really good. So we explain what we did below and then we just waited. More than a year.

Other people seem to have good results with trickier actions: migrating and 301 redirecting the website to a new domain. And repeat if this doesn’t work. Seems ok for a less valuable property, but in our case waiting has proven to be the sensible (non) action.

2020 Update

We have learned something new about this. Google started reporting that it changed something on December the 15th 2019, two days before our sites were dropped from Google index. Google now says something really cryptic.

The Index Coverage report can now more accurately report on indexed pages. Because of this, some pages that were Crawled – currently not indexed are now known to be indexed. As a result, you may see a transfer of pages from Excluded to Valid state. This does not reflect any changes in your site, but a more accurate accounting system.

We think this “more accurate accounting system” may have something to do with our website disappearing. Anyone with more information on this, please get in touch with us via the chat group or via mail (see further below).

Action points

  • Related to this, one of the conclusions is that something may have gone wrong on how Google calculated backlinks into the rankings. This problem therefore is not on-page but off-page. All reported issues below, such as the sitemap are symptoms of this “penalty” that we and so many others have received.
  • One of the action points we get advised a lot is to revise all those links and to disavow links that are pointing to our website. Another action point we hear is “wait”. The question for everyone however is: how long should you wait?
  • In the meantime we hear from a lot of people they created new domains and migrated the content. When this is done without a 301 redirect, it goes fine. But truly, how can this be a solution?

What happened?

For most of 2019 we spent a lot of time and money on building a useful, super fast (mobile pagespeed 100) website in 10 languages (including Swahili) and in less than a month our traffic dropped from 1400 visitors per day to less than 40. What is really embarrassing here (for Google) is the fact that this website has really high quality content, has user intent as starting point, and as such it was growing really nice and organically.

Timeline of our Google events

At the same time, oddly enough, we also noticed some of our other and older web-properties had gone up in rankings. This while our new website is (as we think) even a lot better in terms of quality.

  • November 8th: some O.K. broad and niche sites got improved rankings – but it seems that other O.K. sites across the web got hit on this date and saw their traffic plummeted.
  • December 17th: our top property got hit from 1400 to <300 visitors a day. Shortly after this we noticed some traffic recovering slowly.
  • January 8th: our top property got hit again, some days <10 visitors a day through Google. We got completely de-indexed.

Many other websites that were hit

We discovered that many other people are dealing with the same problem. We found many posts on Twitter, Reddit, various SEO forums, including Google’s own webmaster support forum. There are clear patterns visible of whole websites suddenly, and at exactly the same dates, dropping out of Google’s index. See also elsewhere this blogpost that explains the issue very well.

Most of the replies to people’s inquiries seem pretty useless, unfortunately. It’s really like nobody knows anything and most people are just like second-guessing about what’s happening. Or they even try to sell their SEO services, or tell us to hire external SEO’s while we ourselves are those experts who are hired by others.

https://twitter.com/guaka/status/1228075121019170819

We also noticed that Google’s own Webmaster Trends Analyst John Mueller interacted on Twitter about this, but he claims there is no concern at all. In fact, he says that this seems “not a temporary glitch”. Saying, we are f*.

Later on he continued to say that “there is no technical problem involved”.

Which type of website has been affected by this?

  • To us, it also looks like many properties that were hit are completely white hat with at least reasonable content. We’ve seen travel blogs, retail shops, financial sites, general information sites. Some with no affiliate links at all, some with a lot of affiliate links.
  • These sites are in English, French, Spanish, Italian, and multilingual. We see small websites and large websites even such as ikea.com/de that are seriously hurt by this, and all at the same date. You can see yourself quite some domains mentioned in this Twitter thread. Some others include autoabos.org, schnelles-wissen.de and techninja.nl. We also have seen big drops at medium.com, according to reports at Ahrefs.com
  • We have been unable to find a clear pattern. Many people are upset that they have done everything white hat and built quality sites and they they are hit for no apparent reason.
  • What’s more, there are no explanations by Google. There is hardly any communication and just denial. Nobody has a clue basically. Only when a big webshop like Ikea rings the bell, well at least they get some attention.

The case of Ikea is explained at this video conference call with Google Employers. Ikea Germany disappeared from almost all search result pages in Germany. Their results have been replaced by Ikea Austria, which also has their webpages in the German language. Technically, all is correct from Ikea’s part, so this clearly shows a Google error. See this blogpost as well.

We have encountered many, many examples in the past week about this issue. We have plenty of people in our chat group who have lost many websites in Google’s index. The whole thing seems totally random, really. At the same time, since the 11th of February, we did notice some websites that recovered. But nobody knows what, when and how.

Weird hints at Google Webmaster Console

We did notice some interesting things happening, though. Stuff for example that appears in our dashboard on Google Webmaster Console. See below our main findings.

1. Indexed, not submitted in sitemap

Basically the error message is like this: “Indexed, not submitted in sitemap”. It appears that Google says that we are indexed, but that our pages are not in our sitemaps. This is while the sitemaps are actually found and used by Google. There is nothing wrong with our sitemaps, and Google even uses them according to Google’s sitemap report, but then it basically says “not submitted”. This counts for all of our pages.

indexed not submitted in sitemap

2. Mobile Usability report is useless

Many people have mentioned that the mobile usability report has dropped to 0. See below a screenshot or our mobile pages report that Google gives us. The amount of usable mobile pages basically went down from more than 1,000 pages to 2 pages. You can’t tell us this isn’t a glitch. Obviously it’s a bug from Google’s part.

3. Linksection completely useless

Usually one can find incoming links (which Google discovered while crawling the web) in your webmaster console. Now however, Google completely removed our index, except for the frontpage and one random page. Therefore we cannot see at all which inbound links we have to our pages. We can only see the links to our frontpage, maybe the only indexed page we have. It’s like all of our pages stopped existing for Google!

These hints could also be a result of our website being indexed “mobile first” by Google. This is a secondary index made by Google. All the reporting we receive is based on this mobile first index. Now that our website has been dropped from this mobile first index, all the reporting at the Google Search Console has rendered useless.

4. Returning visitors do get to see us

Although our website is not on the public Google index, some visitors do get to see us. Most notably, if I search for the keywords our website ranks for, I do get to see it. If I am logged out of Google, I do not get to see it. If we look into the Google Console we do get to see these impressions, as well as those from other returning visitors. And most of the time, the rank is number 1. The other type of visitors that seem to find us are from other countries that are not explicitly geo-targeted by any of our 10 languages. And sometimes url’s just appear for an hour or so, or maybe even half a day, and soon after they disappear again, often only ranking in one specific long-tail SERP.

One website owner also noticed something similar. See the tweet below.

Possibilities for not being indexed by Google

There are a couple of possible reasons why the website stopped being indexed by Google. There could be technical reasons at our side, there could be technical reasons at Google’s site or we have received a so-called algorithmic penalty by Google.

1. Technical difficulties

We looked at many, many things and we don’t see how our website could have technical difficulties, not from the perspective of website implementation nor from a technical SEO viewpoint. Also, comparing our website to all other websites that we know that have been de-indexed, we don’t really see anything there either. The similarities are little. There are websites in Drupal CMS, WordPress CMS and even custom html. Websites doing affiliate marketing, websites doing their own sales, and we see many different languages. The only thing that we witness so far is that few of these websites are focused on the US or UK, the ones we discovered are mainly European, Asian and some Latin American ones.

2. Algorithmic penalty

The other option that we have to consider is that we must have received an algorithmic penalty related to web-spam.

An algorithmic penalty is a type of penalty that you receive but Google doesn’t tell you. This is something else than a manual penalty. With a manual penalty in place, you are approached via Google Search Console about the issue at hand, and there is an action point for you to undertake. With an algorithmic penalty there is nothing. You normally just go down the ladder, and you find it hard to rank for specific keywords. However, you don’t just disappear completely from Google with all your pages. This is something we have never heard of before.

Especially in this latter case, it is unacceptable that a Search engine that monopolizes search doesn’t communicate anything on why they just remove you from the search pages. There is nothing now that anyone can know, or understand or learn. All we have is guessing and lots of people giving their opinions, without knowing anything for real. At least there should be a clear message with a reason why de-indexing happened. But now we entered a big black box basically. This is nothing else than strange site de-indexing with no clue in GSC.

As such the only way to know what’s happening is via Twitter to publicly ask Mr. John Bananas. And then all you can do is hoping he can give you “some answer”. But also his answers are often just generic, and most of the times nothing official. In many cases, Google does not want to tell you why your site has been de-indexed.

Let’s assume this is a penalty. Then we could conclude some things. First of all, this is apparantly the new way of Google letting you know there is something wrong with your site. Getting the info back from all participants in our chat and from other sources, there can only be a myriad of reasons why these penalties happen. It could be thin content, some type of affiliate sites, maybe faulty links, who knows. The problem is that there is simply no normal communication done by Google, so what can we know?

Continuing on this assumption, the hints mentioned above could be Google’s way of letting you know you have received a penalty and that your site has been de-indexed. It still doesn’t explain though why some sites that were de-indexed have been recovering already, without the webmasters having done anything about the site, onsite or offsite.

The only rational explanation to this, that comes to our mind, is that Google received specific indications about your website, and that those indications got resolved rather quickly. But all of this is and remains guessing. There are also possibilities, like Google simply has it wrong.

3. Google has got it wrong

The possibility of Google having technical issues could be likely. Google search is a complex application and errors can always enter their search product. Besides, the product is far from perfect as since recent months there appears quite a lot of new spam websites on the search result pages. But what would be the technical issue? We have discovered at least two explanations.

One related possible cause could be so called Domain Crowding. This happened before and could be somehow a related cause to the fact that we basically see whole websites disappearing from Google Search, without any apparent reason.

Another plausible possibility that has been brought up is that Google is penalizing the wrong websites. Some sites have been told by Google employees that they are involved in “massive linkbuilding”, resulting in a Google punishment. The truth for those websites is in fact there was nothing like this happening. This is the case for many and multiple websites, who don’t engage in anything like link-building campaigns at all. One website owner even says that:

Maybe there is some bug, which makes the Google Algorithm associate backlinks to the wrong website, resulting in an algorithmic penalty.

Or as another webmaster observes:

Yes I am very convinced there is a bug from Google. An algorithm penalty seems not to be the case because I have many white hat sites with the same strategy and only few of them got hit.

Another experienced SEO consultant who deals a lot with penalties tells us:

My clients got penalty’s for real things like pure spam and linking. But the domain I am talking about does not has this issue not even backlinks. The website was ranking without. I resolved 100+ manual actions in my life, I think this is something else.

Going one step further

It seems a far stretch, but more and more people are saying that Google is getting into difficulties right now. We are seeing algorithm updates or patches multiple times a week now, and that frequency is a lot higher than it used to be. People are also complaining about a lot of bad results in the Search Result Pages (SERPs). As one user says in the chat group we’ve created about this.

In my opinion, Google screwed up their algorithm badly, and they know it. This is why we have update after update now. They don’t want to admit to this whole mess and they lie and avoid answers. They aren’t stupid they know what is happening. Even huge sites got hit, so this is not some minor thing.

What are we trying?

  1. Until yesterday we were using the Yoast SEO plugin for our sitemaps. Because of Yoast’s intrusive advertising practices we had already planned to toast it. The sitemap problem was a good reason to push that through. Now we’ll set up new sitemaps with the SEO Framework.  (Many people seemed to have tried this without success.)
  2. We had robots.txt Disallow our affiliate links, we dropped this.
  3. We still have some JavaScript that does some (white hat) changes to links, we’re moving this to PHP.
  4. We are considering to move from no-www to www.
  5. In order to check if we could be considered spam by Google, we have been revising incoming links on link-spam and even possible Link schemes. We were merging one by one a couple of websites of ours into this new single website, and this redirecting is something that can prove tricky, if Google thinks you suddenly receive too many “unnatural” links.
  6. (We’ll update this above list as we try more or come up with new findings.)

So far, we are just hoping that soon we will see this veil being uplifted on our website, so that we can move further again. We are also still adding new content and adding relevant information to already published posts. But better than this, we would like Google to be more transparent about their stuff on why we have been de-indexed.

At this point we really think this is a problem at Google’s side. Not us. There are some obvious glitches and bugs going on at Google Search.

Analysis and outreach

Meanwhile, we hope to get more people informed about the issue, and maybe even get some more folks united in their efforts. So that we can exchange information and learn from each others best practices and mistakes.

For this, we’ve set up a Telegram group to bundle forces on this. Join us if you want to think along: https://t.me/googledrop17dec. If you just want to contact us, see the form further below.

Our own social outreach so far

We started using #googledrop at Twitter. We encourage you to use this hashtag on twitter to highlight the problem and get more attention.

tweet googledrop

Some other places we have been reaching out to

We’ll keep updating this post as we learn. Send a message with the same box below to share your thoughts privately with us. We will also share new info with everyone leaving their e-mail.

About the author

Rankshaper is a real-time, on page SEO ranking tool and keyword discovery tool. It helps you identify new ranking opportunities and gain access to new niche markets. Add the search terms your visitors are using to find you and discover new terms and new combinations, while tracking performance.