By using seoforum’s services you agree to our Cookies Use and Data Transfer outside the EU.
We and our partners operate globally and use cookies, including for analytics, personalisation, ads and Newsletters.

  • Join the best UK dedicated SEO Forum

    Provide or get advice on everything SEO, ask questions, gain confirmation or just become apart of a friendly, like minded community who love SEO and Online Marketing.


    Join 50,000 members!

Google Search Console - Crawled not indexed

M

macB

New Member
My site is over 12 months old, my sitemap has grown steadily over this period to 1500 URLs but Search Console reports 1000 excluded under the type "Crawled - not currently indexed".

The help says "These are pages that are intentionally not included" and "The pages won't appear in Google but that was probably intentional" errrr NO!

I've googled quite a few threads with different answers from "you can't do anything" to "you just have to wait".

I've tried analysing the excluded v valid and there is nothing that I can see, the excluded are not duplicates, they are not new and they contain as much "rich" content as those it considers valid.

I've even used the "Request indexing" on individual URLs, Google does craw them (you can see this by the date) but still considers them excluded.

Strangely, the Index coverage stats (i.e. not the sitemap) reports 11.5K valid and 47.5K excluded, these are URLs that the Googlebot spider has discovered itself!

I'm confused... and stuck, any pointers anyone?
 
Community

Community

Administrator
Staff member
What type of site is it and what type of content, this happens alot with Ecommerce stores which have repetitive pages such as category and filtered URL's which have not correctly been set as noindex or correct usage canonicals.

Google also does this is there is a large amount of similar content, such as duplicate content as above, thin content or pages competing for the same search terms with again little content ( happens alot with large quantities of products of the same brand ).

It's quite hard to say without seeing examples of the URL's and the content
 
M

macB

New Member
It’s for tourist information listings, each url submitted on the sitemap relate to either an actual (unique) listing or a search listing page as in “lthings to do near London”. As I mentioned there are no duplicates or thin pages, but I am new at this so there isn’t canonical references set and after doing more reading today I think you are right and that’s were my problem lies.

I see the googlebot has picked up a massive amount of URLs with combinations of my search page with varying url params. I think i’ll look into canonical refs tags.

Thanks for you help.
 
Community

Community

Administrator
Staff member
Yes url params , either filtered or search don't really need to be indexed in the majority of cases as they will be generating large amounts of duplicate content.

I would start off by adding correct canonicals and once google has picked those up start to noindex the offending pages too. While they linger in the index they won't be doing you any favours .
 
M

macB

New Member
Great, thank you very much. I think in my naivety I thought google would use my submitted sitemap as precedent over what their bot found, by that is as I say naive.
 
M

macB

New Member
So I've overhauled the website placing the correct canonical, next and prev tag where applicable and when a search gets too complicated (too many filter options) I've used the nofollow, noindex tags.

Is there anyway I can request Google to recrawl the site or do I just have to wait for it within its schedule?
 

Latest Products

  • [Rivmedia] Lazy Loader XF2
    [Rivmedia] Lazy Loader XF2
    Load images asynchronously on your forum, allowing images to be loaded only when they are in view
    • Rivmedia
    • Updated:
  • [Rivmedia] Guest Redirect & Profile unlink
    [Rivmedia] Guest Redirect & Profile unlink
    Forums which prevent member profile access for guests, redirction and unlinking for profiles
    • Rivmedia
    • Updated:
  • [Rivmedia] Simple Redirects
    [Rivmedia] Simple Redirects
    Simple redirects allows forum admins to make simple 301 or 302 redirects via their admin panel
    • Rivmedia
    • Updated:
  • [Rivmedia] Minimum Message Length
    [Rivmedia] Minimum Message Length
    Eradicate short, pointless posts with a minimum message length and improving content worth on a thre
    • Rivmedia
    • Updated:
Top