SEO issues, Strange URLS and 404

I’ve published a website and connected the sitemap to Google search console.
It’s flagged up a number of URLS as 4040 and 78 noindexed URLS.

These URLs are not in the sitemap.xml page.

They were never created by us. We never used the images mentioned in the URL in our website gallery.

This is a brand new website.

How is the Google bot finding these URLS?
Is our domain space being shared with anyone else?

Example URL:

Check the hyperlink for this broken image not the front end link

@Sitejet_Admin could I have a reply please?

Hi @Naz_Haque, can you please share the Website ID with us and also a screenshot of the noindex and 404 URLs?

@Franzi

Domain: https://www.westwaydrivingschool.co.uk/

Refer to attached image from Google Search Console.

I can’t upload a word / excel file here with all the urls. There are 70+.

@Franzi

404s

Hey @Naz_Haque - has this website existed before under this domain?

Yes it has about three or four years ago.

@Naz_Haque Did you try to crawl it again? It might be possible that the old URLs are still listed. Maybe a new crawl could change that.

Franzi, this was the results from a new crawl taken in 2022.

The domain has been dormant for 3+ years so there’s no way those urls were from an old website.

The suspicious urls didn’t exist on the old website.

This was checked via wayback machine.

We created a new website on sitejet, connected it to GSC and then ran the scan.

Also a Google parameter search for the domain doesn’t show those urls (search on google for “site:westwaydrivingschool.co.uk”)

So I’m trying to understand how these suspicious URLS were crawled.

@Naz_Haque We assume that those are the URLs from the previous website. Did you try to download the sitemap from the website (SEO with Sitejet - Sitejet Help) and upload it to Google Search Console?

Hi Franzi,

Yes that was done last month (sitemap submission).

It’s possible that Google’s first crawl of the new website included URLS it had historically. (3+ years old).

I’ve never seen that before.

Let’s see what happens when Google Search Console does a new crawl top to bottom, I’ll keep you posted.

Thanks for your support.

1 Like

Please keep us updated on this. A very strange issue indeed.


So strangely Google still attempted to crawl those urls.

I’ve never seen this before on a new site and sitemap that has been submitted where google trys to find historical urls.

I’ll instruct GSC to remove those URLS manually.

Hey @Naz_Haque - how was your experience with another site? Did you have similar issues? Or just with this site on Sitejet?