SEO - Common robots.txt

What would be the common robots.txt for all SiteJet websites? Usually when we develop a website we may disallow cache folder for example and other internal dev folders.

Hi! If you need, you can create a robots.txt file using the instructions from here: Add a robots.txt file - Sitejet Help
From what I know, there is no cache other internal dev. folders in Sitejet.

Hope this helps!

1 Like

Hi @Lucian_Dinu, thank you for your reply.

I have already followed the instructions before asking the question.

My question is about the content of the robots.txt and not on how to create one (as in your shared link).

Usually, when we develop a website, we must exclude all the internal folder that we do not want to share on the internet.

In case of SiteJet, it is a black box for us, so I wonder why you don’t share a common robots.txt that we may use in our website as a base one.

Hi! This is what is currently available in Sitejet.
I have a similar issue and I also want to exclude pages/collection items this is why I’ve created this feature request: Add the possibility to exclude pages/collection item pages from sitemap.xml

If you want you can add your input (+1) there.

Hope this helps!

Hi, this should be answered by the dev team of SiteJet. There should be a common base robots.txt file to share with us or to generate on the website and give us the possibility to modify it. Maybe @Andre or @Franzi could help us with this.

As for your feature request. I totally agree with you. There should be a possibility to have draft pages along with this feature Save Without Publishing.

Hi @AJ_Joe

there is already a default robots.txt automatically generated for every website. You can download it (simply append /robots.txt to your website domain) and customize it to your needs and upload it again using the instructions mentioned above. Usually a search engine would index every (linked) page, so a customized robots.txt is not needed in most cases.

If you want to disallow pages from being indexed, you can better simply untick the “Index” checkbox in the “Pages” dialog.

Hi @malte, thank you for your reply. You mean simply add /robots.txt to the website domain?