dev-resources.site
for different kinds of informations.
NgSysV2-4.2: SEO (Search Engine Optimisation)
This post series is indexed at NgateSystems.com. You'll find a super-useful keyword search facility there too.
Last reviewed: Nov '24
1. Introduction
Once you've deployed your application into the Google Cloud it becomes a target for the "web spiders" that patrol the web in search of content to add to their keyword "indexes". Once your site is indexed, people may see it in Search Engine returns.
This is great if it all works. The search engine will drive business in your direction and won't charge you a penny. But in practice, you have to encourage the spiders to index your site prominently. This is what "search engine optimisation" (SEO, for short) is all about - .
Getting good SEO for your site involves:
- Providing a sitemap to help the spiders navigate your site
- Using SSR (Server-side-rendering) and Pre-rendering to make your "crawl budget" go further
- Helping the bots to locate useful "index-worthy" content in your pages
2. Providing sitemap and robots files to guide web spiders
Your site should provide a sitemap file that lists all the routes you want Google (and other search engines) to index. Indexing spiders will usually discover them anyway, provided pages in your site's "tree" hierarchy are properly linked via <a>
anchor links. But, problems may arise if your site is large or new and still poorly referenced by other sites.
These problems are fixed by creating a "site map" file. Site maps can be formatted in several ways, but at its simplest, the indexing engine will be happy with a simple text file that lists your pages as follows:
// /static/sitemap.txt - Don't copy this line
https://myProjectURL/inventory-display
https://myProjectURL/inventory-maintenance
etc
Note the following:
- Pages deployed to the Google app engine are automatically provisioned with an https (encrypted) URL
- "myProjectURL" will most likely be a "custom" URL that you have explicitly linked to your deployment URL.
- You only need to add extensions to the "clean" URLs shown above if these are static ".pdf" files or similar.
- A text sitemap can be called whatever you like, but it's customary to call it "sitemap.txt". In a Svelte webapp, however, you must store this in your project's
static
folder so that it gets built into your yaml file and deployed to the root of your webapp.
The robots file provides a "partner" to the sitemap file that:
- Blocks specific spiders: You can block certain web crawlers from accessing certain parts of your site.
- Blocks specific directories: For example, you might block /admin/ or /private/ to keep those pages out of search engine indexes.
- Specifies the sitemap's location.
Here's an example
// /static/robots.txt - Don't copy this line
User-agent: *
Disallow: https://myProjectURL/inventory-maintenance
Sitemap: https://myProjectURL/sitemap.txt
In a Svelte project, the robots.txt
file (mandatory filename) must be stored in a /static/robots.txt
file.
You can check that your robots.txt
and sitemap.txt
files are being correctly deployed to your project's URL root by trying to view them using your browser:
Each of the following URLs entered into the browser's "search" box should respond by displaying the file contents.
https://myProjectURL/sitemap.txt
https://myProjectURL/robots.txt
Further information on all these issues can be found at Learn about sitemaps
Once you've successfully deployed your sitemap you might find it useful to give Google a "heads-up" by submitting the sitemap to the Google Search Console.
You start here by registering a "property" - ie the URL of your site. This involves running a procedure that enables you to assure Google that you own the site. The procedure starts with the console downloading a "site-verification" file into your "downloads" folder. You must copy this into your Svelte static
folder and rebuild/redeploy your webapp to upload the file to your remote site. If Google can find the file with the content it is expecting when you click the "Verify" button on the authentication screen, it will be satisfied that you genuinely are the owner.
Clicking on the "sitemaps" tool in the menu on the left of the screen will now enable you to enter your sitemap URL (sitemap.txt
) and get a "success " status in the Submitted Sitemaps window.
It may take some time for Google to get around to processing your sitemap. You can speed up the process if you select the "URL inspection" tab on the LHS of the screen and use the "Inspect any URL" entry field at the top of the page to enter the addresses of particularly important pages. For example, you might enter "svelte-dev/" (without the quotes) for the "https://svelte-dev/" root page on a "svelte-dev" website.
In response, Google will check the page for any gross deficiencies. If it's satisfied with what it sees, it will then offer you an opportunity to add the page to its priority crawl queue. You might also use this facility to get updated indexing following major changes to a page.
Once your page is reported as "indexed", you should find that a Google search on site keywords should return a results list that contains an entry for your site. Ideally, your site's entry will be at the top of the list! If so, you have just created an "impression" (Google term) for your site - a Google search has triggered a match for your site. The "performance" tab in the Search Console will now show you the number of daily "impressions" generated by the indexing for your site and the number of times that these were then clicked.
Remember that, if you've been making significant changes to an indexed page, it will probably be a good idea to explicitly request re-indexing, as described above.
The Search Console is a sophisticated tool for monitoring the progress of indexing on your site and resolving any problems that might have been reported. See Get started with Search Console for further details
3. Using "Server-side-rendering" and "Pre-rendering" to make your "crawl budget" go further
While, in recent years, search engines have got better at indexing content rendered with client-side JavaScript, they are happier with pages that contain only HTML. Server-side rendered (SSR) content (ie pages whose HTMl has already been generated by running database-access javascript on the server) is indexed more frequently and reliably. Nobody but Google knows how their indexing engines work, but a reasonable guess runs something like this.
First, your webapp is awarded a "site ranking" (determined in an obscure manner, but probably influenced by the number of "backlinks" on sites that reference your URL). This in turn awards you a certain "crawl budget" - the amount of time the indexing engine is prepared to spend indexing your pages. You'll want to spend this wisely. Server-side rendering eases the bot's workload and makes your budget go further. So, if you want good SEO you should use use SSR!
The ultimate expression of service-side rendering is where a "static" page - one that displays data that either never changes or changes only rarely - is rendered at build time by presence of the following statement to its +page.js
or +page.server.js
file:
export const prerender = true;
Because the server now only has to download pure HTML, your crawl budget goes even further and your users receive a lightning-fast response! See Post 4.3 for details of an arrangement to automate pre-rendering builds using a scheduler.
4. Helping the bots to locate useful "index-worthy" content in your pages
Google's docs at Overview of crawling and indexing topics contain everything you know. Here's a summary:
First of all, you need to get your head around Google's "Mobile first" policy. The Google spider will analyse your site as it would be seen by a browser running on a mobile phone. This means that it will downgrade your site's "reputation" (and its crawl budget) if it considers, for example, that your font size is too small.
If your webapp has been designed for desktop users, this will come as a blow to you. Try your site on your phone and you will likely conclude it is completely useless.
The way out of this is to use "responsive styling" (see Post 4.4 so that the webapp senses the page width of the device it's running on and adjusts things accordingly.
It may be that parts of your webapp aren't appropriate for website operation. You may seek to remove these, but Google would remind you that most of its indexing comes from mobile pages. They recommend you gently conceal such content behind tabs or "accordions".
What web spiders are primarily looking for is content - information that search engine customers will find useful. But they need your assistance in locating and interpreting this. Hereare some tips on how you might do this@
- Give each page well-written and unique
<title>
,<meta name="description" content=" ... ">
and<link>
elements inside a<svelte:head>
code block. Here's an example:
<svelte:head>
<title>Product Inventory</title>
<meta name="description" content="List of products sold by the Magical Products company" />
<link rel="canonical" href="https://myUrl" />
</svelte:head>
This arrangement delegates to Svelte the awkward task of inserting the <title>
, <meta>
and <link>
elements of <head>
into the DOM. The <link>
element here tells the indexing bot which "brand" of a website that might be reachable variously as "https://myUrl" and "https://myUrl/" etc, etc is the "main" or "preferred" version. Ask chatGPT for a tutorial on the word "canonical" if you'd like the full story.
- Ensure that the text content of
<a>
anchor links clearly describes the content of the linked page or (if this is impractical) is supplemented by atitle=
tag. Use an absolute URL in thehref=
tag (ie one that includes all its components). Here's an example
<a href="https://example.com/best-seo-tips"
title="Guidelines for achieving excellent SEO">Best SEO Tips</a>
- Use "structured" data descriptions in sites (such as "recipe" sites) displaying fixed classes of information in a tightly defined format. "Structured data" in this context references a standardized format for providing information about a page and classifying its content. The most common format for structured data on the web is the one published by
schema.org
. Ask chatGPT for an example if you'd like to know more about this and how you would use structured data in a Svelte webapp.
Featured ones: