Files required when registering a site with search engines such as Google and Yahoo. The one that is asked in Search Console (Google search registration service). The contents mainly contain information on the page you want to display in the search.
Added to Gemfile
. * Click here for Github (https://github.com/kjvarga/sitemap_generator)
Gemfile
gem 'sitemap_generator'
Install gem
$ bundle install
Execute the following command to generate config / sitemap.rb
.
$ rails sitemap:install
Edit the generated config / sitemap.rb
.
SitemapGenerator :: Sitemap.default_host
contains the production host.
In SitemapGenerator :: Sitemap.create
, describe the page you want to register for search.
config/sitemap.rb
require 'rubygems'
require 'sitemap_generator'
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
add '/', changefreq: 'weekly', priority: 0.9
add '/about', changefreq: 'weekly', priority: 0.5
User.all.each do |user|
add user_path(user), lastmod: spot.updated_at
end
end
Try running it locally.
$ rails sitemap:refresh
Notifications run on sitemap updates and search engines.
If you don't want to notify the search engine, add no_ping
.
$ rails sitemap:refresh:no_ping
When you execute it, you can see that public / sitemap.xml.gz
is generated.
You can download it from http: //localhost: 3000/sitemap.xml.gz.
Since sitemap.xml.gz
needs to be updated every time a User is created, it is executed periodically every day with cron job
.
Added an endpoint for that.
cron_jobs_controller.rb
class CronJobsController
def refresh
logger.info `bundle exec rails sitemap:refresh`
head :ok
rescure StandardError => e
logger.error e.full_message
head :internal_server_error
end
end
routes.rb
Rails.application.routes.draw do
...
resources :sitemaps, only: [:index]
end
Added settings to cron.yaml
.
cron.yaml
cron:
- description: sitemap
url: /cron_jobs/sitemaps
timezone: Asia/Tokyo
schedule: every day 03:00
Deploy the settings for cron jobs
.
$ gcloud app deploy cron.yaml --project=target-project
This completes the settings for periodic execution by cron jobs
.
GAE in the production environment handles 3 instances for scale-out. Therefore, if you dynamically generate a file and place it on an instance, there is only a 1/3 chance that the file will hit.
In the first place, putting the generated file in the instance under the PaaS
environment is an anti-pattern.
It is good to upload to external storage (GCS).
There is no problem if you are using Computed Engine
etc.
Add the GCS settings to config / sitemap.rb
as described in the sitemap_generator
documentation.
SitemapGenerator::GoogleStorageAdapter Uses Google::Cloud::Storage to upload to Google Cloud storage. You must require 'google/cloud/storage' in your sitemap config before using this adapter. An example of using this adapter in your sitemap configuration with options:
by https://github.com/kjvarga/sitemap_generator#upload-sitemaps-to-a-remote-host-using-adapters
config/sitemap.rb
require 'rubygems'
require 'sitemap_generator'
require 'google/cloud/storage'
SitemapGenerator::Sitemap.default_host = ENV['BASE_URL']
SitemapGenerator::Sitemap.sitemaps_host = "https://storage.googleapis.com/#{ENV['GOOGLE_BUCKET']}"
SitemapGenerator::Sitemap.adapter = SitemapGenerator::GoogleStorageAdapter.new(
credentials: ENV['GOOGLE_CREDENTIAL'],
project_id: ENV['GOOGLE_PROJECT_ID'],
bucket: ENV['GOOGLE_BUCKET']
)
SitemapGenerator::Sitemap.create do
add '/', changefreq: 'weekly', priority: 0.9
add '/about', changefreq: 'weekly', priority: 0.5
User.all.each do |user|
add user_path(user), lastmod: spot.updated_at
end
end
Added a routing to redirect to sitemap.xml
on GCS when access comes at https://domain/sitemap.xml.gz
.
routes.rb
Rails.application.routes.draw do
...
get '/sitemap.xml.gz', to: redirect("https://storage.googleapis.com/#{ENV['GOOGLE_BUCKET']}/sitemap.xml.gz", status: 301)
end
Now, if you deploy the GAE instance again, the setting is completed.
After running cron job
, you can download sitemap.xml.gz
by accessing https://domain/sitemap.xml.gz
.
Recommended Posts