Let Python measure the average score of a page using the PageSpeed Insights API

Introduction

The product I belong to now has PageSpeed Insights (PSI) for the purpose of displaying high-ranking search results and reducing stress when browsing the site. We are measuring the improvement of page speed using.

The improvement results are measured before / after, but depending on the timing of measurement, the PSI score may have a margin of 10 to 30 points ** blur **.

That's why my team measures several times instead of one measurement per page.

--Average score --Number of measurements (15 times or more) ――If the blur is large, the lowest and highest points

I decided to go with the method of describing.

However, the more pages you want to measure, the harder it is to click the analyze button and wait. This time I tried to streamline this using the API.

Target reader image

--Programming beginners who want to acquire data using API ~ 1st year engineer

Finished product

kotahashihama/psi-score-collector https://github.com/kotahashihama/psi-score-collector

Get API key

To get data from the PageSpeed Insights API, you need the API key as a parameter when you access it.

You can get it on the following page.

Try using the PageSpeed Insights API | Google Developers https://developers.google.com/speed/docs/insights/v5/get-started

It is not good for security to leave the API key as it is in the project code, so I used the library python-dotenv and managed it as an environment variable in .env.

`Terminal`


pip install python-dotenv

`config.py`


from dotenv import load_dotenv
load_dotenv()

import os
API_KEY = os.getenv('API_KEY')

`.env`


API_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Prepare setting items

Since we want to measure each set number of measurements in sequence, the following items are required.

`main.py`


#Number of measurements
measurement_count = 3;

#URL to be measured
url_list = [
  'https://www.google.com/',
  'https://www.yahoo.co.jp/',
  'https://www.bing.com/'
]

It seems that it is originally good to separate the setting contents from the processing, but this time we will do it.

Description of main processing

For each URL

--Number of steps --Mobile score --PC score

Is output, and finally the measurement completion message is displayed.

`main.py`


print('\n'.join(map(str, url_list)))
print(f'Is being measured...({measurement_count}Time measurement)')

for i, url in enumerate(url_list):
  print(f'\n({i + 1}/{len(url_list)}) {url}')

  measure('mobile')
  measure('desktop')

  print('\n' + '=' * 60)

print('\n Measurement completed!')

Let's get points

Have the URL and parameters to access the SPI API.

`main.py`


api_url = 'https://www.googleapis.com/pagespeedonline/v5/runPagespeed'
api_key = config.API_KEY

payload = { 'key': api_key }

The config.py created above has the environment variables read from .env.

`main.py`


import config

Read config with and set the value to ʻapi_key`.

Processing to fetch points

First, below is the whole code.

`main.py`


def measure(device):
  device_name = {
    'mobile': 'mobile',
    'desktop': 'computer'
  }

  print(f'[ {device_name[device]} ]')

  payload['strategy'] = device
  url_name = api_url + "?url=" + url

  scores = []

  for i in range(measurement_count):
    result = requests.get(url_name, params = payload)
    result_json = result.json()
    result_score = result_json['lighthouseResult']['categories']['performance']['score']
    displayed_score = math.floor(result_score * 100)

    scores.append(displayed_score)
    print(displayed_score, end=' ')

  score_average_raw = numpy.average(scores)
  score_average = Decimal(str(score_average_raw)).quantize(Decimal('0.1'), rounding=ROUND_HALF_UP)
  score_max = numpy.amax(scores)
  score_min = numpy.amin(scores)
  print(f'\n average{score_average}Point (lowest{score_min}Point, the best{score_max}point)')

There are two main types of libraries used in this process.

--Access API --Calculate the result

doing.

(I won't touch on this time, but I think it's easier to share the Python library with team members if you manage dependencies with Poetry etc.)

Library to access the API

Use a Python library called requests.

`Terminal`


pip install requests

`main.py`


import requests

`main.py`


result = requests.get(url_name, params = payload)
result_json = result.json()
result_score = result_json['lighthouseResult']['categories']['performance']['score']

Access the API with get (), convert the result to JSON with json (), and trace to the score where the score is stored.

Library to calculate results

math
numpy
decimal

I use the Python library of the calculation system. Of these, numpy is not included by default, so install it with pip.

`Terminal`


pip install numpy

`main.py`


import numpy

Since decimal uses only the Decimal object and the rounding constant ROUND_HALF_UP, it is read in this form during import.

`main.py`


from decimal import Decimal, ROUND_HALF_UP

`main.py`


score_average_raw = numpy.average(scores)
score_average = Decimal(str(score_average_raw)).quantize(Decimal('0.1'), rounding=ROUND_HALF_UP)
score_max = numpy.amax(scores)
score_min = numpy.amin(scores)

--With numpy, calculate the average score from the score, and get only the highest and lowest points, respectively. --In the Decimal object of decimal, the average value is rounded off and displayed to the first decimal place.

doing.

Try using

After executing it, it will collect points by leaving it alone. It may be more convenient to use something like terminal-notifier to signal the end.

`Terminal`


py main.py

スクリーンショット 2020-07-18 午後8.26.48.png

The measurement results will be accumulated like this.

Summary

Data acquisition-processing is a very fundamental part, so it seems that it can be used in various places in business. As an engineer, I would like to be able to create such tools to inhale and exhale.

So it was my first Qiita article. No!