HeadlessChrome: a solution for server-side rendering of JS sites [Medium] [Translation] Preventing re-rendering optimization


Continued from previous post

Preventing re-rendering

Actually saying that no changes are made to the client code is ludicrous. In our Express application, the page is loaded via Puppteer to provide the client with a response, but there are some issues with this process.

The js script is executed once in Headless Chrome on the server side, but waiting for the browser to get the real result doesn't prevent the js from executing again, so in this case the js is executed twice (once on the client side and once on the server side)

For our example, We can simply fix it., We need to tell the page, neededhtml It's been generated., No need to generate it again, So we can simply detect<ul id="posts"> Does it exist at the time of initialization, If there is, Indicates that the server-side has renderedOK, There's no need to re-render。 The code is simply modified as follows:

public/index.html

 1 <html>
 2 <body>
 3   <div id="container">
 4     <!-- Populated by JS (below) or by prerendering (server). Either way,
 5          #container gets populated with the posts markup:
 6       <ul id="posts">...</ul>
 7     -->
 8   </div>
 9 </body>
10 <script>
11 ...
12 (async() => {
13   const container = document.querySelector('#container');
14 
15   // Posts markup is already in DOM if we're seeing a SSR'd.
16   // Don't re-hydrate the posts here on the client.
17   const PRE_RENDERED = container.querySelector('#posts');
18 // onlydom non-existent, before it is rendered on the client side
19   if (!PRE_RENDERED) {
20     const posts = await fetch('/posts').then(resp => resp.json());
21     renderPosts(posts, container);
22   }
23 })();
24 </script>
25 </html>

optimisation

In addition to caching the pre-rendered results, there are actually a lot of interesting optimization options via ssr(). Some optimization solutions are easier to see results, while others require careful thought to see results, depending largely on the type of application page and the complexity of the application.

Termination of non-mandatory requests

Currently, the entire page (and all resources within the page) are loaded unconditionally in headless chrome. Then, we really only focus on two things.

1.The rendered Html tag

2.Ability to generate labeled js requests

So any network request that doesn't build a Dom result is a waste of network resources. For example, images, font files, style files and media assets are not actually involved in building the HTML. The style just completes or lays out the DOM, but doesn't create it for display, so we should tell the browser to ignore those resources! By doing this we can save a lot of bandwidth boosting pre-rendering time, especially for pages that contain a lot of resources.

The Devtools protocol supports a powerful feature called web interception, a mechanism that allows us to modify the request object before the browser actually initiates the request. Puppteer is able to provide a more accurate picture by turning onpage.setRequestInterception(true) and setpage Request events for objects, to enable the network blocking mechanism。 It allows us to terminate a request for a resource, Release our request for permission。

ssr.mjs

 1 async function ssr(url) {
 2   ...
 3   const page = await browser.newPage();
 4 
  5 // 1.  Enable the network blocker.
 6   await page.setRequestInterception(true);
 7 
 8   page.on('request', req => {
9 // 2. Terminate requests for resources that do not build the DOM // (images, stylesheets, media).
10     const whitelist = ['document', 'script', 'xhr', 'fetch'];
11     if (!whitelist.includes(req.resourceType())) {
12       return req.abort();
13     }
14 
 15 // 3.  Other requests cleared normally
16     req.continue();
17   });
18 
19   await page.goto(url, {waitUntil: 'networkidle0'});
20   const html = await page.content(); // serialized HTML of page DOM.
21   await browser.close();
22 
23   return {html};
24 }

Inline resource file content

Typically, we use build tools (such as gulp, etc.) to inline js, css, etc. directly into the page at build time. This will improve page initialization performance by reducing http requests.

In addition to using the build tool, we can also use the browser to do the same thing, we can use Puppteer to manipulate the page DOM, inline styles, Javascript and other resources you want to inline in before pre-rendering.

This column shows how to inline a local css resource into the style tag of a page by intercepting the response object.

import urlModule from 'url';
const URL = urlModule.URL;

async function ssr(url) {
  ...
  const stylesheetContents = {};

  // 1. Stash the responses of local stylesheets.
  page.on('response', async resp => {
    const responseUrl = resp.url();
    const sameOrigin = new URL(responseUrl).origin === new URL(url).origin;
    const isStylesheet = resp.request().resourceType() === 'stylesheet';
// For the same domain as the pagestyles  temporary storage
    if (sameOrigin && isStylesheet) {
      stylesheetContents[responseUrl] = await resp.text();
    }
  });

  // 2. Load page as normal, waiting for network requests to be idle.
  await page.goto(url, {waitUntil: 'networkidle0'});

  // 3. Inline the CSS.
  // Replace stylesheets in the page with their equivalent <style>.
  await page.$$eval('link[rel="stylesheet"]', (links, content) => {
    links.forEach(link => {
      const cssText = content[link.href];
      if (cssText) {
        const style = document.createElement('style');
        style.textContent = cssText;
        link.replaceWith(style);
      }
    });
  }, stylesheetContents);

  // 4. Get updated serialized HTML of page.
  const html = await page.content();
  await browser.close();

  return {html};
}

A brief explanation of the above code.

1、 usepage.on("response") Event Listening Network Response。

2、 Interception vs. localcss The response of the resource and temporary storage

3、 Find alllink label, Replace withstyle label, and settextContent for the previous step temporary storage content。

Automatic minimization of resources

Another trick you can use with the web blocker is to respond to the content

For example, let's say for example, that you want to compress css resources in your app, but you also want to not do any compression during the development phase. Then at this point you can also rewrite the response content by rewriting it in Puppteer at the pre-rendering stage with the following code.

 1 import fs from 'fs';
 2 
 3 async function ssr(url) {
 4   ...
 5 
 6   // 1. Intercept network requests.
 7   await page.setRequestInterception(true);
 8 
 9   page.on('request', req => {
10     // 2. If request is for styles.css, respond with the minified version.
11     if (req.url().endsWith('styles.css')) {
12       return req.respond({
13         status: 200,
14         contentType: 'text/css',
15         body: fs.readFileSync('./public/styles.min.css', 'utf-8')
16       });
17     }
18     ...
19 
20     req.continue();
21   });
22   ...
23 
24   const html = await page.content();
25   await browser.close();
26 
27   return {html};
28 }

The main use here is the request.respond method, which can be viewed directly in the interface description documentation https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#requestrespondresponse

Reuse as a Chrome instance

Starting a browser instance every time you pre-render can be a big server burden, so a better approach is to use the same instance when rendering different pages or when starting different renderers, which can save server-side resources to a large extent and increase the speed of pre-rendering.

Puppteer can connect to an already existing instance by calling Puppteer.connect(url), which in turn avoids creating a new instance. To maintain a long-running browser instance, we can modify our code to move the code that launches chrome from ssr() to the Express Server entry file at

server.mjs

import express from 'express';
import puppeteer from 'puppeteer';
import ssr from './ssr.mjs';

let browserWSEndpoint = null;
const app = express();

app.get('/', async (req, res, next) => {
  if (!browserWSEndpoint) {
    const browser = await puppeteer.launch();
    browserWSEndpoint = await browser.wsEndpoint();
  }

  const url = `${req.protocol}://${req.get('host')}/index.html`;
  const {html} = await ssr(url, browserWSEndpoint);

  return res.status(200).send(html);
});

ssr.mjs

import puppeteer from 'puppeteer';

/**
 * @param {string} url URL to prerender.
 * @param {string} browserWSEndpoint Optional remote debugging URL. If
 *     provided, Puppeteer's reconnects to the browser instance. Otherwise,
 *     a new browser instance is launched.
 */
async function ssr(url, browserWSEndpoint) {
  ...
  console.info('Connecting to existing Chrome instance.');
  const browser = await puppeteer.connect({browserWSEndpoint});

  const page = await browser.newPage();
  ...
  await page.close(); // Close the page we opened here (not the browser).

  return {html};
}

End of the middle section, The next part is the final part( Timed run pre-rendering example& Other Notes) Please stay tuned


Recommended>>
1、The SQL Audit and Development Game
2、SpringBoot Minimal Tutorial Chapter 9 SpringBoot Integration with Scala Hybrid Java Development Reference
3、Probably one of the clearest articles on the web about the concept of ZooKeeper
4、CAS Vice President Tan Tieniu Three bottlenecks still exist in generalpurpose pattern recognition
5、MIT Technology Reviews latest top 10 global breakthrough technologies for 2018 artificial intelligence biogenetic technology and more on the list

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送

    已发送

    朋友将在看一看看到

    确定
    分享你的想法...
    取消

    分享想法到看一看

    确定
    最多200字,当前共

    发送中

    网络异常,请稍后重试

    微信扫一扫
    关注该公众号