HeadlessChrome: a solution for server-side rendering of JS sites [Previous] [Translation] Introducing HeadlessChrome pre-rendered pages


Link to original article: https://developers.google.com/web/tools/puppeteer/articles/ssr

note: Due to limited English proficiency, No verbatim translation, You can choose to read the original article directly

tips:Headless browser can be used as an alternative to server-side rendering, converting js sites to static html pages on the server side; running Headless browser on a webserver can pre-render modern js mode applications, increasing responsiveness and being more SEO friendly

The techniques covered in this piece show how to get the most out of theGoogle Headless framework(puppteer) toward aExpress web server Add server-side rendering capabilities, The application-friendly pair is, Basically no code changes are required; All the jobs basically havepuppteer assume (responsibility for), With a few simple lines of code you can render almost any page on the server side。

Here is a small piece of code that will be involved:

 1 import puppeteer from 'puppeteer';
 2 
 3 async function ssr(url) {
 4   const browser = await puppeteer.launch({headless: true});
 5   const page = await browser.newPage();
 6   await page.goto(url, {waitUntil: 'networkidle0'});
 7   const html = await page.content(); //  web pagehtml elements
 8   await browser.close();
 9   return html;
10 }

Note: The code in this post is based on es modules and requires node 8.5+ with --experimental-modules enabled

presentation

If you needseo, You logged in to read this article for one of two reasons: firstly, You have created aweb application, But it's not indexed by search engines, Your application may be aSPA、PWA application。 Or actually applications created by the technology stack, It doesn't really matter what technology stack you're using; The important thing is that, You've spent a lot of time creating great apps, But the user can't find it。 second, You may have noticed from other sites that server-side rendering can improve performance somewhat。 You can can reap the rewards here of how to reducejavascript Start-up costs and how to improve first screen rendering。

tips:Some frameworks like (Preact) already support server-side rendering, so if the framework you're using has a server-side rendering solution, then just stick with it, there's no need to introduce a new tool.

Crawling modern web applications

Search engines primarily crawl static html tags to work, but modern web applications have evolved to be more complex. Javascript based applications, the content is transparent to the web crawler as its content is mostly rendered on the client side via js. Some crawlers like google's crawlers are also getting smart. google's crawlers use Chrome41 to execute Javascript to get the final page, but this solution is still not very mature and perfect. For example, some of the new features of ES6, for example, still cause Js errors in older browsers. For the other search engines, hell, I wonder how they do it? O(∩_∩)O ha!

Headless Chrome Pre-rendered pages

All crawlers understandHTML, So what we need to address is how to implementJS, come up withHTML。 What if I told you there was such a tool, What do you think??

  1. This tool knows how to run all types ofJavascript, Then the output statichtml
  2. This tool comes withweb Adding new features will be updated continuously
  3. Modifying a small number of settings does not require any code changes, You can quickly apply this tool to existing applications

Sounds good, right?? This tool is the browser!

Headless Chrome doesn't care what libraries, frameworks, or toolchains are used; it eats in Javascript for breakfast and spits out static HTML for lunch. Of course we hope it will be a lot faster than that process - Eric

If you use Node, Puppteer is a relatively simple way to operate headless Chrome.The API it provides is a client-side application supporting server-side rendering capabilities. Here is a simple example.

1.JS application

We take a person who has passedjs dynamic generationHTML The example of a dynamic page starts with:

public/index.html

 1 <html>
 2 <body>
 3   <div id="container">
 4     <!-- Populated by the JS below. -->
 5   </div>
 6 </body>
 7 <script>
 8 function renderPosts(posts, container) {
 9   const html = posts.reduce((html, post) => {
10     return `${html}
11       <li class="post">
12         <h2>${post.title}</h2>
13         <div class="summary">${post.summary}</div>
14         <p>${post.content}</p>
15       </li>`;
16   }, '');
17 
18   // CAREFUL: assumes html is sanitized.
19   container.innerHTML = `<ul id="posts">${html}</ul>`;
20 }
21 
22 (async() => {
23   const container = document.querySelector('#container');
24   const posts = await fetch('/posts').then(resp => resp.json());
25   renderPosts(posts, container);
26 })();
27 </script>
28 </html>

2.SSR (Server Side Render) approach

next, A simple implementationssr approach

ssr.mjs

import puppeteer from 'puppeteer';

// Memory cache,key:url value:html elements
const RENDER_CACHE = new Map();

async function ssr(url) {
  if (RENDER_CACHE.has(url)) {
    return {html: RENDER_CACHE.get(url), ttRenderMs: 0};
  }

  const start = Date.now();

  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  try {
    // networkidle0 waits 500ms  When there are no other requests.
    // The page's JS has likely produced markup by this point, but wait longer
    // if your site lazy loads, etc.
    await page.goto(url, {waitUntil: 'networkidle0'});
    await page.waitForSelector('#posts'); // Wait and confirm #posts  Already present indom in, If it already exists, then immediately implement.
  } catch (err) {
    console.error(err);
    throw new Error('page.goto/waitForSelector timed out.');
  }

  const html = await page.content(); //  after being serializedHTML elements
  await browser.close();

  const ttRenderMs = Date.now() - start;
  console.info(`Headless rendered page in: ${ttRenderMs}ms`);

  RENDER_CACHE.set(url, html); // cache rendered page.

  return {html, ttRenderMs};
}

export {ssr as default};

Main code logic.

  1. Add cache. Caching the rendered HTML is the most effective way to improve response and avoid running headless chrome again when you request it again. Other aspects of optimization will be discussed later.
  2. Add exception handling for page load timeouts
  3. invokepage.waitForSelector('#posts') approach, assureid because ofposts elements that existed before the subsequent operation in theDOM in( interested in more than onewaitForxxx approach)
  4. Adding measurement statistics, countHeadless Rendering page time

3.WebServer side code

lastly, By aExpress server Tying it all together。 Hey look directly at the code, The code is commented out。

server.mjs

import express from 'express';
import ssr from './ssr.mjs';

const app = express();

app.get('/', async (req, res, next) => {
// invoke It's written.ssr approach, transmitted inwardsurl, pass (a bill or inspection)headless chrome  Return the rendered result after rendering
  const {html, ttRenderMs} = await ssr(`${req.protocol}://${req.get('host')}/index.html`);
  // Add Server-Timing! See https://w3c.github.io/server-timing/.
  res.set('Server-Timing', `Prerender;dur=${ttRenderMs};desc="Headless render time (ms)"`);
  return res.status(200).send(html); // Serve prerendered page as response.
});

app.listen(8080, () => console.log('Server started. Press Ctrl+C to quit'));

or so, Response receivedHTML That's the way it should be.:

<html>
<body>
  <div id="container">
    <ul id="posts">
      <li class="post">
        <h2>Title 1</h2>
        <div class="summary">Summary 1</div>
        <p>post content 1</p>
      </li>
      <li class="post">
        <h2>Title 2</h2>
        <div class="summary">Summary 2</div>
        <p>post content 2</p>
      </li>
      ...
    </ul>
  </div>
</body>
<script>
...
</script>
</html>

This is the end of the first part. Stay tuned for the next part and the middle part.


Recommended>>
1、Making blockchain a part of the living system
2、AAAI IJCAI and ACL Accept Three Tsinghua Undergraduates Results Chinese NLPs Most Outstanding HowNet Successfully Integrated into DL Models
3、Japan deploys first online blockchain voting system in a city
4、Domestic Linux steps into the information cloud service platform
5、Our specialty is called Thinking of You

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送

    已发送

    朋友将在看一看看到

    确定
    分享你的想法...
    取消

    分享想法到看一看

    确定
    最多200字,当前共

    发送中

    网络异常,请稍后重试

    微信扫一扫
    关注该公众号