Announcement
Bad news for beloved PhantomJS:
Headless mode allows running Chromium in a headless/server environment. Expected use cases include loading web pages, extracting metadata (e.g., the DOM) and generating bitmaps from page contents -- using all the modern web platform features provided by Chromium and Blink.
To use headless, start Chrome with a command line flag:
$ chrome --headless --remote-debugging-port=9222 https://chromium.org
Chrome Platform Status
Puppeteer
Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium.GitHub
Usage from Node.js
For example, the chrome-remote-interface Node.js package can be used to extract a page's DOM like this:
const CDP = require('chrome-remote-interface');
CDP((client) => {
// Extract used DevTools domains.
const {Page, Runtime} = client;
// Enable events on domains we are interested in.
Promise.all([
Page.enable()
]).then(() => {
return Page.navigate({url: 'https://example.com'});
});
// Evaluate outerHTML after page has loaded.
Page.loadEventFired(() => {
Runtime.evaluate({expression: 'document.body.outerHTML'}).then((result) => {
console.log(result.result.value);
client.close();
});
});
}).on('error', (err) => {
console.error('Cannot connect to browser:', err);
});
Google Source
Comments
Post a Comment