Skip to content Skip to sidebar Skip to footer

Error On Line 1: Content Is Not Allowed In Prolog

I am trying to scrape a table of price data from this website using the following code; function scrapeData() { // Retrieve table as a string using Parser. var url = 'https://stooq

Solution 1:

How about this modification?

Modification points:

  • In this case, it is required to modify the retrieved HTML values. For example, when var content = UrlFetchApp.fetch(url).getContentText() is run, each attribute value is not enclosed. These are required to be modified.
  • There is a merged column in the header.

When above points are reflected to the script, it becomes as follows.

Modified script:

function scrapeData() {
  // Retrieve table as a string using Parser.
  var url = "https://stooq.com/q/d/?s=barc.uk&i=d";
  var fromText = '#d9d9d9}</style>';
  var toText = '<table';
  var content = UrlFetchApp.fetch(url).getContentText();
  var scraped = Parser.data(content).from(fromText).to(toText).build();

  // Modify values
  scraped = scraped.replace(/=([a-zA-Z0-9\%-:]+)/g, "=\"$1\"").replace(/nowrap/g, "");

  // Parse table using XmlService.
  var root = XmlService.parse(scraped).getRootElement();

  // Retrieve header and modify it.
  var headerTr = root.getChild("thead").getChildren();
  var res = headerTr.map(function(e) {return e.getChildren().map(function(f) {return f.getValue()})});
  res[0].splice(7, 0, "Change");

  // Retrieve values.
  var valuesTr = root.getChild("tbody").getChildren();
  var values = valuesTr.map(function(e) {return e.getChildren().map(function(f) {return f.getValue()})});
  Array.prototype.push.apply(res, values);

  // Put the result to the active spreadsheet.
  var ss = SpreadsheetApp.getActiveSheet();
  ss.getRange(1, 1, res.length, res[0].length).setValues(res);
}

Note:

  • Before you run this modified script, please install the GAS library of Parser.
  • This modified script is not corresponding to various URL. This can be used for the URL in your question. If you want to retrieve values from other URL, please modify the script.

Reference:

If this was not what you want, I'm sorry.


Post a Comment for "Error On Line 1: Content Is Not Allowed In Prolog"