In today’s fast-paced e-commerce world, having structured product data is essential for everything from price comparisons to building sophisticated recommendation systems. One powerful tool for extracting and processing this data is Amazon Bedrock, a machine learning service that simplifies building and deploying applications with AI. In this blog post, we’ll explore how to use Node.js to extract product data from a webpage, specifically from an e-commerce site like Noel Leeming.
We’ll focus on the LG 65″ UR78 4K Smart UHD TV product page, found at this URL: https://www.noelleeming.co.nz/p/lg-65-ur78-4k-smart-uhd-tv-2023/N220470.html.
Prerequisites
Before we dive in, make sure you have the following:
- Node.js installed on your machine.
- An AWS account with access to Amazon Bedrock.
- Basic knowledge of JavaScript and web scraping.
Here, we’re installing axios
for making HTTP requests, cheerio
for parsing HTML, and aws-sdk
for interacting with Amazon Bedrock.
Step 2: Fetching the Webpage
Next, we’ll write a script to fetch the HTML content of the product page. Create a file named index.js
and add the following code:
const axios = require('axios');
const cheerio = require('cheerio');
const url = 'https://www.noelleeming.co.nz/p/lg-65-ur78-4k-smart-uhd-tv-2023/N220470.html';
async function fetchHTML(url) {
try {
const { data } = await axios.get(url);
return data;
} catch (error) {
console.error('Error fetching the HTML:', error);
}
}
fetchHTML(url).then((html) => {
console.log(html);
});
Need to extract data from a website to use in another system?
Flipmind can build you an API to allow you to easily use this data in your apps. Get in touch today!