Link previews transform bare URLs into rich cards with images, titles, and descriptions. You've seen them everywhere: Slack, Twitter, Discord, iMessage. They make links more engaging and give users context before clicking.
In this guide, we'll build a complete link preview service that generates thumbnail screenshots, extracts metadata, and caches results for performance.
What We're Building
Our service will:
Accept a URL as input
Fetch the page's Open Graph metadata (title, description, image)
{"url":"https://example.com","title":"Example Domain","description":"This domain is for use in illustrative examples.","image":"https://your-cdn.com/previews/abc123.png","favicon":"https://example.com/favicon.ico"}
Architecture Overview
We'll use a simple Node.js server with three main components:
Metadata extractor: Fetches and parses Open Graph tags
Screenshot service: Captures page thumbnails via API
importexpressfrom'express';import{ createClient }from'redis';const app =express();const redis =createClient();await redis.connect();app.get('/preview',async(req, res)=>{const{ url }= req.query;if(!url){return res.status(400).json({error:'URL is required'});}try{const preview =awaitgetPreview(url); res.json(preview);}catch(error){ res.status(500).json({error:'Failed to generate preview'});}});app.listen(3000,()=>{console.log('Preview service running on port 3000');});
Step 2: Extracting Metadata
Open Graph tags live in the <head> of HTML documents. We'll fetch the page and parse them with Cheerio:
importfetchfrom'node-fetch';import*as cheeriofrom'cheerio';asyncfunctionextractMetadata(url){const response =awaitfetch(url,{headers:{'User-Agent':'LinkPreviewBot/1.0'},timeout:5000});const html =await response.text();const $ = cheerio.load(html);// Extract Open Graph tagsconstgetMetaContent=(property)=>{return$(`meta[property="${property}"]`).attr('content')||$(`meta[name="${property}"]`).attr('content');};return{title:getMetaContent('og:title')||$('title').text()||'',description:getMetaContent('og:description')||getMetaContent('description')||'',image:getMetaContent('og:image')||'',siteName:getMetaContent('og:site_name')||'',favicon:$('link[rel="icon"]').attr('href')||$('link[rel="shortcut icon"]').attr('href')||newURL('/favicon.ico', url).href};}
Always set a descriptive User-Agent header. Some sites block requests without one, and it helps site owners understand their traffic.
Step 3: Capturing Screenshots
When a page lacks an Open Graph image, we'll capture a screenshot. This is where the Allscreenshots API shines: there is no browser management needed:
asyncfunctioncaptureScreenshot(url){const response =awaitfetch('https://api.allscreenshots.com/v1/screenshots',{method:'POST',headers:{'X-API-Key': process.env.SCREENSHOT_API_KEY,'Content-Type':'application/json'},body:JSON.stringify({url: url,width:1200,height:630,// Standard OG image ratioformat:'png',delay:1000,// Wait for content to loadblockAds:true})});if(!response.ok){thrownewError('Screenshot capture failed');}const imageBuffer =await response.arrayBuffer();// Upload to your CDN/storageconst imageUrl =awaituploadToStorage(imageBuffer);return imageUrl;}
The 1200x630 dimensions match the standard Open Graph image ratio, so previews will look correct when shared on social platforms.
Step 4: Caching Results
Link previews rarely change, so aggressive caching makes sense. We'll cache for 24 hours:
constCACHE_TTL=60*60*24;// 24 hours in secondsasyncfunctiongetPreview(url){// Normalize URL for consistent cache keysconst cacheKey =`preview:${normalizeUrl(url)}`;// Check cache firstconst cached =await redis.get(cacheKey);if(cached){returnJSON.parse(cached);}// Generate fresh previewconst metadata =awaitextractMetadata(url);// Capture screenshot if no OG imagelet image = metadata.image;if(!image){ image =awaitcaptureScreenshot(url);}const preview ={ url,title: metadata.title,description: metadata.description, image,favicon: metadata.favicon,siteName: metadata.siteName,generatedAt:newDate().toISOString()};// Store in cacheawait redis.setEx(cacheKey,CACHE_TTL,JSON.stringify(preview));return preview;}functionnormalizeUrl(url){const parsed =newURL(url);// Remove trailing slashes, lowercase hostnamereturn`${parsed.protocol}//${parsed.hostname.toLowerCase()}${parsed.pathname.replace(/\/$/,'')}`;}
Consider adding a cache-busting mechanism for site owners who update their pages. A simple ?refresh=true parameter works well.
Step 5: Error Handling
Real-world URLs fail in many ways. Handle them gracefully:
asyncfunctiongetPreview(url){try{// Validate URLnewURL(url);}catch{thrownewError('Invalid URL');}// Check for blocked domainsconst blocked =['localhost','127.0.0.1','0.0.0.0'];const hostname =newURL(url).hostname;if(blocked.some(b=> hostname.includes(b))){thrownewError('URL not allowed');}try{const metadata =awaitextractMetadata(url);// ... rest of logic}catch(error){if(error.code==='ETIMEDOUT'){thrownewError('Request timed out');}if(error.code==='ENOTFOUND'){thrownewError('Domain not found');}throw error;}}
Always validate and sanitize URLs before processing. Server-side request forgery (SSRF) attacks can use your preview service to probe internal networks.
Step 6: Rate Limiting
Protect your service from abuse with rate limiting:
importrateLimitfrom'express-rate-limit';const limiter =rateLimit({windowMs:60*1000,// 1 minutemax:100,// 100 requests per minutemessage:{error:'Too many requests'}});app.use('/preview', limiter);
Performance Optimizations
A few tweaks to make the service faster:
Parallel Requests
Fetch metadata and screenshots concurrently when possible:
asyncfunctiongetPreview(url){const metadata =awaitextractMetadata(url);if(metadata.image){returnbuildPreview(url, metadata);}// Only capture screenshot if neededconst screenshot =awaitcaptureScreenshot(url);returnbuildPreview(url,{...metadata,image: screenshot });}
Preemptive Caching
For frequently requested URLs, refresh the cache before it expires:
asyncfunctiongetPreview(url){const cacheKey =`preview:${normalizeUrl(url)}`;const cached =await redis.get(cacheKey);if(cached){const preview =JSON.parse(cached);const age =Date.now()-newDate(preview.generatedAt).getTime();// Refresh in background if cache is getting staleif(age >CACHE_TTL*0.8*1000){refreshPreview(url).catch(()=>{});// Fire and forget}return preview;}// Generate fresh preview// ...}
You now have a production-ready link preview service. The key ingredients:
Metadata extraction for titles and descriptions
Screenshot fallbacks for pages without OG images
Aggressive caching to minimize API calls
Error handling for the messy real world
From here, you could add support for Twitter Cards, oEmbed providers like YouTube, or image resizing for different use cases. The foundation is solid, so build on it.