Skip to main content

Case Study

Upwork Scout - Job Scraper Bot

Telegram assistant that surfaces freelance gigs in real time

A scraping and automation bot that monitors Upwork job listings and delivers filtered gig alerts through Telegram. Built to reduce search overhead and surface relevant opportunities shortly after posting.

2/1/2024Updated 8/1/20247 min read
Node.js
Puppeteer
MongoDB
Telegram Bot API
Docker
Upwork Scout - Job Scraper Bot screenshot

Project Overview

The Challenge

Manual Upwork searching was slow and inconsistent, making it easy to miss high-fit jobs early. There was a need for a reliable assistant that could detect and forward relevant gigs without constant dashboard polling.

The Solution

I built a Node.js + Puppeteer bot with keyword filtering, deduplication, and Telegram delivery. The bot uses scheduled scraping intervals, resilient retry logic, and MongoDB persistence to maintain state across runs.

Impact

  • Surfaced matching gigs shortly after publication
  • Reduced manual search time and notification lag
  • Created reusable scraping + alert pipeline patterns
  • Open-sourced implementation for freelancer tooling reuse

Key Metrics

Telegram

Delivery Channel

Instant push notifications

MongoDB

Persistence

Deduplication and state tracking

Scheduled

Runtime

Predictable scan intervals

Technical Implementation

Architecture

The bot uses Puppeteer for listing extraction, MongoDB for persistent deduplication, and Telegram Bot API for delivery. A scheduler controls scan cadence while retry + backoff logic handles transient scraping failures.

Technology Stack

Frontend

Telegram Chat InterfaceMessage FormattingFilter Commands

Backend

Node.jsPuppeteerSchedulerRetry Logic

Database

MongoDBJob DeduplicationState Tracking

Infrastructure

DockerCron SchedulingLogging

Tools

Puppeteer StealthTelegram Bot APIGitHubMonitoring Alerts

Key Features

  • Automated Upwork listing extraction
  • Keyword-based relevance filtering
  • Duplicate job prevention via persistence layer
  • Near-real-time Telegram notifications
  • Resilient retry and scheduler flow

Challenges and Solutions

Challenge

Preventing noisy or duplicate alerts

Solution

Added persistent fingerprinting in MongoDB and structured keyword filters so only useful, new opportunities were delivered.

Challenge

Maintaining reliable scraping over repeated runs

Solution

Implemented retry/backoff and defensive DOM extraction patterns to reduce failure frequency during layout shifts.

Related Projects

Live-Stream Guardian QA

UNFCCC - Automation Engineer. Chrome extension with Node.js and Puppeteer that replaced manual stream monitoring with automated quality checks and fast alerts.

Chrome ExtensionsNode.jsPuppeteer
View Project

DataHarbor Solutions Portal

Web Development case study using React, Tailwind CSS, Styled-Components, Node.js, and Express for clean, quick data-management operations.

ReactTailwind CSSStyled-Components
View Project