Skip to content
View hexsyro's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report hexsyro

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hexsyro/README.md

Hi, I'm Heshan πŸ‘‹

I build web scraping, automation systems, and data products β€” from scrapers that handle dynamic sites at scale to full production platforms that serve structured data via REST APIs.


πŸš€ What I Do

  • High-performance web scraping (dynamic + static sites)
  • Multi-source data aggregation pipelines
  • Dataset cleaning, enrichment & normalization
  • Secure REST API development
  • Automated data collection systems

πŸ“‚ Projects

Production news aggregation platform that indexes 6,900+ articles per run from 250+ live sources including BBC, Reuters, The Guardian, TechCrunch, and more.

Built with a 4-tier RSS fallback chain, Playwright-powered scraper for paywalled and dynamic sources, and an hourly APScheduler pipeline. Features full-text search, keyword alerts, weekly digest emails, and a REST API.

FastAPI Β· PostgreSQL Β· Playwright Β· Next.js Β· APScheduler Β· Resend Β· Railway Β· Supabase


Production-grade OSINT dataset marketplace aggregating, enriching, and distributing structured social media data across Reddit, YouTube, GitHub, and Medium.

Features a scalable scraping architecture, automated enrichment pipelines, and secure subscription-based API delivery. Datasets cover AI training, financial sentiment, brand monitoring, and market intelligence.

FastAPI Β· PostgreSQL Β· Next.js Β· Paddle Β· JWT Β· AWS S3


Multi-page Python scraper extracting quotes, authors, and tags from Goodreads into structured datasets.

Python Β· BeautifulSoup Β· CSV


πŸ›  Tech Stack

Scraping & Automation

  • Playwright (Python & Node)
  • Selenium
  • BeautifulSoup
  • Requests / HTTPX
  • Asyncio

Data Processing

  • Pandas Β· NumPy
  • CSV / JSON / JSONL / Parquet exports
  • lxml (XML/RSS repair)

Backend

  • FastAPI
  • PostgreSQL (asyncpg)
  • JWT Authentication
  • APScheduler
  • Resend (transactional email)

Frontend

  • Next.js 15 (App Router)
  • Tailwind CSS
  • TypeScript

Infrastructure

  • Railway (API hosting)
  • Vercel (frontend)
  • Supabase (managed PostgreSQL)
  • Docker

πŸ“¬ Connect

Pinned Loading

  1. Phonq Phonq Public

    A music platform that specified in Phonk genre music

    JavaScript 1

  2. Goodreads-quote-scraper Goodreads-quote-scraper Public

    Scrapes quotes from goodreads.com/quotes and saves to CSV.

    Python 1