ShopParser Marketplace Scraper
Desktop application for structured and polite parsing of marketplace product listings with preview, export, throttling, and resume functionality.
Overview
ShopParser is a Windows desktop application designed for structured and rate-limited parsing of online marketplace product listings.
The application focuses on controlled data extraction workflows, allowing users to:
- preview marketplace pages;
- configure export settings;
- select output fields;
- run throttled crawling sessions;
- resume interrupted operations from checkpoints.
The system was designed with extensibility and reusable marketplace adapters in mind, allowing support for multiple marketplaces within a shared architecture.
Context
The project originated from the need to simplify structured collection and export of marketplace catalog data for analysis, migration, and synchronization workflows.
The target workflow required:
- non-invasive crawling behavior;
- configurable exports;
- marketplace-specific field handling;
- preview and validation before full crawling;
- support for large product catalogs;
- resumable crawling sessions.
A major design requirement was keeping the architecture modular enough to support future marketplace integrations without rewriting the core crawler logic.
Responsibilities
My responsibilities included:
- overall application architecture;
- crawler engine implementation;
- marketplace adapter architecture;
- desktop UI development;
- export system implementation;
- throttling and retry logic;
- checkpoint/resume workflow;
- data model design;
- integration planning for future marketplaces.
Solution
The solution was implemented as a WPF desktop application using a modular marketplace-adapter architecture.
Users can:
- provide marketplace shop URLs;
- preview parsed data before crawling;
- select exported fields;
- choose export formats;
- run full parsing sessions;
- resume interrupted crawls from checkpoints.
The system supports marketplace-specific export logic and configurable parsing workflows while maintaining a shared crawling infrastructure.
Special attention was paid to polite crawling practices:
- request delays;
- retry intervals;
- jitter/randomization;
- controlled HTTP usage.
Technical Details
Stack
- C#
- .NET
- WPF
- AngleSharp
- HttpClient
- ClosedXML
- System.Text.Json
Architecture
The project is organized around several core layers:
- crawl engine;
- marketplace adapters;
- export subsystem;
- UI workflow layer;
- data models.
Marketplace integrations implement a shared adapter interface, allowing new marketplaces to be added through isolated modules without affecting the core system.
Functionality
Implemented functionality includes:
- marketplace page preview;
- throttled crawling sessions;
- export field selection;
- CSV export;
- JSONL export;
- XLSX export;
- checkpoint-based resume support;
- retry and delay management;
- structured logging;
- marketplace-specific export presets.
The system currently focuses on marketplaces such as:
- Rozetka;
- Prom.ua.
Challenges
The main challenges included:
- designing marketplace-agnostic crawling architecture;
- balancing extensibility and simplicity;
- handling inconsistent marketplace page structures;
- implementing polite crawling behavior;
- supporting resumable long-running parsing sessions;
- organizing export formats with different schema requirements.
Another important challenge was preparing the architecture for future marketplace integrations without tightly coupling parsing logic to specific platforms.
Result
The project successfully demonstrated:
- modular marketplace parsing architecture;
- reusable crawling infrastructure;
- desktop-based parsing workflow;
- configurable export pipelines;
- checkpoint-based session recovery;
- scalable adapter-oriented system design.
The resulting architecture can serve as a foundation for future marketplace automation and structured catalog processing tools.
Media
The gallery contains:
- preview workflow examples;
- export configuration screens;
- crawl progress logs;
- exported dataset examples.
Notes
- Educational/prototype-oriented project.
- Marketplace-specific parsing rules may be simplified or partially omitted.
- Built with extensibility for future marketplace integrations.
- Focused on polite and controlled crawling behavior.