Data Crawler/Web Scraping Staff

February 21, 2025
Application ends: March 21, 2025
Apply Now

Apply for this job

Upload CV (doc, docx, pdf)

Job Description

POSITION INTRODUCTION

  • We are looking for a professional Data Scraping specialist, capable of operating a large-scale data collection system, ensuring stability, accuracy and efficiency.
1. Professional Scraping System Development
Technical Requirements:

System Architecture:

  • Design cross-platform Python crawling scripts
  • Build scalable systems
  • Develop parallel crawling solutions
  • Manage large, multi-threaded data streams

Technologies:

  • Scrapy, BeautifulSoup
  • Selenium
  • Asyncio, Multiprocessing
  • Proxy management
  • IP rotation techniques
2. Data Processing and Normalization

Processing Methods:

  • Develop API data cleaning processes
  • Data transformation algorithms
  • Integrity checks
  • Remove noisy data

Tools:

  • Pandas
  • Data validation techniques
  • Machine Learning preprocessing
3. Database Management

Specialized Skills:

Advanced SQL:

  • Complex queries
  • Performance optimization
4. Monitoring & Optimization

Strategy:

  • Manage scraping system operations.
  • Track scraping performance
  • Challenge handling:
  • IP blocking
  • Speed ​​limiting
  • CAPTCHA

PROFESSIONAL REQUIREMENTS

Education
  • Bachelor’s degree (GPA > 3.0)
  • Major:
  • Data science
  • Computer engineering
  • Data related fields
  • English: TOEIC > 700 of  IELTS >5.5
Technical Skills

Python Ecosystem

  • Asyncio, Multiprocessing
  •  Data cleaning techniques
  • Machine Learning preprocessing
  • Advanced error handling

Database & Big Data

  • SQL (Intermediate to Advanced)
  • NoSQL database management
  • PySpark
  • Data warehousing

In-depth Experience

  • Minimum 1-2 years
  • Project implementation:
  • Web scraping
  • Automatic data processing
  • Big data crawling

SOFT SKILLS

  • System analysis
  • Problem solving
  • Independent & team working
  • Time management
  • Logical thinking

NICE TO HAVE EXPERIENCES

  • Big Data experience
  • Data pipeline design
  • Working with diverse APIs
  • Professional certifications
  • Creativity and initiative in proposing ideas

BENEFITS

  • Modern technology environment
  • Competitive salary
  • Development opportunities
  • Continuous training

EVALUATION CRITERIA

  • System stability
  • Data quality
  • Processing efficiency
  • Scalability

REPORTING

  • Directly report to: Manager and Board of Directors
  • Reporting content: according to reporting regulations and reporting content for the technical
  • Types of Reports:
  • Daily Progress Report 
  • Weekly Report 
  • Monthly Report 
  • Milestone Quick Report
  • Incident Report 
  • Performance Report 

OTHER RELATED FACTORS

  • Working hours: 07 hours/day (Morning from 08:00 – 11:30, Afternoon from 13:00 – 16:30), from Monday to Friday, off on Saturday & Sunday. 
  • Working equipment: provided
  • Salary: 12 – 18 million/month