Job Description
POSITION INTRODUCTION
- We are looking for a professional Data Scraping specialist, capable of operating a large-scale data collection system, ensuring stability, accuracy and efficiency.
1. Professional Scraping System Development
Technical Requirements:
System Architecture:
- Design cross-platform Python crawling scripts
- Build scalable systems
- Develop parallel crawling solutions
- Manage large, multi-threaded data streams
Technologies:
- Scrapy, BeautifulSoup
- Selenium
- Asyncio, Multiprocessing
- Proxy management
- IP rotation techniques
2. Data Processing and Normalization
Processing Methods:
- Develop API data cleaning processes
- Data transformation algorithms
- Integrity checks
- Remove noisy data
Tools:
- Pandas
- Data validation techniques
- Machine Learning preprocessing
3. Database Management
Specialized Skills:
Advanced SQL:
- Complex queries
- Performance optimization
4. Monitoring & Optimization
Strategy:
- Manage scraping system operations.
- Track scraping performance
- Challenge handling:
- IP blocking
- Speed limiting
- CAPTCHA
PROFESSIONAL REQUIREMENTS
Education
- Bachelor’s degree (GPA > 3.0)
- Major:
- Data science
- Computer engineering
- Data related fields
- English: TOEIC > 700 of IELTS >5.5
Technical Skills
Python Ecosystem
- Asyncio, Multiprocessing
- Data cleaning techniques
- Machine Learning preprocessing
- Advanced error handling
Database & Big Data
- SQL (Intermediate to Advanced)
- NoSQL database management
- PySpark
- Data warehousing
In-depth Experience
- Minimum 1-2 years
- Project implementation:
- Web scraping
- Automatic data processing
- Big data crawling
SOFT SKILLS
- System analysis
- Problem solving
- Independent & team working
- Time management
- Logical thinking
NICE TO HAVE EXPERIENCES
- Big Data experience
- Data pipeline design
- Working with diverse APIs
- Professional certifications
- Creativity and initiative in proposing ideas
BENEFITS
- Modern technology environment
- Competitive salary
- Development opportunities
- Continuous training
EVALUATION CRITERIA
- System stability
- Data quality
- Processing efficiency
- Scalability
REPORTING
- Directly report to: Manager and Board of Directors
- Reporting content: according to reporting regulations and reporting content for the technical
- Types of Reports:
- Daily Progress Report
- Weekly Report
- Monthly Report
- Milestone Quick Report
- Incident Report
- Performance Report
OTHER RELATED FACTORS
- Working hours: 07 hours/day (Morning from 08:00 – 11:30, Afternoon from 13:00 – 16:30), from Monday to Friday, off on Saturday & Sunday.
- Working equipment: provided
- Salary: 12 – 18 million/month