Skip to main content
Version: 0.1.21

Extractors Overview

This page helps you choose the right extractor for your run, understand key constraints, and navigate to detailed technical guides.

Extractor chooser

ExtractorBest use caseCore constraints/dependenciesNotable controlsOutput/behavior notes
GradcrackerUK graduate roles from GradcrackerCrawling stability depends on page structure and anti-bot behavior; tuned for low concurrencyGRADCRACKER_SEARCH_TERMS, GRADCRACKER_MAX_JOBS_PER_TERM, JOBOPS_SKIP_APPLY_FOR_EXISTINGScrapes listing metadata, then detail pages and apply URL resolution
JobSpyMulti-source discovery (Indeed, LinkedIn, Glassdoor)Requires Python wrapper execution per term; source availability and quality vary by site/locationJOBSPY_SITES, JOBSPY_SEARCH_TERMS, JOBSPY_RESULTS_WANTED, JOBSPY_HOURS_OLD, JOBSPY_LINKEDIN_FETCH_DESCRIPTIONProduces JSON per term, then orchestrator normalizes and de-duplicates by jobUrl
UKVisaJobsUK visa sponsorship-focused rolesRequires authenticated session and periodic token/cookie refreshUKVISAJOBS_EMAIL, UKVISAJOBS_PASSWORD, UKVISAJOBS_MAX_JOBS, UKVISAJOBS_SEARCH_KEYWORDAPI pagination + dataset output; orchestrator de-dupes and may fetch missing descriptions
Manual ImportOne-off jobs not covered by scrapersInference quality depends on model/provider and input quality; some URLs cannot be fetched reliablyApp/API endpoints (/api/manual-jobs/infer, /api/manual-jobs/import)Accepts text/HTML/URL, runs inference, then saves and scores job after review

Which extractor should I use?

  • Use JobSpy for broad first-pass sourcing across common boards.
  • Use Gradcracker when targeting graduate pipelines in the UK.
  • Use UKVisaJobs for sponsorship-specific UK searches.
  • Use Manual Import when you already have a specific posting and need direct import.

Many runs combine sources: broad discovery first, then manual import for high-priority jobs that scraping misses.