Google Jobs renders listings dynamically and localizes results, so naïve HTTP requests rarely return usable data. The most reliable approach is to use a scraping API that handles JavaScript rendering and geo-targeting, then parse just the fields you need into CSV.

What you’ll build:

  • CSV files per query and location with job title, company, location, date, salary, source, and share URL.
  • A scalable pipeline that runs multiple queries across multiple geo-locations concurrently.

Method 1 — Oxylabs Web Scraper API (asynchronous, scalable)

Oxylabs’ Web Scraper API (Google Jobs source) runs headless browsers with geo-targeting and a Custom Parser so you receive structured fields instead of raw HTML.

What you’ll need

  • Oxylabs account and API user credentials (username/password). Create an account in the Oxylabs dashboard, start a free trial, and copy your API user and password from the dashboard credentials page.
  • Python 3.11+ on your machine.

Step 1: Create an Oxylabs account and copy your API username and password from the dashboard (Oxylabs).

Step 2: Install Python 3.11 or newer on your system if it’s not installed yet.

Step 3: Install required libraries for async scraping.

pip install aiohttp asyncio pandas

Step 4: Create a file named payload.json with parsing rules for Google Jobs (Oxylabs Custom Parser).

{
  "source": "google",
  "url": null,
  "geo_location": null,
  "user_agent_type": "desktop",
  "render": "html",
  "parse": true,
  "parsing_instructions": {
    "jobs": {
      "_fns": [
        { "_fn": "xpath", "_args": ["//div[@class='nJXhWc']//ul/li"] }
      ],
      "_items": {
        "job_title":     { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='BjJfJf PUpOsf']/text()"] }] },
        "company_name":  { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='vNEEBe']/text()"] }] },
        "location":      { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='Qk80Jf'][1]/text()"] }] },
        "date":          { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='PuiEXc']//span[@class='LL4CDc' and contains(@aria-label, 'Posted')]/span/text()"] }] },
        "salary":        { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='PuiEXc']//div[contains(@class,'I2Cbhb') and contains(@class,'bSuYSc')]//span[@aria-hidden='true']/text()"] }] },
        "posted_via":    { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@class='Qk80Jf'][2]/text()"] }] },
        "URL":           { "_fns": [{ "_fn": "xpath_one", "_args": [".//div[@data-share-url]/@data-share-url"] }] }
      }
    }
  }
}

Step 5: Create a Python file named jobs_oxylabs.py.

Step 6: Paste imports and load credentials (Oxylabs).

import asyncio
import aiohttp
import json
import pandas as pd
from aiohttp import ClientSession, BasicAuth

# Replace with your Oxylabs API username/password from the dashboard
OXY_USER = "USERNAME"
OXY_PASS = "PASSWORD"

credentials = BasicAuth(OXY_USER, OXY_PASS)

# Load parsing payload
with open("payload.json", "r") as f:
    PAYLOAD = json.load(f)

Step 7: Add helper functions to submit a job, poll status, and fetch results (Oxylabs Push-Pull endpoint).

async def submit_job(session: ClientSession, payload: dict) -> str:
    async with session.post("https://data.oxylabs.io/v1/queries", auth=credentials, json=payload) as resp:
        data = await resp.json()
        return data["id"]

async def check_status(session: ClientSession, job_id: str) -> str:
    async with session.get(f"https://data.oxylabs.io/v1/queries/{job_id}", auth=credentials) as resp:
        data = await resp.json()
        return data["status"]

async def fetch_jobs(session: ClientSession, job_id: str) -> list:
    async with session.get(f"https://data.oxylabs.io/v1/queries/{job_id}/results", auth=credentials) as resp:
        data = await resp.json()
        return data["results"][0]["content"]["jobs"]

Step 8: Add a function to write parsed jobs to a CSV per query and location.

async def save_csv(query: str, location: str, jobs: list) -> None:
    rows = []
    for j in jobs:
        rows.append({
            "Job title": j.get("job_title"),
            "Company name": j.get("company_name"),
            "Location": j.get("location"),
            "Date": j.get("date"),
            "Salary": j.get("salary"),
            "Posted via": j.get("posted_via"),
            "URL": j.get("URL"),
        })
    df = pd.DataFrame(rows)
    filename = f"{query}_jobs_{location.replace(',', '_').replace(' ', '_')}.csv"
    await asyncio.to_thread(df.to_csv, filename, index=False)

Step 9: Define the coroutine that sets the Google Jobs URL and geo-location, submits work, waits for completion, and saves results (Oxylabs).

async def scrape_jobs(session: ClientSession, query: str, country_code: str, location: str) -> None:
    url = f"https://www.google.com/search?q={query}&ibp=htl;jobs&hl=en&gl={country_code}"
    PAYLOAD["url"] = url
    PAYLOAD["geo_location"] = location

    job_id = await submit_job(session, PAYLOAD)

    # Give the backend time to render before polling
    await asyncio.sleep(12)

    while True:
        status = await check_status(session, job_id)
        if status == "done":
            break
        if status == "failed":
            print(f"Job {job_id} failed for {query} @ {location}.")
            return
        await asyncio.sleep(5)

    jobs = await fetch_jobs(session, job_id)
    await save_csv(query, location, jobs)

Step 10: Add main() to run multiple queries and locations concurrently.

URL_QUERIES = ["developer", "chef", "manager"]
LOCATIONS = {
    "US": ["California,United States", "Virginia,United States", "New York,United States"],
    "GB": ["United Kingdom"],
    "DE": ["Germany"]
}

async def main():
    async with aiohttp.ClientSession() as session:
        tasks = []
        for cc, locs in LOCATIONS.items():
            for loc in locs:
                for q in URL_QUERIES:
                    tasks.append(asyncio.ensure_future(scrape_jobs(session, q, cc, loc)))
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    asyncio.run(main())
    print("Completed.")

Step 11: Run the script and verify that CSV files are created per query and location.

Notes:

  • Google Jobs share URLs may only open from an IP in the same country used during scraping; use a matching proxy/VPN when opening them.
  • Use Ctrl+Shift+I on Windows or Option+Command+I on macOS to inspect selectors if you want to extend parsing.

Method 2 — SerpApi Google Jobs API (simple REST, filters and radius)

SerpApi’s google_jobs engine returns structured listings with optional filters like language, country, and radius.

What you’ll need

  • SerpApi account and API key. Sign up at serpapi.com, open the dashboard, and copy your API key from the “API Key” page.
  • Python 3.9+ and the requests and pandas packages.

Step 1: Create a SerpApi account and copy your API key from the dashboard (SerpApi).

Step 2: Install the dependencies.

pip install requests pandas

Step 3: Send your first Google Jobs request and write results to CSV (SerpApi).

import requests
import pandas as pd

API_KEY = "YOUR_SERPAPI_KEY"

params = {
    "engine": "google_jobs",
    "q": "developer new york",
    "hl": "en",
    "gl": "us",
    "api_key": API_KEY,
    # Optional: radius in kilometers (Google may not strictly enforce)
    "lrad": 25
}

r = requests.get("https://serpapi.com/search.json", params=params)
r.raise_for_status()
data = r.json()

rows = []
for j in data.get("jobs_results", []):
    rows.append({
        "Job title": j.get("title"),
        "Company name": j.get("company_name"),
        "Location": j.get("location"),
        "Posted via": j.get("via"),
        "Share link": j.get("share_link"),
        "Description": j.get("description")
    })

pd.DataFrame(rows).to_csv("jobs_serpapi.csv", index=False)

Step 4: Paginate using next_page_token from the response to fetch additional pages (SerpApi).

def fetch_all(params):
    out = []
    while True:
        res = requests.get("https://serpapi.com/search.json", params=params).json()
        out.extend(res.get("jobs_results", []))
        token = res.get("serpapi_pagination", {}).get("next_page_token")
        if not token:
            break
        params["next_page_token"] = token
    return out

jobs = fetch_all(params)

Tips:

  • Use location or uule to simulate city-level searches; avoid combining both in the same request.
  • Filters can arrive as uds strings in the JSON; reuse a filter’s uds value in a new request to refine results.

Method 3 — Free local scraper with Selenium (best for small tests)

A headless browser can scroll the Google Jobs list and extract fields with robust selectors. This is practical for prototyping but will require maintenance and can hit anti-bot systems at volume.

What you’ll need

  • Google Chrome and a matching ChromeDriver binary.
  • Python 3.9+ with selenium, beautifulsoup4 (optional), and pandas.

Step 1: Download ChromeDriver that matches your Chrome version and add it to your PATH.

Step 2: Install Selenium and pandas.

pip install selenium pandas

Step 3: Launch Chrome with an option that reduces automation fingerprints.

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--window-size=1920,1080")
options.add_argument("--disable-blink-features=AutomationControlled")

driver = webdriver.Chrome(options=options)

Step 4: Open a Google Jobs URL built from your query and country code.

query = "developer"
country = "us"
url = f"https://www.google.com/search?q={query}&ibp=htl;jobs&hl=en&gl={country}"
driver.get(url)

Step 5: Scroll the jobs list container to load more results.

import time
from selenium.webdriver.common.by import By

loaded = 0
while True:
    cards = driver.find_elements(By.XPATH, "//div[@class='nJXhWc']//ul/li")
    if len(cards) == loaded:
        break
    loaded = len(cards)
    driver.execute_script("arguments[0].scrollIntoView({block: 'end'});", cards[-1])
    time.sleep(1.5)

Step 6: Extract stable fields with XPaths; use the icon-based selectors for posted date and salary when available.

from selenium.common.exceptions import NoSuchElementException

rows = []
for li in driver.find_elements(By.XPATH, "//div[@class='nJXhWc']//ul/li//div[@role='treeitem']/div/div"):
    def txt(xp):
        try:
            return li.find_element(By.XPATH, xp).get_attribute("innerText")
        except NoSuchElementException:
            return None

    def attr(xp, name):
        try:
            return li.find_element(By.XPATH, xp).get_attribute(name)
        except NoSuchElementException:
            return None

    rows.append({
        "Job title": txt("./div[2]"),
        "Company name": txt("./div[4]/div/div[1]"),
        "Location": txt("./div[4]/div/div[2]"),
        "Source": txt("./div[4]/div/div[3]"),
        "Posted": txt(".//*[name()='path'][contains(@d,'M11.99')]/ancestor::div[1]"),
        "Full/Part": txt(".//*[name()='path'][contains(@d,'M20 6')]/ancestor::div[1]"),
        "Salary": txt(".//*[name()='path'][@fill-rule='evenodd']/ancestor::div[1]"),
        "Logo src": attr("./div[1]//img", "src")
    })

Step 7: Save the extracted data to CSV.

import pandas as pd
pd.DataFrame(rows).to_csv("jobs_selenium.csv", index=False)

Caution:

  • Expect layout changes; revalidate selectors periodically.
  • Throttle requests and respect site terms to avoid blocks.

Choosing an approach

For production and scale, Oxylabs’ async pipeline reliably returns parsed jobs across many queries and locations, reducing engineering time and failures. SerpApi is quick to integrate for smaller scripts and lets you add radius and filter tokens. Selenium is best for quick experiments or when you cannot use a third‑party API.


Quick maintenance tip: keep query lists and location dictionaries in separate JSON or YAML files so non-developers can update search scopes without touching code.