Job Board Indexer Requirements

Home Job Board Indexer Requirements
  • Job postings must use template-driven HTML markup:
    • Our indexer parses the HTML contents of the page.
    • It must be able to locate job posts at predictable locations in the HTML.
      • Example: <a class=”job-post” href=”JOB_POST_URL”>JOB TITLE</a> – <span class=”employer”>EMPLOYER_NAME</span>
    • If job posts are added as ‘unstructured’ content, our indexer will not be able to predictably identify and parse them.
    • If you’ve got a JSON or XML feed that we can index, this is even better!
  • Individual job postings must be accessible via a distinct URL. This URL must be discoverable in the job post “list view”.
  • Provide a consistent and reliable way to detect when a job posting is expired. Ideally, remove the job post and provide a 404 result. Or, mark the posting as expired clearly via parseable/semantic HTML. Example: <span class=”job-posting-expired”>This job post is expired</span>
  • City, town or community names must be consistently represented.
    • If the job is in Thunder Bay, the location must specify “Thunder Bay” exactly the same way every time. (“Thunder Bay Area”, “Thunder Bay and SomeOTherTown”, or “ThunderBay or AnotherTown” will cause indexing difficulty.)
  • Job posts must have the following information, at a minimum:
    • Job Title
    • Employer Name
    • Job post detail URL
    • City / Town name
  • Other Do’s & Don’ts
    • Do use helpful and semantic CSS identifiers and classes to make it easy to identify and parse job posting data.
    • Don’t post job details in PDF documents. PDFs cannot be indexed, and they also present an accessibility problem.
    • Don’t use JavaScript/AJAX to load job posting data after the web page has already loaded in the browser.
    • Don’t append session-specific tokens or identifiers to job post URLs. The URLs that we extract from your job posts must be shareable and usable by others.
    • Don’t leave job postings up indefinitely. Mark them clearly as expired using a semantic (i.e. parseable) HTML structure, or remove them entirely and deliver a 404 page.
    • Don’t re-use URLs. Re-used URLs confuse search engines. Our system uses the URL as a unique identifier for job posts, which means it will fail to index a new post at a re-used URL.