Artifact Content
Not logged in

Artifact 7987e810dab9d7812e0f40651f8a46699a385be2:


Job Listing Site Formats

Here we document the formats of the various job-listing sites. See the samples/ directory for concrete HTML examples.

Some notes:

Software

HRSuite

URLs end in postings/search followed by query arguments, if any. It does not appear that the query arguments for the different search parameters are common across sites (e.g., searching for full-time faculty jobs generates different queries at different schools).

Search results are displayed on a single page, no matter how many, I think.

PeopleAdmin

PeopleAdmin will quickly make you hate the people (if, indeed, they were human at all) who designed it, and life in general. It uses cookies extensively, returning 404s even for GETs if you don't have exactly the right set of cookies.

(This experimentation is based on Portland CC.)

With no cookies, browse to

/applicants/jsp/shared/frameset/Frameset.jsp

This will redirect (via Javascript) to Frameset with a query argument

?time=1430604175982

(or whatever the timestamp is). The time query argument is propagated to GET requests to the subpages of the frameset:

Header.jsp Nav.jsp DefaultContent.jsp Welcome.jsp

None of these include the search results, yet.

The following cookies will have been set: JSESSIONID=(session id)

The Nav.jsp frame contains a form that will be POSTed when you click the button to go to the job listing. The POST request will go to

/applicants/Central

Clicking the button actually calls a script:

formSubmit(document.navForm, document.navForm.delegateParameter, 'searchDelegate', document.navForm.actionParameter, 'showSearch', 'PA_1430604174716');

formSubmit() is defined in Validation.js

function formSubmit(form,actionName,actionValue){ var f=document.getElementById(actionName); f.setAttribute("value",actionValue); form.submit(); }

The navForm mentioned above has the following (hidden) fields:

delegateParameter actionParameter windowTimestamp=PA_1430604174716 (set in the form) searchType

As mentioned above, the effect of clicking the "search for jobs" button is to set the delegateParameter to "searchDelegate" and then POST the form to applicants/Central. The POST request contains:

delegateParameter: "searchDelegate" actionParameter: "showSearch" windowTimestamp: (the value from the form) searchType: "8192"

applicants/Central will return a page whose sole purpose is to redirect to

/applicants/jsp/shared/search/SearchResults.jsp?time=1430605051730

(Or whatever the timestamp is.) The redirection is handled by JS, so requests won't do it for us. No other paramaters are passed to SearchResults.jsp. I can only assume that they are stored in the session object on the server.

SearchResults.jsp will finally give us the first page of results.

Clicking the "next page" button sends another POST request to Central, with the following parameters:

pageName="/jsp/shared/search/SearchResults.jsp" functionalityTableName="8192" delegateParameter="functionalityTableDelegate" actionParameter="getNextRowSet" windowTimestamp=(value set in form) ftPageNumber="1"

Again, the page returned just redirects to SearchResults.

Enterning a specific page number and then clicking Go results in a POST to Central, containing

pageName="/jsp/shared/search/SearchResults.jsp" functionalityTableName="8192" delegateParameter="functionalityTableDelegate" actionParameter="jumpToPage" windowTimestamp=(value set in form) ftPageNumber="4" (or whatever you entered)

So we can at least use that to get to the desired page.

The total number of pages can be found on SearchResults.jsp, in

#ft_8192 > table:nth-child(6) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(2)

Just after the is the text "of n" where n is the number of pages. It would probably be easier to locate the named "ftPageNumber" and then look at the text node following it, rather than use the above selector.

What happens if you try to use this site without Javascript enabled? You can't even get past the first request to Frameset, because the redirect to Frameset with the time query argument is handled by Javascript. Really stupid.

What happens if you have cookies disabled is even more crazy: The first Frameset constantly redirects to itself, with an ever-increasing time query parameter. It never actually takes you to the set of frames.

Problems:

Procedure:

Parsing Results:

The output table is located at

.tabUnselectedBG > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1)

The first row is a header row. Each row contains four cells (columns). In order, they are Title, Open Date, Position Type, and Department. The Title cell contains a containing the actual title, as well as a (Javascript) link to details that we can ignore. The three other cells just contain plain text.

JSP-based application, searching requires posting query criteria to a URL of the form applicants/Central. The server will return a 302 redirect to a URL ending in search/SearchResults_css.jsp containing the actual results.

POST query (not body) parameters:

delegateParamater: "searchDelegate"
actionParameter: ""
windowTimestamp: "PA_..."
elementToConfigure: ""
di_21: "-1"
di_75: ""
di_26: "-1"
di_19006: "-1"
searchType: "8192"
formAction: "goSearch"
button_goSearch: "Search"
formAction: "clearSearch"

Obviously some experimentation will be required to figure out which of these are important. Somewhat troubling: removing the timestamp from the request sends us back to the "welcome" page, rather than giving us search results. So we'll either need to issue an initial request to get the timestamp, or figure out how to generate them on our own.

Results are paginated: clicking "next page" issues a GET request to search/SearchResults_css.jsp with the following query parameters:

delegateParameter: "functionalityTableDelegate"
actionParameter: "getNextRowSet"
pageName: "/jsp/shared/search/SearchResults_css.jsp"
functionalityTableName: "8192"
windowTimestamp: "PA_..."

I think all the actual work is done by the applicants/jsp/shared/search/SearchResults_css.jsp page. The other pages just redirect to it. But no parameters are sent to this page, so apparently you have to GET/POST to Central and then follow the redirect to get your results. Weird.

The various di_ parameters are linked, I think, to the dropdowns on the search criteria box. E.g., changing position type on one school changes the di_19006 value to 3. This is probably school-specific. But -1 appears to mean "any".

More investigation is needed for exactly what has to be sent via POST vs. GET. The parameters appear to be the same; if we're lucky, the action won't matter.

Search results are output in a table.FunctionalityTable. The first row is a header row (contained in a thead). The tbody contains the actual listings, one per row. Each row has four cells:

  • Job title, with a link to more information (titled "View")

  • Closing date

  • Position type (part time, faculty, etc.)

  • Department

Links to the job listings themselves also link to Central (GET). In this case the actionParameter is getJobDetail and there is a rowId parameter.

cURL command-line for the initial search request (that brings up the first page of results, unfiltered by any criteria):

curl 'https://jobs.gcccd.edu/applicants/Central' -H 'Host: jobs.gcccd.edu' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://jobs.gcccd.edu/applicants/jsp/shared/search/Search_css.jsp' -H 'Cookie: JSESSIONID=181993CEBC5B1F047E0D96D548A2BAA6.node1; __utma=198381113.2070514108.1428085647.1428085647.1428085647.1; __utmz=198381113.1428085647.1.1.utmcsr=californiacommunitycolleges.cccco.edu|utmccn=(referral)|utmcmd=referral|utmcct=/Districts.aspx' -H 'Connection: keep-alive' --data 'delegateParameter=searchDelegate&actionParameter=&windowTimestamp=PA_1429934436390&elementToConfigure=&di_21=-1&di_75=&di_26=-1&di_19006=-1&searchType=8192&formAction=goSearch&button_goSearch=Search&formAction=clearSearch'

Again, how much of that is truly needed will have to be determined by experimentation.

NeoGOV

The only commonality is that the initial search URL ends with default.cfm.

Results are paginated to 25 results per page. Requests for subsequent pages require POSTing. But at least there's no weird redirect or timestamping involved (hopefully).

Results are returned in a table.NEOGOV_joblist. There is a thead containing the header row. Each data row contains a single job, with the following columns:

  • Position name (also a link to the position)

  • Employee type (part-time, etc.)

  • Salary

  • Closing date

NeoGOV actually has a central job-listing site called governmentjobs.com. All of the job-listing sites are actually frames served up from this domain.