Semantic Search for Sourcers and Recruiters, Round 2
Looking at my blog stats, it appears I may have struck a nerve when I wrote a post on semantic search for sourcers and recruiters on Monday, December 29th, explaining the concepts of semantic search with regard to how it can be leveraged effectively by sourcers and recruiters.
I received several questions, comments, and inquiries for more information on the subject. If you perform some research on the concept, you likely won’t find much, if anything specifically focused on user-defined semantic search, let alone user-defined semantic search for sourcing and recruiting, so I figured I would take this opportunity to quickly follow up and expand upon the concepts of leveraging semantic search when creating Boolean search strings.
Irina Shamaeva, of Boolean Strings fame (on both LinkedIn and Recruitingblogs), asked a very insightful question about whether or not I would agree that in fact all sourcers and recruiters are trying to perform semantic search when they are looking for candidates who match a job description
While I would agree that most sourcers and recruiters are trying to find people who have had specific experience performing the role and responsibilities required (and desired) of a job description – I have to say that most sourcers and recruiters are NOT actually performing semantic search. In other words, they are not specifically leveraging semantics in their search tactics and strategies. Throwing a collection of search terms into a Boolean search string based on a job description is just that – a collection of words, which do not necessarily imply any meaning in and of themselves.
Adding additional keywords that may not be explicitly mentioned in a job description, or using the NOT/- operator to eliminate false positive results may (or may not) help narrow search results, and while these are certainly search best practices, they are not instrinsically semantic search. Adding or selectively removing keywords/search terms in many cases simply produces results with the search terms (and without the removed terms) but not implying any responsibility with the search terms.
Using a User Interface Engineering position as an example, there are many people who can mention the words (UI OR user interface OR GUI) in their resume who in fact do not have any significant experience with interface design, even though the words are somewhere in the resume. Even if we added additional UI-related terms such as (wireframe OR human factors OR cognitive OR heuristic) we can still return many resumes of candidates that mention the search terms but who have not been primarily responsible for interface design.
When creating Boolean search strings to find potential candidates – the whole point is to find people who have specific skills and experience performing the role/responsibilities of the position the sourcer/recruiter is looking to fill – in other words, relevant results. Certainly no one sets out to run a Boolean search to specifically find a bunch of people who aren’t qualified. Any results returned from a search of candidates who do NOT have the skills and experience necessary to perform the responsibilities of the position the sourcer/recruiter is looking to fill are essentially irrelevant results. Sure, you can build relationships with them and ask for referrals, and perhaps they fit another position, but there is no denying that they simply do not match the intent of the search.
Sourcers and recruiters encounter this all the time when creating Boolean search strings – the search terms are present in the resume, sometimes in large quantity, but the person has not actually DONE what they need them to have done in their career. That is an excellent example of a high lexical similarity between a search and the results (the words match) and low semantic similarity of the search and the results (the person’s experience does NOT match).
High lexical similarity and low semantic similarity between a search and its results can also be evidenced in the “technical skills summary” of most resumes, where a laundry list of skills and technologies are present. But just because something is mentioned in a resume, it does not imply any level of expertise, recency of experience, or even paid experience. Hence someone could mention Java, Eclipse, and Swing in their resume, but not have any paid experience developing applications with them (as in educational experience or at home). This is a normal experience for sourcers and recruiters and so they assume this is simply “the way it is.”
However, if we use the NEAR command (or even better – the more powerful configurable proximity search operators of Lucene and dtSearch), we could add this to a search string on Monster: (develop* OR design*) NEAR (Java OR Eclipse or Swing), and the results MUST mention Java or Eclipse or Swing within 10 words of develop or design, increasing the likelihood that the results will include resumes that have sentences specifically stating development or design-level responsibility with Java/Eclipse/Swing. This is tapping into semantics – the mere presence of words in a resume does not necessarily imply any meaning, but words in the same sentence do imply meaning in most (but certainly not all) cases.
The NEAR operator and configurable proximity functionality of search applications such as Lucene and dtSearch are the best ways to leverage semantics when searching because they allow you to target sentence structure, such as when people talk about doing X with Y (configuring routers, reconciling reports, administering a server cluster, implementing SAP, customizing interfaces, performing SOX audits, etc.).
We must also recognize that the candidates that sourcers and recruiters search for are decidedly NOT profesisonal resume writers. However, they don’t have to be. Even though the majority of people are not very good at writing resumes and are clueless as to how sourcers and recruiters search for their resumes and analyze them, most people do create simple sentences with verbs and nouns (e.g., performing audits, designing portals, troubleshooting a server, managing software development, etc.) – and many people are very direct about their responsibilities/what they do, and we can take advantage of this and target these statements with proximity/semantic searching.
Semantic searching can also be used to specifically defeat the false positives associated with large skill summary/technology lists in resumes. Most of these summary sections are just that – lists of technologies - and they are not sentences with subjects and verbs. Creating Boolean search strings employing a proximity search operator such as NEAR can essentially eliminate “hits” of search terms buried in lists because they are not sentences with nouns and verbs (e.g., developing applications in Java, supervising accountants, maintaining cancer registries, etc.) and they will not return results of search terms mentioned by themselves.
If you are interested in the concept of semantic search for sourcing and recruiting, stay tuned, as I have 2 more posts focused on semantic search coming in January.
Happy New Year!
If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.




