Automated Semantic Search: Proceed with Caution
There are many software vendors who offer solutions to HR, recruiting, and sourcing organizations that claim to have automated candidate search and match capability. These applications can take your search, a job description, or an example resume and claim to leverage semantic search, fuzzy logic and/or Artificial Intelligence search technology to return relevant results.
I’ve had the opportunity to use and evaluate 4 “big name” semantic and intelligent search/match applications for identifying candidates, and I am currently implementing one of them, including cutomizing the taxonomies and ontologies - so I have quite a bit of hands-on, practical experience with semantic search applications. While I do think that intelligent semantic search and match applications definitely have a place in the sourcing and recruiting process, they should not be looked at as solutions to a problem, challenge, or a deficiency in your skills or your team’s.
Before implementing a semantic search application, it is important to first understand the manual search process, how automated semantic search works, and the intrinsic limitations of of semantic search applications.
People Do the Work, Computers Move Information
Eiji Toyoda, the fifth president of Toyota Motor Corporation, who collaborated Taiichi Ohno to fine tune the concept of Kaizen as well as to develop the core concepts of the ‘Toyota Way’, explains brilliantly: ”Society has reached the point where one can push a button and be immediately deluged with…information. This is all very convenient, of course, but if one is not careful there is a danger of losing the ability to think. We must remember that in the end it is the individual human being who must solve the problems.”
As I have written before, thinking is the most critical step in the candidate sourcing process, and regardless of “Artificial Intelligence” and semantic search marketing hype, applications do not have any true cognitive power, nor do they have the ability to be creative or learn as people do. Thus I could not agree more with Jeffrey Liker’s (author of The Toyota Way) assessment - ”People do the work, computers move information.”
Do Not Automate What You Cannot Already Perform Manually
If you are not a highly proficient Talent Miner, capable of manually querying databases and systems with Boolean logic and a high degree of precision to find not only the obvious and easy to find candidates, but also the those residing in Hidden Talent Pools – how can you even HOPE to begin to evaluate what an automated semantic search solution is claiming to do?
You can’t.
If you’re going to buy and use product or service of ANY kind, and you don’t really understand what it does or even exactly how it does it (beyond the marketing hype), and you can’t tell if it REALLY does everything it claims to – how and at what level can you determine if the product or service truly meets your needs and will provide true value?
If you or your organization struggles with the challenge of finding the right candidates at the right time on a consistent basis, implementing an automated search and match application will not magically solve this “problem” for you. The real underlying problem is likely that you or your organization currently does not #1 possess the right skills or the right people who are highly proficient at candidate sourcing and/or #2 have documented highly effective candidate sourcing processes and best practices that are consistently trained to and followed. No software application can fix either or both of those issues.
Jane Beseda, Group VP at Toyota, believes that it is best to “First work out the manual process, and then automate it. Try to build into the system as much flexibility as you possibly can so you can continue to kaizen the process as your business changes” because “…you can kaizen (continually improve) people processes very easily, but it is hard to kaizen a machine (or application).” I agree wholeheartedly.
Master the Sourcing Process Manually, THEN Introduce Automation
If you or the people on your team don’t fully understand how to effectively leveraging technology for talent identification and you can’t perform it manually, I do not recommend implementing an automated search/match solution. It is critical to #1 First develop your skills and ability (or your team’s) to manually source candidates, #2 Document your sourcing best practices and processes, #3 Make sure that they are consistently trained and applied, and #4 Strive to continually improve them. THEN you can go about evaluating automated search and match solutions because you will actually have the ability to truly assess and understand what the products do, you will be able to determine whether or not they meet your needs and can provide real value to your organization, and you will be able to assess how you can possibly best leverage them into your sourcing efforts.
For example, if you don’t really comprehend the concept of semantic search and how it can be applied in candidate sourcing (e.g., you can manually leverage semantic search in your Boolean strings at an expert level or you have someone on your team that can), you won’t be able to tell whether or not a application claiming to leverage semantic search is really effectively doing so, if it will provide any real value to your sourcing efforts, or if it is finding all of the best possible matches in the sources it is searching.
Speaking of Semantic Search
Many automated candidate search and match solutions claim to leverage semantic search. I’ve written many articles on the topic of how semantics can be leveraged for sourcing candidates – so I won’t go into too much detail here – but semantics refers to the study of meaning, as inherent at the levels of words, phrases, and sentences.
Sourcers and recruiters can leverage semantics in their sourcing efforts to more quickly find more relevant results. When the search results match the intended MEANING of the search (the INTENT), there is a semantic similarity between the INTENT of search and its results. In other words – you get what you’re looking for. When search results simply match the search terms but not the INTENT of the search, there only a lexical similarity between the search and its results. In other words – the words match, but you don’t get what you’re looking for.
For example, if I were manually searching for Linux systems administrators – that is the INTENT of my search – to find people who have been primarily responsible for administering Linux systems, regardless of exactly how they express that experience in their resume. However, with basic keyword and title search, I will find a mix of relevant results (people who have been responsible for Linux administration) and irrelevant lexical-only match results (people who happen to mention the search terms of Linux and systems and administraion somwehere in their resumes or profiles, but who have not been primarily responsible for Linux systems administration).
However, as a thinking human being, I can truly learn from the search results and continually improve and adapt my search strategies and keywords to leverage the language used specifically by people who have been primarily responsible for administering Linux systems (using semantics – the actual meaning of specific words and phrases), ensuring more of the results returned actually match the INTENT of my search.
So How Do Applications Achieve Semantic Search?
To provide relevant search results, various automated search and match applications claim to:
- Have “pseudo-AI” – these perform semantic “concept matching” based on a taxonomy/ontology (lists of terms, equivalent terms and their relationships)
- Perform “natural language” search – using pre-programmed phrases and/or proximity matching
- Execute “context-aware” semantic matching of jobs and resumes
- Perform fuzzy matching – returning results matching words that were not specifically searched for by the user, but that the application “thinks” is likely to be relevant
- Have “full AI” – the software is designed with algorithms to create relationships between words, abbreviations and phrases dynamically and without human intervention
Applications Are Not Mind Readers
No software application is capable of determining the INTENT of any search – it is critical to recognize and understand this, because it’s at the very core of semantic matching, although it’s not often written about when it comes to sourcing candidates. No application can actually *know* what you’re looking for – only YOU know what you’re trying to find, which is a person who has had some sort of specific experience performing certain responsibilities, often in specific environments.
Essentially, automated search/match applications take either your search terms, a job description, and/or an example resume and make their “best guess” as you your intent. Don’t get me wrong – some applications do a very good job of taking what you give them and providing some relevant matches. However, you must always be aware that these are just that – guesses – and on top of it all, they’re guesses made by an application, not a person.
Output is Limited by Input
Search/match aplications are only as good as their input - if a user is not especially adept at crafting search strings the quality of the results the application can produce will be limited. Additionally, we are all aware the resumes and job descriptions are FAR from the best representations of skills and experience, so the results produced by an application interpreting a resume or a job order is instrinsically limited.
There is no replacement for the cognitive and interpretive power of the human mind – people are much more capable of “reading between the lines” of resumes and job descriptions and getting to the true essence of the required skills and experience to determining how to best approach sourcing efforts to find the right people. Applications can claim to do this, but it’s an apples to oranges comparison.
Let’s also remember that a sourcer or recruiter can discuss the position’s requirements with the hiring manager (in most cases – if not, they should be able to!) – this information is typically beyond what any example resume or written job description can (or does) convey. Armed with this critical information and detail, a sourcer or recruiter can translate what they’ve gathered verbally from the hiring manager/team into a sourcing strategy, down to the Boolean search string level. The last time I checked, an application cannot do this. There is no replacement for good analytical and interpretive ability – it’s precisely why you can’t fully automate a business analyst’s, a data analyst’s or a financial analyst’s job.
Part of the instrinsic challenge faced by automated search and matching applications is that they are expected to make the leap from “give me what I said” (match the words) to “give me what I want” – matching your intent. Considering that an application is incapable of “understanding” your intent, how can it hope to deliver results that match your intent when your intent cannot be interpreted from an example resume, a job description, or a poorly defined search?
Intrinsic Limitations of Automated Semantic Search/AI
Be aware that any automated semantic search/match application is only as good as its programming, taxonomy, and ontology:
- Pre-programmed lists of “relevant” or related keywords may in fact not be relevant or related to your specific search, and can get outdated quickly
- Fuzzy matching is by definition “approximate” or “inexact” matching, the very opposite of “precise”
- Applications are essentially “guessing” the intent of your search based on the keywords, resume, or job description you feed it – but finding the best talent is not a guessing game
- Some systems will return results with related words – which may in fact NOT be relevant to your specific need – will you be able to determine which are and which aren’t?
- Applications that claim to “learn” may not actually improve the relevance of results over time
The Tough Questions
Are you capable of evaluating an application “under the hood?” When a vendor tells you that their application is “trained to identify the equivalent meanings of terms found in resumes,” do you know how to get to the bottom of exactly how their application accomplishes this, and whether or not their application can actually do it for your specific hiring profiles? Could you manually run searches to objectively evaluate the vendor’s claims? If you don’t already possess the ability to manually source candidates from information systems, what will you do when your automated search/match application fails to produce the right results, or enough of them? Do you have the ability to search your own database to find the candidates that the search/match application can not find/misses?
My Experience and Opinion
I’ve had the opportunity to use a number of automated semantic search/match applications. I’m also currently implementing one and I’m involved in customizing the taxonomies and ontologies. I’m very excited about the features and capability of the semantic search application I’m using. However - I’ve found it does have it’s limitations, and that there is no perfect solution that magically produces the best candidates available with the push of a button, nor can an automated intelligent/semantic search application fully replace a skilled sourcer or recruiter. Augment and empower - yes. Replace – no.
As someone who is highly proficient in manual Boolean and semantic search, from everything I have gathered, most automated search and match applications simply automate basic sourcing best practices, but none that I have used do so completely or flawlessly.
In fact, some search/match apps appear to do little more than heavily leverage common title and skill terminology matching – which any sourcer or recruiter of average skill can accomplish. Other applications go overboard and get sloppy in the process, suggesting and incorporating alternate search terms it *thinks* are related and relevant to your search – and in some cases the terms are actually NOT relevant to the intent of the user’s search, causing more harm than good. But if you don’t know what to look for (past the marketing and past the huge cloud of “related” search terms the app suggests) and you don’t know how to get to the core of the search logic and the true relevance based on the intent of your search, you might just actually believe you have the solution to all of your candidate sourcing problems.
While some industries can benefit heavily from automated search and match applications – for example, those with simple and highly consistent titles, lexicology, and phraseology(such as finance and accounting), others vary widely and change rapidly (such as Information Technology) and pose serious challenges for vendors of automated search and match solutions.
Check Your Reasoning
If you’re looking to implement an automated candidate search/match application, simply ask yourself why – and dig down to the very root of it. Is it because Boolean searching is “too hard?” Is it because you think the application can reduce your costs by reducing headcount? Do you think a search/match app can speed up the talent identification cycle?
I’ll let you in on a little secret – Boolean search isn’t that hard with the proper training. It’s more science than art, really. Which is good news, because manual talent mining via Boolean search strings can be broken down to repeatable steps, including the interpretive and analytical processes, which can be continually and dynamically improved – whereas an application cannot (easily, or at all).
Also – semantic search/match applications are not a replacement for people, because the applications don’t actually perform real work or deliver value to the end customer. When I refer to “work” and “value,” I’m referring to how Lean and the Toyota Production System define those terms. Recall Jeffrey Liker’s assessment, “People do the work, computers move the information.” The author of The Toyota Way has also explained that ”…the only thing that adds value in any type of process…is the physical or information transformation of that product, service, or activity into something the customer wants.”
By customer, Jeffrey means the END customer – the internal hiring manager or the external client. Automated search/match apps don’t transform or produce a product that the end client wants – which is a talented person who is a great match for their need. Automated search/match applications produce resumes and profiles (i.e. “computers move the information”) and the sourcers and/or recruiters analyze and transform those resumes/profiles into fully screened, closed, and qualified candidates for the end customer (i.e., people do the work).
My Suggestions
Automated search/match applications definitely have their place in world of sourcing and recruiting. I think they are best used to facilitate and augment the talent identification efforts of sourcers and recruiters. Some can be especially useful in quickly and simultaneously searching multiple online sources of resumes and social media profiles and parsing large volumes of results into your internal ATS/CRM – this quick and permanent human capital data capture can be a major benefit of using some search/match solutions.
Ultimately, I feel that sourcers and recruiters using AI/semantic search applications should utilize the results they produce and return as ”suggested reading” - but I would never rely solely on an application to exhaustively identify top talent, just as no one would trust a plane full of people to take off and land on autopilot.
Whether or not you are considering purchasing an automated search and match application or service, or you’ve already implemented one, I strongly urge you to #1 First develop your skills and ability (or your team’s) to manually source candidates, #2 Document your sourcing best practices and processes, #3 Make sure that they are consistently trained and applied, and #4 Strive to continually improve them. THEN asses the best way to leverage an automated matching solution to augment your talent identification efforts.
Do not take a poorly functioning sourcing process or team and expect to fix it using an automated search/match application. Fine tune your sourcing process and best practices, develop your sourcers and recruiters with exceptional training, and then surgically insert matching technology to enhace them. Technology is a tool that exists to support your people and your processes – it is not a solution to a problem nor a replacement for a process or a person.
Conclusion
I am in no way against inserting semantic search applications into the sourcing function (remember, I’m using one now!) – but I feel it must be done for the right reasons AND with a full understanding and capability of performing the sourcing processes manually, else you cannot continually improve your sourcing processes, nor will you be capable of picking up sourcing where the application fails to deliver.
If finding some candidates is your end goal, then you can feel comfortable using automated search and match solutions to do all of the “heavy lifting” for you or your associates when it comes to talent identification. However, if you goal is to find all of the best candidates, then you should use an automated matching application as a tool to support your sourcers and recruiters who are well trained and effective in manual sourcing best practices.
Caveat Emptor
If you are looking to purchase an automated candidate search/match solution, or you’ve already implemented one, and you do not have the expertise or experience associated with assessing and implementing such solutions – I STRONGLY urge you to seek a neutral, third party HR/Recruiting technology consultant and involve them in the process. It is all too easy to be sold by a vendor’s marketing and messaging – but you are at a distinct disadvantage if the vendor knows more than you about the sourcing and matching function – be it manual OR automated.
Be wary – do not seek to automate a process that which you do not fully understand how to perform manually.
Partial List of Vendors of Matching Technology:
- Pure Discovery: http://www.purediscovery.com/
- Actonomy: http://www.actonomy.com/
- Semetric (Engenium): http://www.krollontrack.com/semetric/
- TalentSpring: http://www.talentspring.com/
- Sovren: http://www.sovren.com/
- BurningGlass: http://www.burning-glass.com/
- ResumeMirror: http://www.talenttech.com/
Are you a vendor and want to be added to this list? Would you like me to evaluate your product? Let me know. Thanks!
If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.








I’m not a vendor but I would like to make a couple of mandatory additions from this side of the ocean in addition to Actonomy:
parsing:
Textkernel: hhtp://www.textkernel.nl
DaXtra: http://www.daxtra.com
matching:
WCC: http://www.wcc-group.com
matching&parsing:
Lingway: http://www.lingway.com
“a Tool is just a tool”.
Seems to me that Semantic + Bots = Nail Gun VS. old-school = reg hammer, BUT some of the best furniture/homes are “hand built”.
Hmmmm
Interesting article, I’m not sure matching itself is the issue, nor automation, I think it’s a combination of ‘volume and science’ What I mean by that is that most ‘matching’ providers do not have the candidate volume to make match work well, so they continually water down the ‘math’ proposition in an attempt to provide value – if there were 10 million candidates in a database all in the same data format, then matching would be the absolute best way to narrow down the field, SEO and fuzzy logic would not come close.
Secondly, matching is one persons view of what is right, its not your view, and the ‘your’ needs to be built into candidate selection technology – move away from a pure science to an ‘adaptive’ science – at the end of the day, candidate sourcing technology should do one thing – get a recruiter to a list of qualified and willing candidates who they want to interview – and do it fast.
The technology has to be the enabler in this equation, the recruiter the knowledge – couple adaptive selection with candidate volume and I think you would be in recruiter paradise.
@Mark – thanks for the suggestions! I will incorporate those vendors into the post shortly. If you can think of any more, please let me know.
@Jeremy – nice analogy. Although part of my point is that semantic search engines the lack intrinsic precision of “hand made” searches, and hammers aren’t really known as a precision instrument. But I like your take on the mass production vs. hand-crafted/high quality angle.
@Jason – I firmly believe that it is critical to understand a process manually before seeking to automate it. There’s certainly nothing instrinsically wrong with automation – it just has to be done for the right reasons, in proper support of people making decisions, as you suggest. I’ve had the honor of working with some really good matching software, and we have a resume database numbering in the millions, and IMO it’s sloppy. Some good matches is not enough for me. That’s like implementing robotics on an assembly line and being able to crank your output, but with a higher defect rate. Although matching to some extent is based on perception, ultimately there is a true match between what the client/manager wants and specific candidates, and software can’t match that, IMO.
Good, informative article Glen. I read with interest.
Full disclosure: HireAbility is in the resume parsing business not the matching business. We keep a vendor neutral approach and integrate with the best matching technology our clients choose. Sometimes that’s their own homegrown solutions. We believe better ranked results from matching solutions come from searching on the tagged data that parsers provide. So in theory: the better the parser, the better the data, the better the match. That’s why we choose to keep our focus on continuous advancement of our parser.
I agree with you that evaluations of accuracy can never be “quick and dirty.” You need a good methodology and a clear understanding of what you’re doing. Determining accuracy of probabilistic tools is best done with extremely large sample data sets and by looking at the smallest data elements in those sets for a yes or no (1 or 0) match. This is much, much more than any one prospective buyer has the time or resources to do at the purchase phase.
I would, however, like to clarify a couple Matching Technology vendors you identify. Resume Mirror is a licensed reseller of Engenium. Engenium is on your list already. And Sovren wraps up DT Search’s matching technology under its own label. DT Search is not on your list. It should be.
Amy,
Thanks for reading and leaving such a great comment!
You raise an excellent point that I did not address in the scope of my post – that parsing is critical when it comes to matching solutions, which it absolutely is. Better parsing results in more powerful, targeted and precise searching and matching.
What I also think is critical for prospective customers looking to implement a semantic matching solution is that they make sure they are able to modify taxonomies and ontologies to customize the solution for their specific needs – either themselves, or have the vendor do it for them under their direction. I haven’t seen any “out of the box” matching solutions that are already perfectly and completely relevant for every possible client. In fact – I believe that would be impossible.
You are correct that Resume Mirror uses Engenium – I should have had 2 lists: 1 for the search/match applications (such as Engenium), and 1 that includes parsing as well (such as Resume Mirror). I’ve worked with Engenium or a few years – I have never been impressed with its capability – in fact, I am quite disappointed with it.
And while you are correct that Sovren uses dtSearch as their text search engine, dtSearch, to the best of my knowledge, has no true semantic matching capability. I also could not find any mention of any semantic matching technology in dtSearch’s literature. http://www.dtsearch.com/
I’ve been using dtSearch as a stand alone product for a few years now, and it is a fantastic text search engine that effectively enables manual semantic search through variable term weighting and configurable proximity searching, along with some fuzzy/phonic capability (which I don’t use because it doesn’t increase relevancy in my experience). I am also currently working with Sovren – Sovren provides the Semantic Matching Engine powered by the data extraction, classification, and tagging technologies of the Sovren Resume/CV Parser, and simply uses dtSearch as the text search engine, as far as I understand it.
Do you have any other vendors you recommend I add to the list?
Thanks again for your insightful comment!