<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Boolean Black Belt-Sourcing/Recruiting &#187; Semantic Search</title>
	<atom:link href="http://www.booleanblackbelt.com/category/semantic-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.booleanblackbelt.com</link>
	<description>Leveraging LinkedIn, Twitter, Social Media, Resume Databases, and the Internet for Sourcing and Recruiting</description>
	<lastBuildDate>Mon, 30 Jan 2012 14:00:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>The Guide to Semantic Search for Sourcing and Recruiting</title>
		<link>http://www.booleanblackbelt.com/2012/01/semantic-search-explained-for-sourcing-and-recruiting/</link>
		<comments>http://www.booleanblackbelt.com/2012/01/semantic-search-explained-for-sourcing-and-recruiting/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 14:00:47 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[5 levels of semantic search]]></category>
		<category><![CDATA[6Sense]]></category>
		<category><![CDATA[Boolean Black Belt]]></category>
		<category><![CDATA[conceptual search]]></category>
		<category><![CDATA[Configurable proximity]]></category>
		<category><![CDATA[contextual search]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Glen Cathey]]></category>
		<category><![CDATA[grammatical search]]></category>
		<category><![CDATA[hcdir]]></category>
		<category><![CDATA[Human Capital Data]]></category>
		<category><![CDATA[inferential search]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[Monster]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[Proximity Search]]></category>
		<category><![CDATA[recruiting informatics]]></category>
		<category><![CDATA[Resume Sourcing]]></category>
		<category><![CDATA[resume tagging]]></category>
		<category><![CDATA[Semantic Clustering]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[Sourcing]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=8275</guid>
		<description><![CDATA[If you have nearly any tenure in HR, sourcing or recruiting, you&#8217;ve probably heard something about &#8220;semantic search&#8221; and perhaps you would like to learn more. Well &#8211; you&#8217;ve found the right article. As a follow-up to my recent Slideshare on AI sourcing and matching, I am going to provide an overview of semantic search, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Fsemantic-search-explained-for-sourcing-and-recruiting%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Fsemantic-search-explained-for-sourcing-and-recruiting%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/will-lion/2680454123/"><img class="alignright size-full wp-image-10365" title="Semantic Search for Sourcing and Recruiting" src="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/Semantic-Search-Google2.jpg" alt="" width="250" height="250" /></a>If you have nearly any tenure in HR, sourcing or recruiting, you&#8217;ve probably heard something about &#8220;semantic search&#8221; and perhaps you would like to learn more.</p>
<p>Well &#8211; you&#8217;ve found the right article.</p>
<p>As a follow-up to <a title="Glen Cathey's presentation on Talent Sourcing and Matching: Artificial Intelligence vs. Human Cognition" href="http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/">my recent Slideshare on AI sourcing and matching</a>, I am going to provide an overview of semantic search, the claims that semantic search vendors often make, explain how semantic search applications actually work, and expose some practical limitations of semantic search  recruiting solutions.</p>
<p>Additionally, I will classify the 5 basic levels of semantic search and give you examples of how you can conduct Level 3 Semantic Search (Grammatical/Natural) with Monster, Bing, and any search engine that allows for fixed or configurable proximity.</p>
<p>But first &#8211; let&#8217;s define &#8220;semantic search.&#8221;<span id="more-8275"></span></p>
<h2>What is Semantic Search?</h2>
<p><a class="wp-caption-dd" title="What is semantic search? Wikipedia is glad you are curious!" href="http://en.wikipedia.org/wiki/Semantics" target="_self">Semantics</a> is the study of meaning, inherent at the levels of words, phrases, and sentences.</p>
<p><a title="Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results." href="http://en.wikipedia.org/wiki/Semantic_search">Semantic search</a> is most often used to describe searching beyond the literal lexical (exact word for word) match and into the <em><strong>meaning</strong></em> of words and phrases at the conceptual and contextual level, and sentences at the grammatical level.</p>
<p>When sourcing candidates, semantic search can be achieved at the <strong><em>conceptual level</em></strong> when a search for a specific term (e.g., Java) also yields matches on related terms (e.g., J2EE, EJB, servlets, etc). &#8211; words that are related conceptually.</p>
<p>As another example, in the healthcare space, a semantic search for &#8220;cancer&#8221; could also produce positive hits on terms such as oncology, lymphoma, tumor, etc.</p>
<p>Words and phrases by themselves can be somewhat ambiguous, but are less so when taken in <em><strong>context</strong> -</em> using surrounding words or passages that can shed light on the intended meaning.</p>
<p>For example, &#8220;Java&#8221; is a software programming language, but it is also used to refer to coffee, and it is also an Indonesian island. <a class="wp-caption-dd" title="Here is a Twitter query for the search term &quot;java&quot;" href="http://search.twitter.com/search?q=java" target="_self">A quick Twitter search for &#8220;Java&#8221;</a> will typically net you a mix of references to Java. By reading each tweet and the text surrounding &#8220;Java,&#8221; we can easily <a class="wp-caption-dd" title="Funky word, simple meaning" href="http://en.wikipedia.org/wiki/Word_sense_disambiguation" target="_self">disambiguate</a> the reference to &#8220;Java&#8221; and divine the intended meaning.</p>
<p>Below you can see Java referenced on Twitter in 3 very different ways in 3 successive tweets, and the context tells you how to interpret the meaning of &#8220;Java&#8221; in each one:</p>
<p><a href="http://search.twitter.com/search?q=java"><img class="alignnone size-full wp-image-8453" title="Semantic search example using Twitter and Java" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/Java_Twitter2.png" alt="" width="600" height="238" /></a></p>
<h2>Why Should HR/Recruiting Professionals Care about Semantic Search?</h2>
<p>There is more information available about more people today than ever, and the volume is only going to increase and the rate at which is accumulates is accelerating.</p>
<p>Sifting through an ever-increasingly large amount of human capital data in the form of resumes, social media profiles (LinkedIn, Twitter, Facebook, etc.), blogs and other sources is a significant challenge.</p>
<p>The promise and potential of semantic search is that it can help you more quickly and easily cut through massive volumes of potential candidate information to help you find more of the right people faster than standard methods.</p>
<h2><span style="color: #ff0000;"><strong>Choose Your Own Adventure!</strong></span></h2>
<p>Now that you understand semantics and the basic concepts of semantic search, you have a choice:</p>
<ol>
<li>If you don&#8217;t particularly care to get into the details of how solutions that claim to use semantic search actually work and achieve their claims, you can skip all the way to the end for a presentation on the 5 Levels of semantic search. In that presentation that you can find a couple of examples of how to achieve Level 3 semantic search with Monster or any search engine that offers proximity search, which allows you to control how close your search terms are to each other.</li>
<li>If you currently use a matching application that claims to leverage semantic search (e.g., <a title="Learn more about Monster's 6Sense semantic search/match solution" href="http://www.youtube.com/watch?v=SFXYGxptm_0">Monster&#8217;s 6Sense</a>), if you&#8217;re considering purchasing/implementing such a solution, or if you&#8217;re just curious how these kinds of applications achieve their claims, don&#8217;t skip ahead and continue reading.</li>
</ol>
<h2>Semantic Search Claims for Sourcing and Recruiting</h2>
<p>Many vendors are quick to explain that their semantic search solution can help you and/or (wink) your team to &#8220;stop wasting time trying to create difficult and complex Boolean search strings&#8221;, and instead, let &#8220;intelligent search and match&#8221; applications do the work for you.</p>
<p>Some claim that &#8220;a single query will give you the results you need &#8211; no more re-querying, no more waste of time!&#8221;</p>
<p>Going further, semantic search solutions for the recruiting industry commonly state that their offerings:</p>
<ul>
<li>Understand titles, skills, and concepts</li>
<li>Automatically analyze and define relationships between words and concepts</li>
<li>Intuit and infer experience by context</li>
<li>Perform pattern recognition</li>
<li>Perform fuzzy matching</li>
</ul>
<h2>Sounds Great &#8211; But How Do They Really Work?</h2>
<p>Over the years, I&#8217;ve had many people attempt to sell me on the benefits of semantic search when it comes to sourcing potential candidates, and I have also had the opportunity to use and evaluate quite a few semantic search solutions, including pretty much all of the usual suspects in the space.</p>
<p>My experience and skill with regard to human capital data information retrieval <a class="wp-caption-dd" title="Talent mining is human capital data information retrieval" href="http://www.booleanblackbelt.com/2010/10/talent-mining-and-talent-analytics-sourcecon-2010/" target="_self">information retrieval</a> affords me some unique insight as to how the technologies and techniques semantic search vendors utilize to make their claims actually work, as well as their limitations specific to human capital data. More on that last bit later.</p>
<p>First, let&#8217;s get into how semantic search applications for recruiting actually work.</p>
<p>When semantic search vendors make claims that their applications can automatically understand titles, skills, and concepts, analyze and define relationships between words and concepts, intuit and infer experience by context, perform pattern recognition and fuzzy matching, they are typically using 1 or more of the following to do so:</p>
<h2>Resume Parsing</h2>
<p>Parsing slices and dices resumes and extracts useful information contextually based on the structure of most resumes.</p>
<p>A good parser can take a resume and break it down to its component parts and &#8220;understand&#8221; a person&#8217;s experience.</p>
<p>Resume parsing can be used to extract skill words and differentiate between terms mentioned in skills summaries vs. those that are mentioned in the body of the resume &#8211; the latter having a higher probability of being indicative of real experience. Resume parsers can also typically extract titles and employers and some can even reliably identify the most recent title and employer.</p>
<p>Solid parsing technology can correctly identify addresses and education information and &#8220;realize&#8221; that &#8220;George Washington&#8221; in an address is likely a street name, but in an education section a University.</p>
<p>Some parsers can even determine current vs. dated experience with specific skills, as well as automatically calculate years of experience with specific skills, management, and overall years of work experience based on date analysis. Being able to control years of experience can help find people who aren&#8217;t under- or overqualified or not likely to be in the compensation range of the opening you are sourcing/recruiting for.</p>
<p>Resume parsing can result in highly structured data, which can enable a recruiter to move beyond free text search and to search for information contextually in specific sections/fields, such as current title, current experience, education, etc.</p>
<p>A more automated way of achieving semantic search via parsed resume data is to take basic search terms entered by a sourcer or recruiter and weight search results based on recency of related titles and experience, based on data parsed and identified as more recent, as well as calculated years of experience (e.g., Java and related terms mentioned in most recent work experience, dated &#8217;9/06 to Present&#8217;).</p>
<p>So now you know that when you hear that a semantic search application can &#8220;automatically understand titles and skills&#8221; and can &#8220;intuit and infer experience by context,&#8221; not only do you know what they&#8217;re talking about, you know at least one of the ways they try to make good on that claim.</p>
<h2>Taxonomies and Ontologies</h2>
<p>Some semantic search solutions for recruiting leverage ontologies and taxonomies.</p>
<p><a class="wp-caption-dd" title="Taxonomy fully explained" href="http://en.wikipedia.org/wiki/Taxonomy" target="_self">Taxonomy</a> is the science which deals with the study of identifying, grouping, and naming things according to their established natural relationship. An <a class="wp-caption-dd" title="More than you wanted to know about ontologies, perhaps" href="http://en.wikipedia.org/wiki/Ontology_(information_science)" target="_self">ontology</a> is a &#8220;formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts.&#8221;</p>
<p>As complex as those definitions may sound, they are really quite easy to understand when it comes to how vendors utilize taxonomies and ontologies to achieve semantic search.</p>
<p>Taxonomies and ontologies are leveraged by semantic search solutions for recruiting and staffing as a back-end list of keywords organized by concept and relationship so that when you search for a term or phrase, the solution can compare your search against terms and phrases it &#8220;knows&#8221; are conceptually related.</p>
<p>A common taxonomy used in recruiting solutions is a parent-child, hierarchical (directional, one way) taxonomy. Wikipedia uses this simple way of explaining the parent-child relationship: A car is a subtype of vehicle, so any car is also a vehicle, but not every vehicle is a car.</p>
<h2>Hierarchical Taxonomy</h2>
<p>With a hierarchical taxonomy for accounting terminology, if you searched for &#8220;<a class="wp-caption-dd" title="What the heck is SOX 404?" href="http://en.wikipedia.org/wiki/SOX_404_top-down_risk_assessment" target="_self">SOX 404</a>,&#8221; you should get positive hits and relevance ranking from the term &#8220;SOX 404&#8243; as well as &#8220;accounting,&#8221; because the system can recognize that &#8220;SOX 404&#8243; is an accounting-related &#8220;child&#8221; term/concept tied to the &#8220;parent&#8221; term &#8220;accounting.&#8221;</p>
<p>In a true hierarchical taxonomy, if you searched for &#8220;accounting,&#8221; you should only get positive hits and ranking on the term &#8220;accounting&#8221; and not on mentions of &#8220;SOX 404,&#8221; because not all accounting-related work involves SOX 404.</p>
<p>In other words, SOX 404 is accounting-related work, but not all accounting work is SOX 404-related.</p>
<h2>Conceptual Search</h2>
<p>A semantic search solution using a hierarchical taxonomy can help you find terms and phrases other than the ones you specifically searched for, because they compare your search terms with the taxonomy and return results that not only mention your keywords, but also related terminology.</p>
<p>This is a form of &#8220;conceptual search&#8221; &#8211; you search for 1 term, and you can get results mentioning all related concepts as well as your original search term.</p>
<p>In addition to hierarchical relationships, semantic search solutions may also perform conceptual searching based on synonymous terms and phrases.</p>
<p>For example, if you searched for &#8220;Director of Tax,&#8221; a well developed taxonomy would also return results for all of the title variants you didn&#8217;t actually search for, but are the same, such as &#8220;Tax Director,&#8221; &#8220;Director, Tax,&#8221; etc. This form of conceptual search can be useful for finding common abbreviations for phrases, such as CPA and &#8220;C.P.A&#8221; from a search for &#8220;Certified Public Accountant,&#8221; and vice versa.</p>
<p>A comprehensive taxonomy can be especially helpful for Information Technology sourcers and recruiters, as it can be difficult to know or even remember all of the various ways certain technologies can be referenced (SQL 2008, SQL Server, MSSQL, etc.).</p>
<h2>Statistical Methods</h2>
<p>Rather than relying on pre-built taxonomies to define relationships between titles, terms and concepts, some semantic search solutions use complex statistical methods in an attempt to automatically &#8220;understand&#8221; language and relationships between words.</p>
<p>While I am not aware of any semantic search vendor supplying solutions to the recruiting industry that publicly explains their statistical methods, thankfully Google gives us a tiny bit of insight of how such an approach works.</p>
<p>Google has found that keywords with the same or similar meanings in a natural language sense tend to be &#8220;close&#8221; in units of <a title="Google distance is a semantic similarity measure derived from the number of hits returned by the Google search engine for a given set of keywords. Keywords with the same or similar meanings in a natural language sense tend to be &quot;close&quot; in units of Google distance, while words with dissimilar meanings tend to be farther apart." href="http://en.wikipedia.org/wiki/Normalized_Google_distance">Google distance</a>, while words with dissimilar meanings tend to be farther apart.</p>
<p>Here is the equation for the Google distance, which is a measure of semantic interrelatedness derived from the number of hits returned by the Google search engine for a given set of keywords.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/Google_Distance.png"><img class="alignnone size-full wp-image-8455" title="Google Distance Equation for Semantic Interrelatedness" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/Google_Distance.png" alt="" width="483" height="79" /></a></p>
<p>That was easy, right?</p>
<h2>Semantic Clustering, Machine Learning, Pattern Recognition &#8211; Oh My!</h2>
<p>I don&#8217;t pretend to understand semantic clustering and machine learning at the technical level, but I do have a good understanding of what they are used for and how they work at a high level, specifically with regard to sourcing and matching candidates from human capital data.</p>
<p>Semantic clustering is a non-interactive and unsupervised <a class="wp-caption-dd" title="Click here and be a human learning about machine learning :-)" href="http://en.wikipedia.org/wiki/Machine_learning" target="_self">machine learning</a> technique seeking to automatically analyze and define relationships between words and concepts.</p>
<p>For candidate sourcing purposes, algorithms are created to automatically learn to recognize complex patterns, &#8220;learn&#8221; and draw relationships from human capital data (resumes, social network profiles, etc.).</p>
<p>Rather the relying on a static taxonomy, semantic clustering allows for dynamic concept matching.</p>
<p>Based on statistical analysis/algorithms and <a class="wp-caption-dd" title="Learn more about pattern recognition" href="http://en.wikipedia.org/wiki/Pattern_recognition" target="_self">pattern recognition</a>, an application can &#8220;learn&#8221; that C# is related to .Net, due in part to keyword frequency and proximity that it has analyzed across thousands to millions of documents.</p>
<p>A query cloud offers an excellent visualization of semantic clustering &#8211; you can see and choose from a group of terms and phrases that the semantic search solution has determined to be related to your search term.</p>
<p>Here is an example of a query cloud for C#:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/semantic-query-cloud.png"><img class="alignnone size-full wp-image-8457" title="Semantic search query cloud from eGrabber powered by Pure Discovery" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/semantic-query-cloud.png" alt="" width="521" height="308" /></a></p>
<p>While semantic clustering can quickly and easily find related terms, the question has to be asked of whether or not the related terms are actually relevant. Only the person conducting the search can make that determination.</p>
<h2>Fuzzy Logic</h2>
<p>When an application claims to perform fuzzy matching, it is apply fuzzy logic to the search, which finds approximate matches to a pattern in a string.</p>
<p>Fuzzy logic is especially useful to automatically search for slight phrase variations and word misspellings. Most sourcers/recruiters do not take the time to search for misspellings, and understandably so as it is quite laborious. However, a good fuzzy matching solution will find your exact search terms as well as any slight spelling variation, intentional or unintentional.</p>
<p>If you don&#8217;t search for misspellings, you&#8217;re missing people:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/Fuzzy_Logic_Search_can_help_with_misspellings.png"><img class="alignnone size-full wp-image-10367" title="Here is just one example of a misspelling on LinkedIn - unless you search for it directly, you can't find it, so results like these become Dark Matter to 99% of sourcers and recruiters" src="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/Fuzzy_Logic_Search_can_help_with_misspellings.png" alt="" width="600" height="216" /></a></p>
<h2>The 5 Levels of Semantic Search</h2>
<p>Now that you have a basic understanding of the concept of semantic search and how applications using semantic search actually work, I&#8217;d like to introduce you to what I believe are the 5 basic levels of semantic search.</p>
<p>Intended for HR professionals, sourcers and recruiters, this presentation explains and explores the concepts of semantics and semantic search, including the 5 Levels of Semantic Search: Conceptual Search, Contextual Search, Grammatical/Natural Language Search, Inferential Search, and Tagging.</p>
<p>You&#8217;ll also see some examples of how you can achieve Level 3 semantic search using Monster (classic search) or any search engine that allows for fixed or configurable proximity search.</p>
<div id="__ss_11065012" style="width: 595px;">
<p><strong style="display: block; margin: 12px 0 4px;"><a title="Semantic Search for Sourcing and Recruiting" href="http://www.slideshare.net/glencathey/semantic-search-for-slideshare" target="_blank">Semantic Search for Sourcing and Recruiting</a></strong> <iframe src="http://www.slideshare.net/slideshow/embed_code/11065012" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="595" height="497"></iframe></p>
<div style="padding: 5px 0 12px;">View more <a href="http://www.slideshare.net/" target="_blank">presentations</a> from <a href="http://www.slideshare.net/glencathey" target="_blank">Glen Cathey</a></div>
</div>
<h2>Semantic Search for Recruiting: The Good</h2>
<p>I love technology and anything that can make me better and faster at what I do. Semantic search solutions for sourcing candidates can provide many benefits, including:</p>
<ul>
<li>Reducing the time to find relevant matches</li>
<li>Lessening or eliminating the need for recruiters to have deep and specialized knowledge within an industry or skill set</li>
<li>Reducing and even eliminating time spent on initial research</li>
<li>The ability to go beyond literal, identical lexical matching</li>
<li>Leveling the playing field for those with less sourcing experience or ability</li>
<li>Making an inexperienced person look like a sourcing wizard</li>
<li>Boosting teams with low search/sourcing capability</li>
<li>Working well for positions where titles effectively identify matches and where there is a low volume and variety of keywords</li>
<li>Working well for organizations with a high volume of unchanging hiring needs</li>
</ul>
<h2>Semantic Search for Recruiting: The Bad</h2>
<p>On the other hand, you should be aware of some issues associated with blindly trusting semantic search solutions, including:</p>
<ul>
<li>Just because terms are <em><strong>related</strong></em>, it doesn&#8217;t automatically make them <em><strong>relevant</strong></em> to the search</li>
<li>Removing thought from the talent identification  process</li>
<li>The danger of eliminating the need for recruiters to understand what they’re actually searching for</li>
<li>Difficulty with information technology, healthcare, and other sectors/verticals with ever-changing technology and terminology</li>
<li>Finding some people, but eliminating and/or burying others</li>
<li>Finding the best matches based on keywords present, as opposed to the best people</li>
<li>The inability to search for what isn&#8217;t explicitly stated - applications will only return results that mention required keywords and their variants</li>
<li>The fact that many people have skills and experience that are simply not mentioned anywhere in their resumes and thus they cannot be retrieved via any direct search method</li>
<li>They level the playing field &#8211; if competing companies use the same software solution, they will both find (and miss!) the exact same people</li>
<li>The fact that a single search cannot find all of the best people &#8211; every search both includes and excludes qualified candidates</li>
<li>They can favor keyword rich resumes/profiles, yet keyword poor resumes/profiles may in fact represent better candidates that keyword rich resumes</li>
</ul>
<h2>Semantic Search for Sourcing &amp; Recruiting: The Bottom Line</h2>
<p>The potential of semantic search for talent identification and acquisition is powerful and exciting!</p>
<p>However, it&#8217;s important to realize that with technology that&#8217;s been on the market for over a decade, sourcers and recruiters have already been able to &#8220;manually&#8221; achieve Levels 1-4 semantic search for a while now, and there are some solutions available today that allow for searchable tagging as well (Level 5).</p>
<p>On the other hand, using software for automating semantic search/match can allow you to quickly, easily and somewhat reliably achieve Levels 1-2 semantic search, depending on the vendor/solution you choose. At this time, true Level 3-5 semantic search is beyond the reach of today&#8217;s semantic search/match applications (IMHO).</p>
<p>One of the main and inescapable problems with any automated semantic search/match solution is that human capital data is quite often incomplete and unstructured. Let&#8217;s face it &#8211; no company is looking to find people because they mention specific keywords and titles &#8211; everyone&#8217;s looking for their next great hire who has specific skills and experience which may not even be explicitly mentioned in a resume, on a LinkedIn profile, in a Twitter bio, etc.</p>
<p>Matching software can work with what&#8217;s there (text that&#8217;s present), but they can&#8217;t match on what&#8217;s not there (text that isn&#8217;t present). On the other hand, one thing that humans do incredibly well is instantly perform dynamic inference, more commonly known as &#8220;reading between the lines.&#8221; Perhaps at some point in the future, software will be able to somewhat reliably infer experience and capability beyond text that is present, but it can&#8217;t be done today beyond guessing (e.g., &#8220;Were you looking for _____________?&#8221;).</p>
<p>Food for thought &#8211; how would you like to explain to a candidate that the reason why they weren&#8217;t considered for a job was because your semantic search application didn&#8217;t think they were a match based on their resume? How would you feel if you were turned down in consideration for a job because a software solution didn&#8217;t &#8220;like&#8221; your resume? Do we really want to rely 100% on a software solution that seems to make our life easier when it can result in missing and altogether eliminating some of the best people available?</p>
<p>While software can retrieve and move data, <a title="The terms data, information and knowledge are frequently used for overlapping concepts. The main difference is in the level of abstraction being considered. Data is the lowest level of abstraction, information is the next level, and finally, knowledge is the highest level among all three.[citation needed] Data on its own carries no meaning. For data to become information, it must be interpreted and take on a meaning. For example, the height of Mt. Everest is generally considered as &quot;data&quot;, a book on Mt. Everest geological characteristics may be considered as &quot;information&quot;, and a report containing practical information on the best way to reach Mt. Everest's peak may be considered as &quot;knowledge&quot;." href="http://en.wikipedia.org/wiki/Data#Meaning_of_data.2C_information_and_knowledge">data requires analysis to yield information and produce knowledge</a> which can facilitate decision making. That&#8217;s why these solutions are referred to as <a title="A decision support system (DSS) is a computer-based information system that supports business or organizational decision-making activities. DSSs serve the management, operations, and planning levels of an organization and help to make decisions, which may be rapidly changing and not easily specified in advance. DSSs include knowledge-based systems. A properly designed DSS is an interactive software-based system intended to help decision makers compile useful information from a combination of raw data, documents, personal knowledge, or business models to identify and solve problems and make decisions." href="http://en.wikipedia.org/wiki/Decision_support_system">Decision Support Systems</a> - the operative word being &#8220;support,&#8221; because they don&#8217;t (and should not!) make the decisions for you &#8211; these solutions provide you with data to interpret for information to make an informed decision.</p>
<p>In the case of sourcing/recruiting &#8211; it&#8217;s deciding who to engage, screen, and potentially recruit.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2012/01/semantic-search-explained-for-sourcing-and-recruiting/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Talent Sourcing: Man vs. AI/Black Box Semantic Search</title>
		<link>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/</link>
		<comments>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/#comments</comments>
		<pubDate>Mon, 09 Jan 2012 14:00:58 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Artificial Intelligence Matching]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Dark Matter]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Future of Sourcing and Recruiting]]></category>
		<category><![CDATA[HCDIR]]></category>
		<category><![CDATA[Human Capital Data]]></category>
		<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Recruiting Technology]]></category>
		<category><![CDATA[Resume Sourcing]]></category>
		<category><![CDATA[Search Automation]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[Sourcing]]></category>
		<category><![CDATA[Sourcing Automation]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Boolean Black Belt]]></category>
		<category><![CDATA[dtSearch]]></category>
		<category><![CDATA[Glen Cathey]]></category>
		<category><![CDATA[hcdir]]></category>
		<category><![CDATA[Human Capital]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[matching solutions]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[Recruiting]]></category>
		<category><![CDATA[Resume Matching]]></category>
		<category><![CDATA[resume parsing]]></category>
		<category><![CDATA[Semantic Clustering]]></category>
		<category><![CDATA[Sourcing solutions]]></category>
		<category><![CDATA[Talent Identification]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=10315</guid>
		<description><![CDATA[Back in March 2010, I had the distinct honor of delivering the keynote presentation at SourceCon on the topic of resume search and match solutions claiming to use artificial intelligence in comparison with people using their natural intelligence for talent discovery and identification. Now that nearly 2 years has passed, and given that in that [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Ftalent-sourcing-man-vs-aiblack-box-semantic-search%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Ftalent-sourcing-man-vs-aiblack-box-semantic-search%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/AI_Brain.png"><img class="alignright  wp-image-10319" title="Talent Sourcing and Matching: Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing Capability." src="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/AI_Brain.png" alt="" width="219" height="239" /></a>Back in March 2010, I had the distinct honor of delivering the keynote presentation at <a title="Sourcing News and Knowledge - Beyond the Obvious." href="http://www.sourcecon.com/">SourceCon</a> on the topic of resume search and match solutions claiming to use artificial intelligence in comparison with people using their natural intelligence for talent discovery and identification.</p>
<p>Now that nearly 2 years has passed, and given that in that time I&#8217;ve had even more hands-on experience with a number of the top AI/semantic search applications available (I won&#8217;t be naming names, sorry), I decided it was time to revisit the topic which I am <em><strong>very</strong></em> passionate about.</p>
<p>If you&#8217;ve ever been curious about semantic search applications that &#8220;do the work for you&#8221; when it comes to finding potential candidates, you&#8217;re in the right place, because I&#8217;ve updated the slide deck and published it to Slideshare. Here&#8217;s what you&#8217;ll find in the 86 slide presentation:</p>
<ul>
<li>A deep dive into the deceptively simple challenge of sourcing talent via human capital data (resumes, social network profiles, etc.)</li>
<li>How resume and LinkedIn profile sourcing and matching solutions claiming to use artificial intelligence, semantic search, and <a title="Natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence.[1] In theory, natural language processing is a very attractive method of human–computer interaction. Natural language understanding is sometimes referred to as an AI-complete problem because it seems to require extensive knowledge about the outside world and the ability to manipulate it." href="http://en.wikipedia.org/wiki/Natural_language_processing">NLP</a> actually work and achieve their claims</li>
<li>The pros, cons, and limitations of automated/<a title="A black box is a device, system or object which can be viewed solely in terms of its input, output and transfer characteristics without any knowledge of its internal workings. For resume search and match, a black box solution gives you no understanding of exactly WHY it's returned certain results or considers them relevant" href="http://en.wikipedia.org/wiki/Black_box">black box</a> matching solutions</li>
<li>An insightful (and funny!) video of <a title="Dr. Michio Kaku is a theoretical physicist, best-selling author, and popularizer of science. He’s the co-founder of string field theory (a branch of string theory), and continues Einstein’s search to unite the four fundamental forces of nature into one unified theory." href="http://mkaku.org/home/?page_id=5">Dr. Michio Kaku</a> and his thoughts on the limitations of artificial intelligence</li>
<li>Examples of what sourcers and recruiters can do that even the most advanced automated search and match algorithms can’t do</li>
<li>The concept of Human Capital Data <a title="To any sourcer or recruiter not still in the Stone Age, this should sound like a really good description of what you do when you use any sort of technology to find people or information about people: Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web. " href="http://en.wikipedia.org/wiki/Information_retrieval">Information Retrieval</a> and Analysis (HCDIR &amp; A)</li>
<li>Boolean and <a title="Extended Boolean typically incorporates the ability to weight each term in a Boolean search string, allowing the searcher to choose which terms are the most relevant, as well as configurable proximity - the ability to specify how close search terms are to each other, which enables powerful semantic search at the sentence level. " href="https://www.google.com/search?aq=f&amp;sourceid=chrome&amp;ie=UTF-8&amp;q=extended+Boolean">extended Boolean</a></li>
<li>Semantic search</li>
<li>Dynamic inference</li>
<li><a title="Dark Matter is a term I use to describe resumes, LinkedIn profiles, and other human capital data that exists to be found, but cannot be retrieved through direct or conventional search methods." href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">Dark Matter</a> resumes and social network profiles</li>
<li>What I believe to be the ideal resume search and matching solution</li>
</ul>
<div>Enjoy, and let me know your thoughts.</div>
<div id="__ss_10891808" style="width: 595px;">
<p><strong style="display: block; margin: 12px 0 4px;"><a title="Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing" href="http://www.slideshare.net/glencathey/talent-sourcing-and-matching-artificial-intelligence-and-black-box-semantic-search-vs-human-cognition-and-sourcing" target="_blank">Talent Sourcing and Matching &#8211; Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing</a></strong> <iframe src="http://www.slideshare.net/slideshow/embed_code/10891808" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="595" height="497"></iframe></p>
<div style="padding: 5px 0 12px;">View more <a href="http://www.slideshare.net/" target="_blank">presentations</a> from <a href="http://www.slideshare.net/glencathey" target="_blank">Glen Cathey</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why So Many People Stink at Searching</title>
		<link>http://www.booleanblackbelt.com/2011/12/why-so-many-people-stink-at-searching/</link>
		<comments>http://www.booleanblackbelt.com/2011/12/why-so-many-people-stink-at-searching/#comments</comments>
		<pubDate>Mon, 19 Dec 2011 14:00:20 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Artificial Intelligence Matching]]></category>
		<category><![CDATA[Human Capital Data]]></category>
		<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Iterative Search]]></category>
		<category><![CDATA[Search Process]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[Blackbox Search]]></category>
		<category><![CDATA[Critical Thinking]]></category>
		<category><![CDATA[Dark Matter Search]]></category>
		<category><![CDATA[HCIR]]></category>
		<category><![CDATA[How to get better search results]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[Intelligent Search]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[Search Algorithms]]></category>
		<category><![CDATA[Search Relevance]]></category>
		<category><![CDATA[Search Results]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=10211</guid>
		<description><![CDATA[The trouble with search today is that people put too much trust in search engines &#8211; online, resume, social, or otherwise. I can certainly understand and appreciate why people and companies would want to try and create search engines and solutions that &#8220;do the work for you,&#8221; but unfortunately the &#8220;work&#8221; being referenced here is [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F12%2Fwhy-so-many-people-stink-at-searching%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F12%2Fwhy-so-many-people-stink-at-searching%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/stickergiant/4793776078/"><img class="alignright  wp-image-10219" title="Don't implicitly trust any search engine - use your brain, think, and analyze the results for relevance." src="http://www.booleanblackbelt.com/wp-content/uploads/2011/12/Be_Careful_This_Machine_Has_No_Brain_Use_Your_Own_2.png" alt="" width="235" height="203" /></a></p>
<p>The trouble with search today is that people put too much trust in search engines &#8211; online, resume, social, or otherwise.</p>
<p>I can certainly understand and appreciate why people and companies would want to try and create search engines and solutions that &#8220;do the work for you,&#8221; but unfortunately the &#8220;work&#8221; being referenced here is <em><strong>thinking</strong></em>.</p>
<p>I read an article by Clive Thompson in Wired magazine the other day titled, &#8220;<a title="An interesting little article that takes a look into the issues of trusting search engines and not analyzing the search results - essentially, &quot;putting too much trust in the machine.&quot; Critical thinking should never be removed from any search process!" href="http://www.wired.com/magazine/2011/11/st_thompson_searchresults/">Why Johnny Can&#8217;t Search</a>,&#8221; and the author opens up with the common assumption that young people tend to be tech-savvy.</p>
<p>Interestingly, although <a title="Generation Z (also known as Generation M, the Net Generation, or the Internet Generation) is a common name in the US and other Western nations for the group of people born from the early to mid 1990s to the present.[1][2][3][4][5] The generation has grown up with the World Wide Web, which became increasingly available after 1991[6]. The youngest of the generation were born during a minor fertility boom around the time of the US Global financial crisis of the late 2000s decade, ending around the year 2010, with the next unnamed generation succeeding." href="http://en.wikipedia.org/wiki/Generation_Z">Generation Z</a> is also known as the &#8220;Internet Generation&#8221; and is comprised of &#8220;digital natives,&#8221; they apparently aren&#8217;t very good at online search.</p>
<p>The article cites a few studies, including one in which a group of college students were asked to use Google to look up the answers to a handful of questions. The researchers found that the students tended to rely on the top results.</p>
<p>Then the researchers changed the order of the results for some of the students in the experiment.  More often than not, they still went with the (falsely) top-ranked pages.</p>
<p>The professor who ran the experiment concluded that &#8220;students aren’t assessing information sources on their own merit—they’re putting too much trust in the machine.&#8221;</p>
<p>I believe that the vast majority of people put too much trust in the machine &#8211; whether it be Google, LinkedIn, Monster, or their ATS.</p>
<p>Trusting top search results certainly isn&#8217;t limited to Gen Z &#8211; I believe it is a much more widespread issue, which is only exacerbated by <a title="All is not perfect with intelligent search" href="http://www.submitedge.com/news/intelligent-search/">&#8220;intelligent&#8221; search engines</a> and applications using semantic search and <a title="Natural Language Processing, which began as a branch of Artificial Inteliigence" href="http://en.wikipedia.org/wiki/Natural_language_processing">NLP</a> that lull searchers into the false sense of security that the search engine &#8220;knows&#8221; what they&#8217;re looking for.<span id="more-10211"></span></p>
<h2>This is Your Search Without a Brain</h2>
<p>It&#8217;s easy to see why people and companies create search products and services using semantic search and NLP that claim to be able to make searching &#8220;easier&#8221; &#8211; they are looking to sell a product  based on the value of making your life easier, at least when it comes to finding stuff.</p>
<p>If you take a look at some of the marketing materials for intelligent search and match search products, you&#8217;ll find value propositions such as &#8220;Stop wasting time trying to create difficult and complex Boolean search strings,&#8221; &#8221;Let intelligent search and match applications do the work for you,&#8221; and &#8220;A single query will give you the results you need &#8211; no more re-querying, no more waste of time!&#8221;</p>
<p>I love saving time and getting to what I want faster, but my significant issue with &#8220;intelligent search and match&#8221; applications is that they try to determine what&#8217;s relevant to me.</p>
<p>And that&#8217;s a rather large issue, because only I know what I am looking for.</p>
<p>It&#8217;s critical to be reminded that the <a title="In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user." href="http://en.wikipedia.org/wiki/Relevance_(information_retrieval)">definition of &#8216;relevance,&#8217; specifically with regard to information science and information retrieval</a>, is &#8220;how well a retrieved document or set of documents meets the information need of the user.&#8221;</p>
<p>The only person that can make the judgment of how well a search result meets their information need is the person conducting the search, because it&#8217;s their specific information need.</p>
<p>Any reference to &#8220;relevance&#8221; by a search engine, whether it be Google, Bing, LinkedIn, Monster, etc., is based purely on the keywords, operators, and/or facets used.</p>
<p>Search engines don&#8217;t know what you want &#8211; they only know what you typed into or selected from the search interface.</p>
<p>Poor use of keywords, operators or facets will don&#8217;t stop you from getting results. All searches &#8220;work,&#8221; as I am fond of saying &#8211; but the quality or relevance will likely be low.</p>
<p>Of course, that assumes that the person conducting the search is actually proficient at judging the quality or the relevance of the results &#8211; comparing results to their specific information need and experimenting with different combinations of keywords, operators and facets to look for changes in relevance.</p>
<h2>Related Does Not Equal Relevant</h2>
<p>I personally never implicitly trust first page or top ranked search results online, nor top ranked results on LinkedIn, Monster, or anywhere I search. Some of the best search results I have ever found were buried deep in result sets &#8211; far past where most people would typically review, and essentially in the territory of results the search engine deemed least &#8220;relevant.&#8221; <a title="Indicating disapproval, irritation, impatience or disbelief." href="http://en.wiktionary.org/wiki/pshaw">Pshaw</a>!</p>
<p>One reason for this is because I understand that any search engine I use, no matter how &#8220;dumb&#8221; (straight keyword matching), or &#8220;intelligent&#8221; (semantic/NLP), they can only work with the terms I give it. What do you think the  most &#8220;intelligent&#8221; search engine can do with poor user input?</p>
<p>When it comes to searching, unfortunately everyone&#8217;s a winner, because every search &#8220;works&#8221; and returns results.  The problem is that few searchers know how to critically examine search results for relevance.</p>
<p>Regardless how how &#8220;intelligent&#8221; a search engine might be, it can only try to find terms and concepts related to my user input.</p>
<p>This is an often overlooked but critical issue &#8211; just because terms might be related, <em><strong>it does not mean they are relevant to my information need</strong></em>.</p>
<p>It certainly helps to understand that some of the most relevant search results can&#8217;t actually be retrieved by the obvious keywords, titles or phrases, or even those that a semantic search algorithm deems related to them. In fact, some of the best results simply cannot be directly retrieved &#8211; see my post on <a title="Most searches only return the tip of the iceberg when it comes to available and truly relevant results." href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">Dark Matter</a> for more information on the concept.</p>
<p>However, to appreciate the concept that no single search, no matter how enhanced by technology, can find all of the relevant (by human standards and judgment) results available to be retrieved, you have to know a thing or two about information retrieval in the first place.</p>
<p>And if you already lack the ability to critically judge search results and evaluate them for relevance, how can you be expected to be able to evaluate and critically examine the search results returned by intelligent search and match applications?</p>
<p>The &#8220;<a title="In science and engineering, a black box is a device, system or object which can be viewed solely in terms of its input, output and transfer characteristics without any knowledge of its internal workings, that is, its implementation is &quot;opaque&quot; (black)." href="http://en.wikipedia.org/wiki/Black_box">black box</a>&#8221; matching algorithms of intelligent search and match applications pose significant issues to users in that searchers have absolutely no insight as to <em><strong>why</strong></em> the search engine returns the results it does. Without this, what option does a user have other than to implicitly trust the search engine&#8217;s matching algorithm?</p>
<h2>Searching Ain&#8217;t Easy</h2>
<p>Who says search has to be easy anyway?</p>
<p>Just because you might want it to be, should it be? Does it have to be?</p>
<p>Let&#8217;s face it &#8211; a lot of people look for the easy way out. The sheer volume of advertisements pushing diet supplements that claim you can lose a ton of weight without having to watch what you eat and exercise is evidence that people want to get the results they want without working for them.</p>
<p>You know the best way to lose weight? A healthy diet combined with regular exercise. The problem is that eating healthy and exercising regularly is that it requires discipline and hard work.</p>
<p>I&#8217;m not saying there isn&#8217;t a better way to search &#8211; I am a fan of Thomas Edison&#8217;s belief that &#8220;There is always a better way.&#8221;</p>
<p>However, I believe that the better way, specifically when it comes to information retrieval, involves discipline and the hard work of people using <a title="Critical thinking is the process of thinking that questions assumptions. It is a way of deciding whether a claim is true, false; sometimes true, or partly true. The origins of critical thinking can be traced in Western thought to the Socratic method of Ancient Greece and in the East, to the Buddhist kalama sutta and Abhidharma. Critical thinking is an important component of most professions. It is a part of the education process and is increasingly significant as students progress through university to graduate education, although there is debate among educators about its precise meaning and scope.[1]" href="http://en.wikipedia.org/wiki/Critical_thinking">critical thought</a> in the search process &#8211; not short-cutting or completely removing it from the equation.</p>
<p>And I am not alone.</p>
<p>There is already considerable work being done to create new kinds of search systems that <em><strong>depend on </strong><strong>continuous human control of the search process.</strong></em> It&#8217;s called <a title="Human–computer information retrieval (HCIR) is the study of information retrieval techniques that bring human intelligence into the search process. The fields of human–computer interaction (HCI) and information retrieval (IR) have both developed innovative techniques to address the challenge of navigating complex information spaces, but their insights have often failed to cross disciplinary borders. Human–computer information retrieval has emerged in academic research and industry practice to bring together research in the fields of IR and HCI, in order to create new kinds of search systems that depend on continuous human control of the search process." href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_information_retrieval">Human-Computer Information Retrieval (HCIR)</a> - which is the study of information retrieval techniques that bring human intelligence into the search process.</p>
<p>Truly intelligent search systems should not involve limiting or removing human thought, analysis, and influence from the search process &#8211; in fact, they should and can involve and encourage user influence.</p>
<p>When you break it down, the information retrieval process has 2 basic parts:</p>
<ol>
<li>The user enters a query, which is a formal statement of their information need</li>
<li>The search engine returns results</li>
</ol>
<p>The key, in my opinion, is that the search engine should return results in a &#8221;Is this what you were looking for?&#8221; manner and allow you to intelligently refine your results, as opposed to a &#8220;This <em><strong>is</strong></em> what you were looking for&#8221; manner.</p>
<p>There&#8217;s a BIG difference.</p>
<p>The former begs for user influence and input, the latter does not &#8211; it makes the assumption that it found what you wanted</p>
<p>The bottom line is that no matter what you are using to search for information, only <em><strong>you</strong></em> know what you&#8217;re looking for and therefore judge the relevance of the search results returned.</p>
<p>Intelligent search isn&#8217;t easy, because you actually have to think before and after hitting the search button.</p>
<h2>The Intelligent Search Process</h2>
<p>As I have written before, searching should not be a once-and-done affair &#8211; there is no mythical &#8220;once search to find them all.&#8221;</p>
<p><a title="The real “magic” and work of sourcing talent is via human capital data is the iterative, intelligent, and cognitively challenging process of selecting a combination of words and phrases, and in some cases strategically excluding others, analyzing the results returned, making changes to the query based on observed relevance, and repeating the process until an acceptable quantity of highly qualified and well-matched candidates are identified." href="http://www.booleanblackbelt.com/2011/04/sourcing-is-an-investigative-and-iterative-process/">Searching is ideally an iterative process that requires intelligent user input</a>.</p>
<p>Here is an example of an intelligent, iterative search process applied to sourcing talent:</p>
<ol>
<li>Analyzing, understanding, and interpreting job opening/position requirements</li>
<li>Taking that understanding and intelligently selecting titles, skills, technologies, companies, responsibilities, terms, etc. to include (<em><strong>or purposefully exclude!</strong></em>) in a query employing appropriate Boolean operators and/or facets and query modifiers</li>
<li>Critically reviewing the results of the initial search to assess relevance as well as scanning the results for additional and alternate relevant search terms, phrases, and companies</li>
<li>Based upon the observed relevance of and intel gained from the search results, modifying the search string appropriately and running it again</li>
<li>Repeat steps 3 and 4 until an acceptably large volume of highly relevant results is achieved</li>
</ol>
<p>Anyone can enter search terms and hit the &#8220;search&#8221; button, but not everyone can effectively and intelligently search.</p>
<p>Until you&#8217;ve witnessed intelligent and iterative search in action, you likely wouldn&#8217;t know the difference between &#8220;great&#8221; search results, &#8220;good&#8221; search results and &#8220;bad&#8221; search results.</p>
<p>It&#8217;s as dramatic as the difference between and experienced professional offshore fisher, a recreational fisher, and someone going offshore fishing for the first time.</p>
<p>The ocean holds the same fish for everyone fishing it. While a first-time or recreational fisher can get lucky every once in a while, only a person who really knows what they&#8217;re doing can get &#8220;lucky&#8221; on a consistent basis and catch the fish  the recreational fisher only dreams of catching.</p>
<h2>Final Thoughts</h2>
<p>The ability to enter in some search terms and click the &#8220;search&#8221; button doesn&#8217;t convey any supernatural search ability, but it does certainly make people feel like they are good at searching, because unless you mistype something, everyone&#8217;s a winner.</p>
<p>Ultimately, search engines of all types retrieve information, but information requires analysis, and only humans can analyze and interpret for relevance.</p>
<p>Eiji Toyoda, the former President of Toyota Motor Corp., has observed that “Society has reached the point where one can push a button and immediately be deluged with…information. This is all very convenient, of course, but if one is not careful there is a danger of losing the ability to think.”</p>
<p><a title="Critical thinking has been described as “reasonable reflective thinking focused on deciding what to believe or do.”[2] It has also been described as &quot;thinking about thinking.&quot;[3] It has been described in more detail as &quot;the intellectually disciplined process of actively and skillfully conceptualizing, applying, analyzing, synthesizing, and/or evaluating information gathered from, or generated by, observation, experience, reflection, reasoning, or communication, as a guide to belief and action&quot;[4] More recently, critical thinking has been described as &quot;the process of purposeful, self-regulatory judgment, which uses reasoned consideration to evidence, context, conceptualizations, methods, and criteria.&quot;[5] " href="http://en.wikipedia.org/wiki/Critical_thinking">Critical thinking</a> is perhaps <a title="Critical thinking is the skill most demanded by employers around the world when assessing job candidates, according to organisational and people development consultancy, APM Group." href="http://www.nationmultimedia.com/2011/05/04/business/Importance-of-critical-thinking-30154554.html">the most important skill a knowledge worker can possess</a>.</p>
<p>The reason why so many people stink at search is because most people simply don&#8217;t think before or after they search, and they place too much trust in the machine.</p>
<p>Additionally, the quality of the search terms/info entered directly affects the quality of the results. &#8220;Garbage in = garbage out&#8221; certainly applies here. And effective searching is rarely a &#8220;once and done&#8221; affair &#8211; the ability to critically evaluate search results for relevance and successively refine the search criteria to increase relevance is the key to true &#8220;intelligent search.&#8221;</p>
<p>&#8220;<a title="In science and engineering, a black box is a device, system or object which can be viewed solely in terms of its input, output and transfer characteristics without any knowledge of its internal workings, that is, its implementation is &quot;opaque&quot; (black)." href="http://en.wikipedia.org/wiki/Black_box">Black box</a>&#8221; matching algorithms can be wonders of technology and engineering, but they pose significant problems in that searchers have absolutely no insight as to <em><strong>why</strong></em> they return the results they do, and in many cases, the engineers creating these semantic/NLP matching algorithms assume they know what their users are looking for better than the users themselves. <del>I&#8217;m sorry if I am the only person offended by such an assumption.</del></p>
<p>Okay, I&#8217;m not sorry.</p>
<p>I love technology, and I use and have used some of the best matching technology available, but also I know it&#8217;s not a good idea to try to limit or remove intelligent critical thinking from the search process and completely replace it with matching algorithms.</p>
<p>The term human–computer information retrieval was coined by <a title="Learn more about Gary Marchionini" href="http://www.ils.unc.edu/~march/">Gary Marchionini</a> whose main thesis is that “HCIR aims to empower people to explore large-scale information bases <strong><em>but demands that</em></strong> <strong><em>people also take responsibility for this control by expending cognitive and physical energy</em></strong>.” (emphasis mine)</p>
<p>For those who simply want information systems to magically provide them with the most relevant results at the click of a button, you should take special note of the fact that experts in the field of HCIR do not believe that people should step out of the information retrieval process and let semantic search/NLP algorithms/AI be solely responsible for the search process.</p>
<p>If you want to get better search results, use the latest technologies, but don&#8217;t put too much trust in the machine.</p>
<p>Instead, put some skin in the game, take responsibility for the search process, and expend some cognitive energy critically thinking through not only your search input, but also the results for relevance.</p>
<p>&#8220;In the age of information sciences, the most valuable asset is <a title="Knowledge is a familiarity with someone or something unknown, which can include information, facts, descriptions, or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject. It can be implicit (as with practical skill or expertise) or explicit (as with the theoretical understanding of a subject); and it can be more or less formal or systematic.[1] In philosophy, the study of knowledge is called epistemology, and the philosopher Plato famously defined knowledge as &quot;justified true belief.&quot; There is however no single agreed upon definition of knowledge, and there are numerous theories to explain it. Knowledge acquisition involves complex cognitive processes: perception, learning, communication, association and reasoning; while knowledge is also said to be related to the capacity of acknowledgment in human beings.[2]" href="http://en.wikipedia.org/wiki/Knowledge">knowledge</a>, which is a creation of human imagination and creativity. We were among the last to comprehend this truth and we will be paying for this oversight for many years to come.&#8221; — Mikhail Gorbachev, 1990</p>
<h2>Strictly For the Search Geeks</h2>
<p>Check out this <a title="The HCIR 2011 Challenge focuses on the case where recall is everything – namely, the problem of information availability. The information availability problem arises when the seeker faces uncertainty as to whether the information of interest is available at all. Instances of this problem include some of the highest-value information tasks, such as those facing national security and legal/patent professionals, who might spend hours or days searching to determine whether the desired information exists." href="https://sites.google.com/site/hcirworkshop/hcir-2011/challenge">HCIR Challenge</a>, and at least read the  introduction which compares and contrasts precision vs. recall, and references iterative query refinement.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/12/why-so-many-people-stink-at-searching/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Bing&#8217;s Semantic Search, Phonetics and Undocumented Operator</title>
		<link>http://www.booleanblackbelt.com/2011/11/bings-semantic-search-phonetics-and-undocumented-operator/</link>
		<comments>http://www.booleanblackbelt.com/2011/11/bings-semantic-search-phonetics-and-undocumented-operator/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 14:00:39 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Bing]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[x-ray search]]></category>
		<category><![CDATA[Bing Phonetic Search]]></category>
		<category><![CDATA[Bing Plus Sign]]></category>
		<category><![CDATA[Bing Search]]></category>
		<category><![CDATA[Bing Search Operators]]></category>
		<category><![CDATA[Bing Semantic Search]]></category>
		<category><![CDATA[Bing Undocumented Operator]]></category>
		<category><![CDATA[Bing vs. Google]]></category>
		<category><![CDATA[LinkedIn Search]]></category>
		<category><![CDATA[LinkedIn X-Ray]]></category>
		<category><![CDATA[Phonetic Search]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=10036</guid>
		<description><![CDATA[I was recently performing some searches on Bing and came across something curious that I had never noticed before. I&#8217;m not exactly sure if what I found is new or simply something I&#8217;ve overlooked in the past. I updated Twitter with &#8220;Did you know that Bing supports the + query modifier?&#8221; on November 10th, wondering if it [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F11%2Fbings-semantic-search-phonetics-and-undocumented-operator%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F11%2Fbings-semantic-search-phonetics-and-undocumented-operator%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I was recently performing some searches on Bing and came across something curious that I had never noticed before.</p>
<p>I&#8217;m not exactly sure if what I found is new or simply something I&#8217;ve overlooked in the past. <a title="My original Twitter update regarding my finding that Bing search supports the +/Plus sign" href="http://twitter.com/#!/GlenCathey/status/134662838814900225">I updated Twitter with &#8220;Did you know that Bing supports the + query modifier?&#8221;</a> on November 10th, wondering if it was something that other people knew about.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus.png"><img class="alignnone size-full wp-image-10071" title="My Twitter update asking sourcers and recruiters if they knew that Bing supports the +/Plus sign in searches" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus.png" alt="" width="520" height="233" /></a></p>
<p>I only received a few responses, including a couple from noted sourcing luminaries, and the consensus was that I didn&#8217;t find anything because it <a title="Bing Search operator/functionality documentation" href="http://msdn.microsoft.com/en-us/library/ff795620.aspx">wasn&#8217;t documented</a> anywhere and they could not get it to work.</p>
<p>However, the +/Plus sign does in fact work when searching Bing &#8211; just not like it used to in Google.</p>
<p>It&#8217;s always a little exciting to think you are one of the first people to stumble across something most people don&#8217;t know about, although I won&#8217;t get my hopes up that I&#8217;m the only person outside of some folks at Microsoft who&#8217;s ever figured out that Bing supports the +/Plus sign in searches.</p>
<p>This discovery also led me to proof of Bing leveraging semantic and <a title="Phonetic search is a method of locating information in which an algorithm is used to locate combinations of characters that sound similar to a specified combination." href="http://www.answers.com/topic/phonetic-search">phonetic search</a>. <span id="more-10036"></span></p>
<h2>Bing Search Supports the +/Plus Sign</h2>
<p>So I was tinkering around on Bing testing <em><strong>very</strong></em> basic LinkedIn X-Ray searches (more on that later), and here&#8217;s my original Bing search of LinkedIn: <a title="Here's my original Bing X-Ray search of LinkedIn" href="http://www.bing.com/search?q=site:linkedin.com+%22location+Houston%22+java+&amp;qs=n&amp;sk=&amp;sc=1-42&amp;form=QBRE">site:linkedin.com &#8220;location Houston&#8221; java</a></p>
<p>Here are the results I found &#8211; notice anything odd?</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus3.png"><img class="alignnone size-full wp-image-10040" title="Bing results of a basic X-Ray search of LinkedIn: site:linkedin.com &quot;location Houston&quot; java" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus3.png" alt="" width="600" height="738" /></a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>I immediately noticed that the 3rd, 4th, and 5th results highlighted keyword hits of &#8220;Coffee.&#8221;</p>
<p>My first response was confusion &#8211; I could not recall Bing ever trying to so obviously perform <a title="If you're not familiar with semantic search, you can learn more here. I will also be posting an extensive article on semantic search on my Boolean Black Belt website in the near future, so stay tuned!" href="http://en.wikipedia.org/wiki/Semantic_search">semantic search</a> and attempt to guess what I might be looking for by returning results with related terms I didn&#8217;t actually search for.</p>
<p>Then I scanned back up the page and noticed something similar to what I see on Google all the time, typically when Google thinks I might have misspelled something:</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus4.png"><img class="alignnone size-full wp-image-10041" title="Bing decided it might know what I was searching for and returned some results with a related word other than the actual search term I used" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus4.png" alt="" width="600" height="163" /></a></p>
<p>When I clicked on &#8220;Do you want results for site:linkedin.com &#8220;location Houston java,&#8221; this is what I saw:</p>
<p><a href="http://www.bing.com/search?q=%2bsite%3alinkedin.com+%22location+Houston%22+java+&amp;FORM=RCRE"><img class="alignnone size-full wp-image-10042" title="Bing search results, without Bing trying to perform semantic search and guess as to what I was searching for and give me search results with words I did not search for" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus5.png" alt="" width="600" height="668" /></a></p>
<p>&nbsp;</p>
<p>The first thing I noticed was the +/Plus sign.</p>
<p>I could not recall ever seeing it before when searching Bing.</p>
<p>Then I looked at the results, and it was obvious that the +/Plus sign was serving to remove Bing&#8217;s attempt at semantic search and only return results with the exact terms I searched for.</p>
<p>No more results mentioning &#8220;coffee&#8221; when I was searching for Java.</p>
<p>If you think my observation of the +/Plus sign was a fluke, the very next day I was helping one of my associates with a search and noticed he used &#8220;HSCM&#8221; in an OR statement for a PeopleSoft FSCM position. I had never encountered HSCM before on a resume making reference to anything PeopleSoft SCM related, so I Binged it.</p>
<p>My search was simply <a title="Bing search for PeopleSoft HSCM" href="http://www.bing.com/search?q=PeopleSoft+HSCM&amp;go=&amp;qs=n&amp;sk=&amp;sc=8-15&amp;form=QBRE">PeopleSoft HSCM</a>.</p>
<p>When I saw the results, I noticed the &#8220;Including results for peoplesoft hcm,&#8221; even though I searched for HSCM.</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus6.png"><img class="alignnone size-full wp-image-10044" title="Bing search results that included results for what it thought I might be looking for, returning results with words I did not search for" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus6.png" alt="" width="600" height="546" /></a></p>
<p>&nbsp;</p>
<p>In this case, I don&#8217;t think Bing was trying to perform semantic search and return a related search term &#8211; I think Bing was actually steering me towards a spelling variant that is more common to Bing&#8217;s index, perhaps assuming that I misspelled the term in my original search.</p>
<p>When I clicked on &#8220;Do you want results for PeopleSoft HSCM,&#8221; there were only 31 results, and the +/Plus sign was there, preceding the search string:</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus7.png"><img class="alignnone size-full wp-image-10045" title="The +/Plus sign on Bing serves to return only results with the exact search terms you specified, without variants or suggestions" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus7.png" alt="" width="600" height="523" /></a></p>
<p>&nbsp;</p>
<p>If you try the same search on Google, Google doesn&#8217;t give you the benefit of the doubt and simply assumes you misspelled your search term and gives you results for what Google assumes you were searching for.</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus8.png"><img class="alignnone size-full wp-image-10048" title="Google doesn't even give me results with my original search term, jumps to the conclusion I must have misspelled it, and gives me search results for what Google thinks I was searching for." src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus8.png" alt="" width="600" height="368" /></a></p>
<p>&nbsp;</p>
<p>How rude.</p>
<p>I know it&#8217;s a stretch, but there are some people who actually do know what they are searching for and would rather not have their searches hijacked.</p>
<h2>Bing vs. Google</h2>
<p>I was a very <a title="Here's what you would have seen if you used Google back in 1998" href="http://web.archive.org/web/19981202230410/http://www.google.com/">early adpoter of Google&#8217;s search engine</a> (think 1998), preferring it over what most &#8220;power searchers&#8221; were using back then (think <a title="Yes, AltaVista still exists, albeit in neutered form" href="http://www.altavista.com/">AltaVista</a>).</p>
<p>For many years I was a Google extremist &#8211; I used Google search for literally all of my searching needs and never bothered to search using any other Internet search engine except for experimental poking around.</p>
<p>However, not too long ago, after getting frustrated with the junk Google was returning in my LinkedIn searches as well as <a title="Google can get a bit overzealous with more complex searches, forcing you to prove you're human before giving you your search resutls" href="http://www.booleanblackbelt.com/2010/05/what-to-do-if-google-thinks-youre-not-human/">Google more frequently questioning my humanity by forcing me to jump through CAPTCHA hoops</a> , my experimental poking around with Bing got more serious.</p>
<p>At this time, I use Bing more than I use Google &#8211; I&#8217;d estimate a 60/40 split.</p>
<p>Part of this is driven by the fact that I find <a title="Here's a Boolean Black Belt article focused on why I feel Bing beats Google when it comes to X-Ray searching LinkedIn" href="http://www.booleanblackbelt.com/2010/09/bing-beats-google-for-the-best-way-to-x-ray-search-linkedin/">Bing X-Ray searches of LinkedIn are so much &#8220;cleaner&#8221; and not subject to as much &#8220;noise&#8221; as Google search results</a>. I also find searching for LinkedIn profile headline phrases in Bing to do a very good job of returning the profile I&#8217;m looking for, even if I don&#8217;t use the site: command to specifically search LinkedIn.</p>
<p>And of course I love the fact that <a title="Learn more about the big deal about Bing when it comes to sourcing potential candidates on the Internet, LinkedIn, Twitter, etc." href="http://www.booleanblackbelt.com/2010/12/the-big-deal-about-bing-for-sourcing-and-recruiting/">Bing supports configurable proximity with the NEAR:X search functionality, allowing me to perform feats of magic and semantic search at the sentence level</a>.</p>
<p>I also like the fact that, as I showed above, Bing will by default include your search terms along with results of terms it thinks you might find relevant.</p>
<p>With similar searches, Google just assumes you don&#8217;t really know what you were searching for and gives you results of what it thinks you were searching for.</p>
<p>And if you happen to be searching for flights, Bing&#8217;s Price Predictor totally rocks!</p>
<p><a href="http://www.bing.com/travel/flights?FORM=TR2AFL"><img class="alignnone size-full wp-image-10050" title="Bing Travel's Price Predictor hasn't failed me yet, and it's saved me hundreds of dollars already by helping me wait until the right time to buy tickets" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus9.png" alt="" width="515" height="185" /></a></p>
<p>Unrelated to sourcing and recruiting, I know &#8211; but a gem nonetheless!</p>
<h2>Bing Searchers Beware of Semantic and Phonetic Search</h2>
<p>Now that I am on the lookout for Bing&#8217;s semantic search, I&#8217;ve noticed that sometimes Bing will slip in semantic search results without giving you the &#8220;Including results for ____ / Do you want results for _____&#8221; heads-up that lets you know Bing has included results with terms you didn&#8217;t actually search for that Bing thinks is related and relevant.</p>
<p>For example, here are the first page search results for a Java search that returns &#8220;Coffee&#8221; and more interestingly &#8220;Coffey&#8221; &#8211; which means that Bing is not only going semantic by returning words that may have a similar meaning in certain contexts, but also <em><strong>phonetic,</strong></em> returning words that sound similar to the search term.</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus10.png"><img class="alignnone size-full wp-image-10054" title="Example of Bing performing not only semantic search but also leveraging phonetics in a Bing X-Ray search of LinkedIn" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus10.png" alt="" width="600" height="1027" /></a></p>
<p>&nbsp;</p>
<p><a title="Here's Bing's cached result for my LinkedIn X-Ray search for Java, among other search terms" href="http://cc.bingj.com/cache.aspx?q=site%3alinkedin.com+java+%22location+Tampa%2fSt.+Petersburg%22+%22project+manager%22+&amp;d=4708237981387615&amp;mkt=en-US&amp;setlang=en-US&amp;w=e70d497d,d899caf5">If you explore the cached page for the Coffey result</a>, you will notice that there isn&#8217;t any mention of Java anywhere, so the only thing I can conclude is that Bing took my search term of Java and leveraged semantics to also search for coffee as well as phonetic variants, such as Coffey.</p>
<p>I know there have to be a few fellow search geeks that find that prospect to be quite interesting. It looks like the folks behind Bing search have been busy!</p>
<p>In any event, the real lesson here is that Bing didn&#8217;t give me a heads-up that it decided to also return results with terms I didn&#8217;t actually search for.</p>
<p>So, if you&#8217;re using Bing to search for anything and you don&#8217;t want it taking any liberties with semantic search because you only want results with the exact search terms you used, be sure to add a +/Plus sign to the beginning of your search, like so:</p>
<p>&nbsp;</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus111.png"><img class="alignnone size-full wp-image-10056" title="Be sure to use the +/Plus sign when searching Bing and you don't want Bing to use semantic search and only return results with your exact search terms." src="http://www.booleanblackbelt.com/wp-content/uploads/2011/11/BingPlus111.png" alt="" width="527" height="29" /></a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/11/bings-semantic-search-phonetics-and-undocumented-operator/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>What is a Boolean Black Belt Anyway?</title>
		<link>http://www.booleanblackbelt.com/2011/10/what-is-a-boolean-black-belt-anyway/</link>
		<comments>http://www.booleanblackbelt.com/2011/10/what-is-a-boolean-black-belt-anyway/#comments</comments>
		<pubDate>Mon, 10 Oct 2011 13:00:28 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[Beyond Boolean]]></category>
		<category><![CDATA[Boolean Black Belt]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Boolean Search]]></category>
		<category><![CDATA[Boolean Search Strings]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[Query Modifiers]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=9902</guid>
		<description><![CDATA[I&#8217;ve been blogging nearly 3 years now, and I realized I&#8217;ve never come out and actually defined the term &#8221;Boolean Black Belt.&#8221; The concept seems pretty self explanatory, but there has been at least 1 person who&#8217;s taken the opportunity to point out (and gain some traffic in the process &#8211; but it&#8217;s all good!) that it could be perceived as a [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F10%2Fwhat-is-a-boolean-black-belt-anyway%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F10%2Fwhat-is-a-boolean-black-belt-anyway%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2009/05/black-belt-by-quedalapalabra-via-creative-commons.jpg"><img class="alignright" title="black-belt-by-quedalapalabra-via-creative-commons" src="http://www.booleanblackbelt.com/wp-content/uploads/2009/05/black-belt-by-quedalapalabra-via-creative-commons.jpg" alt="" width="240" height="117" /></a>I&#8217;ve been blogging nearly 3 years now, and I realized I&#8217;ve never come out and actually defined the term &#8221;Boolean Black Belt.&#8221;</p>
<p>The concept seems pretty self explanatory, but there has been at least 1 person who&#8217;s taken the opportunity to point out (and gain some traffic in the process &#8211; but it&#8217;s all good!) that it could be perceived as a bit of an oxymoron to be an &#8220;expert&#8221; in something as simple as 3 Boolean operators.</p>
<p>Interestingly, however, I&#8217;ve found that most sourcers and recruiters don&#8217;t even fully exploit the various powers of the OR and NOT operators &#8211; not even close.</p>
<p>So what is a &#8220;Boolean Black Belt&#8221; anyway?<img title="More..." src="http://www.booleanblackbelt.com/wp-includes/js/tinymce/plugins/wordpress/img/trans.gif" alt="" /><span id="more-9902"></span></p>
<h2>Black Belt</h2>
<p>I use the term &#8221;Black Belt&#8221; in reference to the widely known way of describing an expert in martial arts, where the black belt is commonly the highest belt color used and denotes a high degree of competence.</p>
<p>That&#8217;s the easy part; the &#8220;Boolean&#8221; part isn&#8217;t so simple to define.</p>
<h2>Boolean</h2>
<p>I&#8217;d like to take the opportunity to clear up some misconceptions about, and disambiguate my use of &#8220;Boolean&#8221; in &#8220;Boolean Black Belt,&#8221; and pretty much any article in which I refer to Boolean.</p>
<p>When I refer to &#8220;Boolean,&#8221; I am not refering only to the basic Boolean operators of AND, OR, and NOT. I&#8217;m actually referring to the entire process of:</p>
<ol>
<li>Analyzing, understanding, and interpreting job opening/position requirements</li>
<li>Taking that understanding and intelligently selecting titles, skills, technologies, companies, responsibilities, terms, etc. to include (or purposefully exclude!) in a query employing appropriate Boolean operators and query modifiers</li>
<li>Reviewing the results of the initial search to assess relevance as well as scanning the results for additional and alternate relevant search terms, phrases, and companies</li>
<li>Based upon the observed relevance of and intel gained from the search results, modifying the search string appropriately and running it again</li>
<li>Repeat steps 3 and 4 until an acceptably large volume of highly relevant results is achieved</li>
</ol>
<p>Instead of trying to put all of that into a domain name and a concise catch phrase, hopefully you can appreciate why I chose to summarize that entire process as &#8221;Boolean.&#8221;</p>
<h2>Beyond Boolean Logic</h2>
<p>Admittedly, the basic Boolean operators are easy to learn &#8211; after all, there&#8217;s only 3 of them!</p>
<p>However, anyone who&#8217;s adept at leveraging databases and information systems for talent identification knows that the &#8220;magic&#8221; does not lie in the operators themselves, but in all of the steps detailed above.</p>
<p>The &#8220;real&#8221; work of creating effective Boolean search strings lies in the interpretive analysis of the need, determining what terms to include and exclude from searches and in what specific combination, in the analysis of the relevance of the initial search results, and the adaptive process of learning from the results to further refine the Booleans to find a large quantity of highly relevant results &#8211; people who are highly likely to be (or know!) the right match for your hiring needs.</p>
<p>What I just described is actually the process of <a title="Sourcing isn't about Boolean logic/search, it's about Information Retrieval - read more on the subject here" href="http://www.booleanblackbelt.com/2011/04/beyond-boolean-human-capital-information-retrieval/">Information Retrieval</a> (IR), but no matter how much I write on the subject, people still cling to &#8220;Boolean.&#8221;</p>
<h2>Sourcing isn&#8217;t so Simple</h2>
<p>While learning about the concepts of basic Boolean logic is easy, there is nothing inherently easy about creating Boolean search strings for talent identification.</p>
<p>To say that searching databases and information systems to identify talent is &#8220;easy&#8221; because it&#8217;s defined only by 3 simple Boolean operators is to admit that you have little to no understanding or appreciation of online, database, or social network sourcing.</p>
<p>That would be like saying that a challenging math-based brain teaser is simple because everyone understands addition, subtraction, division, and multiplication.</p>
<p>For example, this classic puzzle should be easy for anyone who understands basic math, right?</p>
<p>&#8220;My grandson is about as many days as my son is weeks, and my grandson is as many months as I am in years. My grandson, my son and I together are 100 years. Can you tell me my age in years?&#8221;</p>
<p>After all, it only requires 3 basic and simple mathematical operations: addition, multiplication, and division. If that one is too &#8220;easy&#8221; for you, give <a title="Tough Brain Teaser" href="http://www.braingle.com/brainteasers/44101/six-villages.html" target="_blank">this brain teaser</a> a try &#8211; it too only requires basic math to solve.</p>
<p>It should be obvious that the real challenge of math-based problems comes from being able to understand the puzzle in the first place, and then determining precisely what types of equations and operations are required to solve the problem.</p>
<p>The analysis and understanding is primary, the mathematical operators secondary, as they are useless without the proper understanding of the required and specific application of them.</p>
<p>It&#8217;s the same thing with Boolean search strings.</p>
<h2>Extended Boolean</h2>
<p>Beyond the 3 &#8220;standard&#8221; Boolean operators, there lies extended Boolean, which typically includes proximity operators and term weighting/boosting.</p>
<p>While not every search engine supports extended Boolean, those that do afford users the ability to dramatically increase the relevance of search results, effectively enabling user-defined semantic search.</p>
<h2>Semantic Search</h2>
<p>Semantic search can be defined as search techniques that leverage the actual meaning in words and phrases and can return results that more closely match the &#8220;meaning&#8221;  or intent of the search rather than simply returning results that match the words of the search.</p>
<p>The whole goal of searching databases, the Internet, social media, or other information systems is ostensibly to find people who have a high likelihood of being (or knowing!) a potential match for a hiring need that you have now, or will have in the future.</p>
<p>The more skill and ability you have in being able to craft and execute Boolean and extended Boolean search strings that find more of the right people more quickly, the more effective you can be as a Sourcer or Recruiter.</p>
<p>By &#8220;effective&#8221; I mean filling more positions with high quality talent while reducing time-to-fill.</p>
<p>More. Faster. Better.</p>
<p>Whenever I refer to &#8220;Boolean&#8221; in articles or even in the name of this blog, I&#8217;m actually referring to extended Boolean and user-defined semantic search as well as the basic Boolean operators.</p>
<h2>Query Modifiers</h2>
<p>Boolean search strings are often comprised of more than just search terms and Boolean operators.</p>
<p>There are also query modifiers, and depending on the search engine, they can include: *, &#8221; &#8220;, inurl:, ~, ( ), w/, and many more.</p>
<p>Anyone hoping or claiming to have a high degree of competence with sourcing not only has to have a solid command of the basic Boolean operators, but also how to leverage the available and appropriate query modifiers.</p>
<h2>Final Thoughts</h2>
<p>I use the term &#8220;Boolean Black Belt&#8221; to describe someone with a high degree of competence in the entire process of interpreting and understanding a specific talent need, determining what terms to include and/or exclude from searches and in what specific combination, crafting search strings making effective and appropriate use of Boolean operators, query modifiers, search terms, and semantic search techniques, the analysis of the relevance of the initial search results, and the adaptive process of learning from the results to further refine the Booleans to find a large quantity of highly relevant results &#8211; people who are highly likely to be (or know!) the right match for their hiring need.</p>
<p>I believe that when most people in sourcing and recruiting roles refer to &#8220;Boolean,&#8221; they are not simply referring to AND, OR, and NOT.</p>
<p>To say that mastering the use of Boolean search strings for talent identification is limited to the understanding of the functions of 3 Boolean operators would be ridiculous and an obvious sign of ignorance.</p>
<p>Most people would agree that Barack Obama is an excellent orator, yet he does not use words most people do not understand. For the most part, he uses common words that everyone is familiar with. But his ability as an orator cannot be defined by or limited to the common words he uses - it lies in how he organizes his thoughts and how he arranges and delivers his sentences to convey his indended meaning.</p>
<p>Most sculptors, golfers, jiu jitsu practitioners, and orators use the same tools, clubs, moves, and words. However, mastery does not come from the specific tools, clubs, movements, or words - it&#8217;s in the appropriate and effective APPLICATION of them, typically in response to a challenge or to achieve a specific goal.</p>
<p>Knowing what golf clubs are and how to swing them does not make you a world-class golfer. Having a good vocabulary does not make you an excellent public speaker. Knowing how to punch and kick will not ensure you can win any martial arts/MMA competitions. Owning a hammer and chisel does not make you a world-renowned sculptor.</p>
<p>Similarly, having a command of 3 Boolean operators does not ensure that you can understand the positions you are sourcing or recruiting for and effectively leverage electronic sources of human capital data (databases, ATS/CRM&#8217;s, social media, the Internet, job boards, etc.) to find more of the best candidates available for your hiring needs more quickly.</p>
<p>Nor does it define a Boolean Black Belt, if such a thing can or should exist.</p>
<p> <img src='http://www.booleanblackbelt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/10/what-is-a-boolean-black-belt-anyway/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Beyond Boolean Search: Proximity and Weighting</title>
		<link>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/</link>
		<comments>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/#comments</comments>
		<pubDate>Mon, 27 Jun 2011 13:00:17 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Bing]]></category>
		<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[beyond basic Boolean]]></category>
		<category><![CDATA[Boolean Search]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[NEAR Operator]]></category>
		<category><![CDATA[Proximity Search]]></category>
		<category><![CDATA[term weighting]]></category>
		<category><![CDATA[Text Operators]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=9017</guid>
		<description><![CDATA[Beyond Basic Boolean Most sourcing, recruiting, and staffing professionals are familiar with the basic Boolean operators of AND, OR, and NOT. However, I have found that few are familiar with what some refer to as “extended” Boolean functionality, such as proximity search and term weighting. Proximity and term weighting, where supported, are not actually logical [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F06%2Fbeyond-boolean-search-proximity-and-weighting%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F06%2Fbeyond-boolean-search-proximity-and-weighting%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/kipbot/2626903702/"><img class="alignright" title="Boolean word scramble" src="http://www.booleanblackbelt.com/wp-content/uploads/2008/11/boolean-word-scramble-by-kipbot-300x89.png" alt="" width="300" height="89" /></a></p>
<h2>Beyond Basic Boolean</h2>
<p>Most sourcing, recruiting, and staffing professionals are familiar with the basic Boolean operators of AND, OR, and NOT. However, I have found that few are familiar with what some refer to as “extended” Boolean functionality, such as <a title="More on proximity search" href="http://en.wikipedia.org/wiki/Proximity_search_%28text%29">proximity search</a> and term weighting.</p>
<p>Proximity and term weighting, where supported, are not actually logical (Boolean) operators &#8211; they are more accurately referred to as text or content operators.</p>
<p>Whatever you call them &#8211; extended Boolean or text operators &#8211; they offer sourcers and recruiters significantly more control, power and precision when executing searches, and in the hands of an expert, they can enable semantic search.<span id="more-9017"></span></p>
<h2>Relevance is Everything!</h2>
<p>When it comes to search &#8211; relevance rules.</p>
<p>Ultimately, any sourcing or recruiting professional knows that what’s most critical in running Boolean searches on LinkedIn, the Internet, a job board, or in an internal resume database is getting relevant results.</p>
<p>However, few people talk about exactly what determines relevance &#8211; and I think I know why.</p>
<p>According to Wikipedia, “<a title="Definition of relevance on Wikipedia" href="http://en.wikipedia.org/wiki/Relevance_(information_retrieval)" target="_blank">relevance</a>” denotes how well a retrieved set of documents (or a single document) meets the information need of the user.</p>
<p>The problem is that no search engine, social networking site, or database can &#8220;know&#8221; what is relevant to you &#8211; only <em><strong>you</strong></em> can determine how relevant results are because only you know what you were looking for in the first place!</p>
<p>For sourcing and recruiting, relevant results are typically defined as resumes or profiles of (or information about) potential candidates whose experience and capabilities closely match the hiring profile or job opening that the sourcer or recruiter is trying to find candidates for.</p>
<p>I’d argue that the value of any source of information (LinkedIn, a resume database, the Internet, etc.) lies less in the information contained within, and more in the ability of a user to extract out precisely and completely what the user needs – finding and retrieving any and all appropriately qualified candidates.</p>
<p>Information has no value to you if you are unable to find it and take action on it.</p>
<p>So how can extended Boolean help sourcers and recruiters find more relevant results?</p>
<p>Let’s take a look at proximity first.<img title="More..." src="http://www.booleanblackbelt.com/wp-includes/js/tinymce/plugins/wordpress/img/trans.gif" alt="" /></p>
<h2>Proximity Search</h2>
<p>Proximity search functionality enables a user to search for specific terms that are mentioned within a certain distance of other specific terms.</p>
<p>Being able to control how close search terms are to each other can be especially helpful when leveraging the structure of certain websites and pages &#8211; I&#8217;ll demonstrate this later in the post using LinkedIn and Twitter as examples.</p>
<p>In my opinion, the more powerful application of proximity search lies in the ability to perform natural language or semantic search.</p>
<p>Semantic search uses the science of meaning in language to produce highly relevant search results rather than have a user sort through a list of loosely related keyword results. Words that are close together are often in the same sentence, and when you can search for meaning at the sentence level, you can target people based on what they actually do/what their responsibilities have been.</p>
<p>Being able to target sentences in which people detail their specific responsibilities and level of responsibility is absurdly more powerful than basic keyword search (Level 1 Talent Mining), which is prone to low levels of relevance and false positives.</p>
<p>There are 3 main types of proximity searching: fixed proximity, variable proximity, and adjacency. For the purposes of this post – I will focus only on fixed and variable proximity.</p>
<h2>Fixed Proximity Search</h2>
<p>Fixed proximity is most commonly represented by the NEAR operator. The search engines that do recognize and support the NEAR operator typically define NEAR proximity as within 1 to 10 words (specific search engines can differ – check their documentation). Monster&#8217;s resume database supports the NEAR operator (which doesn&#8217;t have to be capitalized, btw) at a fixed distance of up to 10 words.</p>
<p>How could you leverage fixed proximity to find more relevant search results?</p>
<p>If you were looking for a Windows and Exchange administrator, any basic keyword and title search can pull tons of results of resumes that mention all of the search terms, as well as a high percentage of false positive results. False positive results in this example would be of resumes that mention all of the search terms and titles, but the people have never been primarily responsible for administering windows and exchange servers. A 1 year helpdesk professional can show up in these results because all they have to do is mention the keywords somewhere in their resume.</p>
<p>Leveraging fixed proximity, you could craft this (purposefully basic) search using the NEAR operator: Windows and Exchange NEAR admin* and server*.</p>
<p>That search will ONLY return results of resumes/profiles that mention Exchange within 1 to 10 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.).</p>
<p>Being able to control the fact that Exchange MUST be mentioned within close proximity to admin* will dramatically affect and improve the relevance of the search results, typically returning results of candidates who either have a title using both terms and/or candidates that talk about being responsible for Exchange administration.</p>
<div>Here are some examples of sentences from results that demonstrate the variety of relevant results that can be retrieved with the above search:</div>
<ul>
<li>Managed &amp; <strong>administered</strong> more than 300 <strong>Exchange Servers</strong></li>
<li>Provisioned &amp;<strong> administer</strong> multiple <strong>Exchange</strong> 5.5/2003 <strong>servers</strong></li>
<li>Not only are there <strong>administration</strong> duties for <strong>Exchange</strong> and Blackberry&#8230;</li>
<li><strong>Exchange</strong>/RightFax <strong>administrator</strong></li>
<li>Installing, Configuring, and <strong>Administering</strong> Microsoft <strong>Exchange</strong> 2000 <strong>Server</strong></li>
<li><strong>Administer</strong> a Microsoft <strong>Exchange</strong> 2003/2007 environment</li>
<li>8+ years of expertise as a System <strong>Administrator</strong> in Windows 2003 family, Windows 2000 family, MS <strong>Exchange</strong> 5.5, MS <strong>Exchange</strong> 2000, and <strong>Exchange</strong> 2003</li>
<li>I am proficient with the following skills; planning, installation and <strong>administration</strong> of <strong>Windows </strong>Active Directory, <strong>Windows Servers</strong>, <strong>Exchange Server</strong></li>
<li><strong>Windows Server</strong> Support, Active Directory,<strong>Exchange Server</strong> 2000, 2003<strong> administration</strong> and Blackberry <strong>Server administration</strong></li>
<li><strong>Administer Exchange </strong>2003 <strong>Server</strong> for corporate email</li>
</ul>
<p>As you can see, being able to control the proximity of specific search terms essentially increases the likelihood of returning results of candidates who have had administrative responsibility for Exchange servers, effectively increasing the relevance of the results, because that&#8217;s what we were actually trying to find and identify!</p>
<h2>Configurable Proximity</h2>
<p>A search engine that supports configurable proximity affords users the ability to precisely control the distance between specific search terms.</p>
<p>This can produce even more relevant results than the NEAR operator, because the NEAR operator’s maximum range of 10 can allow for some non-relevant results to be returned. The farther words are mentioned apart from each other, the less likely it is that they are semantically related. In fact, at a distance over 10 words, each word could easily be mentioned in separate bullet points or in separate sentences on a resume and be completely unrelated.</p>
<p>However, with configurable proximity, a sourcer or recruiter can choose the maximum distance between search terms.</p>
<p>Instead of being limited to a distance of 10 or fewer words, a search engine that allows for configurable proximity allows you to create searches that force terms to be quite close together &#8211; as close as you like.</p>
<p>For example, you could choose to search for only people who mention Exchange within 5 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.), regardless of order. A maximum distance of 5 words will dramatically increase the relevance of the search results because mentioning those 2 search terms at such a close range makes it more likely that they are mentioned in the same bullet point or sentence and thus more likely to be semantically related.</p>
<p>Essentially, this search will only return results of people who specifically mention something about being responsible for administering Exchange at least once in their resume. By employing this kind of search, a sourcer is actually performing a semantic search, targeting sentence-level meaning, as they are looking specifically for people who talk about having a particular responsibility – not just looking for documents that happen to contain the search terms.</p>
<h2>Leveraging Website and Page Structure with Proximity Search</h2>
<p>Once you have noticed a consistent pattern to the structure of certain websites and pages, you can use Internet search engines that support proximity search to target the distance between search terms to yield highly relevant search results.</p>
<p><a title="Did you know Google had an undocumented search operator specifically for proximity?" href="http://www.labnol.org/internet/google-around-search-operator/18251/">Although Google supposedly supports proximity search with their undocumented AROUND(x) search operator</a>, I have found its reliability to be suspect. Perhaps that&#8217;s why it&#8217;s not officially documented? <img src='http://www.booleanblackbelt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>The good news is that Bing&#8217;s configurable proximity search functionality of NEAR:x seems to work quite well and consistently.</p>
<p>To leverage the structure of certain websites such as LinkedIn, here is a quick example of how you can target current titles and companies when using Bing.</p>
<p><a title="Bing LinkedIn X-Ray search results for various types of engineers at Google." href="http://www.bing.com/search?q=site:linkedin.com+powered+current+near:3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;go=&amp;form=QBRE&amp;qs=n&amp;sk=">site:linkedin.com current near:3 “engineer at Google” “san francisco bay area”</a></p>
<p>In this query, all of the results must have the phrase &#8220;engineer at Google&#8221; within 3 words of the word &#8220;Current,&#8221; which is on every LinkedIn profile.</p>
<p>If you click on any of the <a title="You do check out cached results right? If not, you're missing out on multi-colored search result goodness!" href="http://cc.bingj.com/cache.aspx?q=site%3alinkedin.com+powered+current+near%3a3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;d=4522630854874848&amp;mkt=en-US&amp;setlang=en-US&amp;w=2fbb37b2,5324d474">cached results</a>, you can see how Bing happily returned results of people who have the phrase “engineer at Google” in their current title field:</p>
<p><a href="http://www.bing.com/search?q=site:linkedin.com+powered+current+near:3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;go=&amp;form=QBRE&amp;qs=n&amp;sk="><img title="Bing X-Ray search of LinkedIn using configurable proximity to search for Google engineers" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing3.png" alt="" width="372" height="170" /></a><br />
With Bing’s NEAR:x functionality, it is remarkably simple to X-Ray Twitter and target people in specific locations who mention specific titles and/or skill terms in their bios.<br />
For example, let’s say you wanted to find Twitter profiles of user experience professionals who live in the New York area. You could run a search like this on Bing to force the search engine to return only results that mention UX within 15 words of &#8220;Bio&#8221; and &#8220;New York&#8221; within 3 words of &#8220;Location:&#8221;</p>
<p><a title="Very good Bing X-Ray results from Twitter of UX pros in the New York area" href="http://www.bing.com/search?q=site%3Atwitter.com+bio+near%3A15+UX+location+near%3A3+new+york&amp;go=&amp;form=QBRE">site:twitter.com bio near:15 UX location near:3 new york</a></p>
<p>You can see how Bing’s proximity search helps you target terms in Twitter bios and location text:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing9.png"><img title="Bing9" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing9.png" alt="" width="600" height="362" /></a></p>
<p>Viewing a cached result displays Bing’s NEAR:x flawless execution:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/06/Bing10.png"><img title="Bing X-Ray search of Twitter using configurable proximity to find people who mention specific terms in their bios as well as live in a specific location" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/06/Bing10.png" alt="" width="191" height="186" /></a></p>
<p>How&#8217;s that for a relevant result?</p>
<p>Basically as good as it gets &#8211; I wanted someone who lives in the NY area who is a User Experience professional, and that&#8217;s exactly what I got! <em><strong>That</strong></em> is relevance!</p>
<p>Of course, <a title="You have to think outside the box to effectively search social networks like Twitter" href="http://www.booleanblackbelt.com/2009/04/searching-social-media-requires-outside-the-box-thinking/" target="_self">when searching Twitter, it is especially important to realize that people can be very creative in how they may describe themselves</a> (titles, skills, etc.), their experience, and their location – they can enter whatever they want.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing11.png"><img title="Bing11" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing11.png" alt="" width="182" height="123" /></a></p>
<p>As such, you could not find the above Twitter bio by searching only for &#8220;Drupal.&#8221;</p>
<h2>Performing Semantic Search with Configurable Proximity</h2>
<p>You can perform basic semantic search by targeting sentence-level meaning using Bing’s support of configurable proximity.</p>
<p>For example, let&#8217;s say you were searching for resumes on the Internet and wanted to find people who have had a specific responsibility, such as configuring juniper routers.</p>
<p>You could run a basic search like this: <a title="Bing search for resumes using configurable proximity to perform semantic, sentence-level search" href="http://www.bing.com/search?q=%28inurl%3Aresume+OR+intitle%3Aresume%29+configuring+near%3A5+juniper+juniper+near%3A5+routers&amp;go=&amp;form=QBRE">(inurl:resume OR intitle:resume) configuring near:5 juniper juniper near:5 routers</a></p>
<p>And see results like this:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing12.png"><img title="Bing12" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing12.png" alt="" width="587" height="112" /></a></p>
<p>Of course, there are many different ways to run that search – I only wanted to demonstrate the power of being able to control how close search terms are to each other, especially when targeting responsibilities, typically stated in verb/noun combinations. This allows you to perform semantic search <strong><em>at the sentence level</em></strong>.</p>
<p>Now that we&#8217;ve played around a bit with proximity search, let&#8217;s move onto the other half of extended Boolean &#8211; variable term weighting.</p>
<h2>Variable Term Weighting</h2>
<p>Talented sourcers and recruiters know that not all terms are equally important in a query.</p>
<p>In most queries and searches, certain search terms are more important than others. When running standard Boolean queries, all search terms are considered/weighted equally &#8211; and this is the stone that the makers of so-called semantic search applications often throw at Boolean search.</p>
<p>Unfortunately, many search engines and database search interfaces simply assign relevance to results by the number of search term “hits” in each document. In most cases, the simple frequency of search terms does not correlate to relevant results. This is where the derisive description “buzzword bingo” comes from, most often used to denote that there is little skill involved in running Boolean searches counting matched keywords.</p>
<p>Using an Information technology hiring profile as an example – if a sourcer was looking for candidates who have significant experience administering Windows servers and Exchange email servers they might create a simple Boolean query such as this: Windows AND Exchange AND server* and admin*.</p>
<p>That search is highly likely to return and rank candidates who are Windows systems administrators who mention Windows many times in their resume/profile and happen to mention Exchange once or twice as highly relevant because of the number of “hits” for Windows – which is by nature a very common term in resumes.</p>
<p>This would leave the sourcer with having to sort through a large volume of false positive results (that contain the keywords, but are not of people who have been primarily responsible for administering Windows and Exchange servers) to find the candidates who actually<em><strong> have</strong></em> been primarily responsible for administering Exchange servers as well as Windows servers.</p>
<p>Search engines that offer users the ability to assign different weights to each search term enable sourcers and recruiters to move beyond simple buzzword matching and take control of the relevance of the results. Essentially, with variable term weighting you can assign a number value to words to increase their weight when ranking retrieved documents – which does not change the total number of results, but the ORDER of the results.</p>
<p>Using the same example as above, a sourcer using a search engine that supports variable term weighting could create a Boolean search string to more heavily weight the term &#8220;Exchange.&#8221; That Boolean query would pull the same number of results as the first search that had no term weighting – however, it would sort and rank the results heavily favoring resumes/profiles that mention Exchange more often in relation to the other search terms, increasing the likelihood that the sourcer can quickly identify candidates who have had experience being responsible for administering and supporting Exchange servers.</p>
<p>By employing variable term weighting, you can positively affect the relevance of the search results.</p>
<h2>Final Thoughts</h2>
<p>Hopefully I&#8217;ve shed some light on how being able to control the proximity of two search terms can yield results that are FAR more relevant than results that simply mention the two terms anywhere in a document or form – this is the critical difference between the semantic similarity between a search and its results vs. the lexical similarity between a search and its results.</p>
<p>There are countless ways you can apply extended Boolean functionality such as variable term weighting and proximity searching to nearly any industry/hiring profile to create searches that return highly relevant results - results that are more relevant than those that can be achieved with standard Boolean logic.</p>
<p>Using a search engine that supports both variable proximity and variable term weighting can empower sourcers and recruiters to quickly find large volumes of highly relevant results, increasing productivity and achieving <a title="Learn more about the concept of Lean, Just In Time Sourcing and Recruiting" href="http://www.booleanblackbelt.com/2011/02/what-is-lean-just-in-time-recruiting/">Just-In-Time sourcing and recruiting</a>.</p>
<p>I wish the makers of search engines would seek less to &#8220;dummy-down&#8221; search interfaces and functionality and incorporate more powerful search capability that allows users to take significant control over the relevance of their search results.</p>
<p>There are a few search engines and ATS/CRM systems that support both configurable proximity search and variable term weighting.</p>
<p>Does yours?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Sourcers and Recruiters &#8211; Don&#8217;t Fear Watson or Semantic Search</title>
		<link>http://www.booleanblackbelt.com/2011/03/sourcers-and-recruiters-dont-fear-watson-or-semantic-search/</link>
		<comments>http://www.booleanblackbelt.com/2011/03/sourcers-and-recruiters-dont-fear-watson-or-semantic-search/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 13:00:09 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Artificial Intelligence Matching]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[Talent Intelligence]]></category>
		<category><![CDATA[Artificial Intelligence Resume Matching]]></category>
		<category><![CDATA[The Future of Recruiting]]></category>
		<category><![CDATA[The Future of Sourcing]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=8497</guid>
		<description><![CDATA[I&#8217;ve read a few articles recently talking about IBM&#8217;s Watson and how the technology they developed may be the harbinger of unemployment for people in many professions. Here&#8217;s one from Fortune magazine, asking if IBM&#8217;s Watson will put your job in jeopardy. Here&#8217;s another suggesting that those who train others in Internet, social media, ATS, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F03%2Fsourcers-and-recruiters-dont-fear-watson-or-semantic-search%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F03%2Fsourcers-and-recruiters-dont-fear-watson-or-semantic-search%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/IBM-Watson21-e1301244151715.jpg"><img class="alignright size-full wp-image-8636" title="IBM Watson wants your job :-)" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/03/IBM-Watson21-e1301244151715.jpg" alt="" width="200" height="181" /></a></p>
<p>I&#8217;ve read a few articles recently talking about IBM&#8217;s Watson and how the technology they developed may be the <a title="one that presages or foreshadows what is to come" href="http://www.merriam-webster.com/dictionary/harbinger">harbinger</a> of unemployment for people in many professions.</p>
<p>Here&#8217;s <a title="Will IBM's Watson be putting you out of a job?" href="http://management.fortune.cnn.com/2011/02/15/will-ibm%E2%80%99s-watson-put-your-job-in-jeopardy/">one</a> from Fortune magazine, <a title="Will IBM's Watson put your job in jeopardy?" href="http://management.fortune.cnn.com/2011/02/15/will-ibm%E2%80%99s-watson-put-your-job-in-jeopardy/">asking if IBM&#8217;s Watson will put your job in jeopardy</a>.</p>
<p>Here&#8217;s <a title="I was not aware that there was a &quot;Boolean Cash Cow&quot; I've certainly never seen one, let alone profited from one. :-)" href="http://www.fistfuloftalent.com/2011/03/21411-the-day-ibms-watson-tapped-out-the-boolean-cash-cow.html">another</a> suggesting that those who train others in Internet, social media, ATS, and resume database sourcing techniques and strategies will be eventually eliminated by semantic search solutions.</p>
<h2>Watson Winning at Jeopardy isn&#8217;t Surprising</h2>
<p>First, let&#8217;s first recognize that it&#8217;s an apples to oranges comparison between Jeopardy and sourcing/recruiting.<span id="more-8497"></span></p>
<p>The ability to quickly research and answer trivia questions (or provide questions for the answers, in the case of Jeopardy) is a far cry from having to boil a hiring need (skills, capabilities, and specific responsibilities in specific industries and environments) down to a series of queries to mine flawed and incomplete human capital data (i.e., resumes and social media profiles) in order to return people who have a high probability of not only being qualified for the position, but also interested in the job (i.e. &#8220;recruitable&#8221;).</p>
<p>With trivia, all of the facts and information are readily accessible, completed and identifiable on the Internet, or in the case of Watson, saved on a multi-TB hard drive array.</p>
<p>It&#8217;s not really shocking that a highly specialized $900,000,000 to $1,800,000,000 (<a title="Watson wasn't cheap!" href="http://money.cnn.com/galleries/2010/technology/1008/gallery.biggest_tech_gambles/3.html?iid=EL">estimated 3 year cost of developing Watson</a>) NLP (Natural Language Processing) computer can sort through 200 million pages of structured and unstructured content, including the full text of Wikipedia, to retrieve information faster than a human relying on memory alone.</p>
<p>Why is anyone surprised that Watson spanked people?</p>
<p>I wasn&#8217;t.</p>
<p>However, I can&#8217;t pass up the opportunity to point out that Watson did make mistakes &#8211; <a title="You can see evidence of the mistake here on Flickr" href="http://www.flickr.com/photos/ken_duffy/5452548946/">here&#8217;s one example in which Watson thought the answer was &#8220;Who is Picasso?&#8221; when the correct answer was &#8220;What is modern art?&#8221;</a></p>
<p>Who knew that the err is human, as well as inhuman?</p>
<p> <img src='http://www.booleanblackbelt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<h2>The Unique Challenge of Human Capital Data</h2>
<p>Unlike finding the answers to trivia questions, when it comes to finding and identifying qualified and talented people based on their resumes and social media profiles and updates, <strong><em>the information is often incomplete, and in many cases, critical bits of identifying data are simply not present.</em></strong></p>
<p>For example, how do you find someone with <a title="This is a real-world example of a challenge that one of my recruiters is tackling now" href="http://en.wikipedia.org/wiki/Spring_Framework">Spring MVC</a> experience when many people don&#8217;t mention it on their resume, nor on LinkedIn, Twitter, blogs, etc.?</p>
<p>I recently gave the world <a title="When it comes to searching LinkedIn, you don't know what you're missing - literally!" href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">a tiny glimpse into the Dark Matter of LinkedIn</a> &#8211; direct keyword, title, and even concept/relational search methods, used by humans or algorithms, can only retrieve results based on existing text.</p>
<p>Quite simply &#8211; if the text isn&#8217;t there to be retrieved or analyzed, a semantic search/NLP algorithm can&#8217;t do anything with it.</p>
<p>Good sourcers really do &#8220;read between the lines&#8221; of both the job description and requirements as well as the human capital data they are searching for and analyzing.</p>
<p>There is <strong><em>much more</em></strong> to high-level sourcing than keyword and title search/match.</p>
<p>There have been semantic solutions on the market for quite some time that can do keyword, title and concept matching reasonably well (as well as some that claim to, but don&#8217;t). The issue with those solutions that no one seems to (or wants to) realize is that they have limitations &#8211; they find some matches, exclude some, and bury others.</p>
<p><strong><em>The real question is who, how, and why are some matches found and ranked highly, while others are excluded, and others ranked lowly but actually represent the best talent?</em></strong></p>
<h2>What Do I Know?</h2>
<p>I have hands-on, practical experience (read: trying to find people to fill real jobs) with many of the &#8220;top shelf&#8221; semantic search applications out there, specifically designed for human capital data, so when I write or speak on the matter of semantic search, I&#8217;m not throwing around empty opinions.</p>
<p>I&#8217;ve seen what these solutions can do, and I&#8217;ve also directly experienced their limitations, including what they simply can&#8217;t do.</p>
<p>Unlike many people who write on the subject of semantic search, I have to personally find people and help others find and recruit talented, qualified candidates with highly specialized skills and experience within 24-48 hours of receiving a client request <strong><em>on a daily basis</em></strong>. If semantic search solutions (including the one I have access to) could speed up that process and help me find more and better candidates faster &#8211; trust me, I would use them!</p>
<p>I&#8217;ve witnessed a sourcer with 9 months of total experience find better qualified matches (and faster) than a big-name semantic search solution in front of one of the senior technical managers responsible for developing the product. It was eye-opening and even somewhat confusing for them, to say the least.</p>
<p>I&#8217;ve also spoken with sourcing/recruiting managers at Fortune 500 companies who have evaluated leading semantic search solutions and they passed on purchasing them because the solutions did not find more and better results faster than their sourcing/recruiting team.</p>
<p>Ultimately, it&#8217;s not about humans vs. technology &#8211; it&#8217;s about results.</p>
<h2>The Solution is Part of the Problem</h2>
<p>I&#8217;ve found the creators of semantic search products don&#8217;t seem to like it nor do they seem to really listen when you point out the flaws and limitations of their creations &#8211; and I&#8217;ve had exchanges with people who hold patents in this space.</p>
<p>I&#8217;ve also gotten the sense from talking with semantic search solutions providers that some of these folks believe that sourcers, recruiters and HR professionals don&#8217;t (and/or can&#8217;t!) really understand semantic search and more complex information retrieval strategies.</p>
<p>To their credit, if their perception (based on experience or otherwise) is that recruiters and HR professionals struggle with Boolean search &#8211; <strong><em>the most basic query &#8220;language&#8221;</em></strong> &#8211; why wouldn&#8217;t they assume that the average recruiter could not possibly understand and appreciate what&#8217;s going on &#8220;under the hood&#8221; of semantic search solutions?</p>
<p>However, it is folly to apply that stereotype to all sourcing, recruiting and HR professionals &#8211; there are plenty of us who actually know more about the specific challenges posed by human capital data and the practical needs and concerns of recruiting organizations than the people who are developing the solutions that we are supposed to intrinsically trust to automatically find the best people available.</p>
<p>I don&#8217;t hate &#8211; I appreciate semantic search. I simply want these solutions to live up to their hype. Semantic search vendors &#8211; listen to your current and potential customers &#8211; they just want your product to work better!</p>
<p>I&#8217;d like to extend an open invitation to any semantic search/NLP vendor &#8211; I will happily evaluate your product and make suggestions for improvements&#8230;for free! If you&#8217;re very confident in your solution, I&#8217;ll also write a review online. If you&#8217;d rather not have your product exposed publicly, I can also evaluate products privately. I really do want to accelerate the efficacy of semantic search applications for sourcing and recruiting!</p>
<p>I also want to educate others who may be buying these kinds of solutions so they are more knowledgeable and informed as to the pros and cons, capabilities and limitations of these solutions, and not sold simply on impressive sales pitches, techno-speak and &#8220;see how many results?&#8221; demonstrations. If you&#8217;re a potential customer of semantic search solutions, please be sure to include your best sourcers/recruiters in the evaluation process &#8211; if the only people who are evaluating a semantic search solution are HR, management, and procurement professionals who don&#8217;t actually search for top talent on a daily basis and won&#8217;t be using the proposed solution, you can easily be sold on a product that doesn&#8217;t actually work as well as you might think based on the sales presentation.</p>
<p>If you&#8217;re looking to buy a new flat screen TV or car, <strong><em>anyone</em></strong> can read reviews online, test drive them and compare them to competing products.  I find it interesting (and telling!) that you can&#8217;t do the same thing when it comes to recruiting and HR software.</p>
<p>When you buy a house &#8211; you get it inspected by a specialized professional before you buy it so you really know what you&#8217;re getting beneath the surface. Before you buy a semantic search solution, you should have it evaluated by a person who specializes in human capital information retrieval (who is also ideally a neutral third party!).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/03/sourcers-and-recruiters-dont-fear-watson-or-semantic-search/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
	</channel>
</rss>

