<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Boolean Black Belt-Sourcing/Recruiting &#187; Boolean Logic</title>
	<atom:link href="http://www.booleanblackbelt.com/category/boolean-logic/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.booleanblackbelt.com</link>
	<description>Leveraging LinkedIn, Twitter, Social Media, Resume Databases, and the Internet for Sourcing and Recruiting</description>
	<lastBuildDate>Mon, 30 Jan 2012 14:00:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Talent Sourcing: Man vs. AI/Black Box Semantic Search</title>
		<link>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/</link>
		<comments>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/#comments</comments>
		<pubDate>Mon, 09 Jan 2012 14:00:58 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Artificial Intelligence Matching]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Dark Matter]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Future of Sourcing and Recruiting]]></category>
		<category><![CDATA[HCDIR]]></category>
		<category><![CDATA[Human Capital Data]]></category>
		<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Recruiting Technology]]></category>
		<category><![CDATA[Resume Sourcing]]></category>
		<category><![CDATA[Search Automation]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[Sourcing]]></category>
		<category><![CDATA[Sourcing Automation]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Boolean Black Belt]]></category>
		<category><![CDATA[dtSearch]]></category>
		<category><![CDATA[Glen Cathey]]></category>
		<category><![CDATA[hcdir]]></category>
		<category><![CDATA[Human Capital]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[matching solutions]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[Recruiting]]></category>
		<category><![CDATA[Resume Matching]]></category>
		<category><![CDATA[resume parsing]]></category>
		<category><![CDATA[Semantic Clustering]]></category>
		<category><![CDATA[Sourcing solutions]]></category>
		<category><![CDATA[Talent Identification]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=10315</guid>
		<description><![CDATA[Back in March 2010, I had the distinct honor of delivering the keynote presentation at SourceCon on the topic of resume search and match solutions claiming to use artificial intelligence in comparison with people using their natural intelligence for talent discovery and identification. Now that nearly 2 years has passed, and given that in that [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Ftalent-sourcing-man-vs-aiblack-box-semantic-search%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2012%2F01%2Ftalent-sourcing-man-vs-aiblack-box-semantic-search%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/AI_Brain.png"><img class="alignright  wp-image-10319" title="Talent Sourcing and Matching: Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing Capability." src="http://www.booleanblackbelt.com/wp-content/uploads/2012/01/AI_Brain.png" alt="" width="219" height="239" /></a>Back in March 2010, I had the distinct honor of delivering the keynote presentation at <a title="Sourcing News and Knowledge - Beyond the Obvious." href="http://www.sourcecon.com/">SourceCon</a> on the topic of resume search and match solutions claiming to use artificial intelligence in comparison with people using their natural intelligence for talent discovery and identification.</p>
<p>Now that nearly 2 years has passed, and given that in that time I&#8217;ve had even more hands-on experience with a number of the top AI/semantic search applications available (I won&#8217;t be naming names, sorry), I decided it was time to revisit the topic which I am <em><strong>very</strong></em> passionate about.</p>
<p>If you&#8217;ve ever been curious about semantic search applications that &#8220;do the work for you&#8221; when it comes to finding potential candidates, you&#8217;re in the right place, because I&#8217;ve updated the slide deck and published it to Slideshare. Here&#8217;s what you&#8217;ll find in the 86 slide presentation:</p>
<ul>
<li>A deep dive into the deceptively simple challenge of sourcing talent via human capital data (resumes, social network profiles, etc.)</li>
<li>How resume and LinkedIn profile sourcing and matching solutions claiming to use artificial intelligence, semantic search, and <a title="Natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence.[1] In theory, natural language processing is a very attractive method of human–computer interaction. Natural language understanding is sometimes referred to as an AI-complete problem because it seems to require extensive knowledge about the outside world and the ability to manipulate it." href="http://en.wikipedia.org/wiki/Natural_language_processing">NLP</a> actually work and achieve their claims</li>
<li>The pros, cons, and limitations of automated/<a title="A black box is a device, system or object which can be viewed solely in terms of its input, output and transfer characteristics without any knowledge of its internal workings. For resume search and match, a black box solution gives you no understanding of exactly WHY it's returned certain results or considers them relevant" href="http://en.wikipedia.org/wiki/Black_box">black box</a> matching solutions</li>
<li>An insightful (and funny!) video of <a title="Dr. Michio Kaku is a theoretical physicist, best-selling author, and popularizer of science. He’s the co-founder of string field theory (a branch of string theory), and continues Einstein’s search to unite the four fundamental forces of nature into one unified theory." href="http://mkaku.org/home/?page_id=5">Dr. Michio Kaku</a> and his thoughts on the limitations of artificial intelligence</li>
<li>Examples of what sourcers and recruiters can do that even the most advanced automated search and match algorithms can’t do</li>
<li>The concept of Human Capital Data <a title="To any sourcer or recruiter not still in the Stone Age, this should sound like a really good description of what you do when you use any sort of technology to find people or information about people: Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web. " href="http://en.wikipedia.org/wiki/Information_retrieval">Information Retrieval</a> and Analysis (HCDIR &amp; A)</li>
<li>Boolean and <a title="Extended Boolean typically incorporates the ability to weight each term in a Boolean search string, allowing the searcher to choose which terms are the most relevant, as well as configurable proximity - the ability to specify how close search terms are to each other, which enables powerful semantic search at the sentence level. " href="https://www.google.com/search?aq=f&amp;sourceid=chrome&amp;ie=UTF-8&amp;q=extended+Boolean">extended Boolean</a></li>
<li>Semantic search</li>
<li>Dynamic inference</li>
<li><a title="Dark Matter is a term I use to describe resumes, LinkedIn profiles, and other human capital data that exists to be found, but cannot be retrieved through direct or conventional search methods." href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">Dark Matter</a> resumes and social network profiles</li>
<li>What I believe to be the ideal resume search and matching solution</li>
</ul>
<div>Enjoy, and let me know your thoughts.</div>
<div id="__ss_10891808" style="width: 595px;">
<p><strong style="display: block; margin: 12px 0 4px;"><a title="Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing" href="http://www.slideshare.net/glencathey/talent-sourcing-and-matching-artificial-intelligence-and-black-box-semantic-search-vs-human-cognition-and-sourcing" target="_blank">Talent Sourcing and Matching &#8211; Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing</a></strong> <iframe src="http://www.slideshare.net/slideshow/embed_code/10891808" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="595" height="497"></iframe></p>
<div style="padding: 5px 0 12px;">View more <a href="http://www.slideshare.net/" target="_blank">presentations</a> from <a href="http://www.slideshare.net/glencathey" target="_blank">Glen Cathey</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2012/01/talent-sourcing-man-vs-aiblack-box-semantic-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Beyond Boolean Search: Proximity and Weighting</title>
		<link>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/</link>
		<comments>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/#comments</comments>
		<pubDate>Mon, 27 Jun 2011 13:00:17 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Bing]]></category>
		<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Extended Boolean]]></category>
		<category><![CDATA[Semantic Search]]></category>
		<category><![CDATA[beyond basic Boolean]]></category>
		<category><![CDATA[Boolean Search]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[NEAR Operator]]></category>
		<category><![CDATA[Proximity Search]]></category>
		<category><![CDATA[term weighting]]></category>
		<category><![CDATA[Text Operators]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=9017</guid>
		<description><![CDATA[Beyond Basic Boolean Most sourcing, recruiting, and staffing professionals are familiar with the basic Boolean operators of AND, OR, and NOT. However, I have found that few are familiar with what some refer to as “extended” Boolean functionality, such as proximity search and term weighting. Proximity and term weighting, where supported, are not actually logical [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F06%2Fbeyond-boolean-search-proximity-and-weighting%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F06%2Fbeyond-boolean-search-proximity-and-weighting%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/kipbot/2626903702/"><img class="alignright" title="Boolean word scramble" src="http://www.booleanblackbelt.com/wp-content/uploads/2008/11/boolean-word-scramble-by-kipbot-300x89.png" alt="" width="300" height="89" /></a></p>
<h2>Beyond Basic Boolean</h2>
<p>Most sourcing, recruiting, and staffing professionals are familiar with the basic Boolean operators of AND, OR, and NOT. However, I have found that few are familiar with what some refer to as “extended” Boolean functionality, such as <a title="More on proximity search" href="http://en.wikipedia.org/wiki/Proximity_search_%28text%29">proximity search</a> and term weighting.</p>
<p>Proximity and term weighting, where supported, are not actually logical (Boolean) operators &#8211; they are more accurately referred to as text or content operators.</p>
<p>Whatever you call them &#8211; extended Boolean or text operators &#8211; they offer sourcers and recruiters significantly more control, power and precision when executing searches, and in the hands of an expert, they can enable semantic search.<span id="more-9017"></span></p>
<h2>Relevance is Everything!</h2>
<p>When it comes to search &#8211; relevance rules.</p>
<p>Ultimately, any sourcing or recruiting professional knows that what’s most critical in running Boolean searches on LinkedIn, the Internet, a job board, or in an internal resume database is getting relevant results.</p>
<p>However, few people talk about exactly what determines relevance &#8211; and I think I know why.</p>
<p>According to Wikipedia, “<a title="Definition of relevance on Wikipedia" href="http://en.wikipedia.org/wiki/Relevance_(information_retrieval)" target="_blank">relevance</a>” denotes how well a retrieved set of documents (or a single document) meets the information need of the user.</p>
<p>The problem is that no search engine, social networking site, or database can &#8220;know&#8221; what is relevant to you &#8211; only <em><strong>you</strong></em> can determine how relevant results are because only you know what you were looking for in the first place!</p>
<p>For sourcing and recruiting, relevant results are typically defined as resumes or profiles of (or information about) potential candidates whose experience and capabilities closely match the hiring profile or job opening that the sourcer or recruiter is trying to find candidates for.</p>
<p>I’d argue that the value of any source of information (LinkedIn, a resume database, the Internet, etc.) lies less in the information contained within, and more in the ability of a user to extract out precisely and completely what the user needs – finding and retrieving any and all appropriately qualified candidates.</p>
<p>Information has no value to you if you are unable to find it and take action on it.</p>
<p>So how can extended Boolean help sourcers and recruiters find more relevant results?</p>
<p>Let’s take a look at proximity first.<img title="More..." src="http://www.booleanblackbelt.com/wp-includes/js/tinymce/plugins/wordpress/img/trans.gif" alt="" /></p>
<h2>Proximity Search</h2>
<p>Proximity search functionality enables a user to search for specific terms that are mentioned within a certain distance of other specific terms.</p>
<p>Being able to control how close search terms are to each other can be especially helpful when leveraging the structure of certain websites and pages &#8211; I&#8217;ll demonstrate this later in the post using LinkedIn and Twitter as examples.</p>
<p>In my opinion, the more powerful application of proximity search lies in the ability to perform natural language or semantic search.</p>
<p>Semantic search uses the science of meaning in language to produce highly relevant search results rather than have a user sort through a list of loosely related keyword results. Words that are close together are often in the same sentence, and when you can search for meaning at the sentence level, you can target people based on what they actually do/what their responsibilities have been.</p>
<p>Being able to target sentences in which people detail their specific responsibilities and level of responsibility is absurdly more powerful than basic keyword search (Level 1 Talent Mining), which is prone to low levels of relevance and false positives.</p>
<p>There are 3 main types of proximity searching: fixed proximity, variable proximity, and adjacency. For the purposes of this post – I will focus only on fixed and variable proximity.</p>
<h2>Fixed Proximity Search</h2>
<p>Fixed proximity is most commonly represented by the NEAR operator. The search engines that do recognize and support the NEAR operator typically define NEAR proximity as within 1 to 10 words (specific search engines can differ – check their documentation). Monster&#8217;s resume database supports the NEAR operator (which doesn&#8217;t have to be capitalized, btw) at a fixed distance of up to 10 words.</p>
<p>How could you leverage fixed proximity to find more relevant search results?</p>
<p>If you were looking for a Windows and Exchange administrator, any basic keyword and title search can pull tons of results of resumes that mention all of the search terms, as well as a high percentage of false positive results. False positive results in this example would be of resumes that mention all of the search terms and titles, but the people have never been primarily responsible for administering windows and exchange servers. A 1 year helpdesk professional can show up in these results because all they have to do is mention the keywords somewhere in their resume.</p>
<p>Leveraging fixed proximity, you could craft this (purposefully basic) search using the NEAR operator: Windows and Exchange NEAR admin* and server*.</p>
<p>That search will ONLY return results of resumes/profiles that mention Exchange within 1 to 10 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.).</p>
<p>Being able to control the fact that Exchange MUST be mentioned within close proximity to admin* will dramatically affect and improve the relevance of the search results, typically returning results of candidates who either have a title using both terms and/or candidates that talk about being responsible for Exchange administration.</p>
<div>Here are some examples of sentences from results that demonstrate the variety of relevant results that can be retrieved with the above search:</div>
<ul>
<li>Managed &amp; <strong>administered</strong> more than 300 <strong>Exchange Servers</strong></li>
<li>Provisioned &amp;<strong> administer</strong> multiple <strong>Exchange</strong> 5.5/2003 <strong>servers</strong></li>
<li>Not only are there <strong>administration</strong> duties for <strong>Exchange</strong> and Blackberry&#8230;</li>
<li><strong>Exchange</strong>/RightFax <strong>administrator</strong></li>
<li>Installing, Configuring, and <strong>Administering</strong> Microsoft <strong>Exchange</strong> 2000 <strong>Server</strong></li>
<li><strong>Administer</strong> a Microsoft <strong>Exchange</strong> 2003/2007 environment</li>
<li>8+ years of expertise as a System <strong>Administrator</strong> in Windows 2003 family, Windows 2000 family, MS <strong>Exchange</strong> 5.5, MS <strong>Exchange</strong> 2000, and <strong>Exchange</strong> 2003</li>
<li>I am proficient with the following skills; planning, installation and <strong>administration</strong> of <strong>Windows </strong>Active Directory, <strong>Windows Servers</strong>, <strong>Exchange Server</strong></li>
<li><strong>Windows Server</strong> Support, Active Directory,<strong>Exchange Server</strong> 2000, 2003<strong> administration</strong> and Blackberry <strong>Server administration</strong></li>
<li><strong>Administer Exchange </strong>2003 <strong>Server</strong> for corporate email</li>
</ul>
<p>As you can see, being able to control the proximity of specific search terms essentially increases the likelihood of returning results of candidates who have had administrative responsibility for Exchange servers, effectively increasing the relevance of the results, because that&#8217;s what we were actually trying to find and identify!</p>
<h2>Configurable Proximity</h2>
<p>A search engine that supports configurable proximity affords users the ability to precisely control the distance between specific search terms.</p>
<p>This can produce even more relevant results than the NEAR operator, because the NEAR operator’s maximum range of 10 can allow for some non-relevant results to be returned. The farther words are mentioned apart from each other, the less likely it is that they are semantically related. In fact, at a distance over 10 words, each word could easily be mentioned in separate bullet points or in separate sentences on a resume and be completely unrelated.</p>
<p>However, with configurable proximity, a sourcer or recruiter can choose the maximum distance between search terms.</p>
<p>Instead of being limited to a distance of 10 or fewer words, a search engine that allows for configurable proximity allows you to create searches that force terms to be quite close together &#8211; as close as you like.</p>
<p>For example, you could choose to search for only people who mention Exchange within 5 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.), regardless of order. A maximum distance of 5 words will dramatically increase the relevance of the search results because mentioning those 2 search terms at such a close range makes it more likely that they are mentioned in the same bullet point or sentence and thus more likely to be semantically related.</p>
<p>Essentially, this search will only return results of people who specifically mention something about being responsible for administering Exchange at least once in their resume. By employing this kind of search, a sourcer is actually performing a semantic search, targeting sentence-level meaning, as they are looking specifically for people who talk about having a particular responsibility – not just looking for documents that happen to contain the search terms.</p>
<h2>Leveraging Website and Page Structure with Proximity Search</h2>
<p>Once you have noticed a consistent pattern to the structure of certain websites and pages, you can use Internet search engines that support proximity search to target the distance between search terms to yield highly relevant search results.</p>
<p><a title="Did you know Google had an undocumented search operator specifically for proximity?" href="http://www.labnol.org/internet/google-around-search-operator/18251/">Although Google supposedly supports proximity search with their undocumented AROUND(x) search operator</a>, I have found its reliability to be suspect. Perhaps that&#8217;s why it&#8217;s not officially documented? <img src='http://www.booleanblackbelt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>The good news is that Bing&#8217;s configurable proximity search functionality of NEAR:x seems to work quite well and consistently.</p>
<p>To leverage the structure of certain websites such as LinkedIn, here is a quick example of how you can target current titles and companies when using Bing.</p>
<p><a title="Bing LinkedIn X-Ray search results for various types of engineers at Google." href="http://www.bing.com/search?q=site:linkedin.com+powered+current+near:3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;go=&amp;form=QBRE&amp;qs=n&amp;sk=">site:linkedin.com current near:3 “engineer at Google” “san francisco bay area”</a></p>
<p>In this query, all of the results must have the phrase &#8220;engineer at Google&#8221; within 3 words of the word &#8220;Current,&#8221; which is on every LinkedIn profile.</p>
<p>If you click on any of the <a title="You do check out cached results right? If not, you're missing out on multi-colored search result goodness!" href="http://cc.bingj.com/cache.aspx?q=site%3alinkedin.com+powered+current+near%3a3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;d=4522630854874848&amp;mkt=en-US&amp;setlang=en-US&amp;w=2fbb37b2,5324d474">cached results</a>, you can see how Bing happily returned results of people who have the phrase “engineer at Google” in their current title field:</p>
<p><a href="http://www.bing.com/search?q=site:linkedin.com+powered+current+near:3+%22engineer+at+Google%22+%22san+francisco+bay+area%22&amp;go=&amp;form=QBRE&amp;qs=n&amp;sk="><img title="Bing X-Ray search of LinkedIn using configurable proximity to search for Google engineers" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing3.png" alt="" width="372" height="170" /></a><br />
With Bing’s NEAR:x functionality, it is remarkably simple to X-Ray Twitter and target people in specific locations who mention specific titles and/or skill terms in their bios.<br />
For example, let’s say you wanted to find Twitter profiles of user experience professionals who live in the New York area. You could run a search like this on Bing to force the search engine to return only results that mention UX within 15 words of &#8220;Bio&#8221; and &#8220;New York&#8221; within 3 words of &#8220;Location:&#8221;</p>
<p><a title="Very good Bing X-Ray results from Twitter of UX pros in the New York area" href="http://www.bing.com/search?q=site%3Atwitter.com+bio+near%3A15+UX+location+near%3A3+new+york&amp;go=&amp;form=QBRE">site:twitter.com bio near:15 UX location near:3 new york</a></p>
<p>You can see how Bing’s proximity search helps you target terms in Twitter bios and location text:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing9.png"><img title="Bing9" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing9.png" alt="" width="600" height="362" /></a></p>
<p>Viewing a cached result displays Bing’s NEAR:x flawless execution:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2011/06/Bing10.png"><img title="Bing X-Ray search of Twitter using configurable proximity to find people who mention specific terms in their bios as well as live in a specific location" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/06/Bing10.png" alt="" width="191" height="186" /></a></p>
<p>How&#8217;s that for a relevant result?</p>
<p>Basically as good as it gets &#8211; I wanted someone who lives in the NY area who is a User Experience professional, and that&#8217;s exactly what I got! <em><strong>That</strong></em> is relevance!</p>
<p>Of course, <a title="You have to think outside the box to effectively search social networks like Twitter" href="http://www.booleanblackbelt.com/2009/04/searching-social-media-requires-outside-the-box-thinking/" target="_self">when searching Twitter, it is especially important to realize that people can be very creative in how they may describe themselves</a> (titles, skills, etc.), their experience, and their location – they can enter whatever they want.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing11.png"><img title="Bing11" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing11.png" alt="" width="182" height="123" /></a></p>
<p>As such, you could not find the above Twitter bio by searching only for &#8220;Drupal.&#8221;</p>
<h2>Performing Semantic Search with Configurable Proximity</h2>
<p>You can perform basic semantic search by targeting sentence-level meaning using Bing’s support of configurable proximity.</p>
<p>For example, let&#8217;s say you were searching for resumes on the Internet and wanted to find people who have had a specific responsibility, such as configuring juniper routers.</p>
<p>You could run a basic search like this: <a title="Bing search for resumes using configurable proximity to perform semantic, sentence-level search" href="http://www.bing.com/search?q=%28inurl%3Aresume+OR+intitle%3Aresume%29+configuring+near%3A5+juniper+juniper+near%3A5+routers&amp;go=&amp;form=QBRE">(inurl:resume OR intitle:resume) configuring near:5 juniper juniper near:5 routers</a></p>
<p>And see results like this:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing12.png"><img title="Bing12" src="http://www.booleanblackbelt.com/wp-content/uploads/2010/12/Bing12.png" alt="" width="587" height="112" /></a></p>
<p>Of course, there are many different ways to run that search – I only wanted to demonstrate the power of being able to control how close search terms are to each other, especially when targeting responsibilities, typically stated in verb/noun combinations. This allows you to perform semantic search <strong><em>at the sentence level</em></strong>.</p>
<p>Now that we&#8217;ve played around a bit with proximity search, let&#8217;s move onto the other half of extended Boolean &#8211; variable term weighting.</p>
<h2>Variable Term Weighting</h2>
<p>Talented sourcers and recruiters know that not all terms are equally important in a query.</p>
<p>In most queries and searches, certain search terms are more important than others. When running standard Boolean queries, all search terms are considered/weighted equally &#8211; and this is the stone that the makers of so-called semantic search applications often throw at Boolean search.</p>
<p>Unfortunately, many search engines and database search interfaces simply assign relevance to results by the number of search term “hits” in each document. In most cases, the simple frequency of search terms does not correlate to relevant results. This is where the derisive description “buzzword bingo” comes from, most often used to denote that there is little skill involved in running Boolean searches counting matched keywords.</p>
<p>Using an Information technology hiring profile as an example – if a sourcer was looking for candidates who have significant experience administering Windows servers and Exchange email servers they might create a simple Boolean query such as this: Windows AND Exchange AND server* and admin*.</p>
<p>That search is highly likely to return and rank candidates who are Windows systems administrators who mention Windows many times in their resume/profile and happen to mention Exchange once or twice as highly relevant because of the number of “hits” for Windows – which is by nature a very common term in resumes.</p>
<p>This would leave the sourcer with having to sort through a large volume of false positive results (that contain the keywords, but are not of people who have been primarily responsible for administering Windows and Exchange servers) to find the candidates who actually<em><strong> have</strong></em> been primarily responsible for administering Exchange servers as well as Windows servers.</p>
<p>Search engines that offer users the ability to assign different weights to each search term enable sourcers and recruiters to move beyond simple buzzword matching and take control of the relevance of the results. Essentially, with variable term weighting you can assign a number value to words to increase their weight when ranking retrieved documents – which does not change the total number of results, but the ORDER of the results.</p>
<p>Using the same example as above, a sourcer using a search engine that supports variable term weighting could create a Boolean search string to more heavily weight the term &#8220;Exchange.&#8221; That Boolean query would pull the same number of results as the first search that had no term weighting – however, it would sort and rank the results heavily favoring resumes/profiles that mention Exchange more often in relation to the other search terms, increasing the likelihood that the sourcer can quickly identify candidates who have had experience being responsible for administering and supporting Exchange servers.</p>
<p>By employing variable term weighting, you can positively affect the relevance of the search results.</p>
<h2>Final Thoughts</h2>
<p>Hopefully I&#8217;ve shed some light on how being able to control the proximity of two search terms can yield results that are FAR more relevant than results that simply mention the two terms anywhere in a document or form – this is the critical difference between the semantic similarity between a search and its results vs. the lexical similarity between a search and its results.</p>
<p>There are countless ways you can apply extended Boolean functionality such as variable term weighting and proximity searching to nearly any industry/hiring profile to create searches that return highly relevant results - results that are more relevant than those that can be achieved with standard Boolean logic.</p>
<p>Using a search engine that supports both variable proximity and variable term weighting can empower sourcers and recruiters to quickly find large volumes of highly relevant results, increasing productivity and achieving <a title="Learn more about the concept of Lean, Just In Time Sourcing and Recruiting" href="http://www.booleanblackbelt.com/2011/02/what-is-lean-just-in-time-recruiting/">Just-In-Time sourcing and recruiting</a>.</p>
<p>I wish the makers of search engines would seek less to &#8220;dummy-down&#8221; search interfaces and functionality and incorporate more powerful search capability that allows users to take significant control over the relevance of their search results.</p>
<p>There are a few search engines and ATS/CRM systems that support both configurable proximity search and variable term weighting.</p>
<p>Does yours?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/06/beyond-boolean-search-proximity-and-weighting/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Beyond Boolean: Human Capital Information Retrieval</title>
		<link>http://www.booleanblackbelt.com/2011/04/beyond-boolean-human-capital-information-retrieval/</link>
		<comments>http://www.booleanblackbelt.com/2011/04/beyond-boolean-human-capital-information-retrieval/#comments</comments>
		<pubDate>Mon, 18 Apr 2011 13:00:26 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Boolean]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Human Capital Data]]></category>
		<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Internet Sourcing]]></category>
		<category><![CDATA[Myths and Misconceptions]]></category>
		<category><![CDATA[Beyond Boolean]]></category>
		<category><![CDATA[Boolean Search]]></category>
		<category><![CDATA[Boolean Strings]]></category>
		<category><![CDATA[Candidate Sourcing]]></category>
		<category><![CDATA[HCIR]]></category>
		<category><![CDATA[Human Capital Information Retrieval]]></category>
		<category><![CDATA[Human computer information retrieval]]></category>
		<category><![CDATA[Sourcing]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=8294</guid>
		<description><![CDATA[When I recently spoke at SourceCon in New York, I showed an example Boolean search string that could be used as a challenge or an evaluation of a person&#8217;s knowledge and ability. The search string looked something like this: (Director or &#8220;Project Manage*&#8221; or &#8220;Program Manage*&#8221; or PM*) w/250 xfirstword and (truck* or ship* or [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F04%2Fbeyond-boolean-human-capital-information-retrieval%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2011%2F04%2Fbeyond-boolean-human-capital-information-retrieval%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/engladgut/1466195037/"><img class="alignright size-full wp-image-8842" title="Boolean Operators" src="http://www.booleanblackbelt.com/wp-content/uploads/2011/04/Boolean-Operators.jpg" alt="" width="240" height="180" /></a></p>
<p>When I recently spoke at SourceCon in New York, I showed an example Boolean search string that could be used as a challenge or an evaluation of a person&#8217;s knowledge and ability.</p>
<p>The search string looked something like this:</p>
<p>(Director or &#8220;Project Manage*&#8221; or &#8220;Program Manage*&#8221; or PM*) w/250 xfirstword and (truck* or ship* or rail* or transport* or logistic* or &#8220;supply chain*&#8221;) w/10 (manag* or project)* and (Deloitte or Ernst or &#8220;E&amp;Y&#8221; or KPMG or PwC or PricewaterhouseCoopers or &#8220;Price Waterhouse*&#8221;)</p>
<p>During the presentation, an audience member asked me why there wasn&#8217;t any use of site:, inurl:, intitle:, etc. I responded by acknowledging that for many, sourcing and Boolean search seems to be synonymous with Internet search &#8211; however, this is <a title="There is much more to Boolean search than the Internet!" href="http://www.booleanblackbelt.com/2009/02/boolean-search-does-not-internet-search/">definitely not the case</a>.<span id="more-8294"></span></p>
<h2>Boolean Logic is Simply the Simplest Way to Search</h2>
<p>Some (but I hope not too many!) sourcing and recruiting professionals may be surprised to learn that <a title="Boolean logic is over 150 years old!" href="http://en.wikipedia.org/wiki/Boolean_logic">Boolean logic</a> significantly predates the Internet as well as computers – by over a century!</p>
<p>I still run into sourcers and recruiters that are not aware that the word “Boolean” comes from the man who invented Boolean Logic in the 19th century – <a title="I still run into people who have no idea that Boolean comes from George Boole!" href="http://en.wikipedia.org/wiki/George_Boole">George Boole</a>. Boolean Logic is the basis of modern computer logic, and George Boole is regarded in hindsight as one of the founders of the field of computer science.</p>
<p>With Boolean logic being created in the 1800′s – it’s pretty obvious that Boolean logic is not just for searching for people and information on the Internet.</p>
<p>Practically any information system from which you need to search and retrieve information from “speaks” Boolean.</p>
<p>This is understandable, because using Boolean logic is the <strong><em>simplest way to construct a search.</em></strong> When you want a combination of terms/phrases you use AND, when you want at least one of a group of terms/phrases you use OR, and when you don&#8217;t want something you use NOT. It really doesn&#8217;t get any easier than that.</p>
<p>When anyone types more than a single word or phrase into Google, Bing, LinkedIn, Amazon, eBay, etc., they&#8217;re performing Boolean search, because spaces are automatically converted to ANDs. Billions of people across the globe are running basic Boolean strings whether they are aware of this or not, which is a testament to how easy Boolean search is.</p>
<h2>Sourcing isn&#8217;t about Boolean Search Strings</h2>
<p>Sourcing candidates is much more than Boolean search strings &#8211; they are but <strong><em>one</em></strong> <strong><em>aspect</em></strong> of sourcing.</p>
<p>Sourcing talent is more accurately and completely defined and described as <strong><em>human capital information retrieval</em></strong>.</p>
<p><a title="Information Retrieval goes way beyond Boolean!" href="http://en.wikipedia.org/wiki/Information_retrieval">Information retrieval</a> (IR) is &#8220;the science of searching for documents, for information within documents, and for metadata about documents, as well as that of searching relational databases and the World Wide Web.&#8221;</p>
<p>Leveraging information systems for talent discovery and identification is about searching documents, for information within documents, and for metadata about documents, as well as that of searching relational databases and the Internet for human capital information, including titles, companies, responsibilities, skills, technologies, social network updates, blog posts, resume information, event and association lists, etc.</p>
<p>With IR, an information retrieval process begins when a user enters a <strong><em>query</em></strong> into an interface.</p>
<p>Queries are simply formal statements of information needs. For a sourcer or recruiter, their information need is typically to find information that will lead them to discover and identify people with specific skills, experience, capabilities, education, etc.</p>
<p>While using Boolean operators is arguably the easiest way to construct a query, IR queries do not have to be limited solely to Boolean logic, as can be seen in the various non-Boolean query modifiers of Internet search engines (here are some of <a title="A partial list of Google's search modifiers/operators" href="http://www.google.com/intl/en/help/operators.html">Google&#8217;s</a> and<a title="A list of Bing's advanced operators" href="http://msdn.microsoft.com/en-us/library/ff795620.aspx"> Bing&#8217;s</a>), <a title="Learn more about the powerful yet least utilized search capability of LinkedIn" href="http://www.booleanblackbelt.com/2009/01/linkedins-advanced-search-operators/">LinkedIn&#8217;s advanced search operators</a>, <a title="Learn more about faceted search" href="http://en.wikipedia.org/wiki/Faceted_search">faceted search</a> (e.g., LinkedIn&#8217;s filters), etc.</p>
<p>The &#8220;hard&#8221; part of creating queries for human capital information retrieval isn&#8217;t deciding which Boolean operators to use. AND/OR/NOT is the <strong><em>easy</em></strong> part. In fact, my daughter learned about Boolean logic last year, including constructing Venn diagrams &#8211; in her 1st grade public school class!</p>
<p>The <strong><em>hard</em></strong> part of creating queries is intelligently selecting a combination of words and phrases, and in some cases <a title="Some relevant search cannot be found via direct search methods - see LinkedIn's &quot;Dark Matter&quot;" href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">strategically excluding some words and phrases</a>, that will return highly relevant results &#8211; people who are not only likely to be qualified for the position being sourced for, but also highly likely to be interested in the opportunity (i.e., &#8220;recruitable&#8221;).</p>
<p>Yes &#8211; you actually have to <strong><em>think</em></strong> in order to create effective queries that return highly <a title="See definition #2 - &quot;the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user&quot;" href="http://www.merriam-webster.com/dictionary/relevance">relevant</a> results.</p>
<h2>Human-Computer Information Retrieval</h2>
<p><a title="Learn more about Human-computer information retrieval!" href="http://en.wikipedia.org/wiki/HCIR">Human–computer information retrieval</a> (HCIR) is &#8220;the study of information retrieval techniques that bring human intelligence into the search process.&#8221;</p>
<p>According to Wikipedia, which <a title="Watson had access to all of Wikipedia when competing on Jeopardy" href="http://www.booleanblackbelt.com/2011/03/sourcers-and-recruiters-dont-fear-watson-or-semantic-search/">IBM&#8217;s Watson used heavily to compete in Jeopardy</a>, &#8220;The fields of human–computer interaction (<a title="Human–computer interaction (HCI) is the study, planning and design of the interaction between people (users) and computers. It is often regarded as the intersection of computer science, behavioral sciences, design and several other fields of study." href="http://en.wikipedia.org/wiki/Human%E2%80%93computer_interaction">HCI</a>) and information retrieval (<a title="Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching relational databases and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has its own body of literature, theory, praxis, and technologies. IR is interdisciplinary, based on computer science, mathematics, library science, information science, information architecture, cognitive psychology, linguistics, and statistics." href="http://en.wikipedia.org/wiki/Information_retrieval">IR</a>) have both developed innovative techniques to address the challenge of navigating complex information spaces&#8230;[and] Human–computer information retrieval has emerged in academic research and industry practice to bring together research in the fields of IR and HCI, in order to create new kinds of search systems that <strong><em>depend on continuous human control of the search process</em></strong>.&#8221; (emphasis mine)</p>
<p>The term human–computer information retrieval was coined by <a title="Learn more about Gary Marchionini" href="http://www.ils.unc.edu/~march/">Gary Marchionini</a> whose main thesis is that &#8220;HCIR aims to empower people to explore large-scale information bases <strong><em>but demands that</em></strong> <strong><em>people also take responsibility for this control by expending cognitive and physical energy</em></strong>.&#8221; (emphasis mine again)</p>
<p>For those who simply want information systems to magically provide them with the most relevant results at the click of a button, you should take special note of the fact that experts in the field of HCIR do not believe that people should step out of the information retrieval process and let semantic search/NLP algorithms/AI be solely responsible for the search process.</p>
<p>If you&#8217;re interested in learning more about HCIR, I suggest you read this <a title="If you have anything to do with sourcing and recruiting, you really should read this blog" href="http://thenoisychannel.com/">blog</a> &#8211; you may be surprised and interested to see who the author is, where he&#8217;s been, what he&#8217;s done, where he is now, and what&#8217;s on his mind.</p>
<h2>Talent Mining</h2>
<p>In my opinion and experience, Boolean search neither adequately describes nor gives proper credit to what sourcers and recruiters are really doing when they leverage the Internet, resume databases, ATS/CRM applications and social networking sites such as LinkedIn to find candidates, and to what some very talented and highly skilled professionals are able to accomplish with human capital information.</p>
<p>At <a title="SourceCon 2010 Agenda, held at the International Spy Museum" href="http://www.sourcecon.com/2010dc/agenda-at-a-glance/">SourceCon 2010</a>, I spoke about a specialized form of HCIR which I call talent mining, which is essentially human capital information retrieval &#8211; a specialized form of IR involving querying and analyzing human capital data (resumes, social network profiles and updates, blogs, etc.) for talent discovery, identification, and ultimately acquisition.</p>
<p>I believe there are at least five distinct levels of <a title="You can view my slide deck on Talent Mining here" href="http://www.slideshare.net/glencathey/source-con-talent-mining-12-no-video">Talent Mining</a>:</p>
<ol>
<li>Skill/Title Search</li>
<li>Concept Search</li>
<li>Implicit Search</li>
<li>Semantic/Natural Language Search</li>
<li>Indirect Search</li>
</ol>
<p>Talent Mining is not defined by nor limited to Boolean search &#8211; any and all information retrieval methods that can be leveraged to discover and return human capital data are applicable and should be used.</p>
<p>At the strategic level, talent mining is the process of transforming human capital data into an informational and competitive advantage, which is much more than simply writing Boolean search strings.</p>
<p>Only the simplest and most basic level 1 talent mining can be performed without much thought &#8211; slapping titles and keywords taken directly from a job description into a Boolean search string and hitting &#8220;search.&#8221;</p>
<p>Beyond that, more advanced level 1 and most certainly levels 2 through 5 talent mining require significant &#8220;cognitive energy,&#8221; as well as involve continual improvement.</p>
<p>In fact, effective sourcing can and should be an <a title="Learn more about iterative development and you will see the parallels with the sourcing process lifecycle" href="http://en.wikipedia.org/wiki/Iterative_and_incremental_development">iterative</a> process.</p>
<h2>Beyond Boolean &amp; Internet Search</h2>
<p>I believe that those who equate sourcing with basic Boolean Internet search don&#8217;t fully understand or appreciate the power of human capital data, its many forms and sources, and the many ways that it can be leveraged.</p>
<p>While the Internet has a lot of information, it is also full of garbage (others would call it &#8220;noise&#8221;) and it does not hold as many &#8220;findable&#8221; resumes as you may have been led to believe.</p>
<p>There is no denying that non-resume human capital data is valuable, but searching the Internet for non-resume information can easily spiral into an exercise in low ROI, time consuming garbage-sifting. Many don&#8217;t realize (or want to recognize) that non-resume data offers shallow information at best and thus has less qualitative and predictive value.</p>
<p>Additionally, the Internet isn&#8217;t a <a title="A database is a system intended to organize, store, and retrieve large amounts of data easily. It consists of an organized collection of data for one or more uses, typically in digital form." href="http://en.wikipedia.org/wiki/Database">database</a> &#8211; it&#8217;s a <a title="It irks me when people call the Internet the biggest &quot;database&quot; in the world. It's not a database!" href="http://en.wikipedia.org/wiki/Internet">network of networks</a> and the information stored on those networks is largely unstructured.</p>
<p>Structured data is an <a title="An order of magnitude is the class of scale or magnitude of any amount, where each class contains values of a fixed ratio to the class preceding it. In its most common usage, the amount being scaled is 10 and the scale is the (base 10) exponent being applied to this amount (therefore, to be an order of magnitude greater is to be 10 times as large)." href="http://en.wikipedia.org/wiki/Order_of_magnitude">order of magnitude</a> (it could easily be argued<strong><em> many</em></strong> orders of magnitude) more valuable and searchable than unstructured data, if for no other reason than it&#8217;s intrinsically high predictive value.</p>
<p>LinkedIn offers a good example of the power of structured human capital data, although a large percentage of LinkedIn profiles are information-anemic. Even so, all profiles are required to have employer and title information, and both are structured, fully searchable fields.</p>
<p>Additionally, corporate ATS&#8217;s and major job board resume databases have hundreds of thousands to tens of millions of candidate records &#8211; with deep and sometimes well-structured data. I&#8217;m perpetually confused as to why there is so much written on Internet sourcing and why I don&#8217;t see more people writing and speaking about mining all of the rich human capital data hiding in resume databases and applicant tracking systems.</p>
<p>Perhaps one of the reasons why the sourcing function and role isn&#8217;t highly regarded or respected by some is because those people equate sourcing with basic Boolean search. If all they think sourcers and recruiters are doing is directly searching for keywords and titles from job descriptions, then I can actually understand why some people would think of sourcing as an entry level role or function.</p>
<p>However, sourcing isn&#8217;t just about Boolean search, it&#8217;s about human capital information retrieval.</p>
<p>While Boolean logic is the simplest way to construct an IR query and practically all information systems accept basic Boolean operators, <strong><em>the real &#8220;magic&#8221; and work of sourcing talent is the iterative, intelligent, and cognitively challenging process of selecting a combination of words and phrases, and in some cases <a title="Some relevant search cannot be found via direct search methods - see LinkedIn's &quot;Dark Matter&quot;" href="http://www.booleanblackbelt.com/2011/03/linkedins-dark-matter-undiscovered-profiles/">strategically excluding others</a>, analyzing the results returned, making changes to the query based on observed relevance, and repeating the process until an acceptable quantity of highly qualified and matched candidates are identified.</em></strong></p>
<p>I would personally like to see more sourcing, recruiting and HR conferences and blogs to address human capital information retrieval, specifically with regard to focusing on the sourcing <strong><em>process</em></strong>, as well as deep and structured human capital data. If this happens, I don&#8217;t think it will be long before companies start to realize that sourcing can offer a serious strategic competitive advantage, and perhaps<strong><em> invest more</em></strong> in technologies and talented people to achieve a competitive advantage based on human capital data for talent discovery, identification, acquisition, and retention.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2011/04/beyond-boolean-human-capital-information-retrieval/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Boolean Search Does Not = Internet Search</title>
		<link>http://www.booleanblackbelt.com/2009/02/boolean-search-does-not-internet-search/</link>
		<comments>http://www.booleanblackbelt.com/2009/02/boolean-search-does-not-internet-search/#comments</comments>
		<pubDate>Wed, 18 Feb 2009 13:00:09 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Boolean 101]]></category>
		<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Myths and Misconceptions]]></category>
		<category><![CDATA[George Boole]]></category>
		<category><![CDATA[Internet Recruiting]]></category>
		<category><![CDATA[Limits of Boolean Search on the Internet]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=622</guid>
		<description><![CDATA[If you read certain sourcing and recruiting blogs and discussion groups, you might get the impression that Boolean search pretty much equals Internet search - such as searching for people and profiles using Google, Yahoo, or other search engines. Some sourcing and recruiting professionals may be surprised to learn that Boolean logic significantly predates the Internet and even computers [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2009%2F02%2Fboolean-search-does-not-internet-search%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2009%2F02%2Fboolean-search-does-not-internet-search%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/george_boole.jpg"><img class="alignright size-full wp-image-1502" title="george_boole" src="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/george_boole.jpg" alt="" width="215" height="211" /></a></p>
<p>If you read certain sourcing and recruiting blogs and discussion groups, you might get the impression that Boolean search pretty much equals Internet search - such as searching for people and profiles using Google, Yahoo, or other search engines. Some sourcing and recruiting professionals may be surprised to learn that Boolean logic significantly predates the Internet and even computers &#8211; by a couple hundred years!</p>
<p>The word &#8220;Boolean&#8221; comes from the man who invented Boolean Logic in the 19th century &#8211; <a class="wp-caption-dd" title="Learn more about George Boole on Wikipedia" href="http://en.wikipedia.org/wiki/George_Boole" target="_blank">George Boole</a>. <a class="wp-caption-dd" title="Learn more about Boolean Logic" href="http://en.wikipedia.org/wiki/Boolean_logic" target="_blank">Boolean Logic </a>is the basis of modern computer logic, and George Boole is regarded in hindsight as one of the founders of the field of computer science.</p>
<p>Now that you know Boolean logic was created in the 1800&#8242;s &#8211; it&#8217;s pretty obvious that Boolean logic is not just for searching for people and information on the Internet. Practically any information system from which you need to search and retrieve information from &#8220;speaks&#8221; Boolean to some extent, whether you realize it or not.</p>
<h2>Applicant Tracking Systems</h2>
<p>I was first exposed to Boolean search back in 1997 B.G. (Before Google) when my sole source of candidates was a Lotus Notes resume database by the name of CPAS, made by <a class="wp-caption-dd" title="VCG Software" href="http://www.vcgsoftware.com/" target="_blank">VCG</a>. Although the CPAS product (which no longer exists) was far from a fully featured <a class="wp-caption-dd" title="Applicant tracking systems explained on Wikipedia" href="http://en.wikipedia.org/wiki/Applicant_Tracking_System" target="_blank">Applicant Tracking System</a>, thankfully it did support full Boolean logic, with very few limitations. If it didn&#8217;t support full Boolean logic, this blog would probably would not exist &#8211; and if it did, I wouldn&#8217;t be writing it. Thank you CPAS!</p>
<p>The CPAS search interface allowed me to hand-code highly precise and effective Boolean search strings using all three standard Boolean operators: AND, OR, and NOT. While there are some applicant tracking systems on the market that do support full Boolean logic, it is an unfortunate fact that too many ATS&#8217;s available today do not support creating searches using full Boolean logic, which significantly handicaps sourcers and recruiters from leveraging their internal corporate candidate databases.</p>
<h2>Job Boards</h2>
<p>In contrast &#8211; all of the major job board resume databases (Monster, Careerbuilder, Hotjobs, Dice, etc.) support full Boolean logic. As I have written about many times before, Monster even supports <a class="wp-caption-dd" title="Extended Boolean" href="http://www.booleanblackbelt.com/2008/11/extended-boolean-proximity-and-weighting/" target="_blank">&#8220;extended&#8221; Boolean search functionality</a> with the incredibly powerful NEAR operator.</p>
<h2>Social Networks</h2>
<p>While most social networks are painfully difficult to search with their extremely limited search interfaces, LinkedIn does support creating search strings employing full Boolean logic. In fact, it appears that you can create Boolean search strings of unprecedented length and complexity on LinkedIn. If you haven&#8217;t already, please read this post I wrote that compares <a class="wp-caption-dd" title="LinkedIn search: Internal vs. External" href="http://www.booleanblackbelt.com/2009/02/free-linkedin-search-internal-vs-x-ray/" target="_blank">searching LinkedIn using LinkedIn&#8217;s search interface with searching Linkedin using Google and the x-ray technique</a>. I got tired of entering words into LinkedIn&#8217;s search bar after cramming 316,638 characters into it. That&#8217;s the equivalent of a Boolean search string that contains over 60,000 words and is approximately 120 pages long!</p>
<h2>Internet Search</h2>
<p>What&#8217;s especially ironic about the wide spread perception that Boolean = Internet search is that most Internet search engines don&#8217;t even support full Boolean logic. For example, although Google supports Boolean search strings containing AND, OR, and NOT (with the minus sign) functionality, you cannot use the NOT/- operator on an OR statement.</p>
<p>Let&#8217;s look at the results when we try and run this search string on Google:<span id="more-622"></span></p>
<p>(inurl:resume | intitle:resume) &#8220;business analyst&#8221; (requirement | requirements) -(job OR jobs OR sample)</p>
<p>According to the Boolean logic of the search, we should not have any results with the words &#8220;job,&#8221; &#8220;jobs,&#8221; or &#8220;sample.&#8221; Here is a screenshot the first page results &#8211; you can easily see that the search is actually returning results with the words sample, job, and jobs, defying the Boolean logic of the search string.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-fails-to-support-the-not-operator.png"><img class="alignnone size-full wp-image-1499" title="google-fails-to-support-the-not-operator" src="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-fails-to-support-the-not-operator.png" alt="" width="500" height="404" /></a></p>
<p>Google also does not allow users to create searches with the following logic:</p>
<p>(cfa AND analyst) OR (mba AND marketing)</p>
<p>That search, according to the Boolean logic, should not return any results that mentions of CFA and analyst as well as mentions of MBA and marketing &#8211; they should be mutually exclusive.  Let&#8217;s see what Google does with it:</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-cfa-mba-results.png"><img class="alignnone size-full wp-image-1500" title="google-cfa-mba-results" src="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-cfa-mba-results.png" alt="" width="500" height="329" /></a></p>
<p>As you can see, Google once again defies the Boolean logic of the search, returning results that mention 1 or more  terms from both sides of the OR operator. You could of course simply split the single (cfa AND analyst) OR (mba AND marketing) search into two separate searches, but the point is that you should not have to, and you would not have to if Google actually adhered to basic Boolean logic.</p>
<p>In case you&#8217;re curious &#8211; Yahoo and Ask also do not properly execute the (cfa AND analyst) OR (mba AND marketing) search as the Boolean logic of the query dictates. However, MS Live does in fact does execute the search properly. <a class="wp-caption-dd" title="MS Live wins the Boolean challenge" href="http://search.live.com/results.aspx?q=%28cfa+AND+analyst%29+OR+%28mba+AND+marketing%29&amp;go=&amp;form=QBLH" target="_blank">Click here to see for yourself</a>.</p>
<h2>Search String Length</h2>
<p>However &#8211; not all is perfect in MS Live Search land. MS Live apparently limits searches to a maximum of 10 search terms.  I&#8217;ve read this on several sites and decided to try and test it just to make sure it was accurate. When creating searches on MS Live, I could definintely type in more than 10 search terms into my searches and the searches ran, but I routinely could not find search terms beyond the 10th search term in my search string in my results. So while Live Search supports full Boolean logic, you cannot create search strings of anything beyond basic complexity due to the extremely low limit on the number of search terms it will actually process.</p>
<p>Google isn&#8217;t much better with regard to the number of search terms you can include and execute in a search &#8211; Google limits you to 32 words.</p>
<p><a href="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-32-word-limit.png"><img class="alignnone size-full wp-image-1505" title="google-32-word-limit" src="http://www.booleanblackbelt.com/wp-content/uploads/2009/02/google-32-word-limit.png" alt="" width="500" height="19" /></a></p>
<p>Unlike MS Live, at least Google had the manners to tell me it was ignoring some of my search terms. While 32 words might seem like a lot of search terms, as a comparison, Monster allows you up to 400 characters (including spaces) in its search bar, which can often mean you can create Boolean search strings with nearly DOUBLE Google&#8217;s limit of 32 search terms. And yes, there are times when you will want (and actually NEED) to create search strings with 60 search terms to target highly precise and relevant results.</p>
<h2>Exalead</h2>
<p>Like MS Live Search, <a class="wp-caption-dd" title="Exalead Internet search engine" href="http://www.exalead.com/search" target="_blank">Exalead</a>is a search engine that does support full Boolean search functionality. In fact, when it comes to Boolean searching, Exalead trumps MS Live and even Monster&#8217;s search capability by supporting <a class="wp-caption-dd" title="Learn more about configurable proximity searching using Exalead" href="http://www.booleanblackbelt.com/2009/01/semantic-search-using-the-near-boolean-operator/" target="_blank">configurable proximity searching</a>. However, for all of its search power, Exalead does not appear to index nearly as many pages/sites as any of the &#8220;major&#8221; search engines (Google, Live, Yahoo, Ask), so for now Exalead must be relegated to the &#8221;minor&#8221; search engine category.</p>
<h2>Conclusion</h2>
<p>If you don&#8217;t have access to a major job board resume database or an applicant tracking system that supports Boolean search, or you don&#8217;t search LinkedIn using Boolean search strings, or you are completely new to sourcing and recruiting &#8211; then <strong><em>perhaps</em></strong> I can understand why you might think that Boolean search is synonymous with Internet search.</p>
<p>However, the cat&#8217;s out of the bag &#8211; George Boole invented Boolean logic back in the 1800&#8242;s, LONG before the invention of computers and the Internet. Also, you&#8217;ve now seen that the &#8220;all mighty&#8221; Google doesn&#8217;t even support full Boolean logic searching &#8211; among major Internet search engines, only MS Live can claim to do that. And there are certainly many other resources you can use that do support full Boolean logic that don&#8217;t limit you to 10 or even 32 search terms &#8211; such as the major job board resume databases, some applicant tracking systems, and LinkedIn.</p>
<p>So when it comes to Boolean search, it is perhaps a more correct statement to say that Internet search = limited and conditional Boolean search.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2009/02/boolean-search-does-not-internet-search/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Basic Boolean Operators Explained</title>
		<link>http://www.booleanblackbelt.com/2008/12/basic-boolean-operators-explained/</link>
		<comments>http://www.booleanblackbelt.com/2008/12/basic-boolean-operators-explained/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 15:30:04 +0000</pubDate>
		<dc:creator>Glen Cathey</dc:creator>
				<category><![CDATA[Boolean Logic]]></category>
		<category><![CDATA[Boolean Operators]]></category>

		<guid isPermaLink="false">http://www.booleanblackbelt.com/?p=723</guid>
		<description><![CDATA[Basic Boolean Operators Explained No, those aren&#8217;t my hands. I never cease to be amazed by what you can find on the Internet and what people take pictures of. Now that I have your attention, this post is going to focus on the basic Boolean operators and search symbols and will not go into any detail of [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2008%2F12%2Fbasic-boolean-operators-explained%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.booleanblackbelt.com%2F2008%2F12%2Fbasic-boolean-operators-explained%2F&amp;source=GlenCathey&amp;style=compact&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><strong><a href="http://www.flickr.com/photos/jennafreedman/3000801429/"><img class="alignright size-medium wp-image-749" title="andornot-hands-jenna-freedman" src="http://www.booleanblackbelt.com/wp-content/uploads/2008/12/andornot-hands-jenna-freedman-300x136.jpg" alt="" width="300" height="136" /></a>Basic Boolean Operators Explained</strong></p>
<p>No, those aren&#8217;t my hands. I never cease to be amazed by what you can find on the Internet and what people take pictures of.</p>
<p>Now that I have your attention, this post is going to focus on the basic Boolean operators and search symbols and will not go into any detail of any of the special Internet-only search commands/operators. Although a great many people seem to think that Boolean = Internet search, Boolean logic and searching has been around WAY before the Internet. And here&#8217;s a quick fact: you don&#8217;t have to capitalize Boolean operators on any of the major job boards and many of the major ATS&#8217;s. Go ahead &#8211; try it. Nothing will explode and your searches will execute.</p>
<p><strong>And now, back to the Boolean basics&#8230;</strong></p>
<p><strong>AND </strong></p>
<p>The AND operator limits your search &#8211; it should be used for targeting required skills, experience, technologies, or titles you would like to limit your results to. Unless you are searching for common words, with every AND you add to your Boolean query, the fewer results you will get.</p>
<p>Example: Java and Oracle and SQL and AJAX</p>
<p><strong>OR </strong></p>
<p>The OR operator typically broadens your search. Essentially, using an OR means &#8221;at least one of/one or more.&#8221; OR statements need to be encapsulated by parentheses in order to execute properly.</p>
<p>Example: Java and Oracle and SQL and AJAX and (apache or weblogic or websphere)</p>
<p>The returned results must mention <strong><em>at least one</em></strong> of the following: apache, weblogic, websphere. However, if candidates mention 2 or all 3, they also will be returned, and some search engines will rank them as more relevant results because of such.</p>
<p>The best ways to use OR statements is #1 to think of all of the alternate ways a particular skill or technology can be expressed, e.g., (CPA or &#8220;C.P.A&#8221; or &#8220;Certified Public Accountant&#8221;), and/or #2 to search for a list of desired skills where you would be pleased if a candidate had experience with at least one, e.g., (apache or weblogic or websphere).</p>
<p><strong>ASTERISK *</strong></p>
<p>The asterisk can be used on most resume databases and non-Internet search engines as a root word/stem/truncation search. In other words, the search engine will return and highlight any word that begins with the root/stem of the word truncated by the asterisk.</p>
<p>For example: admin* will return: administrator, administration, administer, administered, etc.</p>
<p>The asterisk is a time saver for search engines that recognize it (most major job boards and ATS&#8217;s) because it saves you from creating long OR statements and having to think of every way a particular word can be expressed.</p>
<p><strong>PARENTHESES ( )</strong></p>
<p>Parentheses must be used to encapsulate OR statements for the search engines to execute them properly.</p>
<p>For example: (apache or weblogic or websphere)</p>
<p>If you don’t enclose all of your OR statements, your search may run but it will NOT run as intended.</p>
<p><strong>QUOTATION MARKS &#8221; &#8220;</strong></p>
<p>Quotation marks must be used when searching for exact phrases of more than one word, or else some search engines will split the phrase up into single word components.</p>
<p>For example: “Director of Tax” will only return &#8220;Director of Tax.&#8221; If you searched for Director of Tax without the quotation marks, on some search engines, it will split up the words Director and Tax and highlight them as relevant matches even when not mentioned as an exact phrase.</p>
<p><strong><em>Bonus:</em></strong> Google auto-stems every search term, so if you are looking specifically for the word manager, it will still return managed, management, etc. – even if you don’t want it to. If you put quotation marks on a single word in Google, it will defeat the auto-stemming feature and only return that specific word.</p>
<p><strong>NOT </strong></p>
<p>The NOT operator excludes specific search terms and will not return any results with that term (or terms) in them.</p>
<p>Example: If you were searching for an I.T. Project Manager, you may want to employ the NOT operator in order to eliminate false positive results - results that mention your search terms but do not in fact match your target hiring profile.  In this case, you could run: &#8220;project manager&#8221; and not construction &#8211; this search will not return any results with &#8220;project manager&#8221; and the word &#8220;construction&#8221; contained within them.</p>
<p>On all of the major job boards and some ATS&#8217;s, you can use the NOT operator in conjuction with an OR statement.</p>
<p>Example: .Net and not (Java or JSP or J2EE) &#8211; that search will not return any results with any mention of Java, JSP, and/or J2EE.</p>
<p><strong><em>Bonus: </em></strong>NOT has 2 main uses<br />
#1 Excluding words you do not want to retrieve to reduce false positive results (most common usage)<br />
#2 Starting with a very restrictive search with many search terms, you can use the NOT operator to systematically and progressively loosen the search into mutually exclusive result sets (not so common usage, but very effective strategy)</p>
<p>For example:<br />
Search #1 “Project Manager” and SQL and Spanish<br />
Search #2 “Project Manager” and SQL and not Spanish<br />
Search #3 “Project Manager” and not SQL and Spanish</p>
<p>In the near future, I will be writing posts reviewing the search operators and symbols of the major Internet search engines, as well as powerful extended Boolean operators and functionality. Check back often or simply subscribe to my feed.</p>
<p>If there is something you would like to see me post about with regard to Boolean logic and search tactics and strategies &#8211; let me know. Thanks!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.booleanblackbelt.com/2008/12/basic-boolean-operators-explained/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

