Boolean Black Belt

Leveraging social networks, resume databases, and the Internet for sourcing and recruiting

  • FREE Sourcing + Recruiting Resources
  • Who is the Boolean Black Belt?
  • Contact Me
  • Copyright, Disclaimer, Photos

Subscribe via Email

adobe illustrator cs serials Buy Adobe Illustrator CS5 for Mac OEM - Online Software Downloads Center illustrator brushes adobe photoshop adobe Buy Adobe Illustrator CS5 OEM - Online Software Downloads Center free tutorial aging picture adobe photoshop adobe cs2 free illustrator trial Buy Adobe Creative Suite 5 Master Collection OEM - Online Software Downloads Center adobe illustrator cs 11 serial free adobe illustrator cs key Buy Adobe Flash Professional CS5 for Mac OEM - Online Software Downloads Center adobe cs2 indesign personal seminar started adobe illustrator 10 mac Buy Adobe Flash Professional CS5 OEM - Online Software Downloads Center adobe indesign free download adobe illustrator envelope no 10 Buy Adobe Photoshop CS5 Extended for Mac OEM - Online Software Downloads Center fonts adobe indesign adobe photoshop cs2 prefence settings Buy Adobe Dreamweaver CS5 for Mac OEM - Online Software Downloads Center adobe photoshop animals free 2007 adobe photoshop program Buy Adobe InDesign CS5 for Mac OEM - Online Software Downloads Center adobe photoshop keys import corel draw into adobe illustrator Buy Adobe InDesign CS5 OEM - Online Software Downloads Center adobe books illustrator academic version of dreamweaver adobe Buy Adobe Creative Suite 5 Master Collection for Mac OEM - Online Software Downloads Center adobe photoshop freezes adobe photoshop element 5 0 Buy Adobe Dreamweaver CS5 OEM - Online Software Downloads Center adobe photoshop save photo without background loading font in adobe photoshop Buy Adobe Photoshop CS5 Extended OEM - Online Software Downloads Center adobe photoshop cs3 torrent torrentspy

Extended Boolean: Proximity and Weighting

Posted at November 10, 2008

Most sourcing, recruiting, and staffing professionals are familiar with the “standard” Boolean operators of AND, OR, and NOT. However, I have found that few are familiar with “extended” Boolean functionality, such as proximity (or adjacency) and term weighting.

Beyond Basic Boolean

Extended Boolean offers sourcers and recruiters significantly more control, power and precision when executing searches, and in the hands of an expert – extended Boolean can enable semantic search. Semantic search uses the science of meaning in language to produce highly relevant search results rather than have a user sort through a list of loosely related keyword results.

Relevance is Key

Ultimately, any sourcing or recruiting professional knows that what’s most critical in running Boolean searches on the Internet, a job board, or in an internal resume database, is getting relevant results. According to Wikipedia, “relevance” denotes how well a retrieved set of documents (or a single document) meets the information need of the user.

For sourcing and recruiting, relevant results are typically defined as resumes or profiles of (or information about) potential candidates whose experience and capabilities closely match the hiring profile or job opening that the sourcer or recruiter is trying to find candidates for.

I’d argue that the value of any source of information (resume database, the Internet, etc.) lies less in the information contained within, and more in the ability of a user to extract out precisely and completely what the user needs – finding and retrieving any and all appropriately qualified candidates. Information has no value to you if you are unable to find it and take action on it.

So how can extended Boolean help sourcers and recruiters find more relevant results? Let’s take a look at term weighting first.

Variable Term Weighting

Talented sourcers and recruiters know that not all terms are equally important in a query. In most queries and searches, certain search terms are more important than others. When running standard Boolean queries, all search terms are considered/weighted equally. Unfortunately, many search engines and database search interfaces simply assign relevance to results by the number of search term “hits” in each document. In most cases, the simple frequency of search terms does not correlate to relevant results. This is where the derisive description “buzzword matching” comes from, most often used to denote that there is little skill involved in running Boolean searches counting matched keywords.

Using an Information technology hiring profile as an example – if a sourcer was looking for candidates who have significant experience administering Windows servers and Exchange email servers they might create a simple Boolean query such as this: Windows AND Exchange AND server* and admin*. That search is highly likely to return and rank candidates who are Windows systems administrators who mention Windows many times in their resume/profile and happen to mention Exchange once or twice as highly relevant because of the number of “hits” for Windows – which is by nature a very common term in resumes. This would leave the sourcer with having to sort through a large volume of results to find the candidates who actually have been primarily responsible for administering Exchange servers as well as Windows servers.

Search engines that offer users the ability to assign different weights to each search term enable sourcers and recruiters to move beyond simple buzzword matching and take control of the relevance of the results. Essentially, with variable term weighting you can assign a number value to words to increase their weight when ranking retrieved documents – which does not change the TOTAL number of results, but the ORDER of the results.

Using the same example as above, a sourcer using a search engine that supports variable term weighting could create a Boolean search string such as this: Windows AND Exchange:30 AND server* and admin*. That Boolean query will pull the same number of results as the first search that had no term weighting – however, it will sort and rank the results heavily favoring resumes/profiles that mention Exchange more often in relation to the other search terms, increasing the likelihood that the sourcer can quickly identify candidates who have had experience being responsible for administering and supporting Exchange servers. By employing variable term weighting, the sourcer has increased the relevance of the results.

Now, let’s take a look at proximity functionality:

Proximity

Proximity search functionality enables a user to search for specific terms that are mentioned close to other specific terms. An adept sourcer or recruiter knows that documents with the word “computer” mentioned close to the word “science” will often have a different meaning and relevance than documents that simply mention the words “computer” and “science” anywhere within them.

There are 3 main types of proximity searching: fixed proximity, variable proximity, and adjacency. For the purposes of this post – I will focus only on fixed and variable proximity.

Fixed Proximity

Fixed proximity is most commonly represented by the NEAR operator. The search engines that do recognize and support the NEAR operator typically define NEAR proximity as within 1 to 10-16 words (specific search engines can differ – check their documentation).

Using the example of a Windows and Exchange administrator, a sourcer could craft this search using the NEAR operator: Windows and Exchange NEAR admin* and server*. That search will ONLY return results of resumes/profiles that mention Exchange within 1 to 16 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.). Being able to control the fact that Exchange MUST be mentioned within close proximity to admin* will dramatically affect and improve the relevance of the search results, typically returning results of candidates who either have a title using both terms and/or candidates that talk about being responsible for Exchange administration.

Here are some examples taken from actual resumes that demonstrate the variety of relevant results that can be retrieved with the above search:
  • Managed & administered more than 300 Exchange Servers
  • Provisioned & administer multiple Exchange 5.5/2003 servers
  • Not only are there administration duties for Exchange and Blackberry…
  • Exchange/RightFax administrator
  • Installing, Configuring, and Administering Microsoft Exchange 2000 Server
  • Administer a Microsoft Exchange 2003/2007 environment
  • 8+ years of expertise as a System Administrator in Windows 2003 family, Windows 2000 family, MS Exchange 5.5, MS Exchange 2000, and Exchange 2003
  • I am proficient with the following skills; planning, installation and administration of Windows Active Directory, Windows Servers, Exchange Server
  • Windows Server Support, Active Directory,Exchange Server 2000, 2003 administration and Blackberry Server administration
  • Administer Exchange 2003 Server for corporate email

As you can see, being able to control the proximity of specific search terms essentially increases the likelihood of returning results of candidates who have had administrative responsibility for Exchange servers, effectively increasing the relevance of the results.

Fun fact:

  • Did you know that Monster and Exalead support the NEAR operator?

Configurable Proximity

A search engine that supports configurable proximity affords users the ability to precisely control the distance between specific search terms. This can produce even more relevant results than the NEAR operator, because the NEAR operator’s maximum range of 10-16 can allow for some non-relevant results to be returned. The farther words are mentioned apart from each other, the less likely it is that they are semantically related. In fact, at 10-16 words, each could be mentioned in separate bullet points or sentences on a resume and be completely unrelated.

However, with configurable proximity, a sourcer or recruiter can choose the maximum distance between search terms. Although search engines vary with their exact syntax, here is an example of the Windows and Exchange admin search using configurable proximity: Windows and Exchange w/5 admin* and server*. That search can ONLY return results of resumes or profiles that mention Exchange within 5 words of any word starting with the root of admin (administrator, administration, administer, administered, etc.), regardless of order. A maximum distance of 5 words will dramatically increase the relevance of the search results because mentioning those 2 search terms at such a close range makes it more likely that they are mentioned in the same bullet point or sentence and thus more likely to be semantically related. Essentially, this search will only return results of people who specifically mention something about being responsible for administering Exchange at least once in their resume. By employing this kind of search, a sourcer is actually performing a semantic search, as they are looking specifically for people who talk about having a particular responsibility – not just looking for documents that contain words.

Fun facts:

  • Did you know that Exalead supports configurable proximity searching?
  • Did you know that you can integrate a free, open source search engine that supports configurable proximity and variable term weighting into your ATS or resume database? Check out Lucene.

Conclusion

Hopefully you can see how being able to control the proximity of two search terms can yield results that are FAR more relevant than results that simply mention the two terms anywhere in a document or form – this is the critical difference between the semantic similarity between a search and its results vs. the lexical similarity between a search and its results.

There are countless ways you can apply extended Boolean functionality such as variable term weighting and proximity searching to nearly any industry/hiring profile to create searches that return highly relevant results - results that are more relevant than those that can be acheived with standard Boolean logic. Using a search engine that supports both variable proximity and variable term weighting can empower sourcers and recruiters to quickly find large volumes of highly relevant results, increasing productivity and achieving JIT Talent identification and acquisition.

Extended Boolean, Semantic Search
  • Digg
  • Stumbleupon
  • Delicious
  • Reddit
  • Technorati

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Comments

14 Responses to “Extended Boolean: Proximity and Weighting”
  • Amber Lynn says:
    at

    Saw this blog today and thought you might be able to help. I’m new to boolean logic and tasked with finding software engineers on Facebook from top schools. Can you blog about creating strings for such a search?
    Better yet, could you reply with example Boolean Strings for finding software engineers on Facebook who are from Top 10 schools (Stanford, Berkeley, MIT, CMU, etc) and live in the Silicon Valley?
    Thanks, Amber
    headhunt AT ymail

  • Cyclocross says:
    at

    “Did you know that AltaVista supports configurable proximity searching?”

    How?? I cut and pasted your suggested search in Alta Vista and the “w/5″ function didn’t work at all – it gave me results that had the text “w/5″ instead. switching to “NEAR/5″ also didn’t work.

    What’s the correct format for configuring proximity in Alta Vista?

  • Boolean Black Belt says:
    at

    Cyclocross,
    Oops – you got me on that one – that was supposed to read “Did you know that Exalead supports configurable proximity searching?” I’ve since changed it – thank you for pointing out this error.

    AltaVista used to support configurable proximity searching (http://www.searchengineshowdown.com/features/av/review.html), but since Yahoo took over, they no longer support most of the advanced search features AltaVista once boasted.

  • April says:
    at

    Which search engine that supports variable term weighting?

  • Andrea says:
    at

    In need of a boolean search string for Sr Software Engineers with who have experience with Java, C++ and or Ojective-C will be developing software applications for mobile devices in Seattle – can you help?

  • Stephen says:
    at

    Guys, this is a guide. He’s not going to do the work for you.

Leave Comment

Click here to cancel reply.


About Me

I have significant experience with and passion for leveraging technology and Lean principles to achieve high quality hires in a Just-In-Time manner. I'm a power user of Social Media, ATS and CRM applications, job board resume databases, the Internet, Boolean queries and semantic search for recruiting.

My LinkedIn profile Follow me on Twitter Find me on Facebook SlideShare presentations

 

 

 

Search

Archives

  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008

Categories

  • Aggregators
  • Analytics
  • Applicant Tracking Systems (ATS)
  • Artificial Intelligence Matching
  • Best Practices
  • Boolean
  • Boolean 101
  • Boolean Logic
  • Boolean Search Tips and Tricks
  • Candidate Pipelining
  • Candidate Quality
  • Cold Calling
  • Conferences
  • Copyright Info
  • Diversity Sourcing
  • Events
  • Exalead
  • Extended Boolean
  • Facebook
  • Google
  • Hidden Talent Pools
  • How-To's
  • Human Capital Data
  • Industry Searching
  • Internet Sourcing
  • Jigsaw
  • Job Boards
  • Job Posting
  • Job Search
  • Lean/JIT Recruiting
  • LinkedIn
  • Mistakes
  • Monster
  • Monster vs. Google
  • Myths and Misconceptions
  • NEAR Operator
  • Passive Candidates
  • Passive Sourcing and Recruiting
  • Proximity Searching
  • Recruiting Technology
  • Relationship Building
  • Resume Aggregators
  • Resume Sourcing
  • Resume Sourcing vs. Cold Calling
  • Resume Writing
  • Search Process
  • Semantic Search
  • Social Media
  • Social Networking
  • Social Recruiting
  • SourceCon
  • Sourcing and Recruiting
  • Sourcing Automation
  • Sourcing in Europe
  • Sourcing Mistakes
  • Spoke
  • Talent Intelligence
  • Talent Mining
  • Talent Warehouse
  • Thank you!
  • Traffic Data
  • Training Sourcers and Recruiters
  • Twitter
  • Uncategorized
  • x-ray search
  • Yahoo
  • ZoomInfo

 

  • Recent Posts

    • Boolean Black Belt Website Visitor Analytics
    • Private and Out of Network Search Results on LinkedIn
    • Anti-Social Recruiting
    • Denver Colorado Recruiting Conference August 25th
    • Recruiting is a Matter of Perspective
  • Twitter

    • I'm at Grand Hampton Clubhouse. http://4sq.com/c6C5Cn 2 days ago
    • Wish people showed more respect for truckers on the road - they quite literally are the cornerstone of our economy! 3 days ago
    • I'm at Kforce, Inc. (1001 East Palm AVE, Tampa). http://4sq.com/bqKYmW 3 days ago
    • More updates...

    Posting tweet...

    Powered by Twitter Tools

  • Recent Comments

    • Matt Kerr: I like using the Concatenate function on excel to combine th...
    • fraggy: got it john - i did =H$2&B2&H$2 in column d and =D...
    • John: replace the "" (the double ") with H2&" H2 is the cel...
    • Steve Cherry: Spot on - Latin may be the mother tongue of modern language,...
    • fraggy: John, I tried your formula and am getting an error - spec...

 

 

  • Links

    • Boolean Strings (LinkedIn) - Group for Sourcers, Recruiters, Sales and other professionals who are interested in Searching the web to gather information for business.
    • Boolean Strings Network - A web sourcing community sharing best practices for leveraging Boolean search strings
    • Cloud Recruiting - Expose yourself to the cutting edge of mobile recruiting
    • Magic Method - A place to learn about telephone names sourcing.
    • Recruiting Pulse - Your single source for all things recruiting – aggregator of over 40 sourcing, recruiting and HR blogs
    • Sourcing Talent - Insightful Secrets, Tips, and Tricks to finding Talent on the web
    • The Recruiters Lounge - Written by Jim Stroud (and friends) and explores the wacky world of employment with articles, podcasts, videos, comics and more.
    • TheSourceress - Grandmaster Sourcer Katharine Robinson’s blog

Powered by Wordpress | WP Premium theme by PSD to XHTML
Copyright 2010 Boolean Black Belt. All rights reserved

  • FREE Sourcing + Recruiting Resources
  • Who is the Boolean Black Belt?
  • Contact Me
  • Copyright, Disclaimer, Photos