IT guys, is it costly to update a (name) search engine?

Grrr · March 17, 2017, 11:17pm

At work we have an employee search engine.

If you don’t know their email, or phone number (or whatever) just type in their name and up pops the information.

The thing is though, if you get even ONE letter wrong, the engine has no clue what you’re looking for. It gives suggestions, but the suggestions it gives make no freaking sense.
This search engine has been like this for well over a decade (Probably closer to 20 years) and I’m just wondering why nobody in IT has bothered to fix it.
Is updating a simple name search engine costly?

snfaulkner · March 17, 2017, 11:19pm

Are you talking about Active Directory?

Grrr · March 17, 2017, 11:26pm

I don’t believe I am. I’m talking about an intraweb based search function, that finds all employees of the company.

Mangetout · March 17, 2017, 11:33pm

Without knowing what you’re actually describing, it’s impossible to say.

If you’re talking about a contacts search that’s built into your mail client or CRM, then it might not be amenable to change - your IT people may just be delivering an implementation of the system that the business has bought from a third party.

If it’s a search engine in an intranet page that has been developed in-house, yes, it could probably be made better. Has anyone asked your IT department to make it better? They might not even know you’re having a problem with it. Maybe they never use it themselves - it’s a really common misconception that the folks in IT should know every feature and process of every application as well as the users do - but it’s simply not possible for us to know the whole of everyone else’s job in addition to our own.

Grrr · March 17, 2017, 11:47pm

It’s this. It’s just a little box in the corner of the company’s homepage (that is only accessible to employees) that allows you to search for employees one of four ways: Employer number, phone number, email or more commonly, name.
And it’s a big world wide company, I can’t imagine they would be oblivious to it. But I suppose it’s possible.

Shagnasty · March 17, 2017, 11:57pm

I agree that it could cost any amount of money. I work for a mega-corp and such a change would most likely cost in the many tens to hundreds of thousands of dollars and take many months at best but that is an extreme case.

How big a company are we talking about in terms of people? I could build an access database with pretty good search capabilities in an afternoon but it wouldn’t be enterprise ready. I could build a SQL Server database with a web interface in an about a week as long as I had access to employee data but the latter is a problem on its own for many reasons.

I would just try to account for common errors however. What you are talking about requires some sophisticated fuzzy logic like Google uses and that is a lot harder than it sounds to develop on your own. You may think that computers should just be able to tell that “Marey Johson” = “Mary Johnson” but they can’t very easily because they are fundamentally dumb and don’t have any common sense.

There is obviously a way to do it because large search engines do it but it isn’t cheap or easy. There may be a commercial product that already does this type of thing well but I have never seen one personally and it wouldn’t be cheap or easy to integrate with existing systems even if you find the perfect enterprise-level 3rd party software.

slash2k · March 18, 2017, 12:14am

Soundex, metaphone, and Levenshtein algorithms are readily available in practically any major database implementation, so accounting for common errors is quite doable. These sorts of algorithms are what drive spell-checkers and “sounds like” applications, so getting from Marey Johson to Mary Johnson is not at all difficult or expensive.

That said, somebody at the company has to want to make the change, and either they don’t know it needs doing or it’s never risen to the top of the priority list.

Grrr · March 18, 2017, 12:18am

Shagnasty:

I agree that it could cost any amount of money. I work for a mega-corp and such a change would most likely cost in the many tens to hundreds of thousands of dollars and take many months at best but that is an extreme case.

How big a company are we talking about in terms of people? I could build an access database with pretty good search capabilities in an afternoon but it wouldn’t be enterprise ready. I could build a SQL Server database with a web interface in an about a week as long as I had access to employee data but the latter is a problem on its own for many reasons.

I would just try to account for common errors however. What you are talking about requires some sophisticated fuzzy logic like Google uses and that is a lot harder than it sounds to develop on your own. You may think that computers should just be able to tell that “Marey Johson” = “Mary Johnson” but they can’t very easily because they are fundamentally dumb and don’t have any common sense.

There is obviously a way to do it because large search engines do it but it isn’t cheap or easy. There may be a commercial product that already does this type of thing well but I have never seen one personally and it wouldn’t be cheap or easy to integrate with existing systems even if you find the perfect enterprise-level 3rd party software.

Thanks Shag. Yeah, the company has over 40k employees. A lot of which are from foreign countries. So spelling the name isn’t always that straight forward.

I was thinking there has to be some software out there that would remedy this situation. But I guess not. Thanks for the elucidating post.

slash2k · March 18, 2017, 12:38am

“SELECT employee_name FROM database WHERE levenshtein(employee_name, my_guess_of_name) <= 4” is an (open-source) PostgreSQL database implementation; Oracle has something very similar. Levenshtein distance is a measure of the similarity of two strings; basically, how many characters do you need to add, change, or delete to get from one to another. To get from ‘Marey Johson’ to ‘Mary Johnson’, you need two modifications (drop the ‘e’ in Mary, insert an ‘n’ in Johnson), so the Levenshtein distance is 2. The Levenshtein distance from Voitiwa to Wojtyla is 4, for another example. Adjust how similar the two words have to be to make your search fuzzier or not-so-fuzzy.

There are other algorithms out there; this is merely one that I like for fuzzy matching of names.

Shagnasty · March 18, 2017, 1:15am

That is an intelligent answer that will probably work for my “Marey Johson” example but that is only a small subset of the real problem. I work with tons of international employees as well (mainly from India, Russia and China) and stock algorithms simply aren’t going to cut it for looking up many of those names. Many of their real names aren’t even based in the Latin alphabet and it is hard to guess how most of them are pronounced at all even when translated in writing.

This is a multi-tiered problem that isn’t very easy to solve. For example, many people go by names other than their first name (I have myself since birth). It could be a designated nickname or a middle name. Women often change their last name when they get married. Other people have names with spellings that don’t correspond well to their pronunciation. There are countless other complications as well that take some pretty sophisticated data and pseudo-AI rather than a single algorithm to solve.

Many of those issues are somewhat addressable but I have never seen a good solution to the problem as a whole.

slash2k · March 18, 2017, 4:11am

I guess I’m taking a narrower interpretation of the problem to be solved.

For example, if the task at hand is “how do I find this person I know who works at the company when I’m not sure how to spell their name?,” then presumably I have at least an idea of what name they use professionally, in which case I don’t care if this is their “legal” name, a nickname, a married name, whatever. The problem is how is this person known to fellow employees, not how is this person identified on their passport. (And if you really need alternative names, an “a.k.a.” table or set of fields and a query with an OR clause doesn’t add a lot of complexity.)

As far as international employees, what are the parameters of the employee directory? Is the intent that employees in North America or Western Europe will be typing in Cyrillic or kanji to search the directory, or are non-Western employees known professionally to their Western colleagues by names in the Latin alphabet (and/or vice versa)? Spellings that don’t correspond well to their pronunciation aren’t limited to non-Western names (see Irish Gaelic or Polish names for more familiar examples), but Soundex, Daitch–Mokotoff Soundex, and similar phonetic algorithms can give reasonably decent results.

A perfect solution is indeed a complex problem, but a “good enough” solution isn’t, and a “good enough” solution is a vast improvement over the status quo identified by the OP (“if you get even ONE letter wrong, the engine has no clue what you’re looking for”).

Shagnasty · March 18, 2017, 4:44am

I agree for the most part. Simple search tools like Microsoft tools like Outlook and Communicator suck and they could easily be improved upon for some degrees of “easy” but that doesn’t mean that a local IT department can just do that on its own. I work in a heavily Portuguese facility and it is even difficult for me to find people that I know quite well including my office-mate sitting 8 feet away (in case you don’t know, Portuguese last names commonly have a weird mix of vowels that are very hard for English speakers to spell even if they know how to pronounce them). It is ridiculous that I can’t just type in his uncommon first name and the first few letters of his last name and not get a result but that is the way it works now.

I am am a systems analyst and developer by profession for 20 years and I was estimating how much time and effort it would take to fix such a thing for a mega-corp. It isn’t a weekend project and would take big money to solve the problem as a whole.

slash2k · March 18, 2017, 5:51am

I work IT too (database admin), and have worked on projects for “good enough” name searches. I will have to respectfully disagree with your estimates of big money, at least for the case of a project within the control of the local IT department. True, the project I have in mind particularly worked mainly with names of European origin, and names as recorded by a government agency in the U.S.; while writing the test cases and doing all of the validation/checking took more time (and I’ve added some refinements over the years), the actual development to add a fuzzy name search to their database application was the work of only a day or so. It’s not a perfect solution, but it handles the multiple millions of names in their database to their satisfaction.

Soundex, for example, or one of its variants might be a “good enough” solution to a Portuguese name problem: skip the vowels. If you can get the consonants (or at least an approximation thereof) in the right order, you’ll find the name. [Rapozo and Reposa have the same Soundex value, e.g.] Or this may not be good enough for your particular problem, and you might need some other kinds of fuzziness. There are well-regarded software packages, both open-source and commercial, that can handle some pretty complex problem sets. (Metaphone 3, e.g., is a substantial improvement on earlier algorithms; the source code is available for purchase in Java, C#, python, or several other variants for the princely sum of $240.)

Now this won’t work if the company is running exclusively off-the-shelf software without any access to the code; only Microsoft can add to the base code of Microsoft Outlook. However, my experience has been that big government/corporate projects involve a fair amount of writing customized code, even if only to glue together disparate systems. Your experience may be different.

edwardcoast · March 20, 2017, 5:05am

Depends on the platform, how it was written, and how much bureaucracy the company follows before anything gets done.

To get a real answer, find out who has ownership of that code to maintain it and ask them to estimate how many hours it would take to add your request. I suggest doing this before making a formal request.

Mangetout · March 20, 2017, 7:46am

Has anything else about the company’s intranet changed recently? (in a structural/functional sense - not just content).

My prognostication for the outcome here is that it will be something like this (at least, in my experience, the reason why companies use sub-par software often is something like this):

[ul]
[li]The company bought the intranet solution as a packaged deal from an external developer some time ago (or someone developed it inhouse, and that person has left).[/li][li]There was no handover of the development process - only the product itself[/li]li The original developer had his own ideas about coding best practice, so the thing is difficult to understand, let alone maintain[/li][li]Changing anything now is difficult and risky[/li][li]Replacing it is somewhere on the roadmap, but at a very low priority due to scoring of cost and necessity - it will never happen until a new requirement comes along, or the thing breaks and cannot be restored[/li][/ul]

Dead_Cat · March 20, 2017, 12:08pm

In my (limited) experience it will be the last option - it’s simply not enough of a priority to justify a business decision to spend time and money on it.

There is a school of thought that says you don’t want to work for a company that has the time to do these things, as it means their IT/development team doesn’t have anything worthwhile to do - which in turn could mean the company is stagnating and primed for failure :).

Topic		Replies	Views
Search engines: Can I find out if my name was searched Factual Questions	1	723	March 25, 2003
Do some job search engines exist just to collect data for soliciting? Factual Questions	10	865	December 2, 2002
Search engine optimization services - advice? Factual Questions	7	832	January 12, 2005
The search engine needs improvement. About This Message Board	9	868	March 1, 2005
What Web Search Engine Do You Prefer? In My Humble Opinion	38	39313	September 12, 2016

IT guys, is it costly to update a (name) search engine?

Related topics