The term name-centric processing refers to any automated or manual process in which a successful outcome is wholly or principally dependent on technologies and human judgment applied to named entities (NEs). Such processes can range in complexity from a simple membership look-up to sophisticated, real-time assessments of vital intelligence data for military decision-making.
Some examples of prominent business processes that typically are fully or principally name-centric:
- List-screening for anti-money laundering, customer identity processing and other mandated financial compliance activities
- Border-control, visa and passport issuance
- Intelligence integration and corroboration for anti-terrorism, anti-drug and law-enforcement entities
- Passenger Name Record curation and “no-fly list” name-checking for air carriers
- Customer data integration, record linkage/conflation and Single Customer View aggregation
Although the nature and the importance of this class of processes can differ profoundly, and the resources (human and technical) used to drive the processes can range from modest to state-of-the-art, the common thread they share is that names interact in ways that determine, fully or in significant part, the processing outcome. Decisions are made, and results are delivered.
It is difficult to imagine any organization, either commercial or public, in which there are no name-centric processes to be found. They may be superficial in some organizations and essential in others, but they are widely, if not universally present.
Seen from this somewhat unusual but unifying perspective, name-centric processes (NCPs) having various degrees of complexity and bottom-line impact across an organization can all be held to the same set of metrics, in order to determine whether or not the resources (human and technical) allocated are sufficient to deliver the required outcome on an adequately reliable basis. For example, a simple exact-match name look-up may be more than enough to find a customer record in a retail setting, but more sophistication and flexibility are certainly appropriate when vetting passenger names against a ‘no-fly’ list comprising known terrorists and hijackers.
The metrics most often applied to NCPs are those shared with most other information-retrieval processes: precision and recall. (see: http://en.wikipedia.org/wiki/Precision_and_recall).
These metrics are also sometimes combined mathematically to yield a single value, the F-measure. Many documented claims for accuracy in commercial and custom NCPs are expressed with one or more of these basic measures.
Undoubtedly, NCPs share many characteristics with other kinds of information-retrieval systems. However, they also differ in important ways from those systems, and those differences can weaken the value of the precision/recall metrics for assessment of NCPs. Calculation of precision and recall is only possible when there is consistent distinction between relevant NEs (“positives”) and irrelevant NEs (“negatives”). As a practical matter, this is where the assessment mechanism breaks down, because the relevance of NCP outcomes is effectively bounded by the comprehension of the user(s). That is, a monolingual, English-speaking file clerk may not be expected to know the more than five hundred observed variant spellings of the common Islamic name Mohammed, or that the Chinese surname represented by the character 張 ,which is shared by something like 100 million Chinese at present, can be Romanized variously as Zhang, Chang, Zoeng, or Cheong, among others
(see: http://en.wikipedia.org/wiki/List_of_common_Chinese_surnames).
Put another way, the precision and recall of NCPs, whether manual or automated or some combination of the two, are largely in the eye of the beholder. For that reason, Onomastic Resources supports a holistic view of NCPs within an organization that balances across three dimensions:
- The algorithm: what automated processes are used to control name interactions; what are their strengths and limitations; are they well-suited for the types of NEs on which they will operate?
- The data: how are NEs collected, represented, annotated, stored and retrieved by NCPs across the organization; for personal names, what name-model(s) can be used to assign internal structure in a name, if it is not provided by the bearer; what formal and/or structural constraints within the data-stores and data-management resources can degrade NEs?
- The user: what skills and training are requisite or provided for the users who consume NCP outcomes; what guidelines and ancillary data elements are provided to help users make consistent relevancy decisions; how well do user relevancy decisions align with corporate goals and requirements?
[See State of the Art to learn more about the most common techniques used across name-processing products in today’s marketplace.]
For many organizations, legacy NCPs represent a perceived balance between risk and opportunity, whether those processes are in-house or outsourced. Onomastic Resources offers a fresh perspective on that balance, and a more comprehensive, motivated way to allocate the human and technological resources that cope with names. Call us today for an assessment of name-processing in your organization, including risk evaluation, competitive advantage opportunities, and a defined program for continued improvement.