What Is Named Entity Recognition (NER) and How Does It Work?

Written by Jessica Schulze • Updated on

The NER technique is used in many industries, from entertainment to health care. Learn why it’s popular and how it works in this article.

[Featured Image] Two artificial intelligence engineers discuss how named entity recognition will help their chatbot answer questions more effectively.

Named entity recognition (NER) is a natural language processing (NLP) method, which is a subcategory of artificial intelligence (AI) and machine learning (ML). Although it isn’t exactly a household name, named entity recognition powers much of the technology we use every day. It helps search engines produce the results we seek and enables chatbots to answer our questions in a human-like, conversational manner. In the following article, you can learn more about how this technique works, who uses it, and why. 

IBM

professional certificate

IBM Machine Learning

Prepare for a career in machine learning. Gain the in-demand skills and hands-on experience to get job-ready in less than 3 months.

4.6

(2,299 ratings)

89,162 already enrolled

Intermediate level

Average time: 3 month(s)

Learn at your own pace

Skills you'll build:

Machine Learning, Applied Machine Learning, Unsupervised Learning, Regression Analysis, Deep Learning, Reinforcement Learning, Exploratory Data Analysis, Statistical Inference, Feature Engineering, Data Processing, Generative AI, Data Analysis, Supervised Learning, Predictive Modeling, Data Science, Python Programming, Machine Learning Algorithms, Dimensionality Reduction, Statistical Hypothesis Testing, Statistical Analysis, Random Forest Algorithm, Performance Metric, Statistical Modeling, Business Analytics, Data Cleansing, Scikit Learn (Machine Learning Library), Data Manipulation, Sampling (Statistics), Classification And Regression Tree (CART), Technical Communication, Artificial Neural Networks, Data Presentation, Tensorflow, Keras (Neural Network Library), Data Transformation, Anomaly Detection, Probability & Statistics, Scalability, Data Access, Workflow Management, Natural Language Processing, Computer Vision, PyTorch (Machine Learning Library), NumPy, Big Data, Data Mining, Statistical Machine Learning, Linear Algebra, Text Mining, Algorithms, Data Validation, Estimation, Pandas (Python Package)

NER definition 

Named entity recognition, or NER, is a process that extracts information from text. It’s also referred to as entity chunking, entity extraction, or entity identification. The goal is to identify, sort, and rank pieces of information by importance. Breaking this term down into two parts can help us better understand it:

Named Entity: A named entity is any object that can be referenced by name in text.

Recognition: NER systems are trained to recognize these objects and sort them into helpful classifications called entity types.

DeepLearning.AI

specialization

Natural Language Processing

Break into NLP. Master cutting-edge NLP techniques through four hands-on courses! Updated with TensorFlow labs in December 2023.

4.6

(5,781 ratings)

146,161 already enrolled

Intermediate level

Average time: 3 month(s)

Learn at your own pace

Skills you'll build:

Machine Learning Methods, Deep Learning, Artificial Neural Networks, Natural Language Processing, Feature Engineering, Data Processing, Algorithms, Supervised Learning, Tensorflow, PyTorch (Machine Learning Library), Artificial Intelligence and Machine Learning (AI/ML), Data Cleansing, Text Mining, Artificial Intelligence, Markov Model, Keras (Neural Network Library), Machine Learning Algorithms, Dimensionality Reduction, Probability & Statistics

4 types of named entity recognition models

  1. Dictionary-based: Dictionary-based NER systems reference terms listed in dictionaries to identify their presence in text. Dictionaries can be any collection of words related to a specific field or domain. You can create one yourself or use public sources such as databases. 

  2. Rule-based: Rule-based NER systems rely on a set of instructions for extracting named entities from text. You must create the rules based on two types of instruction: Pattern-based rules, which relate to word forms and structure, and context-based rules like “if a contraction such as Mr. or Ms. precedes a name, then that contraction is the person’s honorific title.” These rules can also be combined with dictionaries.

  3. Machine learning-based: Machine learning-based NER systems are based on statistical models designed to identify entity names. To develop an ML-based NER system, the machine learning model must be trained on annotated documents. Annotated documents have explanations that help the machine learn to produce entity names based on instruction and past experiences.

  4. Hybrid systems: Hybrid NER systems combine more than one of the approaches listed above. 

Why is named entity recognition useful? 

NER is especially useful for analyzing unstructured text. In the context of data sets, “unstructured” refers to the absence of organization or database formatting. For example, the collection of files in your computer can be considered unstructured. If you sorted those files into categories such as portable document formats (PDFs) and word documents (DOCs), they would become structured. NER systems reduce the need for time and resource-consuming human analysis, making them ideal for situations that involve large quantities of text.

What are examples of named entity recognition industry applications?

  • Customer service: NER models are used in customer service to power chatbots and organize data related to customer care. For example, ChatGPT responds to user queries conversationally by identifying relevant entities to determine context. A customer support system can route users to the appropriate departments by categorizing their complaints and matching them to resolutions.

  • Health care: Medical professionals use NER models to analyze large amounts of documentation regarding diseases, drugs, and patients. Being able to quickly identify and extract the most pertinent information from lengthy, unstructured text helps reduce research time. 

  • Finance: In the financial field, NER can be used to monitor trends and inform risk analyses. Aside from financial information such as loans and earnings reports, NER models can analyze company names and other relevant mentions on social media to monitor developments that may affect stock prices. 

  • Entertainment: Recommendation systems such as the ones you see on Netflix, Spotify, and Amazon are often powered by NER models that analyze your search history and content you’ve recently interacted with. 

Named entity recognition example in NLP

Named entity recognition systems can be used to enhance other natural language processing tasks, such as parsing. For example, NER can increase the efficiency of part-of-speech tagging or the categorization of words that correspond with specific parts of speech depending on context.

How does named entity recognition work?

The named entity recognition process can be broken down into five steps:

  1. Tokenization: Text must first be split into smaller splices that the NER system can process. These splices can be as small as single words or as large as whole sentences. For example, “A24 released a movie starring Mia Goth” may be split into the following tokens: A24, movie, Mia, Goth. 

  2. Identification: This step is where statistical methods or semantic rules come into play. The NER system can identify entities by format or capitalization. For example, the capitalization in “Mia” and the subsequent word “Goth” indicates a proper noun. 

  3. Classification: Now that the text has been broken down into identifiable pieces, each token can be sorted into predefined categories. Examples of these categories may include “company,” “person,” or “location.”

  4. Contextual analysis: To improve output accuracy, NER systems use context clues. Using the previous example, “Goth” will be recognized as a last name rather than a subculture since the identification process determined it to be a proper noun and the classification process placed it under the category of “person.” 

  5. Post-processing: The post-processing phase is used to refine the NER system’s results. You might use an information base to enhance the data set it’s working with or fine-tune categorization rules to resolve inexactness.

Pros and cons of using named entity recognition systems

AdvantagesDisadvantages
Automates information extraction in large volumes of textDefining rules and providing NER models with vocabulary can be time-consuming.
Applicable in nearly every industryHuman language evolves constantly, requiring NER systems to be updated to avoid false-positive identifications.
The NER process does not evaluate text for truthfulness.Can struggle with spelling variations and spoken word that’s been converted to text
Helps eliminate human errors during text analyses such as overlookingMachine-learning based NER outputs can be challenging to explain.

Learn more about named entity recognition with Coursera 

You can strengthen your knowledge of natural language processing and machine learning with expert-level guidance on Coursera. In the IBM Machine Learning Professional Certificate offered by DeepLearning.AI, you can discover the most up-to-date practical skills and knowledge machine learning experts use in their daily roles. By the end, you’ll predict course ratings by training a neural network and constructing regression and classification models. 

IBM

professional certificate

IBM Machine Learning

Prepare for a career in machine learning. Gain the in-demand skills and hands-on experience to get job-ready in less than 3 months.

4.6

(2,299 ratings)

89,162 already enrolled

Intermediate level

Average time: 3 month(s)

Learn at your own pace

Skills you'll build:

Machine Learning, Applied Machine Learning, Unsupervised Learning, Regression Analysis, Deep Learning, Reinforcement Learning, Exploratory Data Analysis, Statistical Inference, Feature Engineering, Data Processing, Generative AI, Data Analysis, Supervised Learning, Predictive Modeling, Data Science, Python Programming, Machine Learning Algorithms, Dimensionality Reduction, Statistical Hypothesis Testing, Statistical Analysis, Random Forest Algorithm, Performance Metric, Statistical Modeling, Business Analytics, Data Cleansing, Scikit Learn (Machine Learning Library), Data Manipulation, Sampling (Statistics), Classification And Regression Tree (CART), Technical Communication, Artificial Neural Networks, Data Presentation, Tensorflow, Keras (Neural Network Library), Data Transformation, Anomaly Detection, Probability & Statistics, Scalability, Data Access, Workflow Management, Natural Language Processing, Computer Vision, PyTorch (Machine Learning Library), NumPy, Big Data, Data Mining, Statistical Machine Learning, Linear Algebra, Text Mining, Algorithms, Data Validation, Estimation, Pandas (Python Package)

Updated on
Written by:

Writer

Jessica is a technical writer who specializes in computer science and information technology. Equipp...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.