Improve ChatGPT with Knowledge Graphs

Check out the LLM Engineer’s Handbook to master the art of LLMs from fine-tuning to deployment👇

ChatGPT has shown impressive capabilities in processing and generating human-like text. However, it is not without its imperfections. A primary concern is the model’s propensity to produce either inaccurate or obsolete answers, often called “hallucinations.”

The New York Times recently highlighted this issue in their article, “Here’s What Happens When Your Lawyer Uses ChatGPT.” It presents a lawsuit where a lawyer leaned heavily on ChatGPT to assist in preparing a court filing for a client suing an airline. The model generated fictional court decisions to back its arguments, which didn’t go unnoticed. This incident underscores the need for solutions to ground AI models like ChatGPT and improve their performance.

To address this, we propose an approach that focuses on augmenting ChatGPT using a knowledge graph. This method aims to provide a structured context, ensuring the model outputs are accurate but also relevant and up-to-date. By bridging the gap between the unstructured textual world of ChatGPT and the structured clarity of knowledge graphs, we strive to enhance the effectiveness and reliability of AI language models.

All the code used in this article is available on Google Colab and on GitHub.

What is a knowledge graph?

A knowledge graph is a structured format of knowledge representation, usually composed of entities and relationships. In a typical knowledge graph, entities are the nodes, and the relationships between them are the edges. The graph-based representation allows complex relationships to be modeled in a way that’s intuitive and closer to human understanding. Here is a simple illustration of a knowledge graph:

Source: Wikipedia. CC BY-SA 4.0

Google has been using knowledge graphs since 2012 to provide additional contextual information and sources. The structured representation of data offers a new dimension of context to the AI model, grounding it in validated knowledge.

Applying Knowledge Graphs to Improve ChatGPT

A crucial limitation of ChatGPT is its lack of real-time information updates. Since the model was last trained using data up until 2021, it doesn’t have access to events, data, or context after that year. This leads to ChatGPT having outdated or incomplete information about events, technological advancements, or other critical happenings post-2021.

Let’s illustrate this limitation by asking ChatGPT about a recent event, “When did Apple announce the Vision Pro?”. Given the model’s knowledge cutoff in 2021, we would expect it to be unaware of this announcement, which happened in 2023.

!pip install -q openai langchain

import os
import openai

os.environ['OPENAI_API_KEY'] = "your-OpenAI-API-key"
openai.api_key = os.environ['OPENAI_API_KEY']

question = "When did apple announced the Vision Pro?"
completion = openai.ChatCompletion.create(model="gpt-3.5-turbo",
                                          temperature=0,
                                          messages=[{"role": "user",
                                                     "content": question}])
print(completion["choices"][0]["message"]["content"])

As an AI language model, I do not have access to current events or real-time information. However, as of my last training data, Apple has not announced any product called "Vision Pro." It is possible that this product does not exist or has not been announced yet.

As expected, ChatGPT is unable to provide the correct answer due to its training data limitations. This clearly highlights the need for constant updates to the model’s knowledge base, which can be addressed by integrating it with a continuously updated knowledge graph.

By implementing such a knowledge graph, we can ensure that ChatGPT can provide accurate, current, and reliable information, effectively addressing the “hallucination” issues as well as the knowledge cutoff limitations.

Experiment 1: Sentence-Level Knowledge Graphs

To demonstrate this, we’ll use the LangChain library, a powerful tool designed for building frameworks around large language models. The library includes a component called GraphIndexCreator, which can parse a sentence and create a knowledge graph. This component is currently limited and cannot process long corpus of text, but it serves as a perfect starting point for our experiment.

Let’s start with a straightforward sentence: “Apple announced the Vision Pro in 2023.”

from langchain.llms import OpenAI
from langchain.indexes import GraphIndexCreator
from langchain.chains import GraphQAChain
from langchain.prompts import PromptTemplate

text = "Apple announced the Vision Pro in 2023."

index_creator = GraphIndexCreator(llm=OpenAI(temperature=0))
graph = index_creator.from_text(text)
graph.get_triples()

[('Apple', 'Vision Pro', 'announced'),
 ('Vision Pro', '2023', 'was announced in')]

By feeding this sentence into the GraphIndexCreator, it creates a knowledge graph by identifying the sentence’s entities and relationships, forming triplets of information in the format of (source node, relation, target node). However, the GraphIndexCreator might get confused with the relations and target nodes due to the inherent complexity of natural language.

Even though it’s a tiny graph based on a single sentence, we can represent it visually using popular Python libraries such as matplotlib and networkx.

import networkx as nx
import matplotlib.pyplot as plt

# Create graph
G = nx.DiGraph()
G.add_edges_from((source, target, {'relation': relation}) for source, relation, target in graph.get_triples())

# Plot the graph
plt.figure(figsize=(8,5), dpi=300)
pos = nx.spring_layout(G, k=3, seed=0)

nx.draw_networkx_nodes(G, pos, node_size=2000)
nx.draw_networkx_edges(G, pos, edge_color='gray')
nx.draw_networkx_labels(G, pos, font_size=12)
edge_labels = nx.get_edge_attributes(G, 'relation')
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=10)

# Display the plot
plt.axis('off')
plt.show()

Now, let’s enhance ChatGPT using the knowledge graph. We will use another component of the LangChain library, GraphQAChain, to this end.

Initializing the GraphQAChain, we input the same question we asked earlier, “When did Apple announce the Vision Pro?”. This time, ChatGPT leverages the knowledge graph we’ve just built.

chain = GraphQAChain.from_llm(OpenAI(temperature=0), graph=graph, verbose=True)
chain.run(question)

> Entering new GraphQAChain chain...

Entities Extracted:

 Apple, Vision Pro

Full Context:

Apple announced Vision ProVision Pro was announced in 2023



> Finished chain.

' Apple announced Vision Pro in 2023.'

This time, ChatGPT was able to output the correct information! The good thing is that we don’t need any parser to build our knowledge graphs and can use existing ones. In the next experiment, let’s try to use a bigger graph and see if it’s still as performant.

Experiment 2: Bigger Knowledge Graphs

In this experiment, we manually create this more complex graph by supplying a list of triplets to the GraphIndexCreator object using the add_triple() method. Each triplet represents a distinct piece of knowledge related to Apple, such as products it has created or where it is located.

from langchain.graphs.networkx_graph import KnowledgeTriple

# Knowledge graph
kg = [
    ('Apple', 'is', 'Company'),
    ('Apple', 'created', 'iMac'),
    ('Apple', 'created', 'iPhone'),
    ('Apple', 'created', 'Apple Watch'),
    ('Apple', 'created', 'Vision Pro'),

    ('Apple', 'developed', 'macOS'),
    ('Apple', 'developed', 'iOS'),
    ('Apple', 'developed', 'watchOS'),

    ('Apple', 'is located in', 'USA'),
    ('Steve Jobs', 'co-founded', 'Apple'),
    ('Steve Wozniak', 'co-founded', 'Apple'),
    ('Tim Cook', 'is the CEO of', 'Apple'),

    ('iOS', 'runs on', 'iPhone'),
    ('macOS', 'runs on', 'iMac'),
    ('watchOS', 'runs on', 'Apple Watch'),

    ('Apple', 'was founded in', '1976'),
    ('Apple', 'owns', 'App Store'),
    ('App Store', 'sells', 'iOS apps'),

    ('iPhone', 'announced in', '2007'),
    ('iMac', 'announced in', '1998'),
    ('Apple Watch', 'announced in', '2014'),
    ('Vision Pro', 'announced in', '2023'),
]

graph = index_creator.from_text('')
for (node1, relation, node2) in kg:
    graph.add_triple(KnowledgeTriple(node1, relation, node2))

Although we could include many more triplets (real-world knowledge graphs often encompass millions of nodes), the size of our graph for this demonstration is sufficient. When visualized, this more extensive knowledge graph exhibits greater complexity and a richer depiction of information.

# Create directed graph
G = nx.DiGraph()
for node1, relation, node2 in kg:
    G.add_edge(node1, node2, label=relation)

# Plot the graph
plt.figure(figsize=(25, 25), dpi=300)
pos = nx.spring_layout(G, k=2, iterations=50, seed=0)

nx.draw_networkx_nodes(G, pos, node_size=5000)
nx.draw_networkx_edges(G, pos, edge_color='gray', edgelist=G.edges(), width=2)
nx.draw_networkx_labels(G, pos, font_size=12)
edge_labels = nx.get_edge_attributes(G, 'label')
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=12)

# Display the plot
plt.axis('off')
plt.show()

With this larger graph, we once again ask ChatGPT the question: “When did Apple announce the Vision Pro?” Leveraging the GraphQAChain object, ChatGPT processes the information embedded in the knowledge graph.

chain = GraphQAChain.from_llm(OpenAI(temperature=0), graph=graph, verbose=True)
chain.run(question)

> Entering new GraphQAChain chain...

Entities Extracted:

 Apple, Vision Pro

Full Context:

Apple is Company

Apple created iMac

Apple created iPhone

Apple created Apple Watch

Apple created Vision Pro

Apple developed macOS

Apple developed iOS

Apple developed watchOS

Apple is located in USA

Apple was founded in 1976

Apple owns App StoreVision Pro announced in 2023



> Finished chain.

' Apple announced the Vision Pro in 2023.'

ChatGPT successfully extracts the correct information from the more expansive knowledge graph. This result demonstrates that our model can not only scale to larger graphs but can also efficiently navigate a more extensive knowledge base.

The possibilities of implementing larger and more diverse knowledge graphs are practically endless. They can be populated with data from various sources, such as legal documents, code documentation, scientific literature, and more, enhancing the AI’s understanding and response accuracy across multiple domains. The integration of ChatGPT and knowledge graphs thus holds immense promise for future AI development.

Conclusion

As seen in our experiments, knowledge graphs can significantly aid in grounding and improving ChatGPT’s outputs. A key challenge with large knowledge graphs is finding connections between distant nodes, a problem often referred to as graph completion. Successfully addressing this issue would allow ChatGPT to make insightful connections and propose new ideas based on the information available in the knowledge graph.

However, the process of integrating knowledge graphs into language models like ChatGPT is still an evolving field. To further explore the various applications and delve into the details of implementing knowledge graphs, consider the book Hands-On Graph Neural Networks Using Python, which provides a comprehensive guide on this subject. Through this type of research and experimentation, we can continuously improve AI’s ability to understand and generate text, moving us closer to more reliable and grounded AI models.

If you found it helpful, follow me on Twitter @maximelabonne for more graph-related content!