Talking with More Nuance about Knowledge Graphs in Generative AI Systems
Am I the only one who feels there is conflation in how we speak about the value proposition of Knowledge Graphs in today’s Generative AI systems?
I liked to think of the evolution of Knowledge Representation (KR) in AI as going through 3 eras:
- Classical period
- Semantic Web era
- Generative AI (today)

During the first period and within the domain subset interested in logic-based AI, the value proposition was about the precision and expressive power of the representations used to capture knowledge and facilitate reproducible machine understanding and logical inference.
In the subsequent era, the insight was to marry this (and other methods) with the architecture of the web, its connectivity, and the collective use of open standards to achieve a particular vision. This specific vision (“The Semantic Web” [1]) never became realized. Still, earnest efforts to achieve it furthered much of our understanding of the value of these approaches in the software systems we want to build for the future and best practices for addressing semantic interoperability.

The KR systems of that era were RDF, RDFS schema, OWL, RIF, and many others. What these standards all had in common was interoperability with RDF (the core layer that connected KR with the web). Still, they represented an increasing spectrum of the nature and expressive power of using logic to represent knowledge.

The fact that the common layer was graph-based wasn’t the most pronounced value proposition. A more pronounced value proposition was the expressive mathematical logic represented by the graph structures and the machine understanding and semantic interoperability it facilitated. However, the most pronounced part of the value proposition was the ‘web’ part of the Semantic Web, i.e., that the nodes in the graphs were web-addressable resources and thus benefited from all the architecture strengths of the World Wide Web: The REST software architectural style [3].
Hindsight is always 20/20, but the fact that this was the most pronounced value proposition might be one of the reasons we haven’t realized the vision despite a lot of fanfare about what could be done with the sum of the parts. Still, so much was learned through assembling those individual parts.
As a consequence of trying to do all this efficiently and communicate how it differed from traditional relational database systems, a significant amount of effort was focused on the practical challenges of representing natively graph-like knowledge in storage systems that were not optimized for this. This focus was purely a matter of data representation and query efficiency.
As a person who spent a lot of effort [4] during that earlier time designing adaptations to relational databases to make them efficient frameworks for storing and processing RDF data, I started off thinking (in my naive hubris) that such approaches to representation were inherently superior to relational database management systems in all cases.
We are now in a 3rd era of how we can use AI to build the systems we envisioned back in the original classical period. Currently, there is growing interest [5] in how graph-based representation systems can help Generative AI systems. I am very much in the camp that the lessons of the previous era will be critical in furthering those of the current times and helping us to move beyond the concerns that still prevent us from (as one example and in my opinion as an experienced medical informatician) responsibly using Language Language Models in all aspects of operational medical information systems.

Still, the way we communicate that value proposition is entirely focused on the structure of the data alone (i.e., that they are ‘graphs’). When you dig further into studies of this era into any measurable benefit for Generative AI, the analysis seems very surface and almost entirely ignores the separate and orthogonal value proposition of any mathematical logic they use and the reasoning they facilitate, despite how much the ultimate goal is to build systems that ‘reason.’
For example, we consistently refer to these approaches to KR as Knowledge Graphs (KGs) and rarely refer to “Ontologies.” Sometimes, it feels like that has now become a dated term. As we understood in the previous era, there is a difference in machine understandability between a purely RDF graph and an RDF graph with reference to an OWL ontology or a RIF document that describes the semantics of the resources. Logic and web addressability were the secret sauce for intelligent behavior and machine interoperability beyond the fact that the described resources happened to be in a graph.
Despite my earlier naive hubris about graph-based KR, my maturation process as an information system architect has led me to find many situations where relational database systems were perfectly fine, especially where the schema was sufficient for most important use cases and not likely to change.
For example, I often use SNOMED-CT for my medical terminology software development. Although it is a first-class ontology and facilitates logical clinical inference [6], it is not distributed primarily in OWL, although that is one of its distribution formats. Their relational format is sufficient for the underlying mathematical logic that captures the semantics of SNOMED-CT and the operational needs for how it is versioned and distributed.
In addition, when I use it, I rely entirely on the relational format, have never needed to use the OWL format or to query it as a KG, and can do so with the maximal benefit of an open-source relational database like Postgres efficiently and without any impedance to semantic interoperability or query efficiency.

The way we speak about graph-based KRs and how Generative AI systems can benefit from them needs to be more nuanced. What we now call KGs used to be called Linked Data. The shift from emphasizing their linkage to emphasizing the “Knowledge” part of KR may justifiably belie the original vision. Still, the focus on the “Graph” part may unintentionally bury some of what we learned from previous eras about how logic-based KR can help the persistent issue of semantic interoperability and facilitate the machine understanding we still very much desire.
[1] Berners-Lee, Tim; Hendler, James; Lassila, Ora (May 17, 2001). “The Semantic Web” (PDF). Scientific American. Vol. 284, no. 5. pp. 34–43. JSTOR 26059207. S2CID 56818714.
[2] Hooshmand, Y., Resch, J., Wischnewski, P., & Patil, P. (2022). From a monolithic PLM landscape to a federated domain and data mesh. Proceedings of the Design Society, 2, 713–722.
[3] Fielding, Roy Thomas (2000). “Chapter 5: Representational State Transfer (REST)”. Architectural Styles and the Design of Network-based Software Architectures (Ph.D.). University of California, Irvine
[4] Elliott, B., Cheng, E., Thomas-Ogbuji, C., & Ozsoyoglu, Z. M. (2009, September). A complete translation from SPARQL into efficient SQL. In Proceedings of the 2009 International Database Engineering & Applications Symposium (pp. 31–42).
[5] Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., & Wu, X. (2024). Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge and Data Engineering.
[6] Baader, F., & Suntisrivaraporn, B. (2008). Debugging SNOMED CT using axiom pinpointing in the description logic EL. KR-MED 2008, 1.