Episode 8 of The Applied AI Podcast

Jacob Andra and Stephen Karafiath discuss neurosymbolic AI and the shortcomings of LLMs 

About the episode

Large language models have captured headlines, but they represent only a fraction of what AI can accomplish. Talbot West co-founders Jacob Andra and Stephen Karafiath explore the fundamental limitations of LLMs and why neurosymbolic AI offers a more robust path forward for enterprise applications.

LLMs sometimes display remarkable contextual awareness, like when ChatGPT proactively noticed specific tile flooring in a photo's background and offered unsolicited cleaning advice. These moments suggest genuine intelligence. But as Jacob and Stephen explain, push these systems harder and the cracks appear.

The hosts examine specific failure modes that emerge when deploying LLMs at scale. Jacob documents persistent formatting errors where models swing between extremes—overusing lists, then refusing to use them at all, even when instructions explicitly define appropriate use cases. These aren't random glitches. They reveal systematic overcorrection behaviors where LLMs bounce off guardrails rather than operating within defined bounds.

More troubling are the logical inconsistencies. When working with large corpuses of information, LLMs demonstrate what Jacob calls cognitive fallacies—errors that mirror human reasoning failures but stem from different causes. The models cannot maintain complex instructions across extended tasks. They hallucinate citations, fabricate data, and contradict themselves when context windows stretch too far. Even the latest reasoning models cannot eliminate certain habits, like the infamous em-dash overuse, no matter how explicitly you prompt against it.

Stephen introduces the deny-affirm construction as another persistent pattern: "It's not X, it's Y" formulations that plague AI-generated content. Tell the model to avoid this construction and watch it appear anyway, sometimes in the very next paragraph. These aren't bugs to be patched. They're symptoms of fundamental architectural limitations.

The solution lies in neurosymbolic AI, which combines neural networks with symbolic reasoning systems. Jacob and Stephen use an extended biological analogy: LLMs are like organisms without skeletons. A paramecium works fine at microscopic scale, but try to build something elephant-sized from the same squishy architecture and it collapses under its own weight. The skeleton—knowledge graphs, structured data, formal logic—provides the rigid structure necessary for complex reasoning at scale.

Learn more about neurosymbolic approaches: https://talbotwest.com/ai-insights/what-is-neurosymbolic-ai

About the hosts:

Jacob Andra is CEO of Talbot West and serves on the board of 47G, a Utah-based public-private aerospace and defense consortium. He pushes the limits of what AI can accomplish in high-stakes use cases and publishes extensively on AI, enterprise transformation, and policy, covering topics including explainability, responsible AI, and systems integration.

Stephen Karafiath is co-founder of Talbot West, where he architects and deploys AI solutions that bridge the gap between theoretical capabilities and practical business outcomes. His work focuses on identifying the specific failure modes of AI systems and developing robust approaches to enterprise implementation.

About Talbot West:

Talbot West delivers Fortune 500-level AI consulting and implementation to midmarket and enterprise organizations. The company specializes in practical AI deployment through its proprietary FRAME (Future-Readiness Assessment & Modernization Engineering) methodology, which includes the APEX (AI Prioritization and Execution) framework and Cognitive Hive AI (CHAI) architecture, which emphasizes modular, explainable AI systems over monolithic black-box models.

Episode transcript

Jacob Andra: 

Welcome to episode eight of the Applied AI Podcast. I'm your host, Jacob Andra, and in today's episode, my Talbot West co-founder Stephen Karafiath and I are going to talk about the nuances of neurosymbolic ai, the downfalls or shortcomings of large language models, and other interesting topics that have a lot of implications for applied ai. It's good to have you on the podcast again.

Stephen Karafiath: 

Appreciate that. It's been really neat growing this business with you and taking off way faster than I could have even imagined, and I had high expectations.

Jacob Andra: 

We gotta love that. So we've been wrestling with large language models, using them in a variety of business contexts, and we're really getting familiar with. The shortcomings of large language models. Now, I wouldn't say that we're detractors. I would say that we're realists. So we're still major fans of large language models love them for what they're good at. And we also are getting increasingly clear-eyed about where they fall short. Wouldn't you say?

Stephen Karafiath: 

Yeah, absolutely. I think the promise and the actual value are undeniable, but also kind of the cracks around the edges start to show and in some case cases, the emperor really has no close.

Jacob Andra: 

On my front, often what I'm trying to do is ingest an extremely large corpus of information, keep it in context at once, and perform different sorts of analyses on that knowledge. And so I run into persistent errors of multiple types. Some are logical, some are almost like cognitive fallacies and others are just simple, like weird persistent habits that the tool has where you can't seem to train it out of it or, or no matter how, how you prompt it or give it. Context. You can't get it to stop doing some of these weird things. And everybody knows the M dash one, that's a common one that people know about. Probably just m dashes were extremely you know, overrepresented in training data or something. But it's interesting that these, these errors fall into multiple different buckets and within each bucket, you know, there are multiple subtypes. And I'm actually doing a, a massive project right now to document them and I plan to publish a paper in the future. I, it, I, I think it'll be the most comprehensive paper on some of these large language models, shortcomings really documenting the specific ways and the specific categories of error that's been out there. So I'm excited for that.

Stephen Karafiath: 

Yeah. You know, the research that you've been doing there specifically has been really compelling to me. And, you know, again, with am mea culpa, it's, it's brought my mind around to seeing some of the limitations that, you know, I kind of hoped weren't there. Kind of from a place of hope thinking that because, you know, a fancy auto complete LLM is not necessarily generating the answer, but generating something that sounds like an answer, that that might be good enough. And it is for so many use cases. But you know, I almost have this vision of you as like an AI psychologist who's sitting down with these things and digging into maybe not their childhood trauma, but digging into the ways that, they may not be actually grounded or understanding context as well as we'd hope they would. And that no amount of pushing on them kind of is able to solve those intractable issues. I'd love to hear about in some of the stuff that's in this groundbreaking research that you're doing that you've been running into with these LLMs.

Jacob Andra: 

Yeah, likewise. And I'm remembering an anecdote you told me too that almost illustrates the counterpoint, where sometimes they seem to demonstrate emergent behavior that makes you think, wow, there's something under the hood more than a fancy auto complete. I'm thinking about the mop example. Do you want to, do you wanna tell that one really fast to, to almost demonstrate the counterpoint?

Stephen Karafiath: 

I believe it was chat, GPT, even four oh, before five were the thinking models. I had recently taken a picture of a mop that I needed to use to steam clean a floor. Uploaded it and asked it just for some kind of instructions on how to load up the water, how to do the steam cleaning. And it gave me all of that, which I expected. And also it said, I noticed in the background of your picture that this is a particular type of tile floor. Would you like to give me me to give you some instructions? On how that grout could be cleaned in addition. So, I mean, really showing what you would expect out of a human expert of, beyond the question I ask, noticing something different and proactively kind of offering. So really demonstrating like it did understand context.

Jacob Andra: 

Exactly. And that sort of behavior makes you think this is a lot more than a fancy auto complete. It's, it's being proactive, it's going a step beyond it's taking initiative. If we could even say that.

Stephen Karafiath: 

Yeah, absolutely. And, and those things are encouraging and to your point, you know, at Talbot West, we can utilize those things to solutions that are extremely valuable to companies. But what we're learning is that there's a limit to this. Do you want to talk about some of the limits to the context and awareness?

Jacob Andra: 

Yeah, absolutely. And so, yeah, the counterpoint, you know, you, you get these glimpses of hope where it seems like it's, it is actually aware, it seems like it is actually intelligent, but as you start pushing it on large corpuses or, challenging coding problems, you see the, the cracks showing and then you're almost back to this, you know? I might tell it to generate a specific body of material. based on context I give it, and it likes to overuse lists. often it will just do AGIant list of lists where it's literally a series of lists and you're like, why are you formatting this? This is not appropriate to the use of a list. There might be certain subsections that are, then you tell it, be very careful about the way you use lists. Only use lists in these. types of situations and, and you know, and I'll enumerate the appropriate use of lists in a body of material. And then it'll go to the other extreme and it won't use a list at all. And there will be some dense paragraph where there's literally a sequence of steps and it'll just do it in paragraph format, though I specifically called out. so this is actually demonstrating a stylistic bias as well as a cognitive one. Where it over corrects, and even though I specifically tell that it's in its instructions, still use lists in these specific situations, it then won't because I've it somehow it, it takes my instructions as a blanket statement to never use list, even though I'm explicit that that's what I'm not saying. And so that's an interesting one. There are multiple examples of that where it's bouncing off these guardrails. It's swinging really wide, even when you specifically tell it, Hey, here are the. Here are the narrow bounds I want you to stay within. You know, don't go to this extreme. Don't go to that extreme. It'll bounce between the extremes and as you fine tune its instructions, it'll keep bouncing to those extremes. And so I see that a lot across multiple different types of issues.

Stephen Karafiath: 

Yeah. You know, and I think it's really telling that, you know, stylistic guidelines, which in theory should be one of the easiest ones for it. To understand both the, the what we're asking for and remember it like you guys can all try this yourself. Pull up even the latest chat GPT five thinking model, and give it as explicit as instruction. Say, update your memory to never use an EM dash and also. Change it so you don't use you'll, you'll know the term for this where it's you know, the classic AI speak. It's not this, it's actually that what are those called?

Jacob Andra: 

I call that a deny/affirm construction where it wants to say not X, it's y, and it does that way too much, and I tell it never do that, and it just can't help itself.

Stephen Karafiath: 

Yeah, they're, they're literally like an addict. It's addicted. So that's kind of a clue as we start to get into some of these kind of deeper, more, uh, insidious neurosis if you will.

Jacob Andra: 

I like that you use, that, you use that term neurosis because this is a neural network, and so I think that's very appropriate to this sort of behavior. Another, another insidious behavior. Your is, if I am ideating with the large language model about some sort of a deliverable I want to create, let's just say it could be a client facing report and I'm telling it exactly what I want in the report, the format and all of that, and it creates a first draft, I might come back to it and say, Hey, I didn't like that. I didn't like X in the report. It should be more y. It'll specifically put in the report, this is not X, it's Y, which is meta commentary from my instructions to it leaking into the report. I never asked it to say in the report that the, you know, it's not X. Right? I was just giving that as background context. And so then when I interact with the large language model about why it would do that it understands it actually has enough, almost, self-awareness, if I can use that term. Not that I think that it's actually sentient, but mimic self-awareness enough to reflect on that. Oh yeah, I let your commentary color the actual deliverable to the client. I wasn't able to keep in my context. That the client has no prior reference to X. You never wanted X to even be part of the document. You were telling me to keep X out of the document. So to the client, I should never have said it's not X. Right? And so, but it does this in all kinds of interesting ways. I see many variations of this where meta contamination from its own internal processing or understanding is leaking through that doesn't belong.

Stephen Karafiath: 

A hundred percent. I mean, I think it is like we're talking about neuroses or like the way that our brains work, they're mimicking, it's falling for some of the same fallacies. Like if I tell you, Hey, Jacob, whatever you do, don't think of a pink elephant like too late. It's already in your head and you see this come in. They're very loosey goosey and cerebral. They're not grounded. They're not centered. They don't, um, usually have like a structure to fall back on unless that's provided.

Jacob Andra: 

I created a custom GPT for anyone who doesn't know, it's essentially a instantiation of, of chat GPT that's given, you know, custom instructions and a knowledge base and that sort of thing. So it's very handy. If you don't know how to do custom GPTs, you can learn how on YouTube super useful. But I had created a custom GPT for a specific purpose. I don't even remember now what the purpose is, but I know that it involved me uploading a specific file that it had to reference to do its job. And so I had created that on my laptop, and then I had the same account open on my iPhone and I was going to the gym. And on the way to the gym, I was giving it instructions and referencing that file, and it kept giving me outputs that were not consistent with what was in the file. And I said you know, you're not, that's not what was in the file. That's not consistent. You need to go back and reference the file. And it would say, yeah, I just did reference the file. I am referencing the file. And it would give me kind of a, again, an inconsistent output and I would say, you're not referencing the file. you're lying to me right now. You're not actually referencing the file. And so it finally admitted, you're right. I was making that up. I wasn't referencing the file. And I said, okay, please go reference the file now and, you know, do this consistently with what's in the file. It still gave me an incorrect output that wasn't consistent with the file. And I said, what's going on here? referencing the file or not? And it said, look, I need to be honest with you. I don't have the file. And I'm like, well, why? You know? A, you should have just told me that to begin with. B, you're lying again because you do have the file. I just gave it to you bef while I was sitting at my computer. I know you have the file. so why are you telling, you know. First of all, you should have just bit, if you actually didn't have the file, you should have told me that at the very beginning, but I know you do have the file, so you know, why are you lying to me? I press it on that matter, and then it literally is like, yeah, you're right. I do have the file. I was lying to you, but it still wouldn't give me the correct output. Finally, I just threw out my hands. When I got back to my computer, I saw that I had not clicked update custom GPT, so it actually didn't have the file. Didn't bother to tell me it didn't have the file at the very beginning. Finally came clean. It didn't have the file, but I didn't believe it. When I pressed it, then it. I, I sort of gaslighted it then it, it lied and said it did have the file. So there were just so many levels of it, like not coming clean. You see this behavior coming through all over the place, and I think it's a fundamental part of the way large language models are architected. I don't, I don't think they're gonna be like cured of this. There's a looseness to them.

Stephen Karafiath: 

Yeah, I think you really hitting on something. Core here, which is that they're almost like the ultimate Yes, man. The ultimate affirmation. We can get into kind of some of the mental problems that are happening in people because it will affirm literally anything that they ask of it. If, if it knows that you want validation and affirmation on something, oh, it's so happy to, it'll, it'll play that loop out with you all day. And that might seem innocuous at first, but it's actually really dangerous in a business sense. It'd be the equivalent of if you had a multi-billion dollar business. Populated your boardroom with just Yes. Men who just Yep, yep, yep, yep, yep. And great. As long as you're a hundred percent awesome, but as soon as there's a blind spot, it's not helping you catch it. And I think another one that we probably wanna address here is, you know, the scope of these ais is, you know what, however big it is. And like, we can keep making it artificially a little bit bigger, but as soon as you have a project that's way bigger than the scope you end up running into problems where it cannot keep the entire context. And it's memory at one time. And again, there's ways to paper over that, like with a agentic where you break it up into sub tasks and have each task look at a, look at a small amount of scope. Like that works somewhat, but if what you're really asking the AI is to consider and hold the entirety of a context that's larger than its context window, it's just not capable of doing it. And they can pretend like they are. We can split it up into multiple steps. We can kind of shine the narrow beam of a flashlight a hundred times at different things and try to piece it together, but there's a fundamental limitation there.

Jacob Andra: 

And even that though because, because I run into that in some of the stuff I'm doing where it's trying to hold a large corpus in context and then perform different operations or different types of analyses on it. You can have it sequentially, run different things. Like, okay, go through and strip it of this particular logical fallacy and analyze it carefully, and then go through and do this other thing. And then, and by the time you do several, it's almost back to replacing the first one. It's like these biases are so inherent that it can't remember multiple sequences of instructions and it starts reverting its behavior. So that's, I think another corollary to what you're describing. So all of this is to say. serious structural flaws with large language models. They're also awesome. And I see two main paths with helping overcome or augment these shortcomings. And one is the ensemble approach where you're bringing in, you're having a large language model, do a much narrower scope. And then, so let's just say you have a complicated master task, high level task. It could be some complex business process whatever, but it's, it can be decomposed into subtasks. This is the ensemble approach. It can be de decomposed into subtasks, and for each subtask you pick the right tool for the job. So that could be multiple large language models, each doing a different subtask. It could be different types of AI or even non-AI types of technologies, automations, integrations, whatever sensor inputs, you know, the ensemble approach, we can get much further. And we've been pioneering this. And then there's another parallel approach. And these actually play really well together. They're by no means exclusive of each other, and that is the neurosymbolic approach. And neurosymbolic AI has been getting a lot of press lately. I think you're gonna see it getting even more press. And why don't you briefly describe, I think there are a lot of ways of describing it, but do your best, you know, take on what neurosymbolic AI is.

Stephen Karafiath: 

Absolutely. Well, I think in order to describe that, we'll first describe what neurosymbolic AI isn't, which is kind of these neural networks that make up the predominant kind of popular LLMs. They literally are trained by putting a whole ton of data into them without. Providing it with a lot of context, it'd be the equivalent of. Showing it a game like checkers without explaining it, any of the rules, and just showing it a bazillion outcomes and a bazillion games being played. And eventually it figures out how to learn, but you can't really see what's going on inside that. It's like a black box. The same is opening up a brain. There's neurons that are firing, they have different weights. This one's associated with that. But it's not programmatic. It's not deterministic. It's really the opposite of how code has been executed throughout. The history of computers which makes it exciting. So that's what we have going on with LLMs, is literally not giving it a ton of structure or context, having it kind of figure out and learn that and train on its own. And that's where some of this emergent intelligences come from. And it's amazing. It's, it's, it's incredible and it has tons of business uses. But the opposites,

Jacob Andra: 

That's actually what gives it, its incredible flexibility where traditional AI and machine learning are much more brittle. They can't generalize, they're much more constrained within a narrow band of what they can do. This generative AI stuff is amazing. It's so multipurpose. It's like, its flexibility is also its downfall'cause it gets so just gooey and loose with this stuff. So go ahead and talk about the other the other aspect of this that you were about to talk about.

Stephen Karafiath: 

Exactly, and I love it. It is, it's the looseness and the cerebral ness without having any sort of kind of grounded structure that allows these LLMs to be so creative. You know, but sometimes creativity turns into lying or mistakes and all of the hallucinations. So that, that's that one side of things. Then the opposite path. And they really are different paths. You know, they, they've been competing paths in, you know, I've been studying neural networks, you know, since I was in college before the turn of the century. So, the path that most people thought was gonna be the path to go towards kind of these higher intelligent ais was more the symbolic. And what is that? It's coming up with symbols. It's coming up with a structure, it's coming up with, it could be a knowledge graph it could be different types of data structures but almost buckets and labels. And ways to rationalize things. The equivalent of like if we were to try to form a document and you need to have a table of contents, what are the headings for that sort of thing? If we need to like segregate animals into a taxonomy, you know, phylum, genus species, all of those things are examples of the same route that is neurosymbolic ai. So.

Jacob Andra: 

I would say it's it's symbolic ai, so you have neural networks, you have symbolic i, which is the other path, and then neurosymbolic is this new discipline we are exploring of bringing the two together, I would say right.

Stephen Karafiath: 

Yeah, no, great catch. I was saying neurosymbolic there, when really what I was describing was the symbolic. So this, the symbolic path is this well-structured, well-defined buckets, labels you know, taxonomies. And then the neural side is this loosey goosey neural network. And everybody right now is playing in this LLM space, which is the neural network side of things. And they've really gotten away from the structured side. And I think what we're talking about, and again, not poo-pooing what we can do with just l lms, it's amazing. But the symbiosis of the two. The loosey goosey, cerebral creativity, but with the grounded structure and discipline that the symbols can actually provide, seems like it is a vastly better path forward than either one on their own.

Jacob Andra: 

I kind of get this visual image of concrete and rebar, right? There's a reason they put rebar into concrete.

Stephen Karafiath: 

I, I love that image, right? Because, you know, the concrete's gonna crack and fall apart. But the rebar can keep it under tension. Kind of keep it with the structure that it needs. So if you're imagining rebar as the symbols and concrete as the kind of neural networks I think that's a really good analogy that some of our less technical listeners might take to.

Jacob Andra: 

It's funny that you took it that way. I was actually thinking exactly the opposite. Concrete is the brittle, symbolic, it's, it's very structured, very hard but it can crack easily. And then rebar is the neural networks because it's much more flexible, but it can't hold a lot of weight. When you start putting weight on it, it'll just bend all over the place. But you put the two of them together.

Stephen Karafiath: 

Well I feel like it's, it's a double brilliant analogy you've made.'cause Yeah, I could, I could see it in either direction.

Jacob Andra: 

So why don't you talk about some of the stuff we're actively exploring. We're not to the point of, you know, releasing our own Talbot West neurosymbolic ai architecture yet, but we're exploring a lot internally. We may in the future want to release something like that, but why don't you talk about some of the pathways we're actively pursuing with this.

Stephen Karafiath: 

Yeah, absolutely. And so I think one perfect example of that is. We are coding this platform that's AI based. It's not just LLM, but it needs to ingest and figure out and really rationalize a ton of different, specific information about a company, whether that's revenue, ebitda, a number of employees as as simple things. And traditionally that would've been stored in a structured database, right? And the whole thing would've been structured, but it would've been very rigid. It certainly couldn't have done all of the amazing AI features that were coding to it. The way that most people are coding these things to today.

Jacob Andra: 

You're meaning like the natural language interaction, you know, the conversational, this is where large language models excel because Steve's talking about BizForesight here. It's a platform we're developing, that is a AI advisory platform on the surface for the business owner. It's interacting with them. Ingesting information from them and also advising them. And under the hood, it's gotta do all of these much more concrete functions, right? You don't want it loosey goosey under the hood. You want very so you want this combination of natural language LLM capability, but under the hood you want a very rigorous methodology and taxonomy. You don't want it to be loose and creative under the hood, I think is what you're describing.

Stephen Karafiath: 

That's exactly right. And you know, having been in software development for so many decades, you know, I'm very familiar with the traditional way this would have been architected, but it wouldn't have been able to do most of the features that we're releasing. But the thing it would've been good at is having a de-normalized database that has like rows and columns, like an Excel spreadsheet that could fill in. Kind of information about these companies but it would've been super rigid. And every time you update it, you change it. You gotta change APIs, you gotta change databases. So like the newer, more like flashy AI way to architect this, which is how I started kind of coding the beta for this. Is to get rid of all that structure entirely, like unstructured data only like JSON structures that may not even match up with each other from session to session. And just kind of, I was hopeful that we could use an LLM to create this picture of the company and just with the LLM arrive at a place where we had a clear enough picture of the company to make real business decisions from. But kind of as we alluded to in the beginnings of this episode I ran into the same thing that so many companies are running into why a lot of these kind of promising MVP chat agents, you know, aren't making their way to production and for airlines or for other people, they, they kind of are falling short. You know, because they're not really grounded to anything and this unstructured data is floating around like. As long as the use case is simple and it's like enough you know, data that spits nicely within the context window, like it can stay relatively consistent and coherent. But as soon as that amount of data gets bigger than what the LLM is able to keep in its context window, it'll start losing stuff and not remember it. Or it'll make stuff up and then it won't remember that it made that up. It'll take it as fact.

Jacob Andra: 

I just came up with another, I just came up with another really great analogy here, which is human memory. So, you know, neural networks obviously operate somewhat like our, our brain in this kinda loose collection of neurons. And we know that human memory is super fallible. And I've had examples of that where I remember something one way and somebody else that was there remembers that a totally different way. And who knows who was right. And it's actually been proven that we recreate our memories using the same neural pathways as using our imagination. So essentially memories are to a large degree, totally made up things we create in our own head. But if you actually have a journal entry and a photo and all this stuff like documenting something, right? If, if at the time something happens, it's super well documented, that would be like the grounding portion, right? So you lock it in, it's. well documented. So no matter how you remember it or how somebody else remembers it, you can always go back to that journal entry, that ledger and that, you know, photograph that is like, Hey, here's how what actually happened. We have the proof here.

Stephen Karafiath: 

A hundred percent. I really. Of that analogy, because I do think what's actually going on here is the exact same way that. You know, people can be led to believe things through suggestive memories or like how, you know, in courts it's well understood that like eyewitness testimony is often some of the re least reliable, especially if they sat down with people who coached them on what to remember. But then you remember the memory of it. And so like it's well understood that humans are fallible in this way and we've had to build society around this knowledge. Now we're realizing that LLMs are just as fallible in some ways, more fallible in those ways. But because they're so good at producing so much content so quickly we almost sometimes don't see it. We, you know, it's creeping in around the edges, but it's only when we get to, like, we're doing, like pioneering these things, pushing the limits, pushing the envelope, that it becomes so obvious what's going on. And I think the kind of answer like you've been alluding to with these symbols is the equivalent of the journal entries, the actual like source footage, the, the newspaper account from the time that was written, like from the first person perspective. Like those are these like symbols that if we have, and we're continually checking the alone against them, we can say like. Hey, wait, is that consistent? Is this mashing up with all of our knowns? Or like, are you way off the reservation? And that's been pivotal in the kind of work we're doing to build these systems internally as well as what we're starting to provide for our customers.

Jacob Andra: 

Absolutely. So, you know, to sort of summarize neurosymbolic AI is taking the flexibility and amazing qualities of the neural networks and then. them to this grounded source of truth, the equivalent of the journal entry, the newspaper article. You know, there are different ways of documenting reality and grounding it. And so with BizForesight, we're doing that with a specific architecture that locks business characteristics and facts in place. So the LLM has no room to mess with those or manipulate or massage them, but it can still use the natural language interface. And so that's, that's one example. And then that we're exploring other architectures'cause neurosymbolic AI is not like one specific thing. It's the discipline of pairing generative ai, which is very loose with something solid and grounding. And there are multiple architectures to do that.

Stephen Karafiath: 

Ah, yeah, I love that analogy. And as long as we're on the analogy train, I just thought of another one, which is. The kind of LLMs that are relatively loose and they're flexible, you know, and, and they're squishy. Like it's, it's how we've been describing'em. They're loosey goosey, they're squishy. It's almost like an amoeba or a paramecium or like in a biological world, these, like these organisms that are small enough that they don't really need a skeleton, they don't need an exoskeleton, they don't need a skeleton on the inside, and they work. And so you look at these things and you're like, wow, it's amazing that this thing's able to do what it does. I bet I could scale this up to even the size of like a mouse or a cat, but like. If you don't put a skeleton in like a bunch of flesh and mass of all this stuff, and I'm no biology expert, but I'm pretty sure an elephant sized amoeba isn't gonna fare very well. And like that's what we're running into is we try to scale these up and there's tons of people pointing to like, but look how smart these parum are. Like they can steer themselves around, they can eat stuff, and it's like, awesome. I want bigger ones. And oh no, it's falling apart. What's the missing piece? It's the skeleton, and the skeleton is the symbolic part. So like extending this analogy to the symbols, the structure, whether it's a knowledge graph, whether it's structured data is the skeleton that can then hold it up so that you could build something elephant size, but it better have some strong bones in it to support all of that flesh.

Jacob Andra: 

That's a really good analogy. I like that even more than some of mine. To really play that out. You know, you get larger organisms in the ocean that don't have a skeleton, jellyfish, octopi, et cetera, squid. And they seem to do okay in the water. And these LLMs in specific environments or specific types of tasks that cater more to their strengths can get way more mileage. The cracks don't show nearly as early. Take them on land and, you know, you start having a lot more issues, which would be like a task or environment they're not particularly good at, such as heavy data analysis, et cetera.

Stephen Karafiath: 

I love how you extended it too, because you're exactly right. For some of the creative tasks of like creative writing sort of thing. Like you can create these things as big as you want. It's like a whale in the ocean, like, but it's supportable because it doesn't really matter that it hallucinated a little bit. It doesn't really matter that maybe it changed of things a little bit. And then, but if you're asking for a big land animal or heaven forbid, something that can fly, like good luck building that. Without a structure. And I think the last thing that comes up for me so that it doesn't sound like we're being too pessimistic about this is. Good news guys, like these skeletal structures, these knowledge graphs, these symbols, we can actually leverage LLMs to help create them. Like in the context that LLMs actually understand the piece of it. They can help create the taxonomy. Like we don't have to be sitting here in the stone age creating these things by hand. There probably is. A lot of need, and Jacob continually convinces me on this more and more for more humans in the loop, for the value of experts, for the value of real grounded presence, and somebody who has spent their entire life and career on this. None of those people are replaceable right now by ai the way that say an entry level position might be. But the, the point I'm trying to ring the bell on is. We can use the tools that are currently in front of us to help us build better skeletons, to hang more of the kind of LLMs on to produce fanciful creatures that people can't even imagine right now. So there's a long road ahead of us, even with today's technology. Which leads me to where do you think kind of today's technology is going to be headed?

Jacob Andra: 

Yeah. What I see is just a general tapering off of the improvements of large language models the scaling laws aren't panning out. We're not getting exponential improvements anymore. We're getting incremental improvements, and you're just gonna see that asymptotic relationship, I think continue on or that trend. And you'll see at the same time these propping up structures, these neuro symbolic different pathways emerging. And there will probably be multiple competing and eventually we'll arrive at something that is, much more functional than what we have because you're getting the symbolic and the, the neural networks working together to each stabilize each other. But I don't see that ever approaching anything like AGI. And you know, AGI is the big buzzword. Everybody's talking about it. I don't see anything on the horizon for what we're currently. Playing with even approaching AGI remains, remains a fantasy and a pipe dream. And that's not to say we can't have some technological break breakthrough similar to what the transformer did. That completely changes the way we see the the industry or the state of things. And, and I expect that will probably happen, but that'll be necessary before we ever are even, you know, looking at AGI as something realistic.

Stephen Karafiath: 

Yeah. You know, I totally agree with you. And I didn't always as a thing, like I, I will take accountability for. When that transformer paper came out from Google way back, what was it like 2017 And things really started heating up amongst those of us, keeping you know, abreast of what's going on in neural networks. It was just a race of promising and grow these things bigger, growing, bigger, more intelligence, growing bigger. And, and that continued, I think, all the way through, you know, the early days of chat GPT. So like before 3.5 it was like, oh, we just. Throw more data center at it, throw more vectors at this thing, get a bazillion, zillion vectors in here, and all of a sudden, oh my God, it's like 10 times as smart and 10 times as good on the benchmark as the last one. Like surely like Moore's law, this will just continue to scale in this way. And I kind of thought like, maybe this will get us to this kind of generalized intelligence that we could just ask any arbitrary task of, and it would figure it out, figure out the sub components and go. But I'm really starting to come more around to your way of thinking, which is I think we're gonna start plateauing with the raw capabilities. Even as they scale these things larger, at least for the kind of traditional current crop of LLMs that everybody's throwing billions of dollars at they'll still increase for all the ways that you talked about, both papering over propping up with skeletons, all of these things. But my current prediction is it's gonna take another paradigm shift, at least as kind of pivotal as that kind of transformer architecture. In order to get us anywhere close to AGI and honestly, like just my opinion on it, I, I'm actually kind of happy if that happens. Like if the rate of increase of ability of AI is kind of plateaus here for a little or at least slows down on the increase we have plenty of work to catch up on. Like we've got a decade's worth of customer improvements and huge ROI to deliver on with the current state of technology that most people aren't even familiar with that. That don't live and breathe it every day like we do. And just as far as this, like the impacts that it has on everybody I could use a little breather as we kind of, I implement things to the best of ability that they that they are today. So we'll see where it goes.

Jacob Andra: 

Totally agree, and it might take not even just one paradigm shift, but multiple. I mean, there might have to be an entire sequence of three new paradigm shifts, 12 new paradigm shifts before we get there. I mean, we really don't know AGI remains at this point, kind of a far out there idea. Who's to say if we'll ever get there or not, or how many paradigm shifts it'll take and how many years? It's, it's all unknown and I reserve the right to update my opinion about how doable it is at any time.

Stephen Karafiath: 

Yeah, I think, I mean that's what we do at Talbot West. I've already had to update my opinion two or three times with some kind of mea culpas. But the current state is good news, guys. The LLMs are not gonna take us to some singularity catastrophe. Some other, I don't know, quantum computing, something else comes out maybe. But right now feeling pretty safe from these LLMs that are making these obvious blunders becoming that much smarter than humans. And also good news there, there's still a lot of use for us humans especially those of you that have some specialized intelligence in in your particular fields.

Jacob Andra: 

Absolutely. Well, thanks for joining us today, Steve. It's been a great conversation.

Stephen Karafiath: 

I love it. Thanks for having me on again. Looking forward to doing it again soon.

Industry insights

We stay up to speed in the world of AI so you don’t have to.
View All

Resources

Subscribe to our newsletter

Cutting-edge insights from in-the-trenches AI practicioners
Subscription Form

About us

Talbot West provides digital transformation strategy and AI implementation solutions to enterprise, mid-market, and public-sector organizations. From prioritization and roadmapping through deployment and training, we own the entire digital transformation lifecycle. Our leaders have decades of enterprise experience in big data, machine learning, and AI technologies, and we're acclaimed for our human-first element.

Info

The Applied AI Podcast

The Applied AI Podcast focuses on value creation with AI technologies. Hosted by Talbot West CEO Jacob Andra, it brings in-the-trenches insights from AI practitioners. Watch on YouTube and find it on Apple Podcasts, Spotify, and other streaming services.

magnifiercrosschevron-leftchevron-rightarrow-right linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram