How will

artificial intelligence

change our future?

Quick links

A stylized human holding a tablet with glowing text and data visualizations emerging from the screen. Around the tablet, abstract symbols and icons representing knowledge and information flow towards the human, symbolizing AI-generated insights from a large language model.—What is a large language model?

What is a large language model?

By Jacob Andra / Published June 26, 2024

Last Updated: July 29, 2024

Large language models (LLMs) are a type of generative AI that specializes in processing and producing human-like text based on vast amounts of training data and complex neural network architectures. LLMs are a hot topic, with more than 5,000 academic publications discussing them since 2017.

LLMs power popular AI chatbots such as ChatGPT and Claude.

As we described in our explanation of generative AI, think of a parrot, only unimaginably sophisticated. Where a parrot learns to repeat phrases, LLMs learn the underlying patterns of language. They don’t actually “understand”; they’re just incredibly good at learning and repeating patterns.

This pattern-matching ability unlocks some pretty awesome use cases for LLMs, which we’ll explore.

Main takeaways

LLMs amplify human abilities across a wide range of disciplines.

To date, LLMs are no substitute for human judgment and creativity.

LLMs have some pretty glaring shortcomings.

Humans who leverage LLMs will outcompete their peers across most knowledge tasks.

We show you how to make the most of LLMs while minimizing their downsides.

How do LLMs work?

Let's take a closer look at how LLMs work under the hood. While the details can get complex, here's a simplified overview of the main components or aspects:

Model design. LLMs use a structure called a transformer. This architecture includes attention mechanisms that help the model understand how words relate to each other in sentences, leading to more natural and coherent text generation.
Neural networks. LLMs rely on artificial neural networks that mimic the human brain's structure, with interconnected nodes that process and transmit information. The transformer is a specific type of neural network architecture that excels at handling sequential data, such as text.
Training process. The model is trained on massive datasets, usually text content. They learn from both the structure and content. Backpropagation adjusts the weights of the neural network to improve the model's ability to predict and generate language. Over time, the model becomes better at understanding context and generating relevant text.
Specialization. After the initial training, LLMs can be fine-tuned on specific types of text to excel at particular tasks, such as customer service, legal analysis, or medical advice. This fine-tuning process helps the model adapt to specialized vocabularies and contexts.
Practical use. Once ready, LLMs can tackle many different tasks. They analyze input, apply what they've learned, and produce relevant text—whether answering questions, translating, or creating content.

How can LLMs help my company?

A stylized depiction of a futuristic cityscape, where towering skyscrapers are connected by luminous threads representing data and knowledge. In the foreground, a faceless figure is depicted synthesizing these threads, illustrating the LLM’s ability to combine and generate new information from vast data sources.—Capabilites of LLMs

Large language models have enterprise applicability far beyond content creation (their most intuitive use case). Here are ten use cases; there are infinitely more:

Customer support automation
Market research and competitive analysis
Personalized marketing and sales
Legal document review and compliance
Financial analysis and forecasting
Human resources and talent management
Supply chain optimization
Product development and innovation
Internal knowledge management
Fraud detection and risk management

Let’s dig a bit deeper to see how LLMs apply to these (and many other) business applications.

Customer support automation

LLMs power sophisticated chatbots and virtual assistants that provide 24/7 customer support.

Benefits:

Reduces need for human agents, lowering operational costs
Provides instant responses, improving customer satisfaction
Handles a wide range of issues, from troubleshooting to order tracking

Example: a telecommunications company could implement an LLM-powered chatbot, potentially reducing call center volume by 30% and improving first-contact resolution rates.

Considerations:

Integration with existing CRM systems is crucial for seamless operation
Human oversight is still necessary for complex issues
Regular updates are needed to keep the AI current with products and policies

Market research and competitive analysis

LLMs analyze vast amounts of market data, customer reviews, and competitor information to generate insights.

Benefits:

Automates collection and analysis of market data
Identifies trends, customer sentiments, and emerging opportunities
Provides actionable insights for strategic decision-making

Example: a retail chain might use LLM analysis of social media and review sites to identify an emerging consumer preference, potentially leading to a successful new product line.

Considerations:

Data quality and source diversity are crucial for accurate insights
LLM analysis should complement, not replace, traditional market research methods
Interpreting results still requires human expertise and industry knowledge

Personalized marketing and sales

LLMs create highly personalized marketing messages and sales pitches by analyzing customer data and preferences.

Benefits:

Increases effectiveness of marketing campaigns
Enhances customer engagement and loyalty
Improves conversion rates by delivering tailored content

Example: an e-commerce platform could use LLM-generated personalized product recommendations, potentially resulting in a 15% increase in average order value.

Considerations:

Integration with existing marketing automation tools is key
Privacy concerns must be addressed, ensuring compliance with data protection regulations
Regular A/B testing is necessary to optimize LLM-generated content

Legal document review and compliance

LLMs assist in reviewing legal documents, assessing compliance, and identifying potential risks.

Benefits:

Speeds up document review process
Reduces likelihood of human error
Ensures contracts and agreements comply with relevant regulations

Example: a multinational corporation might use LLMs to review thousands of contracts for GDPR compliance, potentially completing in minutes what would have taken months manually.

Considerations:

Human oversight is crucial, especially for high-stakes documents
LLMs must be trained on up-to-date legal information and precedents
Confidentiality and data security are paramount when handling sensitive legal documents

Financial analysis and forecasting

LLMs analyze financial data, generate reports, and provide forecasts based on historical data and market trends.

Benefits:

Enhances accuracy of financial predictions
Automates generation of financial reports
Identifies potential investment opportunities and risks

Example: An investment firm could use LLM-powered analysis to predict market trends, potentially outperforming traditional forecasting methods by 20%.

Considerations:

Integration with existing ERP and financial systems is necessary
LLM analysis should complement, not replace, human financial expertise
Regular model updates are needed to account for changing economic conditions

Human resources and talent management

LLMs streamline HR processes such as recruitment, employee onboarding, and performance evaluations.

Benefits:

Automates resume screening and candidate matching
Provides personalized onboarding experiences
Analyzes employee performance data to identify areas for improvement

Example: A tech company might use LLM-powered resume screening to reduce time-to-hire by 40% while increasing the quality of candidates interviewed.

Considerations:

Bias in training data must be addressed to ensure fair hiring practices
Integration with HRIS systems is crucial for seamless operation
Employee privacy concerns must be carefully managed

Supply chain optimization

LLMs analyze supply chain data to optimize logistics, inventory management, and demand forecasting.

Benefits:

Improves supply chain efficiency and reduces costs
Enhances inventory management by predicting demand more accurately
Identifies potential disruptions and suggests mitigation strategies

Example: a global manufacturer could use LLM analysis to optimize its supply chain, potentially reducing inventory costs by 15% and improving on-time deliveries by 10%.

Considerations:

Integration with existing supply chain management software is key
Data quality from various sources (suppliers, logistics partners) is crucial
Regular model updates are needed to account for changing global conditions

Product development and innovation

LLMs assist in ideation and development of new products by analyzing customer feedback and market trends.

Benefits:

Accelerates product development cycle
Identifies unmet customer needs and market gaps
Generates innovative ideas based on data-driven insights

Example: A consumer electronics company might use LLM analysis of customer reviews and support tickets to identify a key feature for their next product, potentially leading to improved sales.

Considerations:

LLM insights should complement, not replace, traditional R&D processes
Intellectual property concerns must be addressed when using AI-generated ideas
Balancing AI-driven insights with human creativity and intuition is crucial

Internal knowledge management

LLMs are an important component of intelligent knowledge management systems that provide employees with quick access to information. These can take the form of retrieval augmented generation (RAG) setups or other architectures, but the core goal is the creation of an internal AI expert for your operations and processes.

Benefits:

Enhances employee productivity by reducing time spent searching for information
Ensures knowledge is easily accessible and up-to-date
Facilitates better collaboration and information sharing

Example: A consulting firm could implement an LLM-powered knowledge base, reducing time spent on research by 30% and improving project delivery times.

Considerations:

Proper data preprocessing prepares your knowledge base for AI ingestion
Regular updates and maintenance of the knowledge base are crucial
Balancing accessibility with data security and confidentiality is important

Fraud detection and risk management

LLMs analyze transaction data and identify patterns indicative of fraudulent activities or potential risks.

Benefits:

Enhances accuracy of fraud detection systems
Reduces financial losses
Provides real-time monitoring and alerts for suspicious activities

Example: A financial institution might implement LLM-powered fraud detection, potentially reducing false positives by 40% and catching sophisticated fraud schemes that traditional methods could miss.

Considerations:

Compliance with financial regulations is critical when implementing AI-powered systems
Regular model updates are needed to stay ahead of evolving fraud tactics
Human oversight is still necessary for investigating and confirming potential fraud

Other uses of LLMs in business

Large language models have almost unlimited use cases in enterprise. Here at Talbot West, we regularly employ them to drive efficiencies in all sorts of tasks, from custom image generation to data analysis to project management.

If you’re interested in the applicability of AI systems to your business, we’ll be happy to talk to you about your needs and priorities, and we can recommend specific tools and technologies to address those.

How can we help?

LLM capabilities

LLMs are much more than spell-checkers. Let's explore some of their most impressive abilities:

Natural language understanding
Text generation
Translation
Sentiment analysis

Natural language understanding

LLMs grasp context, syntax, and semantics with remarkable precision. They understand intent, even with sarcasm or complex jargon. This ability powers better chatbots and virtual assistants, which lead to smoother conversations with customers.

Text generation

These artificial intelligence systems craft coherent, contextually appropriate content, from articles and marketing copy to creative pieces such as stories and poems. An AI can write a compelling product description or pen a sonnet about artificial intelligence.

Content creators in the media, marketing, and entertainment sectors use this capability to boost productivity and creativity.

Translation

LLMs break down language barriers. They excel in translation by grasping the nuances of different languages. They produce high-quality translations that preserve the original text's meaning and tone.

A global e-commerce company can use LLMs to accurately translate product descriptions, customer reviews, and support documents into multiple languages.

Sentiment analysis

LLMs uncover underlying sentiments in text. This ability gives businesses valuable insight into customer opinions and emotions. A hotel chain can analyze thousands of online reviews in minutes, identify trends in customer satisfaction, and pinpoint areas for improvement. Such information shapes marketing strategies, drives product development, and enhances customer service.

A stylized depiction of an AI system processing and generating text data. The image shows a computer interface with flowing data streams and text being generated from a central AI core. The AI core is represented by a sleek, futuristic design with glowing lines connecting to the data streams. The background is minimalist with a subtle gradient.—What is LLM in generative AI?

Types of large language models

LLMs are a recent outshoot of the fields of natural language processing (NLP) and AI. They can understand, generate, and manipulate human language in ways that were previously unimaginable. While there are many LLMs available, they generally fall into the following categories based on their architecture, training data, and use cases.

Transformer models
Autoregressive models
Encoder-decoder models
Multimodal models
Specialized domain models

Transformer-based models

In 2017, a group of researchers introduced the transformer architecture, and it turned the natural language processing world upside down. Here's why: instead of processing words one after another like a slow reader, transformer models look at all the words in a sentence at once. This parallel processing helps the model grasp context much better, like understanding that "bank" means something different in "the river overflowed its bank" versus "I set up my bank account " versus “bank the airplane.”

Autoregressive models

Picture a word-guessing game where you predict the next word based on what came before. That's how autoregressive models work. They excel at text generation tasks, churning out human-like text that can range from a few sentences to entire paragraphs or even stories.

Encoder-decoder models

Encoder-decoder models take in one form of text (the input sequence) and flip it into another (the output sequence). This makes them perfect for tasks such as translating "Hello, how are you?" into "Bonjour, comment allez-vous?" or condensing a long article into a brief summary.

Multimodal models

Multimodal models can work with texts, images, audio, and even video. As AI applications grow more complex, these versatile models become increasingly important. An AI that can describe what's in a photo, transcribe a podcast, and then summarize it all in a neat paragraph uses multimodal models.

Specialized domain models

Specialized domain models focus on specific industries or tasks, using domain-specific data to boost their performance. For example, a model trained on medical literature will likely outperform a general-purpose model when it comes to analyzing patient symptoms or suggesting treatment options.

How do LLMs impact different industries?

For any enterprise, LLMs offer efficiencies in marketing, HR, and routine knowledge tasks. For certain industries, LLMs will bring even deeper disruption.

Healthcare
Legal
Education
Finance
Manufacturing

Healthcare

In healthcare, LLMs can analyze complex medical data and improve diagnostic accuracy and treatment planning. They can process patient records, lab results, and imaging studies to spot patterns human doctors might overlook.

In drug discovery, LLMs speed up the process and predict molecular interactions and potential side effects. They also contribute to personalized treatment plans based on a patient's genetic profile and medical history.

Legal

In the legal field, LLMs can streamline contract analysis, due diligence, and case law research. They quickly pinpoint relevant precedents across large databases of legal documents, which saves lawyers hours of manual search time.

LLMs also support the drafting of legal documents, maintain consistency with established legal language and minimize errors. For litigation, these models help predict case outcomes through analysis of historical data and current case details.

Education

LLMs are poised to transform education with adaptive learning systems. They analyze student performance data to create personalized learning paths and adjust difficulty levels and content as students progress.

For language learning, LLMs offer real-time feedback on pronunciation and grammar and mimic natural conversations. They also assist educators who create diverse, engaging content and assessments tailored to different learning styles.

Finance

In the financial sector, LLMs can strengthen risk assessment and fraud detection. They analyze market trends, news, and company reports to generate insights for investment decisions.

LLMs also power advanced chatbots that handle complex financial queries and assist with financial planning. For algorithmic trading, these models help refine strategies and process vast amounts of market data.

Manufacturing

LLMs are set to revolutionize manufacturing with predictive maintenance and supply chain optimization. They analyze sensor data from machinery to forecast failures before they occur and reduce downtime.

For product design, LLMs offer engineers improvement suggestions based on performance data and customer feedback. They can also fine-tune supply chains, forecast demand fluctuations, and identify potential disruptions.

Benefits of large language models

An open book with pages transforming into a network of connected dots and lines, symbolizing data and knowledge. The background features abstract technological elements, indicating the fusion of traditional knowledge with advanced AI capabilities.—Benefits of large language models

LLMs promise to revolutionize many aspects of business, offering unprecedented capabilities and efficiencies. These models enhance productivity, boost customer experiences, and drive smarter decision-making.

Improved efficiency. LLMs automate routine tasks such as customer support, data entry, and content creation, freeing up human resources for more strategic activities.
Enhanced accuracy. With data, LLMs can identify patterns and trends that might be missed by humans. This leads to more accurate insights and predictions, improving decision-making across various business functions.
Scalability. LLMs handle large volumes of data and interactions, and scale operations without a proportional increase in cost.
Personalization. LLMs analyze customer data to deliver personalized experiences. Whether through tailored marketing messages, product recommendations, or customized support, businesses can enhance customer satisfaction and loyalty.
Cost reduction. Automating tasks with LLMs reduces the need for extensive human labor, leading to significant cost savings. Improved efficiency and accuracy further reduce operational costs and errors.
Knowledge management. LLMs help in organizing and retrieving information efficiently. They can sift through large datasets, documents, and reports to provide relevant information quickly, supporting better knowledge management within organizations.
Innovation and creativity. By generating new ideas, drafting content, and even assisting in product design, LLMs foster innovation and creativity. They provide a fresh perspective and can help in brainstorming and developing new concepts.
Data-driven insights. LLMs process and analyze data to generate actionable insights. These insights help businesses understand market trends, customer behavior, and operational performance.
Enhanced collaboration. LLMs facilitate better communication and collaboration by summarizing documents, translating languages, and providing real-time information. This supports teams in working together more effectively, regardless of location.

LLMs limitations and challenges

While LLMs offer remarkable capabilities, they bring some major challenges and limitations:

LLMs can inadvertently learn and perpetuate biases present in their training data.
The decision-making process of LLMs is often opaque, so it can be difficult to understand how they arrive at certain conclusions.
The performance of LLMs heavily depends on the quality of the training data.
While LLMs excel at many tasks, they can struggle with domain-specific knowledge or highly specialized tasks without additional fine-tuning.
The potential misuse of LLMs for generating fake news, deepfakes, or other malicious content poses significant ethical challenges.
Keeping LLMs up-to-date with the latest information requires ongoing maintenance and periodic retraining, which can be resource-intensive and complex.
As the size and complexity of LLMs increase, deploying and scaling these models across different platforms and applications becomes more challenging.
The high energy consumption associated with training and running large models raises concerns about their environmental footprint.

Examples of popular large language models

The large language models showcase the rapid advancements in AI and natural language processing tasks. Each model brings unique strengths and innovations, pushing the boundaries of what is possible in understanding and generating human language.

Here are some of the most popular LLMs today:

Model	Developer	Features
GPT-4	OpenAI	Advanced language understanding and generation capabilities Handles complex queries and generates coherent, contextually relevant responses Fine-tuned on diverse datasets for broad task performance.
Gemini	Google	State-of-the-art text understanding and generation Enhanced multimodal capabilities combining text and visual information Fine-tuned for specific applications, improving accuracy and contextual relevance
T5	Google	Converts all NLP tasks into a text-to-text format Simplifies the fine-tuning process for different Handles translation, summarization, and question-answering tasks with high accuracy
RoBERTa	Facebook AI	Optimized version of BERT(Google LLM) Extended training on larger datasets Utilizes higher computational power to improve performance on a wide range of NLP tasks
Megatron	NVIDIA	Handles large datasets and complex computations Performs well on many NLP tasks with high efficiency Scalable architecture for extensive model training
XLNet	Google AI	Combines autoregressive and autoencoding models Generates the next token based on all permutations of the sentence for a comprehensive understanding of the context Enhanced performance in understanding and generating contextually accurate text

Contact Talbot West

Talbot West excels in harnessing the power of large language models to generate valuable insights for businesses. Whether you're looking to create engaging content or streamline customer support with AI, we offer targeted insights for your use case.

Discover how our cutting-edge approach can elevate your data strategy and drive your business forward. Schedule a free consultation with our AI experts.

Work with Talbot West

LLMs FAQ

What is the difference between an NLP and an LLM?

NLP is a broad field that focuses on the interaction between computers and human language. LLMs are specific AI systems trained on vast amounts of text data to perform language-related tasks. LLMs are a powerful tool within the NLP domain.

Is ChatGPT a large language model?

ChatGPT is a chatbot that uses a large language model called GPT, developed by OpenAI. GPT uses billions of parameters and advanced machine learning techniques to understand and generate human-like text. It has been trained on massive amounts of data and demonstrates impressive capabilities across a wide range of tasks, from content generation to language translation.

What are deep learning models?

Deep learning models are artificial neural networks with multiple layers between the input and output. These models use complex mathematical functions to process data through hidden layers. They excel at tasks such as image recognition, natural language processing, and speech recognition. Deep learning powers many modern AI applications.

Can LLM help with programming languages?

Large language models understand code syntax, structure, and patterns. LLMs can generate code snippets, explain programming concepts, debug errors, and even translate between different programming languages.

How do attention mechanisms improve LLMs?

Attention mechanisms enhance LLMs by allowing them to focus on the relevant context within input data. This leads to more accurate and coherent outputs, especially in tasks such as machine translation and generating entire sentences with human-like quality.

What are foundation models in AI?

Foundation models are large, pre-trained models that serve as a base for different downstream tasks. They handle complex tasks in NLP, such as language generation and machine translation, and leverage powerful models with significant computing power and extensive training.

Why is human intervention important in using LLMs?

Human intervention ensures the outputs of LLMs are accurate and appropriate. Despite their ability to generate human-like text, LLMs may produce incorrect or biased information, this is why human oversight is important for maintaining quality and relevance.

Resources

Fan, L., Li, L., Lee, S., Yu, H., & Hemphill, L. (2024). A Bibliometric Review of Large Language Models Research from 2017 to 2023. A Bibliometric Review of Large Language Models Research from 2017 to 2023. https://arxiv.org/pdf/2304.02020
Araci, D. (2019). FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. In
arXiv [cs.CL]. arXiv. http://arxiv.org/abs/1908.10063
Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A Pretrained Language Model for Scientific
Text. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/1903.10676
Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown,
T., Song, D., Erlingsson, U., Oprea, A., & Raffel, C. (2020). Extracting training data from
large language models. In arXiv [cs.CR]. arXiv.
https://www.usenix.org/system/files/sec21-carlini-extracting.pdf
Chen, M., Tworek, J., Jun, H., Yuan, Q., de Oliveira Pinto, H. P., Kaplan, J., Edwards, H., Burda,
Y., Joseph, N., Brockman, G., Ray, A., Puri, R., Krueger, G., Petrov, M., Khlaaf, H., Sastry,
G., Mishkin, P., Chan, B., Gray, S., … Zaremba, W. (2021). Evaluating Large Language
Models Trained on Code. In arXiv [cs.LG]. arXiv. http://arxiv.org/abs/2107.03374
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding. In arXiv [cs.CL]. arXiv.
http://arxiv.org/abs/1810.04805
Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF
procedure. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2203.05794
Johanna Okerlund, Evan Klasky, Aditya Middha, Sujin Kim, Hannah Rosenfeld, Molly Kleinman,
Shobita Parthasarathy. (2022). What’s in the Chatterbox? Large Language Models, Why
They Matter, and What We Should Do About Them. University of Michigan.
https://stpp.fordschool.umich.edu/sites/stpp/files/2022-05/large-language-models-TAP-2022
-final-051622.pdf
Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing: An Introduction to
Natural Language Processing, Computational Linguistics, and Speech Recognition.
https://web.stanford.edu/~jurafsky/slp3/ed3book_jan72023.pdf
Kim, B., Kim, H., Lee, S.-W., Lee, G., Kwak, D., Jeon, D. H., Park, S., Kim, S., Kim, S., Seo, D.,
Lee, H., Jeong, M., Lee, S., Kim, M., Ko, S. H., Kim, S., Park, T., Kim, J., Kang, S., … Sung,
N. (2021). What Changes Can Large-scale Language Models Bring? Intensive Study on
HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers. In arXiv [cs.CL].
arXiv. http://arxiv.org/abs/2109.04650
Meta, A. I. (2023, February 24). Introducing LLaMA: A foundational, 65-billion-parameter large
language model. Meta AI.
https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018, June 11). Improving language
understanding by generative pre-training. OpenAI.
https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_p
aper.pdf
Shang, J., Ma, T., Xiao, C., & Sun, J. (2019). Pre-training of Graph Augmented Transformers for
Medication Recommendation. In arXiv [cs.AI]. arXiv. http://arxiv.org/abs/1906.00346
Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., & Catanzaro, B. (2019).
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism.
In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/1909.08053
Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.-T., Jin, A., Bos,
T., Baker, L., Du, Y., Li, Y., Lee, H., Zheng, H. S., Ghafouri, A., Menegali, M., Huang, Y.,
Krikun, M., Lepikhin, D., Qin, J., … Le, Q. (2022). LaMDA: Language Models for Dialog
Applications. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2201.08239
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M.,
Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus,
W. (2022). Emergent Abilities of Large Language Models. In arXiv [cs.CL]. arXiv.
http://arxiv.org/abs/2206.07682ang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). Xlnet:
Generalized autoregressive pretraining for language understanding. Advances in Neural
Information Processing Systems, 32.
https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Ab
stract.html
Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z.,
Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., … Wen,
J.-R. (2023). A Survey of Large Language Models. In arXiv [cs.CL]. arXiv.
http://arxiv.org/abs/2303.18223

About the author

Jacob Andra is the CEO of Talbot West as well as of BizForesight, an AI-powered M&A platform built and partially owned by Talbot West. He hosts The Applied AI Podcast and spends his time pushing the limits of what AI can accomplish in real-world applications. Jacob speaks, writes, and publishes extensively on digital transformation, AI integration, and business process improvement. His expertise spans multiple disciplines, including business strategy, systems integration, digital transformation, and applied artificial intelligence. He's the co-developer of Cognitive Hive AI (CHAI), a modular, composable ensemble framework, and the developer of the Talbot West AI Prioritization and EXecution (APEX) methodology for mapping business opportunities and surfacing the best opportunities for applied AI.

Jacob Andra

Industry insights

We stay up to speed in the world of AI so you don’t have to.

Seated, in front row: Alexandra Pasi, Ph.D, CEO of Lucidity Sciences, and Jacob Andra, CEO of Talbot West. Talbot West & Lucidity Sciences Announce Partnership, Joint Advisory Board Appointments

Talbot West & Lucidity Sciences Announce Partnership, Joint Advisory Board Appointments

Jacob Andra, Talbot West CEO, and Adam Wardel announce Wardel's appointment to the Talbot West advisory board

Talbot West Adds Legal & Compliance Expertise to Advisory Board With Adam Wardel

Digital transformation strategy: how to do it the right way

Talbot West CEO Jacob Andra at age 13 and age 50 for an article penned by Stephen Karafiath

From blowtorches to boardrooms: why I partnered with Jacob Andra

What is neurosymbolic AI?

Big Consulting is realizing that they can't continue to justify their billable-hour model for strategic analysis when AI delivers better analysis in minutes.

McKinsey in WSJ: how Big Consulting is adapting to the age of AI, and how Talbot West is already there

Composable AI is AI architecture built from modular, interchangeable components that can be rapidly assembled, updated, or reconfigured. In short, it’s another term for Talbot West’s Cognitive Hive AI (CHAI) architecture that we’ve been championing for a long time now.

Composable AI: the future of intelligent enterprise

Most treat “build vs buy” as a straightforward choice between speed and customization, cost and control. They're wrong. It’s a complex optimization problem disguised as a simple choice. Organizations think they're weighing two options when they're actually navigating dozens of variables they don't know exist.

Buy or build an AI solution? How to evaluate your options.

APEX (AI Prioritization and EXecution) cuts through the noise. Our process identifies your single best AI opportunity and hands you the blueprint to deploy it.

AI Prioritization and Execution (APEX): a decisionmaking framework

Total organizational intelligence is inevitable by 2030, according to digital transformation advisory Talbot West

The Talbot West 5-year thesis

AI efficiency for mergers and acquisitions lifecycle

AI across the M&A lifecycle

BizForesight is an AI-powered business assessment platform that serves two distinct audiences while creating value for both. For business owners, it delivers sophisticated valuation insights and strategic guidance based on proprietary data from thousands of actual transactions. The platform helps owners understand their company's worth and identify optimal paths forward—whether growing, transitioning management, or planning an exit. Simultaneously, BizForesight functions as a qualified lead generation engine for professional service providers in the M&A ecosystem. The platform intelligently matches business owners with relevant professionals who can help implement their chosen strategies. Led by Bill McCalpin, Chair of the Alliance of Mergers & Acquisitions Advisors, and powered by Talbot West's AI technology, BizForesight has 400 business owners queued for its summer 2025 launch. This positions the platform to become the industry's largest deal flow driver by year-end 2025.

BizForesight: an AI-powered business assessment tool

Art deco stylized tree with geometric, angular branches forming symmetrical patterns. Circuit traces run through branches, carrying glowing data particles. High-performing branches transform from copper to brilliant gold and grow thicker, while underperforming branches dim and narrow. Seasons transition in quadrants around the tree, showing the evolution of optimization. Classic zigzag and geometric motifs decorate the base. Background features stepped layers of circuitry in muted tones, allowing the tree's optimization process to stand out in brilliant metallic colors.

What is reinforcement learning in CHAI?

Allegorize a sales engine by showing an actual internal combustion engine generating money as a highly efficient machine. Art Deco aesthetic, cash coming out the manifold, cybercircuitry and data streams connecting the cash to the engine and also circuitry patterns across the engine itself.

Build an efficient sales engine with AI capabilities

Art deco sentinel figures standing back-to-back, protecting a central sphere of client interests. One sentinel embodies traditional professional wisdom (rendered in classic art deco professional symbols), the other composed of advanced AI patterns. Their armor interlocks where they meet, creating stronger protection. Circuit-pattern shields extend from both figures. Energy flows between them strengthen their defensive stance. Style: protective art deco with cybernetic enhancement, burnished gold and electric blue.

Why do professional services firms love to refer their business clients to Talbot West?

An Art Deco-style illustration of a glowing, abstract human brain, seamlessly connected to a spinal column. The spinal column extends downward, branching out into intricate golden nerves that weave through an abstract corporate environment. Along the glowing pathways, Art Deco-styled icons appear: a briefcase for business operations, a bar graph for finance, a magnifying glass for analytics, a handshake for client services, and a gear for operations. The nerves light up each icon with radiant gold and teal energy, showing interconnectedness. The backdrop features symmetrical Art Deco patterns in black and gold with teal accents, combining elegance with a futuristic corporate aesthetic. The overall composition integrates organic forms with corporate iconography, embodying the concept of AI as the central nervous system of the organization. No text. Neural circuitry and data streams connecting icons to each other and to the brain and spine.

An AI central nervous system for your organization

Art deco mechanical robotic arm split composition: left half realistic industrial metal in steel blues, right half transformed with glowing neural network overlay in warm gold. Clean geometric patterns and streamlined forms typical of art deco. Neural connections flow across divide using art deco's characteristic sunburst and zigzag motifs. Strong angular shapes, industrial elegance, minimal color palette of metallic blue-grey and warm gold. High contrast with dramatic shadows. Background should use subtle art deco chevron patterns. Data streams and cybercircuitry across the surfaces. Style reference: retro-futuristic meets Machine Age aesthetic.

Physical AI: Where gen AI, natural language, and robotics meet in the physical world

Art deco courthouse façade viewed head-on, with vertical data streams flowing between the columns like waterfalls. Circuit patterns form the decorative friezes. Gold and obsidian color scheme with electric blue data elements. Geometric stepped patterns frame the composition. No text.

Invisible AI for law firms: a new paradigm for legal tech

A minimalist art deco aesthetic of organic cloud-like forms transforming into clean geometric vectors, symbolizing AI vector embeddings. Use curved lines and interconnected nodes to show the transition from data to structured information. Blue and silver gradients in the background to evoke a futuristic yet elegant look.

What is vector embedding and why does it matter?

Art deco style architectural illustration of a sleek chrome and steel bridge connecting two distinct geometric platforms. Bridge has clean lines and symmetrical supports. Platforms feature stepped geometric patterns characteristic of art deco design. Muted gold and silver tones. Sharp angular shadows. No text or words. Professional technical aesthetic with art deco flourishes. Minimalist background with subtle gradient. View from slight angle showing depth. Data lines and cybercircuits crisscrossing everything and making up the background. Art deco style. No text.

What is AI middleware and how does it make my business more efficient?

Art deco style illustration of faint, glowing cybercircuitry weaving invisibly through a workplace scene—a desk, a laptop, and familiar tools like email and chat icons subtly integrated into the circuitry. The circuits blend seamlessly into the background, emphasizing invisibility and familiarity. Muted metallics with soft glows.

Invisible AI: the evolution of SaaS and why your team doesn’t need another “product” to learn

Art Deco style golden scale of justice balanced with a computer chip and dollar signs, geometric patterns in background, metallic gold and deep blue colors, sleek lines and symmetry. No text. Cyber circuitry and data streams connecting elements and making up the background.

Use AI to turn fixed-fee legal work into a profit center for your firm

Advanced persistent threat cyberintrusions. A collage consisting of power plant, a virus, a laptop with a ton of code visible on the screen, a cell phone tower, a single smartphone with a social media scroll. Art deco aesthetic. Mostly grayscale with a small amount of blue and gold. No text. Data streams and circuitry connecting everything and making up the background.

How to fight advanced persistent threats (APTs) with AI

law firm workflows with cognitive hive AI. Show a collage of motifs related to the legal industry: gavel, law books, computer monitor. Data lines and cybercircuits connecting everything and making up the background. Art deco type aesthetics with blues, grays, and gold colors. No text.

AI and law: the opportunity of AI for the legal profession

Variational autoencoder as part of cognitive hive AI. Show a melange of motifs related to the data, backpropagation. Data lines and cybercircuits crisscrossing everything and making up the background. Art deco style. No text.

What is a variational autoencoder and what is its usefulness for enterprise?

Cybersecurity using AI. A collage consisting of a hacker, a laptop with a ton of code visible on the screen, a single smartphone with a social media scroll, a computer screen that is blank. Art deco aesthetic. Mostly grayscale with a small amount of blue and gold. No text. Data streams and circuitry connecting everything and making up the background.

AI and cybersecurity: How AI can help us defend ourselves

open source intelligence with cognitive hive AI for expanded insights. A collage consisting of a satellite, a drone, a ship, a map, social media profiles, a smartphone, and a single large computer screen that features geospatial intelligence. Art deco aesthetic. No text. Data streams and circuitry connecting everything and making up the background.

AI-powered OSINT: A system of systems approach to intelligence

Art deco aesthetic, minimalist control panel with dials, knobs, and sliders, connected by stylized lines to a faint neural network in the background, symbolizing hyperparameters in neural networks. Metallic textures with glowing accents, abstract and futuristic, landscape orientation.

What are hyperparameters in neural networks?

Minimalist art deco aesthetic of stacked, shrinking rectangular blocks glowing softly. Digital markings resembling abstract language symbols on each block. Design symbolizes the concept of scaled-down language models, with clean lines and a futuristic, tech-inspired look.

What is a small language model?

Stephen Karafiath Talbot West thoughts on AI

The future of AI and the power of modular systems: thoughts from Stephen Karafiath

Government building motif in art deco style with lots of circuitry AI for government efficiency an article by Talbot West

How AI can make government more efficient while unlocking new capabilities

An an image that encapsulates the idea of detection of adversarial gray zone campaigns. Use imagery of satellites, communications, surveillance, and maritime activity. Art deco aesthetic done in grayscale. Lots of circuitry and data streams connecting elements. Evoke persistent surveillance, competition, bring in a bit of a Cold War vibe.

Gray zone warfare part 5: We need better detection capabilities

Gray zone warfare and detection and deterrence, a military motif with gray overtones and lots of circuitry and data streams. Think surveillance, detection, deterrence, aggression.

Gray zone warfare part 4: Deterrence in the gray zone

$A close-up, minimalist art deco illustration of a nautilus shell with spiraling, nested chambers, each chamber representing a different AI module in a system of systems approach. Larger outer chambers symbolize high-level systems, while smaller inner chambers represent specialized capabilities. Fractals with cyber fusion, data streams and circuitry fusing the different fractals. Art deco style, muted colors, non-psychedelic. Really fuse nature and cyber elements.$

Why system of systems is the future of AI deployment

$Art deco aesthetic, minimalist, a fractured military shield in shades of gray with circuitry lines running through cracks, symbolizing cyber infiltration and vulnerability. Military overtones, subtle rivet details, red highlights on some lines for alert. Lots of data streams symbolizing the digital landscape of most gray zone warfare.$