Steeped in history, built for the future, the new Rewire office in Amsterdam is a hub for innovation, collaboration, and impact
Rewire has taken a new step forward: a new home in Amsterdam! But this isn’t just an office. It’s a reflection of our growth, vision, and commitment to driving business impact through data & AI. With this new office, we aim to create an inspiring home away from home for our employees, and a great place to connect and collaborate with our clients and partners.
Steeped in history, built for the future
Amsterdam is a city where history and innovation coexist. Our new office embodies this spirit. Nestled in scenic Oosterpark, which dates back to 1891, and set in the beautifully restored Amstel Brewery stables from 1912, our location is a blend of tradition and modernity.


The fully refurbished Rewire office in Amsterdam once served as the stables of the Amstel brewery.
But this isn’t just about aesthetics—our presence within one of Amsterdam’s most vibrant areas is strategic. Surrounded by universities, research institutions, cultural and entertainment landmarks, we’re positioned at the crossroads of academia, industry, and creativity. Moreover, the city has established itself as a global hub for Data & AI, attracting global talent, startups, research labs, and multinational companies. Thus, we’re embedding ourselves in an environment that fosters cross-disciplinary collaboration and promotes impact-driven solutions. All in all, our new location ensures that we stay at the forefront of AI while providing an inspiring setting to work, learn, and imagine the future.
A space for collaboration & growth
At Rewire, we believe that knowledge-sharing and hands-on learning are at the core of meaningful AI adoption. That’s why our new Amsterdam office is more than just a workspace—it’s a hub for collaboration, education, and innovation.

More than just an office, it's a hub for collaboration, education, and innovation.
We’ve created dedicated spaces for GAIN’s in-person training, workshops, and AI bootcamps, reinforcing our commitment to upskilling talent and supporting the next generation of AI professionals. Whether through hands-on coding sessions, strategic AI leadership discussions, or knowledge-sharing events, this space is designed to develop the Data & AI community of tomorrow.
Beyond structured learning, we’ve designed our office to be an environment where teams can engage in deep problem-solving, collaborate on projects, and push the boundaries of what can be achieved. Our goal is to bridge the gap between research and real-world application, thus helping clients leverage AI’s full potential.


The new office includes a variety of spaces, from small quiet spaces for deep thinking to large open spaces for group work, and socializing.
Sustainability & excellence at the core
Our new office is built to the highest standards of quality and sustainability, incorporating modern energy-efficient design, eco-friendly materials, and thoughtfully designed workspaces.

Eco-friendly and energy efficient, the new office retain the fixtures of the original Amstel stables.
We’ve curated an office that balances dynamic meeting spaces, collaborative areas, and quiet zones for deep thinking—all designed to support flexibility, focus, and innovation. Whether brainstorming the next AI breakthrough or engaging in strategic discussions, our employees have a space that fosters creativity, problem-solving, and impactful decision-making.

The office is not just next to Oosterpark, one of Amsterdam's most beautiful park. It is also home to thousands of plants.
We’re Just getting started
Our move to a new office in Amsterdam represents a new chapter in Rewire’s journey, but this is only the beginning. As we continue to expand, build, and collaborate, we look forward to engaging with the broader Data & AI community, fostering innovation, and shaping the future of AI-driven impact.
We’re excited for what’s ahead. If you’re interested in working with us, partnering on AI initiatives, or visiting our new space—contact us!

The team that made it happen.
How a low-cost, open-source approach is redefining the AI value chain and changing the reality for corporate end-users
At this point, you’ve likely heard about it: DeepSeek. Founded in 2023 by Liang Wenfeng, co-founder of the quantitative hedge fund High-Flyer, this Hangzhou-based startup is rewriting the rules of AI development with its low-cost model development, and open-source approach.
Quick recap first.
From small steps to giant leaps
There are actually two model families launched by the startup: DeepSeek-V3 and DeepSeek R1.
V3 is a Mixture-of-Experts (MoE) large language model (LLM) with 671 billion parameters. Thanks to a number of optimizations it can provide similar or better performance than other large foundational models, such as GPT-4o and Claude-3.5-Sonnet. What’s even more remarkable is that V3 was trained in around 55 days at a fraction of the cost for similar models developed in the U.S. —less than US$6 million for DeepSeek V3, compared with tens of millions, or even billions, of dollars in investments for its Western counterparts.
R1, released on January 20, 2025, is a reasoning LLM that uses innovations applied to the V3 base model to greatly improve its performance in reasoning. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. The kicker is that R1 is published under the permissive MIT license: this license allows developers worldwide to modify the model for proprietary or commercial use, which paves the way for accelerated innovation and adoption.
Less is more: reshaping the AI value chain
As other companies emulate DeepSeek’s achievements, its cost-effective approach and open-sourcing of its technology signals three upcoming shifts:
1. Lower fixed costs and increased competition. DeepSeek shows that cutting-edge LLMs no longer require sky-high investments. Users have reported running the V3 model on consumer Mac hardware and even foresee its use on devices as lightweight as a Raspberry Pi. This opens the door for more smaller players to develop their own models.
2. Higher variable costs driven by higher query costs. As reasoning models become widely available and more sophisticated, their per-query costs rise due to the energy-intensive reasoning processes. This dynamic creates new opportunities for purpose-built models – as opposed to general-purpose models like OpenAI’s ChatGPT.
3. Greater value creation down the AI value chain. As LLM infrastructure becomes a commodity (much like electricity) and companies develop competing models that dent the monopolistic power of big tech (although the latter will continue to push the boundaries with new paradigm breakthroughs), the development of solutions in the application layer – where most of the value for end-users resides – becomes much more accessible and cost-effective. Hence, we expect accelerated innovations in the application layer and the AI market to move away from “winner-takes-all” dynamics, with the emergence of diverse applications tailored to specific industries, domains, and use cases.
So what’s next for corporate end-users?
Most corporate end-users will find that having a reliable and “good enough” model matters more than having the absolute best model. Advances in reasoning such as R1 could be a big step for AI agents that deal with customers and perform tasks in the workplace. If those are available more cheaply, corporate adoption and bottom lines will increase.
Taking a step back, a direct consequence of the proliferation of industry and function-specific AI solutions built in the application layer is that AI tools and capabilities will become a necessary component for any company seeking to build competitive advantage.
At Rewire, we help clients prepare themselves for this new reality by building solutions and capabilities that leverage this new “AI commodity” and turning AI potential into real-world value. To explore how we can help you harness the next wave of AI, contact us.
From customer support to complex problem-solving: exploring the core components and real-world potential of Agentic AI systems
For decades, AI has captured our imaginations with visions of autonomous systems like R2-D2 or Skynet, capable of independently navigating and solving complex challenges. While those pictures remain firmly rooted in science fiction, the emergence of Agentic AI signals an exciting step in that direction. Powered by GenAI models, these systems would improve adaptability and decision-making beyond passive interactions or narrow tasks.
Consider the example of e-commerce, where customer support is critical. Traditional chatbots handle basic inquiries, but when a question becomes more complex—such as tracking a delayed shipment or providing tailored product recommendations—the need for human intervention quickly becomes apparent. This is where GenAI agents can step in, bridging the gap between basic automation and human-level problem solving.
The promise of GenAI-driven agents isn’t about overnight industry transformation but lies in their ability to augment workflows, enhance creative processes, and tackle complex challenges with a degree of autonomy. Yet, alongside this potential come significant technological, ethical, and practical challenges that demand thoughtful exploration and development.
In this blog, we’ll delve into what sets these systems apart, examine their core components, and explore their transformative potential through real-world examples. Let’s start by listing a few examples of possible use cases for GenAI agents:
- Personal assistants that schedule meetings based on availability and send out invitations.
- Content creators that generate blog posts, social media copy, and product descriptions with speed and precision.
- Code assistants that assist developers in writing, debugging, and optimizing code.
- Healthcare assistants that analyze medical records and provide diagnostic insights.
- AI tutors that personalize learning experiences, offering quizzes and tailored feedback to students.
GenAI agents handle tasks that previously required significant human effort, freeing up valuable time and resources for strategic thinking and innovation. But what exactly are these GenAI agents, how do they work and how to design them?
Understanding the core components of GenAI agents
Unlike traditional chatbots, GenAI agents transcend simple text generation, captured in three core principles:
- They act autonomously. These agents are capable of taking goal-driven actions, such as querying a database or generating a report, without explicit human intervention.
- They plan and reason. Leveraging advanced reasoning capabilities, they can break down complex tasks into actionable steps.
- They integrate with tools. While LLMs are great for answering questions, GenAI agents use tools and external systems to access real-time data, perform calculations, or retrieve historical information.
What makes these capabilities possible? At the heart of every GenAI agent lies the power of GenAI models, specifically large language models (LLMs). LLMs are the engine behind the agent's ability to understand natural language, adapt to diverse tasks, and simulate reasoning. Without them, these agents couldn’t achieve the nuanced communication or versatility required for autonomy or to handle complex inputs.
But how do all the pieces come together to create such a system? To understand the full picture, we need to look beyond LLMs and examine the other key components that make GenAI agents work. Together, these components (Interface, Memory, Planning, Tools, and Action) enable the agent to process information, make decisions, and execute tasks. Let’s explore each of these building blocks in detail.

1. Interface: the bridge between you and the AI
The interface is the gateway through which users communicate with the GenAI agent. It serves as the medium for input and output, allowing users to ask questions, give commands, or provide data. Whether it’s a text-based chat, a voice command, or a more complex graphical user interface (GUI), the interface ensures the agent can understand human input and convert it into actionable data.
2. Memory: remembering what matters
Memory is what allows a GenAI agent to learn from past experiences and adapt over time. It stores both short-term and long-term information, helping the agent maintain context across interactions and deliver personalized experiences, based on past conversations and preferences.
3. Planning: charting the path to success
The planning component is the brain behind the agent’s decision-making process. This is essentially using the reasoning of an LLM to break down the problem into smaller tasks. When faced with a task or problem, the agent doesn’t just act blindly. Instead, it analyses the situation, sets goals and priorities, and devises a strategy to accomplish them. This ability to plan ensures that the agent doesn’t simply react in predefined ways, but adapts its actions for both simple and complex scenarios.
4. Tools: extending the agent’s capabilities
No GenAI agent is an island— it needs to access additional resources to solve more specialized problems. Tools can be external resources, APIs, databases, or even other specialized (GenAI) agents that the agent can use to extend its functionality. By activating tools as prompted by the GenAI agent when needed, the agent can perform tasks that go beyond its core abilities, making it more powerful and versatile.
5. Action: bringing plans to life
Once the agent has formulated a plan, it’s time to take action. The action component is where the agent moves from theory to practice, executing tasks, sending responses, or interacting with tools and external systems. It’s the moment where the GenAI agent delivers value by fulfilling its purpose, completing a task, or responding to a user request.
The core components of a GenAI agent in action
Now that we’ve broken down the core components of a GenAI agent, let’s see how they come together in a real-world scenario. Let’s get back to our customer support example and imagine a customer who is inquiring about the status of their order. Here’s how the agent seamlessly provides a thoughtful, efficient response:

- Interface: The agent receives the customer query, "What is the status of my order?"
- Memory: The agent remembers the customer placed an order on January 15th 2025 with order ID #56789 from a previous interaction.
- Planning: The agent breaks down the task into steps: retrieve order details, check shipment status, and confirm delivery date.
- Tools: The agent accesses the DHL API with the customer’s order reference to get the most recent status update and additionally accesses CRM to confirm order details.
- Action: The agent retrieves the shipping status, namely “Out for delivery” as of January 22nd, 2025.
- Interface: The agent sends the response back to the customer: "Your order is on its way and is expected to arrive on January 22nd, 2025. Here’s the tracking link for more details: [link]. Would you like to set a delivery reminder?"
If the customer opts to set a reminder, the agent can handle this request as well by using its available tools to schedule the reminder accordingly, something that basic automation would not be able to do.
Unlocking efficiency with multi-agent systems for complex queries
A single GenAI agent can efficiently handle straightforward queries, but real-world customer interactions often involve complex, multifaceted issues. For instance, a customer may inquire about order status while also addressing delayed shipments, refunds, and promotions—all in one conversation. Managing such tasks sequentially with one agent increases the risk of delays and errors.
This is where multi-agent systems come in. By breaking down tasks into smaller, specialized subtasks, each handled by a dedicated agent, it’s easier to ensure and track efficiency and accuracy. For example, one agent handles order tracking, another manages refunds, and a third addresses promotions.
Splitting tasks this way unlocks several benefits: simpler models can tackle reduced complexity, agents can be given specific instructions to specialize in certain areas, and parallel processing allows for approaches like majority voting to build confidence in outputs.
With specialized GenAI agents working together, businesses can scale support for complex queries while maintaining speed and precision. This multi-agent approach outperforms traditional customer service, offering a more efficient and accurate solution than escalating issues to human agents.

What is needed to design GenAI agents?
Whether you’re automating customer support, enhancing business processes, or revolutionizing healthcare, building effective GenAI agents is a strategic endeavor that requires expertise across multiple disciplines.
At the core of building a GenAI agent is the integration of key components that enable its functionality and adaptability:
- Data engineering. The foundation of any GenAI agent is its ability to access and process data from multiple sources. This includes integrating APIs, databases, and real-time data streams, ensuring the agent can interact with the world outside its own environment. Effective data engineering enables an agent to access up-to-date information, making it dynamic and capable of handling a variety of tasks.
- Prompt engineering. For a large language model to perform effectively in a specific domain, it must be tailored through prompt engineering. This involves crafting the right inputs to guide the agent’s responses and behavior. Whether it’s automating customer inquiries or analyzing medical data, domain-specific prompts ensure that the agent’s actions are relevant, accurate, and efficient.
- Experimentation. Designing workflows for GenAI agents requires balancing autonomy with accuracy. During experimentation, developers must refine the agent’s decision-making process, ensuring it can handle complex tasks autonomously while maintaining the precision needed to deliver valuable results. This iterative process of testing and optimizing is key to developing agents that operate efficiently in real-world scenarios.
- Ethics and governance. With the immense potential of GenAI agents comes the responsibility to ensure they are used ethically. This means designing systems that protect sensitive data, comply with regulations, and operate transparently. Ensuring ethical behavior isn’t just about compliance—it’s about building trust with users and maintaining accountability in AI-driven processes.
Challenges and pitfalls with GenAI agents
While GenAI agents hold tremendous potential, there are challenges that need careful consideration:
- Complexity in multi-agent systems. Coordinating multiple specialized agents can introduce complexity, especially in ensuring seamless communication and task execution.
- Cost vs. benefit. With resource-intensive models, balancing the sophistication of GenAI agents with the cost of deployment is crucial.
- Data privacy and security. As GenAI agents require access to vast data and tools, ensuring the protection of sensitive information becomes a top priority.
- Ethical and regulatory considerations. Bias in training data, dilemmas around autonomous decision-making, and challenges in adhering to legal and industry-specific standards create significant risks. Deploying these agents responsibly requires addressing ethical concerns while navigating complex regulatory frameworks to ensure compliance.
- Performance management. Implementing agent systems effectively requires overcoming the complexity of monitoring and optimizing their performance. From breaking down reasoning steps to preparing systems and data for smooth access, managing the performance of these agents as they scale remains a significant hurdle.
Unlocking the future of GenAI agents with insights and discoveries ahead - stay tuned!
GenAI agents represent an exciting shift in how we approach problem-solving and human-computer interaction. With their promise to reason, plan, and execute tasks autonomously, these agents have the potential to tackle even the most complex workflows. As we continue our exploration of these systems, our focus will remain on understanding both their immense promise and the challenges they present.
In the coming weeks, we will delve deeper into the areas where GenAI agents excel, where they struggle, and what can be done to enhance their real-world effectiveness. We invite you to follow along with us as we share our experiments, findings, and practical insights from this ongoing journey. Stay tuned for our next blog post, where we will explore the evolving landscape of GenAI agents, along with the valuable lessons we have learned through experimentation.
This article was written by Mirte Pruppers, Data Scientist, Phebe Langens, Data Scientist, and Simon Koolstra, Principal at Rewire.
Why combining Retrieval-Augmented Generation with Knowledge Graphs holds transformative potential for AI.
Google it! How many times have you said that before? Whether you're settling a friendly debate or hunting for that perfect recipe, Google is our go-to digital oracle. But have you ever wondered what happens behind that simple search bar? At the heart of Google's search magic lies something called a Knowledge Graph—a web of interconnected information that's revolutionizing how AI understands our queries. And guess what? This same technology, along with RAG (Retrieval-Augmented Generation), is transforming how businesses handle their own data. In this article we’ll explore the journey from data to insight, where RAG and Knowledge Graphs are the stars of the show!
The power of RAG: enhancing AI's knowledge base
Retrieval-Augmented Generation, or RAG, is a fascinating approach that's changing the game in AI-powered information retrieval and generation. But what exactly is RAG, and why is it causing such excitement in the AI community?
At its core, RAG addresses a fundamental limitation in GenAI models like ChatGPT: their knowledge is strictly bounded by their training data. Consider this: if you ask ChatGPT's API (which doesn’t have the web search capability) about the 2024 Euro Cup winner, it can’t give you the answer as this event occurred after its last knowledge or training update. The same applies to company-specific queries like "What were our Q3 financial results?" The model inherently cannot access such private, domain-specific information because it's not part of its training data. This inability to access real-time or specialized information significantly limits the range of responses these models can generate.
RAG offers a solution by combining user prompts with relevant information retrieved from external sources, such as proprietary databases or real-time news feeds. This augmented input enables GenAI models to generate responses that are both contextually richer and up-to-date, allowing them to answer questions even when their original training data doesn't contain the necessary information. By bridging the gap between AI training and the dynamic world of information, RAG expands the scope of what GenAI can achieve, especially for use cases that demand real-time relevance or access to confidential data.
How does RAG work its magic?
Now that we've established how RAG addresses the limitations of traditional GenAI models, let’s break down the four essential steps to getting RAG to work: data preparation, retrieval, augmentation, and generation.
Like most AI systems, effective RAG is dependent on data: it requires the external data to be prepared in an accessible way that is optimized for retrieval. When a query is made, relevant information is retrieved and augmented to this query. The model finally generates a well-informed response, leveraging both its natural language capabilities and the retrieved information. Here's how these steps unfold.
1. Data Preparation: This crucial first step involves parsing raw input documents into an appropriate format (usually text) and splitting them into manageable chunks. Each text chunk is converted into a high-dimensional numerical vector that encapsulates its meaning, called an embedding. These embeddings are then stored in a vector database, often with an index for efficient retrieval.

2. Retrieval: When a user asks a question, RAG embeds the query (with the same embedding model that was used to embed the original text chunks), searches the database for similar vectors (similarity search) and then retrieves the most relevant chunks.

3. Augmentation: The most relevant results from the similarity search are used to augment the original prompt.
4. Generation: The augmented prompt is sent to the Large Language Model (LLM), which generates the final response.

The beauty of RAG is that it doesn't fundamentally change the model's behavior or linguistic style. Instead, it enhances the model's knowledge base, allowing it to draw upon a vast repository of information to provide more informed and accurate responses.
While RAG has numerous applications, its impact on business knowledge management stands out. In organizations with 50+ employees, vital knowledge is often scattered across systems like network drives, SharePoint, S3 buckets, and various third-party platforms, making it difficult and time-consuming to retrieve essential information.
Now, imagine if finding company data was as easy as a Google search or asking ChatGPT. RAG makes this possible, offering instant, accurate access to everything from project details to financial records. By connecting disparate data sources, RAG centralizes knowledge, streamlines workflows and turns scattered information into a strategic asset that enhances productivity and decision-making.
Knowledge Graphs: mapping the connections
While RAG excels at retrieving relevant information, it doesn’t always capture the broader context of interconnected data. In complex business environments, where relationships between projects, departments, and clients are crucial, isolated facts alone may not tell the full story. This is where RAG falls short: it retrieves data, but it doesn’t always "understand" how these pieces fit together. Here, Knowledge Graphs (KGs) come in, mapping connections between entities to create a structured, context-rich network of relationships that deepens AI’s understanding.
How do Knowledge Graphs work?
A Knowledge Graph (KG) is a network of interconnected concepts, where nodes represent entities (e.g., people, projects, or products), and edges show the relationships between them. KGs are especially powerful because they can easily incorporate new information and relationships, making them adaptable to changing knowledge.

For businesses, a unified view of their data is invaluable. Knowledge Graphs organize complex relationships between employees, teams, projects, and departments, transforming scattered data into a structured network of meaningful connections. This structured approach empowers GenAI by enabling more intelligent, context-aware outputs based on deep, interconnected insights.
Imagine a GenAI-powered workflow: when a user poses a question, the system translates it into a query, perhaps leveraging a graph query language like Gremlin. The Knowledge Graph responds by surfacing relevant relationships—such as past interactions with a client or the expertise of specific team members. GenAI then synthesizes this enriched data into a clear, actionable response. Whether HR is identifying the ideal candidate for a project or sales is retrieving a client's complete history, integrating KGs into AI workflows allows businesses to extract these interconnected insights.
Bringing knowledge graphs and RAG together: introducing GraphRAG
Both RAG and Knowledge Graphs tackle a fundamental challenge in AI: enabling machines to understand and use information more like humans do. Each technology excels in different areas—RAG is particularly strong at retrieving unstructured, text-based information, while Knowledge Graphs are powerful for organizing and connecting information in meaningful ways. Notably, Knowledge Graphs can now also incorporate unstructured data processed by LLMs, allowing them to reliably retrieve and utilize information that was originally unstructured. This synergy between RAG and Knowledge Graphs creates a complementary system capable of managing diverse information types, making their integration especially valuable for internal knowledge management in businesses, where a wide range of data must be effectively utilized.
Here's how this powerful combination works:
1. Building the Knowledge Graph with RAG: We start by setting up a Knowledge Graph based on the relationships in the company's data, using RAG right from the start. This process involves chunking all internal documents and embedding these chunks. By applying similarity searches on these embeddings, RAG uncovers connections within the data, helping to shape the structure of our Knowledge Graph as it is being built.
2. Connecting Documents to the Graph: Once we have our Knowledge Graph, we connect the embeddings of the chunked documents to the corresponding end nodes. For instance, all embedded documents regarding Project A are connected to the Project A node in the graph. The result is a rich Knowledge Graph where some nodes are linked to embedded chunks of internal documents.

3. Leveraging RAG for Complex Queries: This is where RAG again comes into play. For questions that can be answered based purely on the Knowledge Graph structure, we can quickly provide answers. But for queries requiring detailed information from documents, we use RAG:
- We navigate to the relevant node in the Knowledge Graph (e.g., Project A).
- We retrieve all connected embeddings (e.g., all embedded chunks connected to Project A).
- We perform a similarity search between these embeddings and the user's question.
- We augment the original user prompt with the most relevant chunks (use database keys to obtain the chunks corresponding to the relevant embeddings) .
- Finally, we pass this augmented prompt to a LLM to generate a comprehensive answer.

This hybrid approach combines the best of both worlds. The Knowledge Graph offers a structured overview of the company's information landscape, enabling quick responses to straightforward queries while also efficiently guiding RAG to the most relevant subset of document chunks. This significantly reduces the number of chunks involved in similarity searches, optimizing retrieval. Meanwhile, RAG excels at performing deeper, more detailed searches when necessary, providing comprehensive answers without the computational burden of scanning all documents for each query. Together, they create a more efficient, scalable, and intelligent system for handling business knowledge.
Unlocking actionable insights: the future with RAG and knowledge graphs
Practically speaking, RAG and Knowledge Graphs revolutionize how we interact with data by making information both accessible and deeply connected. It’s like searching your company’s entire database as effortlessly as using Google, but with precise, up-to-date answers. The impact? Streamlined workflows, faster decision-making, and the discovery of connections you may not have realized were there.
The impact of combining RAG with Knowledge Graphs extends far beyond basic knowledge management, reshaping various industries through its ability to link real-time, context-aware insights with complex data structures. In customer service, this technology could enable support bots to deliver highly personalized assistance by connecting past interactions, product histories, and troubleshooting steps into a seamless, context-aware experience. The financial sector benefits from enhanced fraud detection capabilities, as GraphRAG can map intricate transactional relationships and retrieve specific records for thorough investigation of suspicious patterns. Additionally, healthcare organizations can use the technology's potential to revolutionize patient care by creating comprehensive connections between diagnoses, treatments, and real-time medical records, while simultaneously matching patients with relevant clinical trials based on their detailed medical histories. In the terms of supply chain management, GraphRAG can empower teams with real-time disruption alerts and relationship mapping among suppliers and inventory, enabling more agile responses to sudden changes. And as a last example, market intelligence teams may gain a significant advantage through dynamic insights that link competitor data, emerging trends, and current information, facilitating proactive strategic planning.
By combining retrieval and relationship mapping, RAG and Knowledge Graphs turn complex data into actionable insights—making it easier than ever to find exactly what you need, when you need it.
The challenge of building GraphRAG: unlocking data’s potential with precision and scalability
While combining RAG with KGs holds transformative potential, implementing this hybrid solution introduces significant technical and organizational challenges. The core complexity lies in seamlessly integrating two sophisticated systems—each with unique infrastructures, requirements, and limitations—into a unified, scalable architecture that can evolve with growing data demands. Other key challenges include:
- Data Consistency and Integration: Consolidating data from multiple sources like SharePoint, S3, and internal databases requires a robust integration framework to maintain cross-platform consistency, data integrity, and efficient real-time updates with minimal latency.
- Graph Structure and Adaptability: Designing a Knowledge Graph that accurately models an organization’s data relationships requires a deep understanding of interconnected entities, processes, and history. This graph must be adaptable to incorporate new data and evolving relationships, necessitating rigorous planning and proactive maintenance.
- System Optimization for Real-Time Performance: For large-scale deployments, the system must support complex queries with minimal latency. Achieving this balance requires an optimized allocation of computational resources to ensure high performance, especially during real-time retrieval and similarity searches.
- Security and Access Control: Integrating RAG with Knowledge Graphs requires careful handling of sensitive data, including strong access controls, clear permission settings, and regulatory compliance to keep proprietary information secure and private. For example, a knowledge management system should restrict access to sensitive files—such as confidential financial reports or HR documents—so that employees see only information they are authorized to access.
By addressing these challenges through a structured, security-focused approach, organizations can unlock the full potential of their data, building a scalable foundation for advanced insights and innovation.
Technology and innovation expert Tom Goodwin on the merits of GenAI and how to leverage its potential.
During Rewire LIVE, we had the pleasure of hosting Tom Goodwin, a friend of Rewire and pragmatic futurist and transformation expert who advises Fortune 500 companies on emerging technologies such as GenAI. Over the past 20 years, he has studied the impact of new technology, new consumer behaviors and the changing rules of business, which makes him uniquely suited to understand the significance of GenAI today.
At the core of Tom’s thinking lies a question that all leaders should ponder: if, knowing everything you know now, were to build your company from scratch, what would it look like? At times counter-intuitive, Tom’s insights, steeped in history, provide valuable clues to answer this question. In this article, we share a handful of them.
INSIGHT 1: Technology revolution happens in two stages. In the first stage we add to what was done before. In the second stage we rethink. That’s when the revolution really happens.
Tom’s insight is derived from the Perez Framework, developed by Carlota Perez, a scholar specialized in technology and socio-economic development. The framework – based on the analysis of all the major technological revolutions since the industrial revolution – stipulates that technological revolutions first go through an installation phase, then a deployment stage. In the installation phase, the technology comes to market and the supporting infrastructure is built. In the deployment phase, society fully adopts the technology. (The transition between the two phases is typically marked by a financial crash and a recovery.)
During the first phase, there’s a frenzy – not dissimilar to the hype that currently surrounds GenAI. Everyone jumps on the technology, everyone talks about it. However, nothing profound really changes. For the most part, the technology only adds to the existing ways of doing things. In contrast, during the second stage, people finally make sense of the technology and use it to rethink the way things are done. That’s when the value is unleashed.
Take electricity as an example. In the first stage, electricity brought the electric iron, the light, the fan, the oven. These were all things that existed before. In the second stage, truly revolutionary innovations emerged: the radio, the TV, the telephone, the microwave, the microwave dinner, factories that operate 24/7, and so on. The second stage required a completely different mindset vis-à-vis what could do be done and how people would behave.
This begs the question: what will be the second stage of GenAI – and more broadly AI – be? What will be the telephone, radio, microwave for AI? Tom’s assertion here is that the degree of transformation is less about how exciting that technology is, and it’s much more about how deeply you change. Better AI will be about systems that are completely rethought and deep integrations, rather than UI patches.
Watch the video clip.
INSIGHT 2: Having category expertise, knowing how to make money, having relationships, and having staff who really know what they’re doing is probably more important than technology expertise.
Across many industries, battle lines are drawn between large traditional companies that have been around for a long time and the digitally-enabled, tech first, mobile-centric startup types. Think Airbnb vs Marriott, Tesla vs. BMW, SpaceX vs NASA, and so on.
The question is who’s going to win. Is it the digitally native companies who have created themselves for the modern era? Or is it the traditional companies that have been around for a long time? Put another way, is it easiest to be a tech company and learn how to make money in your target industry? Or be a big company who already knows how to make money but must now understand what a technology means and adapt accordingly?
Up until recently, the assumption was that the tech companies would win the battle. This proved true for a while: Netflix vs. Blockbusters, Apple vs. Nokia, etc. The assumption was that this would carry on. Understanding the technology was more important than understanding the category.
Tom’s observation is that in the past four years, these assumptions have been challenged. For example, traditional banks have got really good at understanding technology. Neobanks might be good at getting millennials to share the cost of a pizza, but they’re not that good at making money. So there’s this slow realisation that maybe digital-first tech companies are not going to win – because big companies are getting pretty good at change.
Taking a step back, it seems that the narrative of disrupt or die isn’t always true: a lot of the rules of business have not changed; incumbents just need to get a bit more excited about technology. Ultimately, having category expertise, knowing how to make money, having relationships, and having staff who really know what they’re doing is probably more important than tech expertise.
Watch the video clip.
INSIGHT 3: The AI craze is enabling a more flexible investment climate. This is an incentive for leaders to be bold.
Generative AI has spurn heated debates about the evolution of AI and divided experts and observers into two opposing groups: the AI cheerleaders and the sceptics. The former believe that AI is going to change everything immediately. The latter think that it’s a bubble.
History is littered with innovations that went nowhere. A handful of them however proved to be transformational – if in the long run. Only time will tell which group GenAI will join. In the meantime, there’s a growing realization that significant investment may be required to make meaningful steps with AI, hence a more flexible climate for capex – which is an incentive for leaders to be bold.
Tom’s insight reflects this situation: change is hard and expensive, and so regardless of one’s position in the debate, GenAI provides a unique window of opportunity to get the investor that you wouldn’t normally get. It is an amazing time to have an audience who normally wouldn’t listen to you.
Conclusion
These were but a handful of the many insights that Tom shared with us during Rewire LIVE. Taking a step back, it is clear that we are far from having realized the full value of GenAI – and, more broadly, AI. In the words of Tom, AI is a chance to dream really big and leave your mark on the world. It is yours for grab.
About Tom Goodwin
Tom Goodwin is the four time #1 in “Voice in Marketing” on LinkedIn with over 725,000 followers on the platform. He currently heads up “All We Have Is Now”, a digital business transformation consultancy, working with Clients as varied as Stellantis, Merck, Bayer, and EY to rethink how they use technology.
Tom hosts “The Edge” a TV series focusing on technology and innovation, and “My Wildest Prediction”, a podcast produced and distributed by Euronews. He has published the book “Digital Darwinism” with Kogan Page, and has spoken in over 100 cities across 45 countries.
With a 23 year career that spans creative, PR, digital and media agencies, Tom is an industry provocateur as a columnist for the Guardian, TechCrunch and Forbes and frequent contributor to GQ, The World Economic Forum, Ad Age, Wired, Ad Week, Inc, MediaPost and Digiday.
To find out more about Tom, visit www.tomgoodwin.co
Rewire CEO Wouter Huygen reviews the arguments for and against GenAI heralding the next industrial revolution, and how business leaders should prepare.
Is generative AI under- or overhyped? Is it all smoke and mirrors, or is it the beginning of a new industrial revolution? How should business leaders respond? Should they rush to adopt it or should they adopt a wait-and-see approach?
Finding clear-cut answers to these questions is a challenge for most. Experts in the field are equally divided between the cheerleaders and the skeptics, which adds to the apparent subjectivity of the debate.
The GenAI cheerleaders can point to the fact that performance benchmarks keep being beaten. Here the underlying assumption is the “AI Scaling Hypothesis”. That is, as long as we throw in more data and computing power, we’ll make progress. Moreover, the infrastructure required for GenAI at scale is already there: an abundance of cloud-based data and software; the ability to interact with the technology using natural language. Thus, innovation cycles have become shorter and faster.
On the other hand, GenAI skeptics make the following points: first, the limitations of GenAI are not bugs, they’re features. They’re inherent to the way the technology works. Second, GenAI lacks real world understanding. Third, LLMs demonstrate diminishing returns. In short, there are hard limits to the capabilities of GenAI.
The lessons of History indicate that while there might be some overhype around GenAI, the impact could be profound – in the long run. Leaders should therefore develop their own understanding of GenAI and use it to define their vision. Shaping the future is a long-term game that starts today.
Watch the video (full transcript below).
The transcript has been edited for clarity and length.
Generative AI: the new magic lantern?
Anyone recognizes this? If you look closely, not much has changed since. Because this is a basic slide projector. It’s the Magic Lantern, invented around 1600. But it was not only used as a slide projector. It was also used by charlatans, magicians, people entertaining audiences to create illusions. This is the origin of the saying “smoke and mirrors”. Because they used smoke and mirrors with the Magic Lantern to create live projections in the air, in the smoke. So the Magic Lantern became much more than a slide projector – actually a way of creating illusions that were by definition not real.
You could say that Artificial Intelligence is today’s Magic Lantern. We’ve all seen images of Sora, OpenAI’s video production tool. And if you look at OpenAI’s website, they claim that they’re not working on video production. They actually intend to model the physical world. That’s a very big deal if that is true. Obviously it’s not true. At least I think I’m one of the more sceptical ones. But those are the claims being made. If we can actually use these models to model the physical world, that’s a big step towards artificial general intelligence.
Is GenAI overhyped? Reviewing the arguments for and against
If AI is today’s Magic Lantern, it begs the question, where are the smoke and where are the mirrors? And people who lead organizations should ponder a few questions: How good are AI capabilities today? Is AI overhyped? What is the trajectory? Will it continue to go at this pace? Will it slow down? Re-accelerate? How should I respond? Do we need to jump on it? Do we need to wait and see? Let everybody else do the first experience, experience the pains, and then we will adopt whatever works? What are the threats and what are the risks? These are common questions, but given the pace of things, they are crucial.
To answer these questions, one could look to the people who develop all this new technology. But the question is whether we can trust them. Sam Altman is looking for $7 trillion. I think the GDP of Germany is what? $4 trillion or $5 trillion. Last week Eric Schmidt, ex-Google CEO, stated on TV that AI is underhyped. He said the arrival of a non-human intelligence is a very, very big deal. Then the interviewer asked: is it here? And his answer was: it’s here, it’s coming, it’s almost here. Okay, so what is it? Is it here or is it coming? Anyway, he thinks it’s underhyped.
We need to look at the data, but even that isn’t trivial. Because if you look at generative AI, Large Language Models and how to measure their performance, it’s not easy. Because how do you determine if a response is actually accurate or not? You can’t measure it easily. In any case, we see the field progressing, and we’ve all seen the news around models beating bar exams and so on.
The key thing here is that all this progress is based on the AI scaling hypothesis, which states that as long as we throw more data and compute at it, we’ll advance. We’ll get ahead. This is the secret hypothesis that people are basing their claims on. And there are incentives for the industry to make the world believe that we’re close to artificial general intelligence. So we can’t fully trust them in my opinion, and we have to keep looking at the data. But the data tells us we’re still advancing. So what does that mean? Because current systems are anything but perfect. You must have seen ample examples. One is from Air Canada. They deployed a chatbot for their customer service, and the chatbot gave away free flights. It was a bug in the system.
That brings us to the skeptical view. What are the arguments? One is about large language modelling or generative AI in general: the flaws that we’re seeing are not bugs to be fixed. The way this technology works, by definition, has these flaws. These flaws are features, they’re not bugs. And part of that is that the models do not represent how the world works. They don’t have an understanding of the world. They just produce text in the case of a Large Language Model.
On top of that, they claim that there are diminishing returns. If you analyze the performance, for instance, of the OpenAI stuff that’s coming out, they claim that if you look at the benchmarks, it’s not really progressing that much anymore. And OpenAI hasn’t launched GPT-5, so they’re probably struggling. And all the claims are based on these scaling laws, and those scaling laws can’t go on forever. We’ve used all the data in the world, all the internet by now. So we’re probably hitting a plateau. This is the skeptical view. So on the one hand we hear all the progress and all the promises, but there are also people saying, “Look, that’s actually not the case if you really look under the hood of these systems.”
As for questions asked by organization leaders: “What do I need to do?” “How fast is this going?” Here, the predictions vary. In the Dutch Financial Times, here’s an economist saying it’s overhyped, it’s the same as always, all past technology revolutions took time and it will be the same this time. On the other hand, a recent report that came out saying this time is different: generative AI is a different type of technology and this is going to go much faster. The implication being that if you don’t stay ahead, if you don’t participate as an organization, you will be left behind soon.
The argument for generative AI is that the infrastructure is already there. It’s not like electricity, where we had to build power lines. For generative AI, the infrastructure is there. The cloud is rolled out. Software has become modular. And the technology itself is very intuitive. It’s very easy for people to interact with it because it’s based on natural language. All of those arguments are the basis for saying that this is going to go much faster. And I think some of us recognize that.
Looking ahead: how leaders should prepare
There’s a difference between adopting new tools and really changing your organization. When we think about the implications, at Rewire we try to make sense of these polarized views and form our own view of what is really happening and what it means for our clients, for our partners, and the people we work with. We have three key takeaways.
The first one is that we firmly believe that everybody needs to develop their own intuition and understanding of AI. Especially because we’re living in the smoke and mirror phase. It means that it’s important for people who have the role of shaping their organization to understand the technology and develop their own compass of what it can do, to navigate change.
The second is that you need to rethink the fundamentals. You need to think about redesigning things, re-engineering things, re-imagining your organization, asking what if, rather than adopting a tool or a point solution. You must think how your organization is going to evolve, what will it look like in five years’ time and how do we get there?
The third, is that yes, I agree with the fact of this Andrew McAfee, the economist that says generative AI is different because it goes faster. To a certain extent that’s true. But not to the point where full business models and full organizations and end-to-end processes change. Because that’s still hard work, it’s transformational work that doesn’t happen overnight. So the answers are nuanced. It’s not one extreme or the other. It is a long-term game to reap the benefits of this new technology.
What Generative AI can and cannot do. And what it means for the future of business.
Meet Edmond de Belamy, the portrait painting displayed above. On Thursday October 25th in 2018, Edmond was auctioned off by Christie’s for a whopping $432,500. The signature on the bottom right of Edmond shows its artist to be

This impressive-looking formula represents a GAN network: a generative AI-model avant-la-lettre. Edmond de Belamy became the first AI-generated artwork to be auctioned off by a major auction house.
The Belamy portrait was the first time we were truly impressed by the capabilities of a GenAI model. Nowadays, its quality is nothing special. In fact, we have rapidly gotten used to image generation models producing photorealistic images, text generation models that generate e-mail texts and meeting summaries better than we could ourselves, and even LLMs that support us developers and data scientists in writing code.

The breakthrough of GenAI, heralded by the release of ChatGPT in December 2022, has truly been amazing. It may almost lead you to believe that the age of humans is over, and the age of machines has now truly begun.
That is, until you start asking these GenAI models the hard questions. Such as:
Can you give me an odd number that does not have the letter “E” in it?

Anyone who can read could tell you GPT-4 botched that question. The correct answer would have been: no, I cannot, because such a number does not exist. Despite its ‘reasoning’, ChatGPT manages to confidently announce the number “two” has an “e” in it (note: it doesn’t), and continues to produce “two hundred one” as an example of an odd number that does not have an “e” in it (note: it does).
Another favorite pastime of ours is to ask LLM models to play chess with us. Our typical experience: they will claim they know the rules, and proceed to happily play illegal moves halfway through the game, or conjure up ghost pieces that manage to capture your knight out of nowhere.
The internet is full of entertaining examples of ChatGPT getting tasks wrong that humans solve without a second thought. These issues are not ChatGPT-specific either. Other LLM-providers, such as Anthropic, Google, Meta, and so on, face similar challenges.
So, on the one hand, GenAI models have proven to be an impressive breakthrough. On the other hand, it is clear that they are far from infallible. This begs the question:
What is the breakthrough of GenAI really?
And more importantly:
What does this mean for someone wanting to make the most of this technology?
Dissecting the GenAI breakthrough
In the 21st century we have seen the breakthrough of machine learning ending the ‘AI winter’ of the 1990s. Machine learning is essentially computers generalizing patterns in data. Three things are needed to do that well: algorithms for identifying patterns, large amounts of data to identify the patterns in, and computation power to run the algorithms.
By the start of the 2000s, we did know a fair bit about algorithms. The first Artificial Neural Networks, which form the foundation of ‘deep learning’, were published in 1943 (McCulloch et al., 1943). Random forest algorithms, the precursors to modern tree ensemble algorithms, were initially designed in 1995 (Breiman et al., 2001).
The rise of the digital age added the other two ingredients. We saw exponential growth in the amount of stored data worldwide. Simultaneously computing power kept getting exponentially cheaper and more easily accessible. All of a sudden, ‘machine learning’ and therefore AI, started to fly. Apparently, tossing large amounts of data, and huge amounts of computing power at a suitable algorithm can achieve a lot.
With machine learning we managed to have a go at quite a few problems that were previously considered difficult: image recognition, speech-to-text, playing chess. Some problems remained remarkably difficult, however. One of which was automated understanding and generation of human language.
This is where LLMs come in. In 2018, the same year our good fellow Edmond was created, OpenAI published a remarkable paper. They introduced a class of algorithms for language understanding called Generative Pre-trained Transformers, for short: “GPT” (Radford et al., 2018). This combined two key ideas. One that allowed them to “pre-train” language models on low-quality text as well as beautifully crafted pieces. And another that allowed these algorithms to benefit efficiently from large amounts of computing power. Essentially, they asked these models to learn how to predict the next word in a sentence, ideally enabling them to generate their own sentence if they became good enough at that.
GPT-1 was born with this paper in 2018, but OpenAI did not stop there. With their new algorithms they were able to scale GPT-1 both in model size, but just as importantly in terms of how much data it could be fed. GPT-3, the model that powered ChatGPT at its launch in 2022, was reportedly trained on almost 500 billion text tokens (which is impressive, but not even close to the ‘entire internet’, as some enthusiasts claim), and with millions of dollars of computing power.
Despite its insane jump in performance, the fundamental task of GPT-3 remains the same as it was for GPT-1. It is a very large, very complex machine learning language model trained to predict the next word in a sentence (or predict the next ‘token’, to be more precise). What is truly remarkable, however, is the competencies that emerge from this skill, which impressed even the most sceptical.
Emergent competencies of Large Language Models
Predicting the next word is cool, but why are these LLMs heralded by some as the future of everything (and by some as the start of the end of humanity)?
Despite being trained as “next-word-predictors” LLMs have begun to show competencies that truly do make them game changers. To be able to predict the next word based on a prompt, they have to do three things:
- Understand: making sense of unstructured data from humans and systems;
- Synthesize: processing information to ‘connect the dots’ and formulate a single, sensible response;
- Respond: generating responses in human or machine-readable language that provide information or initiate next steps.

Let’s unpack all this by looking at an example.

In this example, we asked ChatGPT powered by GPT-4 to solve a problem that may seem easy to anyone with basic arithmetic knowledge. Let’s pause to appreciate why ChatGPT’s ability to solve the problem is remarkable regardless.
Firstly, ChatGPT understands what we are asking. It manages to grasp that this question requires logical reasoning and gives us an answer that includes argumentation and a calculation. Furthermore, it can synthesize a correct response to our request, generating essentially ‘new’ information that was not present in the original prompt: the fact that we will have 3 loaves of bread left at the end of the day. Finally, it can respond with an answer that makes sense. In this case, it’s a human language response, wrapped in some nice formatting, but ChatGPT can also be used to generate ‘computer language’ in the form of programming code.
To put this into perspective: previous systems that were able to complete such prompts, would have generally been composed of many different components working together, tailored to this particular kind of problem. We’d have a text understanding algorithm, that extracts the calculation to be done from the input prompt. Then a simple calculator to perform the calculation, and a ‘connecting’ algorithm that feeds the output from the first algorithm into the calculator. Finally, we’d have a text generating system that takes the output and inputs the text into a pre-defined text template. With ChatGPT – we can have one model perform this whole sequence, all by itself. Lovely.
To put it in even more perspective: these types of problems were still difficult for older versions of ChatGPT. It shows that these emerging competencies are still improving in newer versions of state-of-the-art models.

So far, we haven’t even considered models that can do understanding, synthesis and generation across multiple types of input and outputs (text, images, sound, video, etc.) These multi-modal models are all the rage since early 2024. The world of GenAI is truly moving quickly.
So what does it mean for businesses?
The talents of GenAI are truly impressive, making them a very versatile tool that opens up endless possibilities. It can craft beautiful e-mails, help generate original logos, or speed up software development.
Even more promising is GenAI’s ability to make sense of unstructured data at scale. Estimates put the amount of unstructured data to be about 80-90% of all the data that companies possess. Extracting value from this data through traditional means is time-consuming and challenging at best. This has led to unstructured data being largely ignored for many commercial use cases. Yet GenAI can plough through it and generate outputs tailored to the desired use cases. You can uncover the needles in haystacks to fuel human decisions, or enable more traditional AI systems to learn from this data. Imagine how powerful these systems would become if you increased the amount of useful information available to them by a factor 5 to 10.
Now, if you’ve been following some of the AI news lately, you’ll know we can take it even one step further and think about ‘Agentic AI’, that is, agents powered by AI. These systems that cannot only think, but actually do. In future, they will likely enable large scale automation at first, and later complete organizational transformations.
Research into how to make this work is in full swing as of the summer of 2024. The extent to which autonomous AI agents are already feasible is a hotly debated topic. At any rate, ‘simple’ agents, are now being developed that are beginning to capitalize on the AI promise. Making the most of these will require their users to carefully consider how to manage their performance, and balance two opposing characteristics: hallucinations (i.e. nonsensical results, which some argue are an inescapable feature of LLMs), and their effectiveness.
Against this backdrop, it won’t be long before the early adopters get a head start on the laggers. Teams and organizations that take the time to identify opportunities and capitalize on them are set to move far ahead of the competition.
Fine-tuning can worsen factual correctness in specialized application domains. We discuss the implications.
This article originally published on LinkedIn. The writer, Wouter Huygen is partner and CEO at Rewire.
A new paper reveals that fine-tuning is not a wonder drug for prevailing LLM hallucinations. Rather the reverse: fine-tuning can actually worsen performance when aiming to develop factual correctness in specialized application domains.
Using supervised learning on new knowledge fine-tunes hallucinations, instead of enhancing accuracy
These findings could have profound implications. What if these areas precisely provide most of the value from LLM use cases?
The hard problem of hallucinations
The beauty of LLMs is that they are very generic and general-purpose: they contain “knowledge” on a very wide range of subjects covered in the training data. This forms the basis for the claim that the current path will get us (close) to AGI. I don’t think that is true, but that’s for another day.
Clearly, generative AI currently works only up to a point. Measuring hallucination rates is notoriously difficult, but roughly speaking the tech works well in 80% of the cases. And yes, performance depends on many factors including the prompting abilities of the user. That being said, getting rid of the remaining 20% is arguably the biggest headache of the AI industry.
A long standing question in neuroscience and philosophy is how consciousness arises in the brain. How does a bunch of molecules and electromagnetic waves produce the miracle of our conscious experience? This is referred to as the hard problem of consciousness. But what if science has the premise all wrong? What if consciousness does not arise from matter, but matter is (an illusion) formed in consciousness?
Hallucinations are the current hard problem-to-crack for AI
Similarly, hallucinations are a cause of generative AI technology, not a consequence. The technology is designed to dream up content, based on probabilistic relationships captured in the model parameters.
Big tech proclaims that the issue can be solved through further scaling, but experts in the field increasingly recognize we have to view it as a remaining feature, not a bug. After all, who would not be hallucinating after reading the entire internet ;-)
For the short term, the applicability of LLMs – despite their amazing feats – remains more limited than we might hope. Especially in high stakes situations and/or very specialized areas. And these might just be the areas that herald the most value (e.g. in healthcare, proving accurate diagnostic/treatment solutions).
Unless fundamental algorithmic breakthroughs come along, or scaling proves to work after all, we have to learn how to make the best of what we've got. Work with the strengths, while minimizing downside impact.
Using Fine-Tuning to develop domain specific applications
Since the beginning of the Gen AI hype, fine-tuning is touted as one of the ways to improve performance on specific application areas. The approach is to use supervised learning on domain-specific data (e.g. proprietary company data) to fine-tune a foundational (open source) model to specialize it for a certain use case and increase factuality.
Intuitively this makes sense. The foundation model is pre-trained on generic text prediction with a very broad base of foundational knowledge. Further fine-tuning would then provide the required specialization, based on proprietary and company-specific facts.
Fine-tuning does not work well for new information
A recent paper investigates the impact of fine-tuning on new information. The authors aimed to validate the hypothesis that new knowledge can have unexpected negative impact on model performance, rather than improving it in a specific area. The outcomes are surprising, counter-intuitive at first glance, and impactful.
Fine-tuning with new knowledge works much slower than for existing knowledge (i.e. knowledge that was included in the pre-training data set). But most importantly, beyond a certain point of training, new knowledge deteriorates model performance on existing knowledge. In other words, incorporating specific new information in fine-tuning increases hallucinations. Worse yet, the hallucination rate grows linearly with more training in unknown content.
In intuitive terms, it seems as if the model gets confused with new information and “unlearns” existing knowledge.
Exhibit 1. Train and development accuracies as a function of the fine-tuning duration, when fine-tuning on 50% Known and 50% Unknown examples.

Source: paper from Zorik Gekhman et al.
These conclusions have serious implications for anyone aiming to develop specialized LLM use cases. Fine-tuning remains useful for strengthening model performance in known areas as well as improving the form and structure of the desired output. But using fine-tuning to increase factuality on new information does not work well and has undesirable, opposite affects.
The unfortunate correlation between accuracy and value
Using LLMs to build knowledge assistants is a promising use case across many fields. These use cases thrive well in highly knowledge-intensive industries, allowing users to query situation specific information on-demand. This includes healthcare workers, pharmaceutical advisors, customer service, professional services, etc. Not only do these assistants increase effectiveness and efficiency of their users, they also allow to accumulate enterprise knowledge and IP in a much more sustainable and scalable manner. They become like digital co-workers that never resign, unless you fire them.
As long as humans can be in the loop, verifying output, or when the impact of inaccurate information is low, the current LLM technology is already good enough. But in many situations, most of the value would actually come from reliability and factual correctness rather than an 80%- answer that can be manually adjusted (like drafting an email).
What to do instead?
To enhance performance in specific application areas amidst existing technological constraints, companies and developers must adopt a pragmatic and empirical engineering approach. This involves employing a combination of sophisticated techniques to forge optimal solutions. Innovations like Retrieval-Augmented Generation (RAG), fine-tuning processes accounting for new versus existing knowledge, advanced context embedding, and post-processing output verification are reshaping our methodologies daily.
The new insights discussed here demonstrate the importance to stay abreast of the fast developing field to continue pushing the performance boundaries of Gen AI applications. Until new breakthroughs happen in foundation models, we have to keep learning new tricks of the trade to get the most out of today's state of the art.