Home
Data management
Unlocking true intelligence with memory, strategic planning, and transparent performance management
The rise of intelligent AI agents marks a shift from traditional task automation toward more adaptive, decision-making systems. While many AI-powered workflows today rely on predefined rules and deterministic processes, true agentic AI goes beyond fixed automation—it incorporates memory, reasoning, and self-improvement to adapt and autonomously achieve specific goals.
The previous agentic AI blog post explained the building blocks (memory, planning, tools and actions), use cases, and potential for autonomy. It positioned AI agents not just as tools for automating repetitive tasks but as systems capable of enhancing workflow efficiency and tackling complex problems with some degree of independent decision-making. In this blog post, we explore how these agents can be seamlessly integrated into real-world scenarios, striking a balance between cognitive science theory and the practical realities of human-agent interaction, with a particular focus on the memory and planning building blocks, alongside the addition of performance management.
Moving from models to real-world AI agents
Scaling AI agents presents challenges. While they can improve efficiency, their success depends on integration with existing systems. Use cases like personal assistants and content generators demonstrate clear value, whereas others struggle with reliability and adaptability.
Current LLM-based workflow automation relies on knowledge—whether through large reasoning models or knowledge bases. However, these agents often lack persistent memory, meaning they re-solve the same issues repeatedly. Without storing and leveraging past experiences, they remain reactive rather than truly intelligent. To bridge this gap, AI agents need:
- Memory – Retaining and applying past experiences.
- Ability to plan – Setting goals, adapting strategies, and managing complexity.
- Transparent performance management – Ensuring alignment, oversight, and trust.
These elements go beyond the building blocks of tools and actions, which have already been widely discussed in AI agent design. Here we focus on memory, planning, and performance management, as they represent critical design choices to move toward AI agents that are not just reactive task automators, but intelligent, adaptable decision-makers capable of handling more sophisticated tasks in real-world scenarios. Let’s start by exploring what intelligence in that sense even means.
Memory: the key to true intelligence
True intelligence goes beyond automation. An AI agent must not only process information but also learn from past experiences to improve over time. Without memory, an AI agent would remain static, unable to adapt or evolve. By integrating memory, reasoning, and learning, an intelligent AI agent moves beyond simply performing predefined tasks. In our previous blog post on AI agents, we explained that memory, planning, tools and actions are the building blocks of agents. Now, let's examine how memory plays a crucial role in enhancing an AI agent's capabilities. In cognitive science, memory is divided into:
- Semantic memory – A structured database of facts, rules, and language that provides foundational knowledge.
- Episodic memory – Past experiences that inform future decisions and allow for adaptation.
The interplay between semantic and episodic memory enables self-improvement: experiences enrich knowledge, while knowledge structures experiences. When agents lack episodic memory, they struggle with contextual awareness and must rely solely on predefined rules or external prompting to function effectively. To obtain intelligent agents, episodic memory is therefore crucial. By organizing past interactions into meaningful units (through chunking), agents can recall relevant solutions, compare them to new situations, and refine their approach. This form of memory actively supports an agent’s ability to reflect on past actions and outcomes.
To illustrate this, let's consider an example from supply chain management, as depicted in the image below. An AI agent that tracks delivery data, inventory, and demand can improve logistics by learning from past experiences. If the agent identifies patterns, such as delays during certain weather conditions or peak seasons, it can proactively adjust shipping schedules and notify relevant stakeholders. Without memory, the agent would simply repeat tasks without optimizing them, leading to inefficiencies and missed opportunities for improvement.

Figure 1 - Memory in AI Agents - An example from supply chain management
Ability to plan: the foundation for autonomous decision-making
Intelligent AI agents must dynamically plan and break complex problems into manageable tasks—mirroring the analytical nature of the human mind. Unlike rule-based automation, these agents should be able to assess different strategies, evaluate potential outcomes, and adjust their approach based on real-time feedback. Planning allows an agent to remain flexible, ensuring it can pivot when conditions change rather than blindly following predefined sequences.
LLMs serve as the reasoning engines of AI agents, showcasing increasingly advanced cognitive abilities. However, they struggle with long-term memory and sustained focus—much like the human mind under information overload. This limitation poses challenges in designing AI agents that must retain context across extended interactions or tackle complex problem-solving tasks.
A critical design question is whether an agent should retain plans internally or offload them to an external tool. Keeping plans within an LLM provides full information access but may be limited by context constraints. For example, an AI managing a real-time chat-based customer support system could benefit from internal memory to dynamically adapt to an ongoing conversation, keeping track of the customer's previous questions and preferences without relying on external systems. This allows the agent to provide personalized responses without the delay of querying an external database. On the other hand, external tools lighten the cognitive load but can introduce rigidity if not well-integrated. For instance, an AI-powered weather application might be better off using an external tool to retrieve up-to-date weather data rather than relying on its internal model, which could become outdated or too complex to manage. This allows the system to focus on processing and presenting the information without overloading its internal resources. A balanced approach ensures adaptability without overloading the agent’s working memory. Ultimately, the necessity of such a tool depends on the LLM's ability to retrieve, retain, and adjust information—an advanced reasoning model might even eliminate the need for external tools.
For example, an AI-powered financial advisor might need to balance long-term investment strategies with short-term market fluctuations. If it relies too heavily on immediate context, it might make impulsive decisions based on temporary trends. On the other hand, if it solely adheres to a rigid external planning framework, it might fail to adapt to new opportunities. The ideal approach blends both—leveraging structured knowledge while maintaining the ability to dynamically reassess and adjust strategies.
Transparent performance management: balancing efficiency and trust
Human-AI agent interaction is shaped by the trade-off between efficiency and trust: the more autonomous an AI agent becomes, the more it can streamline operations and reduce human workload—yet the less transparent its decision-making may feel. In scenarios where tasks are low-risk and repetitive, full automation makes sense as errors have minimal impact, and efficiency gains outweigh the downsides. However, in high-stakes environments like financial trading or medical diagnosis, the costs of a wrong decision are simply too high. Transparent performance management is thus essential.
The challenge is that AI agents, while improving, are still fallible, inheriting issues like hallucinations and biases from LLMs. AI must operate within defined trust thresholds—where automation is reliable enough to act independently yet remains accountable. Rather than requiring continuous human oversight, performance management should focus on designing mechanisms that allow AI agents to function autonomously while ensuring reliability. This involves self-monitoring, self-correction, and explainability.
Mechanisms like agent self-critique mitigate these issues by enabling agents to evaluate their own decisions before execution. Also known as LLM-as-a-judge, self-critique involves sending both input and output to a separate LLM entity that is unaware of the entire agentic workflow, assessing whether the response logically follows from the input. For instance, an LLM can check its output for consistency by sending both its input and response to a separate validation model, which then determines whether the response aligns with the provided information. This process helps catch hallucinations, biases, and inconsistencies before decisions are finalized, improving the reliability of autonomous AI agents.
In the early stages of AI agent deployment, human experts play a crucial role in shaping and refining performance management processes. However, as these agents evolve, the goal is to reduce direct human intervention while maintaining oversight through structured performance metrics. Instead of requiring constant check-ins, AI agents should be designed to self-monitor and adapt, ensuring alignment with objectives without excessive human involvement. By incorporating mechanisms for self-assessment, AI agents can achieve greater autonomy while maintaining accountability. The ultimate aim is to develop fully autonomous agents that balance efficiency with transparency—operating independently while ensuring performance remains reliable.
For example, consider an AI agent managing IT system maintenance in a large enterprise. Such an agent monitors server performance, security threats, and software updates. Instead of relying on human intervention for every decision, it can autonomously detect anomalies, apply minor patches, and optimize system configurations based on historical performance data. However, major decisions—such as deploying a company-wide software update—may still require validation through transparent reporting mechanisms. If the AI agent consistently demonstrates accuracy in its assessments and risk predictions, human involvement can gradually decrease, ensuring both operational efficiency and system integrity.
What’s next for AI agents?
AI agents are evolving beyond simple automation. To become truly intelligent, they must adapt, plan, and learn from experience. Without memory, an agent is static. Without planning, it lacks direction. Without transparent performance management, it risks unreliability.
By integrating memory, planning, and performance management, AI agents can move beyond task execution toward strategic problem-solving. Future AI will not merely automate processes but will actively contribute to decision-making, helping organizations navigate complexity with greater precision and efficiency.
The future belongs to AI that doesn’t just execute tasks but remembers, adapts, and improves. An agent without intelligence is merely automation with an attitude.
Sources
Greenberg DL, Verfaellie M. "Interdependence of episodic and semantic memory: evidence from neuropsychology." J Int Neuropsychol Soc. 2010;16(5):748-753. doi:10.1017/S1355617710000676. Link.
"Does AI Remember? The Role of Memory in Agentic Workflows." (2025) Link.
"RULER: What's the Real Context Size of Your Long-Context Language Models?" (2024). arXiv:2404.06654
This article was written by Gijs Smeets, Data Scientist at Rewire and Mirte Pruppers, Data Scientist at Rewire.
An introduction to the world of LLM output quality evaluation: the challenges and how to overcome them in a structured manner
Perhaps you’ve been experimenting with GenAI for some time now, but how do you determine when the output quality of your Large Language Model (LLM) is sufficient for deployment? Of course, your solution needs to meet its objectives and deliver reliable results. But how can you evaluate this effectively?
In contrast to LLMs, assessing machine learning models is often a relatively straightforward process: metrics like Area Under the Curve for classification or Mean Absolute Percentage Error for regression give you valuable insights in the performance of your model. On the other hand, evaluating LLMs is another ball game, since GenAI generates unstructured, subjective outputs – in the form of texts, images, or videos - that often lack a definitive "correct" answer. This means that you’re not just assessing whether the model produces accurate outputs; you also need to consider, for example, relevance and writing style.
For many LLM-based solutions (except those with very specific tasks, like text translation), the LLM is just one piece of the puzzle. LLM-based systems are typically complex, since they often involve multi-step pipelines, such as retrieval-augmented generation (RAG) or agent-based decision systems, where each component has its own dependencies and performance considerations.
In addition, system performance (latency, cost, scalability) and responsible GenAI (bias, fairness, safety) add more layers of complexity. LLMs operate in ever-changing contexts, interacting with evolving data, APIs, and user queries. Maintaining consistent performance requires constant monitoring and adaptation.
With so many moving parts, figuring out where to start can feel overwhelming. In this article, we’ll purposefully over-simplify things by answering the question: “How can you evaluate the quality of your LLM output?”. First, we explain what areas you should consider in the evaluation of the output. Then, we’ll discuss the methods needed to evaluate output. Finally, to make it concrete, we bring everything together in an example.
What are the evaluation criteria of LLM output quality?
High-quality outputs build trust and improve user experience, while poor-quality responses can mislead users and foster misinformation. The start of building an evaluation (eval) is to start with the end-goal of the model. The next step is to define the quality criteria to be evaluated. Typically these are:
- Correctness: Are the claims generated by the model factually accurate?
- Relevance: Is the information relevant to the given prompt? Is all required information provided -by the end user, or in the training data- to adequately offer an answer to the given prompt?
- Robustness: Does the model consistently handle variations and challenges in input, such as typos, unfamiliar question formulations, or types of prompts that the model was not specifically instructed for?
- Instruction and restriction adherence: Does the model comply with predefined restrictions or is it easily manipulated to jailbreak the rules?
- Writing style: Does the tone, grammar, and phrasing align with the intended audience and use case?
How to test the quality of LLM outputs?
Now that we’ve identified what to test, let’s explore how to test. A structured approach involves defining clear requirements for each evaluation criterion listed in the previous section. There are two aspects to this: references to steer your LLM towards the desired output and the methods to test LLM output quality.
1. References for evaluation
In LLM-based solutions, the desired output is referred to as the golden standard, which contain reference answers for a set of input prompts. Moreover, you can provide task-specific guidelines such as model restrictions and evaluate how well the solution adheres to those guidelines.
While using a golden standard and task-specific guidelines can effectively guide your model towards the desired direction, it often requires a significant time investment and may not always be feasible. Alternatively, performance can also be assessed through open-ended evaluation. For example, you can use another LLM to assess relevance, execute generated code to verify its validity, or test the model on an intelligence benchmark.
2. Methods for assessing output quality
Selecting the right method depends on factors like scalability, interpretability, and the evaluation requirement being measured. In this section we explore several methods, and assess their strengths and limitations.
2.1. LLM-as-a-judge
An LLM isn’t just a text generator—it can also assess the outputs of another LLM. By assessing outputs against predefined criteria, LLMs provide an automated and scalable evaluation method.
Let’s demonstrate this with an example. For example, ask the famous question, "How many r's are in strawberry?" to ChatGPT's 4o mini model. It responds with, "The word 'strawberry' contains 1 'r'.", which is obviously incorrect. With the LLM-as-a-judge method, we would like the evaluating LLM (in this case, also 4o mini) to recognize and flag this mistake. In this example, there is a golden reference answer “There are three 'r’s' in 'strawberry'.”, which can be used to evaluate the correctness of the answer.

Indeed, the evaluating LLM appropriately recognizes that the answer is incorrect.
The example shows that LLMs can evaluate outputs consistently and at scale due to their ability to quickly assess several criteria. On the other hand, LLMs may struggle to understand complex, context-dependent nuances or subjective cases. Moreover, LLMs may strengthen biases within the training data and can be costly to use as an evaluation tool.
2.2. Similarity metrics for texts
When a golden reference answer is available, similarity metrics provide scalable and objective assessments of LLM performance. Famous examples are NLP metrics like BLEU and ROUGE, or more advanced embedding-based metrics like cosine similarity and BERTScore. These methods provide quantitative insights in measuring the overlap in words and sentence structure without the computational burden of running full-scale LLMs. This can be beneficial when outcomes must closely align with provided references – for example in the case of summarization or translation.
While automated metrics provide fast, repeatable, and scalable evaluations, they can fall short on interpretability and often fail to capture deeper semantic meaning and factual accuracy. As a result, they are best used in combination with human evaluation or other evaluation methods.
2.3. Human evaluation
Human evaluation provides a strong evaluation method due to its flexibility. In early stages of model development, it is used to thoroughly evaluate errors such as hallucinations, reasoning flaws, and grammar mistakes to provide insights into model limitations. As the model improves through iterative development, groups of evaluators can systematically score outputs on correctness, coherence, fluency, and relevance. To reduce workload and enable real-time human evaluation after deployment, pairwise comparison can be used. Here, two outputs are compared to determine which performs better for the same prompt. This is in fact implemented in ChatGPT.
It is recommended to use both experts as non-experts in human evaluation of your LLM. Experts can validate the model’s approach based on their expertise. On the other hand, non-experts play a crucial role in identifying unexpected behaviors and offering fresh perspectives on real-world system usage.
While human evaluation offers deep, context-aware insights and flexibility, it is resource- and time-intensive. Moreover, comparing different examiners can lead to inconsistent evaluations when they are not aligned.
2.4. Benchmarks
Lastly, there are standardized benchmarks that offer an approach to assess the general intelligence of LLMs. These benchmarks evaluate models on various capabilities, such as general knowledge (SQuAD), natural language understanding (SuperGLUE), and factual consistency (TruthfulQA). To maximize their relevance, it’s important to select benchmarks that closely align with your domain or use case. Since these benchmarks test broad abilities, they are often used to identify an initial model for prototyping. However, standardized benchmarks can provide a skewed perspective due to their lack of alignment with your specific use case.
2.5. Task specific evaluation
Depending on the task, other evaluation methods are appropriate. For instance, when testing a categorization LLM, accuracy can be measured using a predefined test set alongside a simple equality check (of the predicted category vs. actual category). Similarly, the structure of outputs can be tested by counting line-breaks; certain headers and/or the presence of certain keywords can also be checked. Although these technique are not easily generalizable across different use cases, they offer a precise and efficient way to verify model performance.
Putting things together: writing an eval to measure LLM summarization performance
Consider a scenario where you're developing an LLM-powered summarization feature designed to condense large volumes of information into three structured sections. To ensure high-quality performance, we evaluate the model for each of our five evaluation criteria. For each criterion, we identify a key question that guides the evaluation. This question helps define the precise metric needed and determines the appropriate method for calculating it.
Criterium | Key question | Metric | How |
Correctness | Is the summary free from hallucinations? | Number of statements in summary that can be verified based on source text | * Use an LLM-as-a-judge to check if each statement can be answered based on the source texts * Use human evaluation to verify correctness of outputs |
Relevance | Is the summary complete? | Number of key elements present with respect to a reference guideline or golden standard summary | Cross-reference statements in summaries with LLM-as-a-judge and measure the overlap |
Is the summary concise? | Number of irrelevant statements with respect to golden standard Length of summary | * Cross-reference statements in summaries with LLM-as-a-judge and measure the overlap * Count the number of words of generated summaries | |
Robustness | Is the model prone to noise in the input text? | Similarity of summary generated for original text with respect to summary generated for text with noise such as typo’s and inserted irrelevant information | Compare statements with LLM-as-a-judge, or compare textual similarity with ROUGE or BERTscore |
Instruction & restriction adherence | Does the summary comply with required structure? | Presence of three structured sections | Count number of line breaks and check presence of headers |
Writing style | Is the writing style professional, fluent and free of grammatical errors? | Rating of tone-of-voice, fluency and grammar | * Ask LLM-as-a-judge to rate fluency and professionality and mark grammatical errors * Rate writing style with human evaluation |
Overarching: alignment with golden standard | Do generated summaries align with golden standard summaries? | Textual similarity with respect to golden standard summary | Calculate similarity with ROUGE or BERTscore |
The table shows that the proposed evaluation strategy leverages multiple tools and combines reference-based and reference-free assessments to ensure a well-rounded analysis. And so we ensure that our summarization model is accurate, robust, and aligned with real-world needs. This multi-layered approach provides a scalable and flexible way to evaluate LLM performance in diverse applications.
Final thoughts
Managing LLM output quality is challenging, yet crucial to build robust and reliable applications. To ensure success, here are a few tips:
- Proactively define the evaluation criteria. Establish clear quality standards before model deployment to ensure a consistent assessment framework.
- Automate when feasible. While human evaluation is essential for subjective aspects, automate structured tests for efficiency and consistency.
- Leverage GenAI to broaden your evaluation. Use LLMs to generate diverse test prompts, simulate user queries, and assess robustness against variations like typos or multi-language inputs.
- Avoid reinventing the wheel. There are already various evaluation frameworks available on the internet (for instance, DeepEval). These frameworks provide structured methodologies that combine multiple evaluation techniques.
Achieving high-quality output is only the beginning. Generative AI systems require continuous oversight to address challenges that arise after deployment. User interactions can introduce unpredictable edge cases, exposing the gap between simulated scenarios and real-world usage. In addition, updates to models and datasets can impact performance, making continuous evaluation crucial to ensure long-term success. At Rewire, we specialize in helping organizations navigate the complexities of GenAI, offering expert guidance to achieve robust performance management and deployment success. Ready to take your GenAI solution to the next level? Let’s make it happen!
This article was written by Gerben Rijpkema, Data Scientist at Rewire, and Renske Zijm, Data Scientist at Rewire.
Steeped in history, built for the future, the new Rewire office in Amsterdam is a hub for innovation, collaboration, and impact
Rewire has taken a new step forward: a new home in Amsterdam! But this isn’t just an office. It’s a reflection of our growth, vision, and commitment to driving business impact through data & AI. With this new office, we aim to create an inspiring home away from home for our employees, and a great place to connect and collaborate with our clients and partners.
Steeped in history, built for the future
Amsterdam is a city where history and innovation coexist. Our new office embodies this spirit. Nestled in scenic Oosterpark, which dates back to 1891, and set in the beautifully restored Amstel Brewery stables from 1912, our location is a blend of tradition and modernity.


The fully refurbished Rewire office in Amsterdam once served as the stables of the Amstel brewery.
But this isn’t just about aesthetics—our presence within one of Amsterdam’s most vibrant areas is strategic. Surrounded by universities, research institutions, cultural and entertainment landmarks, we’re positioned at the crossroads of academia, industry, and creativity. Moreover, the city has established itself as a global hub for Data & AI, attracting global talent, startups, research labs, and multinational companies. Thus, we’re embedding ourselves in an environment that fosters cross-disciplinary collaboration and promotes impact-driven solutions. All in all, our new location ensures that we stay at the forefront of AI while providing an inspiring setting to work, learn, and imagine the future.
A space for collaboration & growth
At Rewire, we believe that knowledge-sharing and hands-on learning are at the core of meaningful AI adoption. That’s why our new Amsterdam office is more than just a workspace—it’s a hub for collaboration, education, and innovation.

More than just an office, it's a hub for collaboration, education, and innovation.
We’ve created dedicated spaces for GAIN’s in-person training, workshops, and AI bootcamps, reinforcing our commitment to upskilling talent and supporting the next generation of AI professionals. Whether through hands-on coding sessions, strategic AI leadership discussions, or knowledge-sharing events, this space is designed to develop the Data & AI community of tomorrow.
Beyond structured learning, we’ve designed our office to be an environment where teams can engage in deep problem-solving, collaborate on projects, and push the boundaries of what can be achieved. Our goal is to bridge the gap between research and real-world application, thus helping clients leverage AI’s full potential.


The new office includes a variety of spaces, from small quiet spaces for deep thinking to large open spaces for group work, and socializing.
Sustainability & excellence at the core
Our new office is built to the highest standards of quality and sustainability, incorporating modern energy-efficient design, eco-friendly materials, and thoughtfully designed workspaces.

Eco-friendly and energy efficient, the new office retain the fixtures of the original Amstel stables.
We’ve curated an office that balances dynamic meeting spaces, collaborative areas, and quiet zones for deep thinking—all designed to support flexibility, focus, and innovation. Whether brainstorming the next AI breakthrough or engaging in strategic discussions, our employees have a space that fosters creativity, problem-solving, and impactful decision-making.

The office is not just next to Oosterpark, one of Amsterdam's most beautiful parks. It is also home to thousands of plants.
We’re Just getting started
Our move to a new office in Amsterdam represents a new chapter in Rewire’s journey, but this is only the beginning. As we continue to expand, build, and collaborate, we look forward to engaging with the broader Data & AI community, fostering innovation, and shaping the future of AI-driven impact.
We’re excited for what’s ahead. If you’re interested in working with us, partnering on AI initiatives, or visiting our new space—contact us!

The team that made it happen.
How a low-cost, open-source approach is redefining the AI value chain and changing the reality for corporate end-users
At this point, you’ve likely heard about it: DeepSeek. Founded in 2023 by Liang Wenfeng, co-founder of the quantitative hedge fund High-Flyer, this Hangzhou-based startup is rewriting the rules of AI development with its low-cost model development, and open-source approach.
Quick recap first.
From small steps to giant leaps
There are actually two model families launched by the startup: DeepSeek-V3 and DeepSeek R1.
V3 is a Mixture-of-Experts (MoE) large language model (LLM) with 671 billion parameters. Thanks to a number of optimizations it can provide similar or better performance than other large foundational models, such as GPT-4o and Claude-3.5-Sonnet. What’s even more remarkable is that V3 was trained in around 55 days at a fraction of the cost for similar models developed in the U.S. —less than US$6 million for DeepSeek V3, compared with tens of millions, or even billions, of dollars in investments for its Western counterparts.
R1, released on January 20, 2025, is a reasoning LLM that uses innovations applied to the V3 base model to greatly improve its performance in reasoning. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. The kicker is that R1 is published under the permissive MIT license: this license allows developers worldwide to modify the model for proprietary or commercial use, which paves the way for accelerated innovation and adoption.
Less is more: reshaping the AI value chain
As other companies emulate DeepSeek’s achievements, its cost-effective approach and open-sourcing of its technology signals three upcoming shifts:
1. Lower fixed costs and increased competition. DeepSeek shows that cutting-edge LLMs no longer require sky-high investments. Users have reported running the V3 model on consumer Mac hardware and even foresee its use on devices as lightweight as a Raspberry Pi. This opens the door for more smaller players to develop their own models.
2. Higher variable costs driven by higher query costs. As reasoning models become widely available and more sophisticated, their per-query costs rise due to the energy-intensive reasoning processes. This dynamic creates new opportunities for purpose-built models – as opposed to general-purpose models like OpenAI’s ChatGPT.
3. Greater value creation down the AI value chain. As LLM infrastructure becomes a commodity (much like electricity) and companies develop competing models that dent the monopolistic power of big tech (although the latter will continue to push the boundaries with new paradigm breakthroughs), the development of solutions in the application layer – where most of the value for end-users resides – becomes much more accessible and cost-effective. Hence, we expect accelerated innovations in the application layer and the AI market to move away from “winner-takes-all” dynamics, with the emergence of diverse applications tailored to specific industries, domains, and use cases.
So what’s next for corporate end-users?
Most corporate end-users will find that having a reliable and “good enough” model matters more than having the absolute best model. Advances in reasoning such as R1 could be a big step for AI agents that deal with customers and perform tasks in the workplace. If those are available more cheaply, corporate adoption and bottom lines will increase.
Taking a step back, a direct consequence of the proliferation of industry and function-specific AI solutions built in the application layer is that AI tools and capabilities will become a necessary component for any company seeking to build competitive advantage.
At Rewire, we help clients prepare themselves for this new reality by building solutions and capabilities that leverage this new “AI commodity” and turning AI potential into real-world value. To explore how we can help you harness the next wave of AI, contact us.
From customer support to complex problem-solving: exploring the core components and real-world potential of Agentic AI systems
For decades, AI has captured our imaginations with visions of autonomous systems like R2-D2 or Skynet, capable of independently navigating and solving complex challenges. While those pictures remain firmly rooted in science fiction, the emergence of Agentic AI signals an exciting step in that direction. Powered by GenAI models, these systems would improve adaptability and decision-making beyond passive interactions or narrow tasks.
Consider the example of e-commerce, where customer support is critical. Traditional chatbots handle basic inquiries, but when a question becomes more complex—such as tracking a delayed shipment or providing tailored product recommendations—the need for human intervention quickly becomes apparent. This is where GenAI agents can step in, bridging the gap between basic automation and human-level problem solving.
The promise of GenAI-driven agents isn’t about overnight industry transformation but lies in their ability to augment workflows, enhance creative processes, and tackle complex challenges with a degree of autonomy. Yet, alongside this potential come significant technological, ethical, and practical challenges that demand thoughtful exploration and development.
In this blog, we’ll delve into what sets these systems apart, examine their core components, and explore their transformative potential through real-world examples. Let’s start by listing a few examples of possible use cases for GenAI agents:
- Personal assistants that schedule meetings based on availability and send out invitations.
- Content creators that generate blog posts, social media copy, and product descriptions with speed and precision.
- Code assistants that assist developers in writing, debugging, and optimizing code.
- Healthcare assistants that analyze medical records and provide diagnostic insights.
- AI tutors that personalize learning experiences, offering quizzes and tailored feedback to students.
GenAI agents handle tasks that previously required significant human effort, freeing up valuable time and resources for strategic thinking and innovation. But what exactly are these GenAI agents, how do they work and how to design them?
Understanding the core components of GenAI agents
Unlike traditional chatbots, GenAI agents transcend simple text generation, captured in three core principles:
- They act autonomously. These agents are capable of taking goal-driven actions, such as querying a database or generating a report, without explicit human intervention.
- They plan and reason. Leveraging advanced reasoning capabilities, they can break down complex tasks into actionable steps.
- They integrate with tools. While LLMs are great for answering questions, GenAI agents use tools and external systems to access real-time data, perform calculations, or retrieve historical information.
What makes these capabilities possible? At the heart of every GenAI agent lies the power of GenAI models, specifically large language models (LLMs). LLMs are the engine behind the agent's ability to understand natural language, adapt to diverse tasks, and simulate reasoning. Without them, these agents couldn’t achieve the nuanced communication or versatility required for autonomy or to handle complex inputs.
But how do all the pieces come together to create such a system? To understand the full picture, we need to look beyond LLMs and examine the other key components that make GenAI agents work. Together, these components (Interface, Memory, Planning, Tools, and Action) enable the agent to process information, make decisions, and execute tasks. Let’s explore each of these building blocks in detail.

1. Interface: the bridge between you and the AI
The interface is the gateway through which users communicate with the GenAI agent. It serves as the medium for input and output, allowing users to ask questions, give commands, or provide data. Whether it’s a text-based chat, a voice command, or a more complex graphical user interface (GUI), the interface ensures the agent can understand human input and convert it into actionable data.
2. Memory: remembering what matters
Memory is what allows a GenAI agent to learn from past experiences and adapt over time. It stores both short-term and long-term information, helping the agent maintain context across interactions and deliver personalized experiences, based on past conversations and preferences.
3. Planning: charting the path to success
The planning component is the brain behind the agent’s decision-making process. This is essentially using the reasoning of an LLM to break down the problem into smaller tasks. When faced with a task or problem, the agent doesn’t just act blindly. Instead, it analyses the situation, sets goals and priorities, and devises a strategy to accomplish them. This ability to plan ensures that the agent doesn’t simply react in predefined ways, but adapts its actions for both simple and complex scenarios.
4. Tools: extending the agent’s capabilities
No GenAI agent is an island— it needs to access additional resources to solve more specialized problems. Tools can be external resources, APIs, databases, or even other specialized (GenAI) agents that the agent can use to extend its functionality. By activating tools as prompted by the GenAI agent when needed, the agent can perform tasks that go beyond its core abilities, making it more powerful and versatile.
5. Action: bringing plans to life
Once the agent has formulated a plan, it’s time to take action. The action component is where the agent moves from theory to practice, executing tasks, sending responses, or interacting with tools and external systems. It’s the moment where the GenAI agent delivers value by fulfilling its purpose, completing a task, or responding to a user request.
The core components of a GenAI agent in action
Now that we’ve broken down the core components of a GenAI agent, let’s see how they come together in a real-world scenario. Let’s get back to our customer support example and imagine a customer who is inquiring about the status of their order. Here’s how the agent seamlessly provides a thoughtful, efficient response:

- Interface: The agent receives the customer query, "What is the status of my order?"
- Memory: The agent remembers the customer placed an order on January 15th 2025 with order ID #56789 from a previous interaction.
- Planning: The agent breaks down the task into steps: retrieve order details, check shipment status, and confirm delivery date.
- Tools: The agent accesses the DHL API with the customer’s order reference to get the most recent status update and additionally accesses CRM to confirm order details.
- Action: The agent retrieves the shipping status, namely “Out for delivery” as of January 22nd, 2025.
- Interface: The agent sends the response back to the customer: "Your order is on its way and is expected to arrive on January 22nd, 2025. Here’s the tracking link for more details: [link]. Would you like to set a delivery reminder?"
If the customer opts to set a reminder, the agent can handle this request as well by using its available tools to schedule the reminder accordingly, something that basic automation would not be able to do.
Unlocking efficiency with multi-agent systems for complex queries
A single GenAI agent can efficiently handle straightforward queries, but real-world customer interactions often involve complex, multifaceted issues. For instance, a customer may inquire about order status while also addressing delayed shipments, refunds, and promotions—all in one conversation. Managing such tasks sequentially with one agent increases the risk of delays and errors.
This is where multi-agent systems come in. By breaking down tasks into smaller, specialized subtasks, each handled by a dedicated agent, it’s easier to ensure and track efficiency and accuracy. For example, one agent handles order tracking, another manages refunds, and a third addresses promotions.
Splitting tasks this way unlocks several benefits: simpler models can tackle reduced complexity, agents can be given specific instructions to specialize in certain areas, and parallel processing allows for approaches like majority voting to build confidence in outputs.
With specialized GenAI agents working together, businesses can scale support for complex queries while maintaining speed and precision. This multi-agent approach outperforms traditional customer service, offering a more efficient and accurate solution than escalating issues to human agents.

What is needed to design GenAI agents?
Whether you’re automating customer support, enhancing business processes, or revolutionizing healthcare, building effective GenAI agents is a strategic endeavor that requires expertise across multiple disciplines.
At the core of building a GenAI agent is the integration of key components that enable its functionality and adaptability:
- Data engineering. The foundation of any GenAI agent is its ability to access and process data from multiple sources. This includes integrating APIs, databases, and real-time data streams, ensuring the agent can interact with the world outside its own environment. Effective data engineering enables an agent to access up-to-date information, making it dynamic and capable of handling a variety of tasks.
- Prompt engineering. For a large language model to perform effectively in a specific domain, it must be tailored through prompt engineering. This involves crafting the right inputs to guide the agent’s responses and behavior. Whether it’s automating customer inquiries or analyzing medical data, domain-specific prompts ensure that the agent’s actions are relevant, accurate, and efficient.
- Experimentation. Designing workflows for GenAI agents requires balancing autonomy with accuracy. During experimentation, developers must refine the agent’s decision-making process, ensuring it can handle complex tasks autonomously while maintaining the precision needed to deliver valuable results. This iterative process of testing and optimizing is key to developing agents that operate efficiently in real-world scenarios.
- Ethics and governance. With the immense potential of GenAI agents comes the responsibility to ensure they are used ethically. This means designing systems that protect sensitive data, comply with regulations, and operate transparently. Ensuring ethical behavior isn’t just about compliance—it’s about building trust with users and maintaining accountability in AI-driven processes.
Challenges and pitfalls with GenAI agents
While GenAI agents hold tremendous potential, there are challenges that need careful consideration:
- Complexity in multi-agent systems. Coordinating multiple specialized agents can introduce complexity, especially in ensuring seamless communication and task execution.
- Cost vs. benefit. With resource-intensive models, balancing the sophistication of GenAI agents with the cost of deployment is crucial.
- Data privacy and security. As GenAI agents require access to vast data and tools, ensuring the protection of sensitive information becomes a top priority.
- Ethical and regulatory considerations. Bias in training data, dilemmas around autonomous decision-making, and challenges in adhering to legal and industry-specific standards create significant risks. Deploying these agents responsibly requires addressing ethical concerns while navigating complex regulatory frameworks to ensure compliance.
- Performance management. Implementing agent systems effectively requires overcoming the complexity of monitoring and optimizing their performance. From breaking down reasoning steps to preparing systems and data for smooth access, managing the performance of these agents as they scale remains a significant hurdle.
Unlocking the future of GenAI agents with insights and discoveries ahead - stay tuned!
GenAI agents represent an exciting shift in how we approach problem-solving and human-computer interaction. With their promise to reason, plan, and execute tasks autonomously, these agents have the potential to tackle even the most complex workflows. As we continue our exploration of these systems, our focus will remain on understanding both their immense promise and the challenges they present.
In the coming weeks, we will delve deeper into the areas where GenAI agents excel, where they struggle, and what can be done to enhance their real-world effectiveness. We invite you to follow along with us as we share our experiments, findings, and practical insights from this ongoing journey. Stay tuned for our next blog post, where we will explore the evolving landscape of GenAI agents, along with the valuable lessons we have learned through experimentation.
This article was written by Mirte Pruppers, Data Scientist, Phebe Langens, Data Scientist, and Simon Koolstra, Principal at Rewire.
From interoperability to generative data management, here are the key trends shaping the data management agenda
Curious about what 2025 holds for Data Management? This article offers a strategic perspective on the trends that we expect will demand the most attention.
But first things first: let’s look at where we are today.
It is widely recognized that Data Management is the main bottleneck when scaling AI solutions. More and more companies see the benefits of good data management when building scalable data & AI capabilities. In fact, to do anything meaningful with data, your business must organize for it.
However, there is no one-size-fits-all data management strategy. How to organize depends on what you want to accomplish and what your business considers critical. Regardless of whether your organization adopted defensive or offensive data strategies, 2024 was marked by the fact that nearly all companies were addressing these three challenges:
- Balancing centralization versus decentralization. Data mesh principles bring data federation into focus. That is, strategically determining the level of domain autonomy versus standardization in order to balance agility with consistency.
- Explicitly linking data management with use case value. Many companies have started properly building a business case for data. Both value and costs have become part of the equation, as part of a shift from big data to high quality data.
- Managing the trade-off between flexibility, scalability and technical debt. This involves deciding between highly automated, low-code platforms for users of varying technical skills vs. high-code frameworks for experienced developers who work on tailored solutions.
Moving on to 2025, we expect the key trends to be:
- Interoperability –to overcome heterogeneous technologies and data semantics across organizational siloes.
- Generative data management –to enable both GenAI use cases from the information contained in the data, as well as GenAI data governance agents.
- Balancing data democratization with people capabilities –to foster high quality decision-making. This involves both high data trustworthiness and building people's ability to work with the data.
Now let's review them one by one.
Trend #1. Interoperability
Building scalable data management typically evolves to a stage where decentralized domain teams take ownership, and data definitions and discoverability are effectively addressed. However, this creates a landscape with both heterogeneous technology and data across domains. For companies that have adopted offensive data strategies, this means that domain-transcending use cases are hampered as the underlying data cannot be integrated. For companies that have adopted defensive data strategies, this means that meeting the regulatory requirements (e.g. CSRD) that require company-wide integrations, are at risk. Either way, the promise of interoperability is to reap the value for cross-domain use cases.
When designing interoperability solutions, we need to start by asking what data should be common across domains and how to handle the heterogeneity over domains to deliver the value in cross-domain use cases. Here the challenge from heterogeneous data can be expected to be more complex than the challenge that stems from heterogeneous technology.
What about implementation? While property graphs, ontologies, query federation engines and semantic mapping technologies are serious contenders for providing a strong technological basis for interoperability, scaling these solutions remain unproven. However, we anticipate that 2025 will be a breakthrough year, with leading companies in data management demonstrating successful large-scale implementations—potentially integrating access management solutions as an additional layer atop the interoperability framework.
The journey doesn’t stop there: achieving end-to-end interoperability hinges on the collaboration of your people. Addressing use cases with cross-domain questions demands a coalition of domain experts working together to craft semantic models and establish a shared language, essential for breaking down silos and fostering seamless integration.
Trend #2. Generative data management
Whereas the general public enthusiastically adopted GenAI in 2024, we expect 2025 to be marked by the integration of GenAI into business processes. However, successfully embedding this new class of data solutions will depend heavily on effective data management.
Drawing parallels with predictive AI, it’s clear that integrating GenAI into business operations requires robust data management frameworks. Without this foundation, businesses risk a proliferation of siloed point solutions, which later require costly redevelopment to scale. Moreover, the risks associated with the classic "garbage in, garbage out" rule are even more pronounced with GenAI, making high-quality data a critical prerequisite. In short: fix your data first.
Interestingly, GenAI itself can play a pivotal role in addressing data management challenges. For instance, GenAI-powered data governance stewards or agents can automate tasks like metadata creation and tracking data provenance. (Check out our article on Knowledge Graphs and RAG to learn more about their transformational powers.)
For companies with offensive data strategies, thriving with GenAI requires aligning its deployment with a solid data management strategy. Meanwhile, defensive strategies can leverage GenAI to automate tasks such as data classification, metadata generation, and tracking data lineage across systems.
To make all this work, companies must solve the challenges of technological interoperability, allowing GenAI use cases to access and benefit from diverse datasets across platforms. Since GenAI relies on a variety of labeled data to perform optimally, addressing these challenges is critical.
All in all, we believe 2025 will be the year GenAI begins transforming data management for businesses that recognize it requires a fundamentally different approach than predictive AI. The companies that adapt will lead the way in this next wave of innovation.
Trend #3. Balancing democratization of data with people capabilities
Many companies began their data transformation journeys on a small scale, often driven by technology. This has now led to broader data availability within organizations, bringing both opportunities and challenges. Two key outcomes are emerging:
First, to empower the right people, it's important to evaluate where you are today, and what responsibilities and competencies you require from the people within your organization. This feeds back into the level of abstraction of technical capabilities, the training curriculum and the purpose of your accelerator teams. For more details check out this blog post on data management adoption.
Second, trustworthiness and quality of the data becomes more critical than ever. Indeed, low quality data and lack of trust in the data can cause significant organizational noise and costs. (Consider the impact of erroneous sales data accessed by 100 users and systems, with all reports and decisions based on it propagating those inaccuracies).
For companies with offensive data strategies, high-quality data products enable scalability, allowing less technical domain experts to contribute more effectively to building solutions. For defensive strategies, robust trust and quality checks pave the way for "data management by design," enabling a shift from "human in the loop" to "human on the loop".
In sum, we believe 2025 will mark a turning point in data management, characterized by higher levels of automation and an improved developer experience. This will make data management more accessible to a wider range of people while requiring a step up in both people skills and data quality. Striking the right balance between these elements will be critical for success.
The road ahead
The market around data & AI is rapidly changing. It is also filled with noise, from the latest data management tools to emerging paradigms (e.g. data mesh, grid, fabric, and so on). Without a focused approach, it’s easy to lose sight of what truly matters for your organization.
Addressing this challenge is much like conducting an orchestra, where people, technology, and organizational elements must work in harmony on a transformational journey. Missteps can lead to isolated point solutions or inadvertently create a new bottleneck. But with the right mix of experience, skill, and knowledge, success is absolutely achievable. In an earlier blogpost we shared our perspective on making data management actionable. Are you ready to step into the role of the conductor, orchestrating data management for your organization? If so, contact us to share your experience!
This article was written by Nanne van't Klooster, Program Manager, Freek Gulden, Lead Data Engineer, and Frejanne Ruoff, Principal at Rewire.
It takes a few principles to turn data into a valuable, trustworthy, and scalable asset.
Imagine a runner, Amelia, who finishes her run every morning eager to pick up a nutritious drink from her favorite smoothie bar, "Running Smoothie”, at the corner of the street. But there’s a problem: Running Smoothie has no menu, no information about ingredients or allergens, and no standards for cleanliness or freshness. Even worse, some drinks should be restricted based on age — like a post-run mimosa — but there is no way to identify the drinks for adults only. For long time customers like Amelia, who know all products by heart, this isn’t much of an issue. But new customers, like Bessie, find the experience confusing and unpleasant, often deciding not to return.
Sounds strange, right? Yet, this is exactly how many organizations treat their data.
This scenario parallels the typical struggles organizations face in data management. Data pipelines can successfully transfer information from one system to another, but this alone doesn’t make the data findable, usable, or reliable for decision-making. Take a Sales team processing orders, for instance. That same data could be highly valuable for Finance, but it won’t deliver any value if the Finance team isn’t even aware the pipeline exists. This highlights a broader issue: simply moving data from point A to B falls short of a successful data exchange strategy.

Effectively sharing Data as a Product requires sharing described, observable and governed data to ensure a smooth and scalable data exchange.
In response to this, leading organizations are embracing data as a product — a shift in mindset from viewing data as an output, to treating data as a strategic asset for value creation. Transforming data into a strategic asset requires attention to three core principles:
- Describe: ensure users can quickly find and understand the data they need, just as a well-labeled menu listing ingredients and allergens helps Bessie know what she can order.
- Observe: share data quality and reliability over time, both expected standards and unexpected deviations — like information on produce freshness and flagging when customers must wait longer than usual for their drink.
- Govern: manage who can access specific data so only authorized individuals can interact with sensitive information, similar to restricting alcoholic menu items based on an age threshold.
By embedding these foundational principles, data is not just accessible but is transformed into a dependable asset to create value organization-wide. This involves carefully designing data products with transparency and usability in mind, much like one would expect from a reputable restaurant's menu.
In this blog, we will explore why each principle is essential for an effective data exchange.
Describe: ensure data is discoverable and well-described
For data to create value, consumers need to be able to find, understand, and use it. If your team produces a dataset that’s crucial to multiple departments but remains tucked away on a platform no one knows about or is not well described and remains ambiguous, your crucial dataset might as well be invisible to potential consumers.
Findable data requires a systematic approach to metadata: think of it as the digital “menu listing” of data that helps others locate and understand it. Key metadata elements include the data schema, ownership details, data models, and business definitions. By embedding these in, for example, a data catalog, data producers help their consumers not only discover data but also interpret it accurately for their specific needs.
Observe: monitor data quality and performance
The next step is to share data quality—consumers need to know that what they are accessing is reliable. Data lacking quality standards leave users guessing whether the data is recent, consistent, and complete. Without transparency, consumers might hesitate to rely on data or worse, make flawed decisions based on outdated or erroneous information.
By defining and sharing clear standards around data quality and availability —such as timeliness, completeness, and accuracy— you enable consumers to determine if the data meets their needs. Providing observability into performance metrics, such as publishing data update frequency or tracking issues over time, allows users to trust the data and promotes data quality accountability.
Govern: manage data access and security
Finally, a successful data product strategy is built on well-managed data access. While data should ideally be accessible to any team or individual who can create value from it, data sensitivity and compliance requirements must be taken into account. Yet, locking all data behind rigid policies slows down collaboration and might lead teams to take risky shortcuts.
A well-considered access policy strikes the right balance between accessibility and security. This involves categorizing data access levels based on potential use cases, and establishing clear guidelines on who can view, modify, or distribute data. Effectively managed access not only safeguards sensitive information but also builds trust among data producers, who can rest assured their data is treated confidentially. Meanwhile, consumers can access and use data confidently, without friction or fear of misuse.
Sounds easy, but the devil is in the details
For many, these foundational principles may seem straightforward. Yet, we often see companies fall into the trap of relying solely on technology to solve their Data & AI challenges, neglecting to apply these principles holistically. This tech-first approach often results in poor adoption and missed opportunities due to a lack of focus on organizational context and value delivery.
Take data catalogs, for example — essential tools for data discoverability. While it may seem like a simple matter of choosing the right tool, driving real change requires a comprehensive approach that incorporates best practices from the Playbook to Scalable Data Management. Without them, companies face long-term risks where:
- Due to a lack of standards the catalog features data duplication, inconsistent definitions, no clear or recursively looping data lineage, and so on. For consumers of the data, this makes it difficult to navigate eroding its usefulness.
- Due to a lack of requirements the catalog is helpful for some teams, but useless for others, inviting proliferation of alternative tools further complicating data access and reducing overall adoption.
This illustrates the fact that something as fundamental as a data catalog isn’t just a technological fix. Instead, it requires a coordinated, cross-functional effort that aligns with business priorities and data strategy: it is not about implementing the right tool, but about implementing the tool the right way.
Conclusion: Data as a Product, not just Data
In the end, successfully sharing data across an organization is about more than just setting up access points and handing over datasets. It demands a holistic approach to data discoverability, observability, and governance suited to your organization. By embedding these principles, organizations can overcome common pitfalls in data sharing and set up a robust foundation that turns data into a true organizational asset. It’s not only a strategic shift in data management but also a cultural one that lays the foundation for scalable, data-driven growth.
This article was written by Femke van Engen, Data Scientist, Simon Beets, Data & AI engineer, and Freek Gulden, Lead Data Engineer at Rewire.
Demystifying the enablers and principles of scalable data management.
In the first instalment of our series of articles on scalable data management, we saw that companies that master the art of data management consider three enablers: (1) data products, (2) organizations, and (3) platforms. In addition, throughout the entire data management transformation, they follow three principles: value-driven, reusable, and iterative. The process is shown in the chart below.
Exhibit 1. The playbook for successful scalable data management.

Now let’s dive deeper into the enablers and principles of scalable data management.
Enabler #1: data products
Best practice dictates that data should be treated not just as an output, but as a strategic asset for value creation with a comprehensive suite of components: metadata, contract, quality specs, and so on. This means approaching data as a product, and focusing on quality and the needs of customers.
There are many things to consider, but the most important questions concern the characteristics of the data sources and consumption patterns. Specifically:
- What is the structure of the data? Is there a high degree of commonality in data types, formats, schemas, velocities? How could these commonalities be exploited to create scalability?
- How is the data consumed? Is there a pattern? Is it possible to standardize the format of output ports?
- How do data structure and data consumption considerations translate into reusable code components to create and use data products faster over time?
Enabler #2: organization
This mainly concerns the structure of data domains and clarifying the scope of their ownership (more below). This translates into organizational choices such as whether data experts are deployed centrally or decentrally. Determining factors include data and AI ambitions, use case complexity, data management maturity, and the ability to attract, develop, and retain data talent. To that end, leading companies consider the following:
- What is the right granularity and topology of the data domains?
- What is the scope of ownership in these domains? Does the ownership merely cover definitions, and does it (still) rely on a central team for implementation or have domains real end-to-end ownership over data products?
- Given choices on these points, what does it mean for how to distribute data experts (e.g. data engineers, data platform engineers)? Is that realistic given the size or ability to attract and develop talent or should choices be reconsidered?
Enabler #3: platforms
This enabler covers technology platforms - specifically the required (data) infrastructure and services that support the creation and distribution of data products within and between domains. Organizations need to consider:
- How best to select services and building blocks to construct a platform? Should one opt for off-the-shelf solutions, proprietary (cloud-based) services, or open-source building blocks?
- How much focus on self-service is required? For instance, a high degree of decentralization typically means a greater focus on self-service within the platform and the ability of building blocks to work in a federated ways.
- What are the main privacy and security concerns and what does that mean for how security-by-design principles are incorporated into the platform?
Bringing things together: the principles of scalable data management
Although all three enablers are important on their own, the full value of AI can only be unlocked by leaders who prudently balance them throughout the whole data management transformation. For example, too much focus on platform development typically leads to organizations that struggle to create value as data (or rather, its value to the business) has been overlooked. On the other hand, too data-centric companies often struggle with scaling as they haven’t arranged the required governance, people, skills and platforms to remain in control of large scale data organizations.
In short, how the key enablers are combined is as important as the enablers on their own. Hence the importance of developing a playbook that spells out how to bring things together. It begins with value, and balances the demands on data, organization and platform to create reusable capabilities that drive scalability in iterative, incremental steps. This emphasis on (1) value, (2) reusability and (3) iterative approach lies at the heart of what companies who lead in the field of scalable data management do.
Let’s review each of these principles.
Principle #1: value, from the start
The aim is to avoid two common pitfalls: the first is starting a data management transformation without a clear perspective on value. The second is failing to demonstrate value early in the transformation. (Data management transformation projects can last for years, and failing to demonstrate value early in the process erodes the momentum and political capital.) Instead of focusing on many small initiatives, it is essential to prioritize the most valuable use cases. The crucial – and arguably the hard bit – is to consider not only the impact and feasibility of individual use cases but also the synergies between them.
Principle #2: reusable capabilities
Here the emphasis is on collecting, formalizing and standardizing the capabilities from core use cases. Then, re-use them for other use cases, thereby achieving scalability. Reusable capabilities encompass people capabilities, methodologies, standards and blueprints. Think about data product blueprints that include standards for data contracts, minimum requirements on meta data and data quality, standards on outputs and inputs, as well as methods on how to organize ownership, development, and deployment.
Principle #3: building iteratively
Successful data transformation progress iteratively towards their ultimate objectives, with each step being the optimal iteration in light of future iterations. Usually this requires (1) assessing the data needs of the highest-value use cases and developing data products that address these needs. Then, (2) considering where it impacts the organization and taking steps towards the new operating model. The key here is to identify the most essential platform components. Since they typically have long lead times, it's important to mitigate gaps through pragmatic solutions - for example ensuring that technical teams assist non-technical end users, or temporarily implementing manual processes.
Unlocking the full value of data
Data transformations are notoriously costly and time consuming. But it doesn't have to be that way: the decoupled, decentralized nature of modern technologies and data management practices allow for a gradual, iterative, but also targeted approach to change. When done right, this approach to data transformation provides enormous opportunities for organizations to leapfrog their competitors and create the data foundation for boundless ROI.
This article was written by Freek Gulden, Lead Data Engineer, Tamara Kloek, Principal, Data & AI Transformation, and Wouter Huygen, Partner & CEO.
In this first of a series of articles, we discuss the gap between the theory and practice of scalable data management.
Fifteen trillion dollars. That’s the impact of AI by 2030 on global GDP according to PwC. Yet MIT research shows that, while over 90% of large organizations have adopted AI, only 1 in 10 report significant value creation. (Take the test to see how your organization compares here.) Granted, these numbers are probably to be taken with a grain of salt. But even if these numbers are only directionally correct, it’s clear that while the potential from AI is enormous, unlocking it is a challenge.
Enters data management.
Data management is the foundation for successful AI deployment. It ensures that the data driving AI models is as effective, reliable, and secure as possible. It is also a rapidly evolving field: traditional approaches, based on centralized teams and monolithic architectures, no longer suffice in a world of exploding data. In response to that, innovative ideas have emerged, such as data mesh, data fabric, and so on. They promise scalable data production and consumption, and the elimination of bottlenecks in the data value chain. The fundamental idea is to distribute resources across the organization and enable people to create their own solutions. Wrap this with an enterprise data distribution mechanism, and voilà: scalable data management! Power to the people!
A fully federated model is not the end goal. The end goal is scalability, and the degree of decentralization is secondary.
Tamara Kloek, Principal at Rewire, Data & AI Transformation Practice Area.
There is one problem however. The theoretical concepts are well known, but they fall short in practice. That’s because there are too many degrees of freedom when implementing them. Moreover, a fully federated model is not always the end goal. The end goal is scalability, and the degree of decentralization is secondary. So to capitalize on the scalability promise, one must navigate these degrees of freedom carefully, which is far from trivial. Ideally, there would be a playbook with unambiguous guidelines to determine the optimal answers, and explanations on how to apply them in practice.
So how do we get there? Before answering this question, let’s take a step back and review the context.
Data management: then and now
In the 2000s, when many organizations undertook their digital transformation, data was used and stored in transactional systems. For rudimentary analytical purposes, such as basic business intelligence, operational data was extracted into centralized data warehouses by a centralized team of what we now call data engineers.
This setup no longer works. What has changed? Demand, supply and data complexity. All three have surged, largely driven by the ongoing expansion of connected devices. Estimates vary by source, but by 2025 the number of connected (IoT) devices is projected to be between 30 to 50bn globally. This trend creates new opportunities and reduces the gap between operational and analytical data: analytics and AI are being integrated into operational systems, using operational data to train prediction models. And vice versa: AI models generate predictions to steer and optimize operational processes. The boundary between analytical and operational data becomes blurred, and requires a reset on how and where data is managed. Lastly, privacy and security standards are ever increasing, not least driven by new a geopolitical context and business models that require data sharing.
Organizations that have been slow to adapt to these trends are feeling the pain. Typically they experience:
- Slow use-case development, missing data, data being trapped in systems that are impossible to navigate, or bottlenecks due to centralized data access;
- Difficulties in scaling proofs-of-concepts because of legacy systems or poorly defined business processes;
- Lack of interoperability due to siloed data and technology stacks;
- Vulnerable data pipelines, with long resolution times if they break, as endless point-to-point connections were created in an attempt to bypass the central bottlenecks;
- Rising costs as they patch their existing system by adding people or technology solutions, instead of undertaking a fundamental redesign;
- Security and privacy issues, because they lack end-to-end observability and security-by-design principles.
The list of problems is endless.
New paradigms but few practical answers
About five years ago, new data management paradigms emerged to provide solutions. They are based on the notion of decentralized (or federated) data handling, and aim to facilitate scalability by eliminating the bottlenecks that occur in centralized approaches. The main idea is to introduce decentralized data domains. Each domain takes ownership of its data by publishing data products, with emphasis on quality and ease of use. This makes data accessible, usable, and trustworthy for the whole organization.
Domains need to own their data. Self-serve data platforms allow domains to easily create and share their data products in a uniform manner. Access to essential data infrastructure is democratized, and, as data integration across different domains is a common requirement, a federated governance model is defined. This model aims to ensure interoperability of data published by different domains.
In sum, the concepts and theories are there. However, how you make them work in practice is neither clear, nor straightforward. Many organizations have jumped on the bandwagon of decentralization, yet they keep running into challenges. That’s because the guiding principles on data, domain ownership, platform and governance provide too many degrees of freedom. And implementing them is confusing at best, even for the most battle-hardened data engineers.
That is, until now.
Delivering on the scalable data management promise: three enablers and a playbook
Years of implementing data models at clients have taught us that the key to success lies in doing two things in parallel that touch on the “what” and “how” of scalable data management. The first step is to translate the high-level principles of scalable data management into organization-specific design choices. This process is structured around three enablers - the what of scalable data management:
- Data, where data is viewed as a product.
- Organization, which covers the definition of data domains and organizational structure.
- Platforms, which by design should be scalable, secure, and decoupled.
The second step addresses the how of scalable data management: a company-specific playbook that spells out how to bring things together. This playbook is characterized by the following principles:
- Value-driven: goal is to create value from the start, with data being the underlying enabler.
- Reusable: capabilities are designed and developed in a way that they are reusable across value streams.
- Iterative: the process of value creation balances the demands on data, organization and platform with reusable capabilities that drive scalability in iterative, incremental steps.
The interplay between the three enablers (data, organizations, platforms) and playbook principles (value-driven, reusable, and iterative) are summarised in the chart below.
Exhibit 1. The playbook for successful scalable data management.

Delivering on the promise of scalability provides enormous opportunities for organizations to leapfrog the competition. The playbook to scalable data management - designed to achieve just that - has emerged through collaborations with clients across a range of industries, from semiconductors to finance and consumer goods. In future blog posts, we discuss the finer details of its implementation and the art of building scalable data management.
This article was written by Freek Gulden, Lead Data Engineer, Tamara Kloek, Principal, Data & AI Transformation, and Wouter Huygen, Partner & CEO.