Fixing unstructured information is the first step to better AI results

Businesses know they need AI. However, many realise once they start applying it, that a lot of their information is scattered in silos. 

When that happens, AI can’t tell what’s accurate or relevant. That’s why structuring data properly is an essential first step.  

In this Q&A, Joakim Jannerfeldt, Global Solution Director, Compliance & BC, EIM at Columbus Sweden, discusses the most common data challenges companies face when adopting AI, how to organise information effectively, and real-world examples of organisations putting these ideas into practice. 

When companies start exploring AI, what do they typically discover about the state of their information? 

According to recent market studies, around 70–80% of a company’s business-critical data is stored in different file formats, which is the unstructured information we’re talking about here. This is the data that sits outside core business systems like ERP and CRM.

While applying AI in those systems comes with its own set of challenges, this wider pool of information is still a key part of what organisations rely on day-to-day. Right now, a lot of it is stored as PDFs, Excel files, images, or even videos — and it often ends up scattered across different locations. 

When organisations start applying AI on top of this, they realise this content isn’t properly managed — and in many cases, they don’t know where all of it is stored. It might be on local computers, servers, Dropbox, SharePoint, Teams, or buried in emails. 

Finding and connecting the right data for AI becomes a major challenge. Much of this data is also uncontrolled in terms of quality or validity. There can be multiple copies of the same document, slightly different versions of the same file, or information that’s outdated and no longer relevant. 

When AI is applied to that data, it doesn’t understand or distinguish what’s accurate — it simply uses everything available. The result is predictable: the more poor-quality data organisations feed AI, the more inaccurate its results become. This quickly becomes a cycle where wrong information turns into “truth.” 

When businesses get wrong answers from AI, they blame the technology. But the problem isn’t the AI, it’s the data behind it

Can AI itself fix poor data quality? 

The short answer is no — AI doesn’t know whether data is correct or not. But what it can help with is creating and handling new data, especially when dealing with large volumes of unstructured information. 

For example, when an organisation receives a new document or agreement to store, AI can help classify it and make it more structured. It can identify certain triggers and create useful metadata — information that can later be used to find or analyse it more effectively. 

AI can also help organise existing data to some extent. But simply applying a prompt and saying, “please correct this,” won’t work. AI doesn’t understand context, accuracy, or relevance. That part still requires human input. Organisations need to validate their data before AI can be used confidently.  

What does it mean to organise information so that AI can understand and use it effectively? 

The first step is to create structure. One-way organisations can do that is by classifying information and adding metadata. As I mentioned earlier, metadata helps describe what a document is, what it relates to, and its context. 

For example, if an organisation has a new agreement, what kind of agreement is it? Is it a supplier contract, a customer agreement, or an employment contract? That next layer of detail helps AI understand context. From there, organisations can add more metadata such as who the parties are, when the document was created, its start and end dates, and whether it’s still valid.  

Another important aspect is security. With detailed metadata, organisations can make sure when AI is applied, it excludes sensitive information, such as a CEO’s contract, and only draw from approved sources 

The next step is for organisations to define what counts as business-critical information and set clear policies for how those documents should be created, stored, and used. The same applies to emails, where companies need to decide what is and isn’t worth keeping. Employees need clear guidance and governance so that everyone handles information in a consistent way. 

There’s also the process side. When organisations create an agreement, there’s a lifecycle — it starts as a draft, then it goes for review and approval, and finally becomes valid. Later it might be renewed or become obsolete. AI needs to recognise where in that process the document sits. 

By building that process into the data structure, organisations enable AI to distinguish between valid and outdated documents. That way, when someone asks AI to generate a draft or summary, it will only use approved and up-to-date material. 

In practical terms, how does a business start building this structure? 

Usually, a company invests in a tool or platform to govern this process. In some cases, they can add new processes while still maintaining control of the information in their existing storage setup. But it’s important to have a governance tool that oversees all content and repositories, whether that’s one central system or several connected ones. 

There needs to be a solution that handles indexation and stores the metadata for everything. From there, organisations can link to the relevant locations where the information sits. As I mentioned earlier, a clear policy is also essential for how to handle certain documents. Companies need to identify which are business-critical and the information that will be used to improve ways of working, boost efficiency, and support operational excellence. 

But this isn’t just about the system. It’s about creating a structured way of working. Employees need to understand that this is the new standard for how the organisation will handle information. In some cases, organisations can even automate certain actions through their systems. For example, AI agents today can process incoming emails and route them correctly based on rules that have been set. 

For other types of information, such as content created in Word or Excel where users have more freedom, a more controlled process may be needed. Teams might work through specific templates or dedicated platforms to ensure consistency. By doing this, companies can control how information is managed, stored, and governed. 

If a business realises it needs help getting its data in order, what support is available? 

Most companies need some level of guidance. It often depends on their size and maturity. For example, in quality management — which is a very controlled area  there’s usually someone responsible who already knows how that information should be structured and maintained. 

At Columbus, we advise organisations on what good looks like, how it can be achieved, and which technologies can best support them. The approach we take depends on factors like company size, information volume, and how critical that data is.  

 At the same time, it’s also about new habits. There’s usually a change in how people work and manage emails, templates, and unstructured information overall. We also support organisations through this process, helping them build a clear change strategy that teams understand, embrace, and see the value in, making their everyday work easier. 

At a practical level, we help organisations decide the best structure for them. This could mean one central repository, a single platform, or a solution that connects their existing systems together. In some cases, migration to a unified platform makes sense; in others, it’s better to keep the structure distributed. It depends on the amount and type of data, and how it will be used. 

In larger organisations, the amount of unstructured information can be overwhelming. That’s why it’s better to start small with policies that set direction and focus on the most business-critical areas first. For example, if a company develops a lot of products, it likely generates huge volumes of documentation. Structuring that data properly allows the business to reuse knowledge, improve development efficiency, and get faster results. 

We help companies identify those opportunities, prioritise the low-hanging fruit, and build from there. Through analysis and clear policies, we guide organisations in understanding the best starting point for their journey. 

It starts with awareness at the C-level. Leaders need to understand that without control, data can’t be used effectively

Can you share some real-life examples of companies putting this into practice? 

We’ve been working with a company called INFICON, who manufacture leak detection equipment for gas and liquid. When they develop these products, they also work with external partners for components, which go through testing such as air quality checks. This information, including protocols and related documents from development, is stored in a repository. 

They want to use that repository to identify specific examples of past work such as successful projects, the criteria used, and the results achieved This information exists across several documents, but by structuring it properly, they’ll be able to retrieve what they need efficiently and use it to improve future projects. 

We’re also working with Viscaria, a mining company who are currently establishing a new mine. We’re helping them manage asset-related documentation, from equipment instructions and service manuals to drawings, sketches and protocols. 

They hold several project meetings, and the key decisions made in these meetings go into one controlled repository. We’re helping them apply AI to this repository to help them find the information they need, create reports, identify bottlenecks, and spot patterns. In the longer term, the company will be able to tie this to record-based or transactional data for deeper analysis. 

A good example can also be found in everyday business processes, like accounts payable (AP) automation. AI can be used to identify the type of invoice, extract relevant data, put it through a process, correct it if needed, and then send it for payment or internal review. 

But the first step is to define how that process should work. In an invoice workflow, for example, there are already established rules such as the “four-eyes principle,” meaning no invoice should be paid until at least two people have reviewed it. 

Organisations can build these checks into the process using AI — ensuring the right people approve the right invoices. For example, if an employee isn’t authorised to approve payments above a certain amount, AI can automatically route that invoice to the correct person. 

That way, companies maintain compliance and accuracy. When invoices are stored in a structured format, tagged with supplier, arrival date, and accounting details, the data becomes traceable and can be linked directly to the original invoice image for verification. 

Looking ahead, how do you see AI evolving and what should businesses focus on next? 

If organisations don’t take control of their unstructured data and continue to apply AI on top of it, they risk creating a new version of “truth” based on inaccurate information. Once that happens, those inaccuracies can quickly become accepted as reality. 

That’s why data quality and validation are so important. AI can analyse, summarise, and identify patterns, but it can’t determine what’s factually correct without reliable data behind it. The responsibility for that still lies with us. Humans provide the context, validation, and governance that AI cannot. 

From a business perspective, this is also why some organisations remain hesitant to invest heavily in AI. They’ve run early pilots or proofs of concept, but the results haven’t met expectations. I’ll go back to my earlier point — the problem isn’t the AI itself, it’s the data. If the data isn’t accurate, consistent, or well-structured, the insights won’t be right either. That’s why cleaning, validating, and structuring data is so critical. Once that foundation is in place, AI can truly deliver on its potential. 

 

Key takeaways: 

  • Most business-critical information is unstructured and scattered in several locations, making it impossible for AI to understand what’s accurate or relevant. 
  • Structuring information with metadata, governance and defined processes gives AI the context it needs to work effectively. 
  • Adopting new technologies like AI often requires a different way of working, meaning organisations must support their people in creating new habits for managing unstructured information. 

 

Joakim Jannerfeldt 2021
Joakim Jannerfeldt Vice President EIM and Security Sweden

Get in touch

Contact