It’s easy to understand the excitement around generative AI: a conversation with a computer seems like the simplest and most obvious next step in technology development. However, this excitement often creates expectations that go beyond what is delivered or experienced. It is therefore essential to understand the fundamental elements that underpin the successful deployment of AI systems within the IP profession. It starts with data and its quality.
The importance of training data in AI development
All applications that leverage AI depend on the quality of the input data. While it seems obvious that poor-quality input data will lead to poor-quality output, it is difficult to apply this rule of thumb to the multitude of AI solutions readily available. For example, with machine translation, you input the source text in one language and generate the target text in another. There are significant differences in quality between the many algorithms available, some of which can be attributed to how and when the algorithm was trained, while others depend on the technology deployed.
The development of AI dates back to has Alan Turing in the 1950s or earlier. It comes in many forms, including unsupervised and supervised machine learning, and now generative AI. Even if we focus only on generative AI, there are many great language models to choose from that have fundamentally different capabilities. For example, in a Statista In the 2023 ranking, Claude 3 Opus (Anthropics) scored 60.1% for its ability to solve math problems, while Gemini Pro (Google) scored 32.6% (and the upgraded Gemini 1.5 Pro scored 58.5%).
Choosing the right approach to a problem
Elon Musk has suggested that generative AI has the potential to usher in a new era of human innovation, which means a lot for the intellectual property profession. For an innovator at the heart of an R&D team, it could accelerate a project or spur new avenues of research. For a patent professional, it could automate drafting or reduce the administrative burden of getting files before an examiner.
Several strategic considerations lie at the heart of the risk and value associated with patent rights. These include:
- the shift from quantity to quality of patents;
- competitive monitoring and benchmarking for disruptive technologies;
- risk management in times of increased NPE activity and significant geopolitical changes;
- highlight investment in sustainable technologies using the mapping of patented technologies against the UN SDGs – WIPO’s recent report on “Mapping innovations: patents and the sustainable development goals“demonstrates the importance of these data; and
- facilitate the achievement of evidence-based financial results for SEP licensing in the telecommunications sector.
This overview doesn’t do justice to the vast array of opportunities that IP offers, but therein lies the challenge. Many people are calling for more AI with the sole intention of increasing efficiency. For IP professionals, this means delivering better, faster, and cheaper solutions than current approaches. In the world of AI, this triangle must take on new dimensions, such as trust and transparency.
Data quality
The answer to these puzzles can be illustrated by patent analysis. The ability to digitally search patents only began in 1998, and by 2006, patent data was ubiquitous and recognized as a key source of scientific information. Today, there are dozens of proprietary patent data products, and many more if we include the services offered by national and international patent offices. It is essential to focus on the following elements when choosing the right source for patent analysis.
- Accuracy – Patent data is confusing and, while access to public data from a national patent office may be free, it is rarely clear. The most common problem is ownership, where no attempt is made to group patents held by members of the same group. If you cannot attribute a patent to its current owner, the process is redundant.
- Completeness – a key aspect of this approach is the concept of patent family, where patents filed in multiple jurisdictions relating to the same invention are treated as a single invention. Worldwide coverage is also crucial.
- Accessibility – Patent analysis was originally only of interest to specialists involved in building patent portfolios. Today, demand is driven by non-IP teams, for whom speed and usability are essential.
Patent data continues to grow in importance, but it is becoming increasingly clear that it is often insufficient. Here are some examples of how patent analysis can be used strategically:
- quality, as many leading patent scoring algorithms (e.g., Patent Asset Index) rely on both citation data and gross national income data to account for the relative importance of patent rights in countries of different sizes;
- risk, of which patent litigation is a good indicator – and this data needs to be integrated and aligned;
- technology, because companies think in terms of technology trends, so the ability to analyze patents from this perspective is essential; and
- SEP – although there are databases of patents declared as SEP, this data can be enriched by mappings to the relevant standard.
Key takeaways
While AI is essential, the starting point is understanding why. Knowing the problem you want to solve can help guide you to the data you need. In patents, not all data is created equal. Even when it comes to seemingly identical datasets, it’s critical to focus on quality. If you’re working with incomplete or inaccurate datasets, AI can’t help. In the world of patent analysis, “garbage” is only one outcome. This is true whether you’re manually reviewing the data or using the latest generation of AI.