Loading...
Neurosymbolic Artificial Intelligence
The necessary hybridisation: symbolic and sub-symbolic AI
The necessary hybridisation: symbolic and sub-symbolic AI //

For decades, artificial intelligence has been split by a false dichotomy. On one side, the symbolic or semantic approach focuses on endowing machines with the ability to interpret the world through deep language understanding, enabling the creation of knowledge representation systems based on computable graphs and first-order logic. On the other, the sub-symbolic or statistical approach seeks to identify patterns in large volumes of data, developing machine learning processes that allow these systems to evolve and anticipate events dynamically.
This divide has generated an apparently irresolvable tension: the structured rigidity of symbols versus the adaptive opacity of neural networks. Symbolic systems reason transparently but struggle to scale in the face of natural language complexity. Sub-symbolic systems learn from experience but operate as black boxes that cannot be audited.
Neurosymbolic AI has transcended this conflict, resolving the paradox through bidirectional interfaces that force the black box to be accountable to logical structure. This is not simply a matter of combining technologies, but of differentiable compositionality: systems where deep learning is anchored in unalterable world models and where logical rules are embedded directly into neural tensors.
According to Gartner's RADAR, this semantic core is precisely the element that endows AI systems with reliability, enabling them to operate with a grounding in reality that purely statistical systems cannot achieve. We are currently in the era of hybridisation, where organisations combine the strengths of both paradigms to build systems that can interact naturally with humans whilst maintaining a commitment to truth and objective reality.
The development of a truly intelligent and comprehensive AI requires the careful integration of diverse approaches within a large hybrid or composite framework that must necessarily have a symbolic heart. This heart is what ensures that machines can have a contextual and evaluative interpretation of the propositions they generate. One of the most pressing challenges in deploying artificial intelligence in high-risk services is precisely to ensure that language models and Machine Learning and Deep Learning systems operate reliably and in compliance with regulations.
Combining advances in language processing and generative technologies with a deeper understanding of cognition and meaning brings us closer to systems capable of maintaining that fundamental commitment to truth and objective reality.
Predictive integration. The bridge between the two approaches
Learning in AI systems can benefit from the integration of Linked Data technologies, associated with knowledge graphs, with approaches that use machine learning and deep learning techniques, among which generative GPT (Generative Pre-trained Transformer) technologies stand out for their capabilities.
The former can add facts to entities recognised as equivalent, whilst the latter can help integrate new facts identified through statistical patterns. This operation, known as predictive integration, is the basis for extending knowledge graphs and consolidating and ontologically interpreting inductive learning in such a way that it becomes possible to develop operations containing logic on top of it.
Predictive integration is bidirectional and benefits both knowledge graphs and LLMs. On the one hand, it enables LLMs to be used to enrich knowledge graphs with new facts and relationships discovered through pattern analysis in large volumes of text, or to humanise the relationship between people and machines by introducing dialogue through natural language. On the other hand, knowledge graphs provide the context and semantic structure necessary to improve the performance of LLMs and prevent hallucinations.
Combating hallucinations in Generative AI
A hallucination occurs when a generative AI model responds with false or misleading information. For example, you might ask for references to paintings in the Museo del Prado on the influence of Flemish art on Spanish religious painting, only to find that the model returns a synthetic citation composed of fragments of real information (the Boletín del Museo del Prado, academics, or an apparently plausible article title) that together form a false text making claims that were never actually made.
To better understand hallucinations and why they occur, it is important to understand how LLMs work. These models are designed and trained to perform a single task: to predict the next most probable token based on a given sequence of text. A token can be a word, part of a word, or even a series of characters.
It is precisely this ability to predict the next most probable token that enables LLMs to generate realistic text, complete tasks, respond to dialogue, and draft content that appears authentic. However, herein lies the paradox: when it comes to answering questions, predicting the next token does not require LLMs to provide truthful or accurate responses. Their function is limited to producing text that, statistically, is likely to appear plausible based on their training data, without verifying any correspondence with reality.
Whilst certain improvements to LLMs can help reduce the likelihood of hallucinations, some aspects are an integral part of the generative capability of these models. The very characteristics that make LLMs useful for generating text that appears fluid and creative also increase the risk of introducing hallucinations.
So how can we approach the use of LLMs for reliable and useful tasks whilst at the same time mitigating the prevalence of hallucinations?
The superior nature of Neurosymbolic AI outputs
The superior nature of Neurosymbolic AI outputs //
As we have already noted, the superiority of this hybrid approach rests on two complementary aspects. On the one hand, there are demonstrated theoretical limitations in purely connectionist or sub-symbolic architectures that restrict their capacity to address logical processes and formal reasoning reliably. On the other hand, the integration of symbolic components into the neurosymbolic architecture brings significant computational advantages, as it enables the incorporation of logical structures and rules that confer consistency and explainability on AI outputs.
This fusion of symbolic and sub-symbolic approaches — whose general implications we have already explored — now materialises in a set of specific properties that characterise the operational capacity of these systems to carry out deterministic inferences and verifiable reasoning.
1. Traceability of AI outputs
The system must be able to identify where a given piece of information has been sourced so that the user can trace it back and verify that it is indeed what the system used to produce its results. This tracing capability is not an added feature but an architectural consequence of operating on explicit symbolic representations.
2. Verifiability of AI outputs
The system offers formal guarantees of logical consistency on the basis of its ontological foundation, maintaining verifiable logical coherence across all its inferences and conclusions. Whereas sub-symbolic systems can produce mutually contradictory results without having any mechanism to detect this, a neurosymbolic system can implement formal verification of the consistency of its outputs, which may be substantiated in automatic theorem-proving systems or validation of the corresponding reasoning processes.
3. Unambiguous recognition, annotation, and extraction of named entities
The system guarantees the unambiguous recognition, annotation, and extraction of named entities in the documents and resources processed by the AI system.
4. Reliable automatic classification
Reliable automatic classification of the results of the unambiguous recognition, annotation, and extraction of named entities. In particular, and given that chronologies are an ordering system — or, if preferred, a specific case of classification — the system must be specifically capable of producing reliable chronologies.
5. Deterministic, reliable, and unsupervised reasoning
Reasoning is grounded in first-order or higher-order logic as the basis for AI outputs. These characteristics enable deterministic reasoning to be identified as strong reasoning which, understood from a technical standpoint, can be defined as reasoning based on decidable fragments of first-order logic. This property allows us to predicate of it the attributes of reliable and deterministic and, consequently, its outputs can, for certain processes, be unsupervised.
6. Third-party reproducibility of AI outputs
AI outputs must be explainable in this strong sense of being reproducible by third parties.
7. Complete reversibility of reasoning
This condition is connected to the previous one — or, if preferred, reproducibility is one way of describing the reversibility of reasoning. It refers to the ability to traverse inference chains bidirectionally, enabling both forward (deductive) and backward (abductive) reasoning over the same symbolic representations. This makes possible the formal verification of corresponding hypotheses and the generation of complete causal explanations, as is the case with reasoning outputs at the Consorcio de Compensación de Seguros.
8. Hierarchical abstraction with monotonic inheritance
The system must be able to construct and manipulate conceptual taxonomies where properties are inherited in a predictable and verifiable manner across levels of abstraction. This characteristic, fundamental in systems based on formal ontologies, enables guaranteed inferences that sub-symbolic systems cannot assure.
9. Manipulation of universal and existential quantifiers
Neurosymbolic systems correctly process statements with variable scope such as "all X that have property Y also have property Z, except those that...". They can handle these structures using predicate logic, whereas sub-symbolic systems can only approximate these patterns statistically without formal guarantees.
10. Systematic compositionality with explicit semantics
This characteristic refers to the ability to combine known concepts in order to understand entirely new situations by following formal compositional rules. For example, if the system understands "triangle", "blue", and "rotate", it must be able to correctly interpret "the blue triangle rotates" without having encountered this combination during training. Purely sub-symbolic systems fail systematically at these compositional generalisation tasks.
11. Deterministic counterfactual reasoning
The system has the capacity to answer questions of the type "what would have happened if..." through formal manipulation of structured causal models. This requires explicit representation of causal relationships in symbolic form, something that cannot be guaranteed in purely connectionist systems of a statistical basis.
12. Integration of inviolable hard constraints
Finally, the system can impose and guarantee compliance with absolute constraints that can never be violated, regardless of the patterns learned. For example, physical, legal, or logical constraints that must be maintained even if the training data suggests otherwise. This capability completes the set of properties that make neurosymbolic AI the superior architecture for critical systems requiring formal guarantees.