Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks

¹The Laboratory of Cognition and Decision Intelligence for Complex Systems, IA, CAS ²School of Artificial Intelligence, University of Chinese Academy of Sciences ³College of Electrical and Information Engineering, Hunan University ^#Corresponding author

Abstract

Language models' (LMs) proficiency in handling deterministic symbolic reasoning and rule-based tasks remains limited due to their dependency implicit learning on textual data.

To endow LMs with genuine rule comprehension abilities, we propose "Neural Comprehension" - a framework that synergistically integrates compiled neural networks (CoNNs) into the standard transformer architecture. CoNNs are neural modules designed to explicitly encode rules through artificially generated attention weights. By incorporating CoNN modules, the Neural Comprehension framework enables LMs to accurately and robustly execute rule-intensive symbolic tasks.

Extensive experiments demonstrate the superiority of our approach over existing techniques in terms of length generalization, efficiency, and interpretability for symbolic operations. Furthermore, it can be applied to LMs across different model scales, outperforming tool-calling methods in arithmetic reasoning tasks while maintaining superior inference efficiency. Our work highlights the potential of seamlessly unifying explicit rule learning via CoNNs and implicit pattern learning in LMs, paving the way for true symbolic comprehension capabilities.

Code

Install

We can also animate the scene by interpolating the deformation latent codes of two input frames. Use the slider here to linearly interpolate between the left frame and the right frame.

git clone https://github.com/WENGSYX/Neural-Comprehension
cd Neural-Comprehension
pip install .

Create your CoNNs!

from NeuralCom.AutoCoNN import AutoCoNN

INSTRUCT = 'Create an SOp that is the last letter of a word'
VOCAB = ['a','b','c','d','e','f','g']
EXAMPLE = [[['a','b','c'],['c','c','c']],[['b','d'],['d','d']]]

auto = AutoCoNN()
model,tokenizer = auto(instruct=INSTRUCT,vocab=VOCAB,example=EXAMPLE)

Use CoNN from huggingface!

from NeuralCom.CoNN.modeling_conn import CoNNModel
from NeuralCom.CoNN import Tokenizer


model = CoNNModel.from_pretrained('WENGSYX/CoNN_Reverse')
tokenizer = Tokenizer(model.config.input_encoding_map, model.config.output_encoding_map,model.config.max_position_embeddings)

output = model(tokenizer('r e v e r s e').unsqueeze(0))
print(tokenizer.decode(output.argmax(2)))
>>> [['bos', 'e', 's', 'r', 'e', 'v', 'e', 'r']]

Experiments

Symbolic Operations

We conducted an experiment to examine Neural Comprehension and learning-based methods' performance on symbolic operations with varying digit lengths. Neural Comprehension achieved high accuracy across all lengths, outperforming learning-based methods that struggled with out-of-distribution lengths, demonstrating its robustness and interpretability for symbolic tasks.

Symbolic Reasoning

We investigate the performance of Neural Comprehension in symbolic reasoning tasks. Our hypothesis is that pretrained Language Models lack the capacity for symbolic reasoning, and incorporating CoNNs can improve their abilities. To assess the rule comprehension component, we devise an experiment measuring the model's accuracy in a "Chain of Thought"-like manner. Our results suggest that Neural Comprehension improves symbolic reasoning capabilities and can fit faster compared to Vanilla Fine-tune.

Arithmetic Reasoning

The experiments demonstrate that incorporating Neural Comprehension, which utilizes arithmetic neural networks like Addition and Subtraction CoNNs, significantly enhances language models' performance on complex arithmetic reasoning tasks involving longer digit lengths compared to vanilla CoT models. The results highlight Neural Comprehension's advantages over tool-based approaches in terms of efficiency, adaptability, and scalability.

BibTeX

@inproceedings{ weng2024mastering, title={Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks}, author={Yixuan Weng and Minjun Zhu and Fei Xia and Bin Li and Shizhu He and Kang Liu and Jun Zhao}, booktitle={The Twelfth International Conference on Learning Representations}, year={2024}, url={https://openreview.net/forum?id=9nsNyN0vox} }