Research Echoes #1

Echos de la recherche #1
Publications

Research Echoes #1

ChatGPT, yes... but... and...

Keeping up with academic research is crucial for anyone looking to work effectively in Deep Learning. This field remains highly academic, with each week bringing its share of small revolutions. While it is necessary to maintain a certain distance from these advancements, we offer you a follow-up on recent research with a focus on some recent works that have caused a bit of a stir: language models in general, and ChatGPT in particular.

No one could have ignored the arrival of the new ChatGPT offering from the partnership between OpenAI and Microsoft. The release of ChatGPT3 has created many waves in society. OpenAI recently published a technical report on ChatGPT4 [https://cdn.openai.com/papers/gpt-4.pdf], and Google found itself obliged to follow suit by hastening the presentation of the Bard [https://blog.google/technology/ai/bard-google-ai-search-updates/] model.

For us, it is always vital to maintain some distance from commercial frenzies. We are indeed obliged to think in terms of tools, results, and therefore limits. ChatGPT is certainly a remarkable achievement, but it must be acknowledged that the accompanying communication seems to drift a bit by playing on the confusion between what a language model is and what artificial intelligence could be. Without wanting to repeat what has already been written many times about ChatGPT, it should be noted that the tool recently integrated a connection with the Wolfram [https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/] tool (which is not artificial intelligence) to handle a number of specific, logical, and even mathematical questions, questions for which ChatGPT is inherently struggling to answer.

If there is one work to read to better understand the limits of ChatGPT, it is essential to read “On the danger of Stochastic Parrots”. It is quite accessible and helps to reframe what ChatGPT is or is not. This publication has been widely cited and shared in the academic world, where it has been welcomed by many researchers dismayed by the "commercial" presentation of ChatGPT, far from a rigorous scientific or technical analysis. It is also worth noting that OpenAI has not particularly encouraged attempts at serious criticism of the older versions of ChatGPT.

Stanford recently released an exciting model called ALPACA – https://github.com/tatsu-lab/stanford_alpaca. The goal here is to democratize language models by significantly reducing the cost of training. This approach is commendable, if only because many problems prohibit sharing sensitive data with OpenAI, or because we lack models that are less costly to train.

On the other hand, one cannot help but be skeptical of a work like the very recent em>JARVIS by Microsoft [https://github.com/microsoft/JARVIS]. While the approach is interesting, it further distances any possibility of controlling and mastering the models used. It is highly likely that these approaches will not produce relevant new industrial tools. Similarly, ChatDoctor [https://github.com/kent0n-li/chatdoctor], based on LLAMA, raises quite concerning questions. We know that a Deep Learning model will be highly susceptible to even the slightest bias in its input data, and we are unaware of the biases lurking behind ChatGPT. So the question is not whether such a model works, but how quickly it will provide completely off-topic, or even dangerous, responses.

However, not everything is to be discarded, and this democratization of language models opens the door to the use of text as an interface for defining a topic, for robotic control, or task definition. This approach raises many questions of robustness or quality, but it cannot be ignored that a segmentation model, for example, controlled by ordinary text, will be much more accessible to non-experts. This democratization can change the game for new, much freer interfaces. Until the comical moment when we want to "debug" this interface... Until then, "prompt engineering" has become a technical field, and we recommend this resource for those who want to delve into the subject :

https://github.com/dair-ai/Prompt-Engineering-Guide

It is also worth noting that certain prompts can lead a language model astray from its intended use, for example, recently with ChatGPT4 :

https://news.ycombinator.com/item?id=35190383

To go a little further, the following works may be interesting to look at for anyone curious about the latest advances in this field :

LLaMA: Open and Efficient Foundation Language Models

[https://github.com/facebookresearch/llama]

Whether you love or hate Facebook, it must be acknowledged that their Deep Learning research teams remain at the forefront. LLama is a collection of open source models of various sizes, allowing for working with these models without being confined to the role of an uninformed user behind an API.

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention [https://github.com/zrrskywalker/llama-adapter]

This approach aims for fine-tuning the considerable LLaMA, enabling the adaptation of a language model to respond to instructions on a very low budget (the authors claim "one hour" on 8 GPUs in parallel), and for a particularly simple model (1.2M parameters). These approaches are worth keeping an eye on as they offer much more exploitable tools.

Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases

[https://arxiv.org/pdf/2303.14742v1.pdf]

This work is part of the (too rare) "practical" studies, applied to real-world problems, of enormous language models. It is observed, among other things, that the growth in results linked to the size of the model is not an absolute truth, with the emergence of significant plateaus on logical or mathematical problems.