The discovery
My discovery of OpenAI's API marked the beginning of my immersion in the world of large language models (LLMs). Initially, my curiosity was piqued by the integration of the API into my personal projects (code generation, ideation, research, etc.). I am a fan of robotics and home automation, I try to automate many things, and the arrival of OpenAI fascinated me with the possibilities in this field, the next step after Google Home, Alexa, etc.
First Experiments
The use of the LLM API began with practical applications: providing help with advanced biology questions from my mini-me #1, which were beyond my school level; providing me with descriptions and insights on certain topics related to cybersecurity (80% of my usage at that time); and most importantly, helping me set up a reward system to encourage the completion of household chores. Strangely, the points earned were always used to unlock Wi-Fi outside of authorized hours.
Fun fact: don't leave your assistant accessible to your mini-me (No. 3 for me, aged 6 at the time) if you want to keep your tokens. He quickly realized that the answers were more interesting than his Google Assistant, and my budget exploded.
The Arrival of ChatGPT
The introduction of ChatGPT by OpenAI consolidated features that were previously scattered throughout the API, providing a unified and more intuitive interface. As the world discovered ChatGPT and LLMs, I discovered that a polished UX could make a big difference. Whereas previously I had to use my shell and its prompt to call the API depending on what I wanted to do, I could now do everything in the same interface.
Pushing the boundaries
I started using it every day to proofread my texts, check the meaning of my English conversations, and produce professional content. It became part of my daily routine, becoming the most static tab in front of Spotify. This use highlighted the model's capabilities, but also its limitations, particularly in terms of accuracy and reliability. I began to challenge it on MITRE and TTP extraction from CERT reports, and I discovered hallucinations: it was able to respond confidently even when what it was saying was false. And I began to question my use of it.
But? What are you doing with my data?!
The expert eye is a professional curse: it constantly analyzes and judges the environment. The world of cybersecurity is an integral part of my life. My attention quickly turned to data security and the integrity of interactions with ChatGPT. The idea of entering sensitive data never appealed to me, proof of the effectiveness of long-standing awareness campaigns. I wanted to use it for topics where I wanted to increase my productivity, but I didn't want to give it information it wasn't supposed to have: the first incidents were beginning to make headlines (Samsung, data leak, prompt overflow, etc.). We are at the beginning, and I didn't really take the time to read what OpenAI would do with my data, the retention period, actual deletion, etc. I couldn't reasonably continue my experiments on the OpenAI model, not at the current level of service.
Towards non-dependence Technical Independence
I didn't have to think about it for long, as I couldn't do without it: it compensated for my dyslexia and reassured me in my exchanges (desired/perceived meaning, syntax, etc.). My wife was more than happy to be less involved in this regard. I avoided long reads by asking for summaries of technical articles, and I looked for ways to improve my productivity. The need to control data and its processing prompted me to explore alternatives. It just so happened that Llama and GPT4All had just entered the scene, and I was able to initiate the implementation of a local LLM. But I was faced with technical implementation challenges. I still didn't understand what an LLM was, how it worked, or what the prerequisites were, so as always, after a good dose of documentation: it works, but it's loooong, it crashes, and it bugs. Everything I love.
Mixtral 34B and customization attempt
After working with Llama and benefiting from advances in the open-source community, I optimized my tool using Llama.cpp and Hugging Face. The Mixtral 34B model, quantified to run on my Nvidia 3080 12GB, offered excellent results and complete control over my data, which is essential for my sensitive projects. However, the model struggled with contextualized text analysis, a difficulty compounded by the memory load of the reference texts and the fact that it is a generic model. Despite attempts to refine the learning process, the only tangible results were a hefty electricity bill and a PC that replaced my heating system. Trials with lighter models were unsuccessful. I even attempted to inject a touch of sarcasm into one of the generations, and I have never been trolled so much. It easily surpassed mini-me #1 and #2, but it was no longer able to respond correctly.
What now?
I could never do without GPT/LLM technologies again. I use them far too much, and they give me peace of mind in my daily life. Today, I have a much better understanding of LLM, but the mathematics still remain complex for me.
My local LLM manages critical tasks without relying on external services. I continue to benefit from this innovation while regaining control of my data. I also have control over my projects and the ability to experiment without fear. Of course, this doesn't solve the limitations of LLM. I still have to monitor feedback, check for hallucinations, change models depending on the context, etc. But I feel reassured and calm. There are numerous examples of incidents and risks associated with LLMs, with new cases being discovered every day. (Are you familiar with LLM Agents? Stay tuned to Squad, we'll be talking about them soon). This is an opportunity for me and Fabien, our AI expert, to talk to you soon about recommendations for using LLMs.
To go further:
Eileen Yu. (Aug. 11, 2023). 75% of businesses are implementing or considering bans on ChatGPT. ZDNet. Retrieved from https://www.zdnet.com/article/75-of-businesses-are-implementing-or-considering-bans-on-chatgpt/
Hugging Face Community. (n.d.). Home Page. Retrieved from https://huggingface.co/
GPT4ALL. manyoso (n.d.). Home Page. Retrieved from https://gpt4all.io/index.html
Gerganov, G. (n.d.). llama.cpp. Retrieved from https://github.com/ggerganov/llama.cpp
Jan.ai. (n.d.). Home Page. Retrieved from https://jan.ai & https://github.com/janhq/jan
["Okay, but I want GPT to perform 10x for my specific use case"]. (Year). YouTube. Retrieved from https://www.youtube.com/watch?v=Q9zv369Ggfk
[What is Retrieval-Augmented Generation (RAG)?]. (Year). YouTube. Retrieved from https://www.youtube.com/watch?v=T-D1OfcDW1M
A little anecdote to finish?
I told you that I mainly use my LLM to compensate for my dyslexia. Sometimes I feel like I'm being clear/coherent, but when I see my LLM's suggestions, I realize that's not the case at all. In one of my experiments, I tried to teach it to anonymize my inputs to see if we could trust a model to secure sensitive data, etc. So I gave it instructions to replace first names, last names, and places with imaginary information. It didn't work; it wasn't conclusive. I put it aside to come back to it later. Except that I forgot about it in the meantime, and innocently launched my LLM to correct an exchange with an important contact. I wasn't sure about my English wording, whether it had the intended meaning, but I had doubts about part of my text... Can you see where this is going? I had a first name in the text at the beginning, where I had no doubts, so I only checked the part that interested me: it was indeed more relevant than mine, the generation made more sense, so I copied/pasted and sent it. The contact's response: "Who is Toto?"





