Have you ever wondered what is the structure behind what we know today as “Artificial Intelligence”? How do language models like Chat GPT, Gemini, Claude, or Deep Seek work? What happens every time it receives a prompt? Under what security protocols do they operate? What applications exist around us? How to obtain the greatest benefit? What is AI Safety and why is it so important?[1]
An AI is not just an application that responds to user questions, nor is it solely a collection of information gathered historically. Rather, it is an entire structure that operates based on security protocols and structures meticulously created with the objective of converting words into a mathematical language (vector space) that the model can compute.
As Data Engineering students in Mérida, Yucatán, México, due to our university internships, we decided to work with the ARENA 3.0 material.
However, the birth of this post results from some uncomfortable questions as a group: How can we delve into this world while still being students or beginners? Where can we start? What previous knowledge sets the foundation to improve understanding?[2]
All these questions were only the beginning of everything that was coming for us, and like many, we felt overwhelmed starting this course and seeing so many concepts that we didn't know or ignored. To resolve these questions, we focused specifically on the chapter “1.1 Transformers from scratch” whose intention is to understand how gpt2 works internally from scratch using pytorch and to link technical knowledge with topics on security and alignment of AI models.
AI transcends everything we thought possible, with surprising adoption and great potential for utilization. For this reason, our focus is directed at all those interested in entering this world but who still do not know where to start.
But the million-dollar question would be: How can we give an opinion on AI without understanding the basis of its structural functioning (transformers)?
AI Safety is a field focused on seeking the security of Artificial Intelligence systems, and at the same time, that these are aligned with human values, acting in a beneficial way and avoiding causing accidental or unforeseen damage. At the same time, AI Safety is composed of 4 key aspects, which are indispensable when one delves into the structure as such; the pillars are:
When starting on the topic of AI, one must keep in mind that there may be certain confusions between concepts, and that is why it is necessary to make clear the difference between AI Safety and AI Ethics; the first focuses more on avoiding any type of incident that could be catastrophic, while the second’s focus is to establish principles, values, and moral norms for the design, development, and deployment of AI. Now then, starting to answer all the questions, what happens when we give a prompt to the AI? To answer this first question, we must take into account that an AI currently does not think for itself, but rather processes each word you give it. This is very important, as here the breakdown of how its structure and architecture work begins.
Tokenization: It is the process of decomposing text, which assigns a mathematical value (token) to each sub-unit (not necessarily in full words or individual letters), and these are accompanied by another value that works as an identifier; this helps to obtain predictions and answers with greater accuracy. For we must remember that the AI does not process words, but numbers exclusively; this is why words are converted into tokens.
Example: “tokenizar” —> “token-izar”
It is possible that at this point you are already overwhelmed by the amount of information described; exactly that happened to us as students at first, especially when at first glance it might not seem like much considering it is only a summary of what makes up the architecture and functioning of AI and AI Safety. Nonetheless, it is of vital importance to consider all these small concepts and explanations, as we consider them to be the foundations to be able to understand the field and have a pleasant and enriching study and avoid the haze or stress from the amount of information that begins to unfold.
After having made clear everything that the process an AI has and its components entails, we can answer the question “What applications exist around us?”.
This question was very interesting because, although they were probably examples in plain sight, for us they were questions to which we had never paid proper attention, of which we began to have a clearer vision while we were studying.
Some AI applications, beyond Chat GPT, Gemini, etc., are the algorithms on different platforms like Tik Tok. The model of this platform calculates which video we might like based on previous videos watched, liked, or with more views by the user. Another example is the Google search engine to autocomplete a possible search based on other previous searches by the user or the predictive keyboard of a phone. And a final example based on our city: public transport contains an AI system which is programmed and trained to announce the next stops around the city depending on the route you take, and at the same time it is linked to a GPS system for greater precision.
At this point, we began to discern small elements of daily life that go unnoticed but can be linked to an AI. However, assuming we already have the basic knowledge with the concepts explained above, now we can answer the question “How to obtain the greatest benefit?”. One way to obtain good results when using AI is by improving the prompts we give it, since, taking advantage of the fact that we already understand how Self-Attention works, we will be able to give it more details to improve the context, thus achieving more precise, correct answers with a low probability of hallucinations. But we must not forget that AI calculates probabilities, so it can make mistakes; therefore, it is important at all times to read and verify the given answers regardless of how good your prompt is.
As Data Engineering students, our biggest question was how and where to start. Therefore, we can tell all those interested in venturing into this new field that the best advice is not to force yourself to know every mathematical formula from the beginning, since that will only generate an information entry bottleneck. We understood that the best thing is to first understand the theory and logic of the data, the processes, and AI Safety, as well as key concepts that will significantly lighten the load without taking away importance from its purpose.
At the end of all this journey, we can say that AI Safety is of utmost importance because it is the bridge and the security that exists between technology that keeps advancing and human technology. Without the key elements of AI Safety: alignment, robustness, and interpretability, which at the same time are part of the structure of AI, perhaps we would have an Artificial Intelligence very capable of “solving” any problem but without a limit with which to measure right and wrong, becoming a potential danger. As beginners for beginners, our best advice is “Do not be afraid of AI, but rather seek the understanding of its structure.” The future of data is not only creating them, it is taking care of them and understanding them.
This post was written by members of the group AI Safety UPY
Thanks to Rous Polanco and Saralet Chan for the redaction!