Generative AI is a type of AI that can create new content, such as images, music, or text, based on the patterns and structures it has learned from existing data.
Generative AI models use a type of machine learning called deep learning, which involves training neural networks on large datasets to recognize patterns and relationships.
These models can generate new content that is often indistinguishable from human-created content, making them useful for applications like art, music, and writing.
The first generative AI model was the Generative Adversarial Network (GAN), which was introduced in 2014 and has since become a widely used technique in the field.
Curious to learn more? Check out: Google Announces New Generative Ai Search Capabilities for Doctors
What is Generative AI?
Generative AI, or GenAI, is a type of AI that uses Unsupervised Learning from unstructured data to offer original insights.
This process requires much more data and time to create its models compared to traditional AI.
GenAI has reached a tipping point, making it accessible to anyone who can interact with it through simple conversation, much like popular tools like ChatGPT and Bard.
Artificial Intelligence
Artificial Intelligence is a way to make machines think and behave intelligently, as described by experts ([4], p. 8). This field of study aims to create computer systems that can perform tasks that normally require human intelligence.
Artificial Intelligence is used to describe computer systems that demonstrate human-like intelligence and cognitive abilities, such as deduction, pattern recognition, and the interpretation of complex data ([5], p. 386). These abilities are essential for machines to learn and adapt to new situations.
The study of Artificial Intelligence focuses on making computers do things that people do better, at least for now ([3], p. 3). This includes tasks like problem-solving, decision-making, and understanding natural language.
Here are some key characteristics of Artificial Intelligence:
- Ability to perform tasks that normally require human intelligence
- Demonstration of human-like intelligence and cognitive abilities
- Use of deduction, pattern recognition, and complex data interpretation
In essence, Artificial Intelligence is about creating machines that can think and behave like humans, but with the speed and efficiency that only computers can provide ([6], p. 287).
Background
Generative AI, or GenAI, has its roots in Traditional ML/AI, where machine learning was used to wrangle data into structures using Supervised Learning.
This traditional approach introduced new roles like Data Scientist, Data Engineer, and Data Citizen, who worked together to create insights from data.
The early days of Traditional ML/AI were marked by a focus on prediction and creating insights similar to the originators of the data.
As data grew, so did the need for more complex models, which is where GenAI comes in, leveraging Unsupervised Learning from unstructured data to offer original insights.
GenAI requires much more data and time to create its models, but offers a completely different interaction experience compared to traditional AI.
This shift has opened doors to new roles, such as Prompt Engineer, Content Curator, and AI Ethics Specialist, who refine and govern models to "think" in certain, prescribed ways.
For more insights, see: Create with Confidence Using Generative Ai
How Does it Work?
Generative AI starts with a prompt that could be in the form of a text, an image, a video, a design, musical notes, or any input that the AI system can process.
Early versions of generative AI required developers to submit data via an API or a complicated process, which needed special tools and programming languages like Python.
Generative AI algorithms return new content in response to the prompt, which can include essays, solutions to problems, or realistic fakes created from pictures or audio of a person.
LLM (Narrow General)
LLM (Narrow General) models are a significant step up from traditional AI, but still have limitations. They can perform a wide range of tasks, but only within a specific domain.
Unlike traditional AI, which can only classify existing data, LLM models can generate new data and perform tasks that require creativity. For example, a traditional model might classify a picture as a cat or dog, while an LLM model can generate a new picture of a cat.
LLM models are also more accurate than traditional AI, especially when it comes to tasks like predicting the end of a sentence. With a large amount of sample text, LLM models can become quite accurate, as seen in the success of tools like ChatGPT.
Readers also liked: Can I Generate Code Using Generative Ai
Here's a rough breakdown of the differences between traditional AI and LLM models:
While LLM models are impressive, they still have limitations. They can only perform tasks within a specific domain, and may not be able to understand or apply intelligence across a wide range of tasks.
Text-Based Model Fundamentals
Text-based machine learning models work by being trained on vast amounts of text data, which enables them to generate predictions and classify inputs accurately.
The first machine learning models to work with text were trained using supervised learning, where humans labeled the inputs to teach the model what to do. This type of training is still used today, but it's being supplemented by self-supervised learning, which involves feeding the model a massive amount of text data to enable it to generate predictions on its own.
Generative AI models, which are a type of text-based model, combine various AI algorithms to represent and process content. They can transform raw characters into sentences, parts of speech, entities, and actions, which are then represented as vectors using multiple encoding techniques.
These techniques can also encode biases, racism, deception, and puffery contained in the training data, making it crucial to be mindful of the data used to train the models.
Here's an interesting read: Telltale Words Identify Generative Ai Text
Tokenization
Tokenization is the process of breaking a dataset into its smallest units, called tokens. These tokens can be words, portions of words, individual characters, or even pixels of an image.
The smaller the unit, the more tokens in the vocabulary, and the more complex the model becomes. This requires more compute power to respond to prompts.
For example, the sentence "Architecture and Governance is an online publication that is primarily written by IT architects for IT architects" might be tokenized into 18 tokens. Words like "and" and "is" are common enough to be individual tokens.
Tokenization is a non-trivial process, and punctuation is just one aspect to consider. Capitalization, contractions, and compound words like "factor" and "refactoring" also need to be accounted for.
Breaking down words into subwords, like word stems, prefixes, and suffixes, can improve model performance and response quality. Techniques like Byte Pair Encoding (BPE) and WordPiece can help models handle rare and unknown words efficiently.
Using the same tokenization algorithm on a very large sample set, like trillions of words, can help the model identify patterns and generate new and original content.
You might enjoy: Which Term Describes the Process of Using Generative Ai
Conversational vs. Predictive
Conversational AI helps AI systems like virtual assistants, chatbots and customer service apps interact and engage with humans in a natural way. It uses techniques from NLP and machine learning to understand language and provide human-like text or speech responses.
Predictive AI, on the other hand, uses patterns in historical data to forecast outcomes, classify events and provide actionable insights. Organizations use predictive AI to sharpen decision-making and develop data-driven strategies.
Conversational AI is all about having a conversation, while predictive AI is focused on making predictions based on data. It's like the difference between talking to a friend and analyzing a spreadsheet.
Check this out: Conversational Ai vs Generative Ai
Potential Risks and Limitations
Generative AI models can produce hallucinations, where they perceive patterns or objects that are nonexistent or imperceptible to human observers, creating nonsensical or inaccurate outputs.
Hallucinations can occur due to various factors, including overfitting, training data bias or inaccuracy, high model complexity, and insufficient or biased training data.
Here's an interesting read: Learn Generative Ai
Setting the temperature of a generative AI model outside its prescribed limits can lead to hallucinations, resulting in nonsensical outputs.
Bias is another significant issue with generative AI, which can lead to unfair or discriminatory results. Data bias stems from biased training data, while algorithmic bias is embedded in the model's mechanics.
Generative AI models can also suffer from confirmation bias, where users unintentionally reinforce existing prejudices by selectively using or interpreting AI-generated content.
These biases can be mitigated by using diverse data collection, algorithmic fairness measures, continuous monitoring, human oversight, and a commitment to transparency and ethical AI practices.
Generative AI models can produce outputs that sound extremely convincing but are often wrong, biased, or manipulated for unethical purposes.
Some risks associated with generative AI include unintentionally publishing biased, offensive, or copyrighted content, which can lead to reputational and legal issues.
To mitigate these risks, organizations can carefully select initial data, use smaller specialized models, customize general models based on their own data, keep a human in the loop, and avoid using generative AI for critical decisions.
A fresh viewpoint: How Generative Ai Can Augment Human Creativity
Some limitations of generative AI include its inability to always identify the source of content, assess the bias of original sources, and understand how to tune for new circumstances.
Realistic-sounding content can make it harder to identify inaccurate information, and results can gloss over bias, prejudice, and hatred.
Here are some limitations to consider when implementing or using a generative AI app:
- It does not always identify the source of content.
- It can be challenging to assess the bias of original sources.
- Realistic-sounding content makes it harder to identify inaccurate information.
- It can be difficult to understand how to tune for new circumstances.
- Results can gloss over bias, prejudice and hatred.
Benefits and Use Cases
Generative AI can be applied in various use cases to generate virtually any kind of content. It's becoming more accessible to users of all kinds thanks to cutting-edge breakthroughs like GPT that can be tuned for different applications.
Generative AI can automate the manual process of writing content, reducing the effort of responding to emails and improving the response to specific technical queries. It can also create realistic representations of people, summarize complex information into a coherent narrative, and simplify the process of creating content in a particular style.
Worth a look: Generative Ai Content
Some potential benefits of implementing generative AI include automating content writing, reducing email response time, and improving technical query responses. Generative AI can also be used to create photorealistic art, design physical products and buildings, and write music in a specific style or tone.
Here are some of the many use cases for generative AI across different industries:
What Are the Benefits of?
Generative AI can automate the manual process of writing content, reducing the effort of creating new content. This can be a huge time-saver for businesses and individuals alike.
One of the most significant benefits of generative AI is its ability to summarize complex information into a coherent narrative. This can be especially helpful for technical or scientific content that's difficult to understand.
Generative AI can also improve the response to specific technical queries by providing accurate and relevant information. This can be a game-changer for customer service and technical support teams.
Discover more: Generative Ai Content Creation
In addition to these benefits, generative AI can create realistic representations of people, such as deepfakes, which can be used for various applications. It can also simplify the process of creating content in a particular style, making it easier to produce high-quality content quickly.
Here are some of the potential benefits of implementing generative AI in different industries:
- Finance: Better fraud detection systems
- Legal: Designing and interpreting contracts, analyzing evidence, and suggesting arguments
- Manufacturing: Identifying defective parts and root causes more accurately and economically
- Film and media: Producing content more economically and translating it into other languages with the actors' own voices
- Medical: Identifying promising drug candidates more efficiently
- Architectural: Designing and adapting prototypes more quickly
- Gaming: Designing game content and levels
Best Practices
To get the most out of generative AI, it's essential to consider the accuracy of generated content. This means vetting the accuracy of generated content using primary sources where applicable.
Clear labeling is also crucial. Generative AI content should be clearly labeled for users and consumers. This transparency helps build trust and ensures that users understand the source of the content.
Bias is a significant concern with generative AI. Consider how bias might get woven into generated AI results. This requires a thoughtful approach to avoid perpetuating existing biases.
To ensure the quality of AI-generated code and content, double-check it using other tools. This helps catch any errors or inaccuracies.
A unique perspective: Bias in Generative Ai
Understanding the strengths and limitations of each generative AI tool is vital. Familiarize yourself with common failure modes in results and work around these.
Here are some essential best practices to keep in mind:
- Clearly label all generative AI content for users and consumers.
- Vet the accuracy of generated content using primary sources where applicable.
- Consider how bias might get woven into generated AI results.
- Double-check the quality of AI-generated code and content using other tools.
- Learn the strengths and limitations of each generative AI tool.
- Familiarize yourself with common failure modes in results and work around these.
Models and Tools
Generative AI models like Dall-E, ChatGPT, and Gemini are trained on large datasets to generate new content. Dall-E, for example, was built using OpenAI's GPT implementation in 2021 and is capable of generating imagery in multiple styles.
These models can be used in various modalities, such as text, imagery, music, code, and voices. Image generation tools like Dall-E 2, Midjourney, and Stable Diffusion are popular examples.
Some notable generative AI tools include GPT, Jasper, AI-Writer, and Lex for text generation, and Amper, Dadabots, and MuseNet for music generation. Here are some examples of generative AI tools across different modalities:
- Text generation: GPT, Jasper, AI-Writer, Lex
- Image generation: Dall-E 2, Midjourney, Stable Diffusion
- Music generation: Amper, Dadabots, MuseNet
- Code generation: CodeStarter, Codex, GitHub Copilot, Tabnine
- Voice synthesis: Descript, Listnr, Podcast.ai
Gemini, Google's public-facing chatbot, was built on a lightweight version of its LaMDA family of large language models and has undergone updates to improve its efficiency and visual responses.
Parameters
Parameters play a crucial role in determining the quality and delivery of results from a model.
The context window, which refers to the amount of text a language model can process at a time, influences how well the model understands and responds to prompts. Typical context windows range from 512 to 2048 tokens.
A larger context window generally leads to better understanding of long-range dependencies and more coherent responses, but it also requires more computational resources.
The prompt temperature, a numerical value between 0 and 1, controls the randomness or creativity of the model's responses. Lower temperatures produce more predictable responses, while higher temperatures introduce more randomness and creativity.
Nucleus (Top-p) Sampling focuses on a cumulative probability mass of tokens by selecting the smallest set of tokens that accounts for a certain probability threshold. A value of 0.8 means the model filters out lower 20% probable tokens and resamples only from the most likely 80% tokens.
Top-k Sampling considers only the k most probable tokens, truncating the probability distribution to include only those top k choices. With k = 50, only the 50 most likely tokens would be considered for sampling.
For another approach, see: Generative Ai Human Creativity and Art
AI Models Dall-E, ChatGPT, and Gemini
Dall-E is an example of a multimodal AI application that identifies connections across multiple media, such as vision, text, and audio. It was built using OpenAI's GPT implementation in 2021 and its second version, Dall-E 2, was released in 2022, enabling users to generate imagery in multiple styles driven by user prompts.
ChatGPT is an AI-powered chatbot that took the world by storm in November 2022, built on OpenAI's GPT-3.5 implementation. It incorporates the history of its conversation with a user into its results, simulating a real conversation.
Gemini, formerly known as Bard, is Google's equivalent of Microsoft Copilot, powered by Google's own AI models. It was previously known as Bard, but was rebranded to Gemini.
Dall-E is trained on a large data set of images and their associated text descriptions, connecting the meaning of words to visual elements. This allows it to generate imagery in multiple styles driven by user prompts.
For more insights, see: How Are Modern Generative Ai Systems Improving User Interaction
ChatGPT was built on OpenAI's GPT-3.5 implementation and has a way to interact and fine-tune text responses via a chat interface with interactive feedback. Earlier versions of GPT were only accessible via an API.
Gemini has a fact-checking feature that searches the web for similar content to back up its answers and shows the user where it found the information. This feature is not available in the basic version of Gemini.
Additional reading: What Is the Key Feature of Generative Ai
Examples of Tools
Generative AI models are being used in various industries to produce high-quality content quickly and efficiently. This has opened up new possibilities for businesses and organizations.
Generative AI tools exist for various modalities, such as text, imagery, music, code, and voices. Some popular AI content generators include GPT, Jasper, AI-Writer, and Lex for text generation, while Dall-E 2, Midjourney, and Stable Diffusion are popular image generation tools.
Microsoft has also introduced Microsoft Copilot for 365, a set of generative AI tools for Microsoft 365. These tools cost $30 a month per user and are designed to make generative AI a part of everyday use.
Check this out: Generative Ai Certification Microsoft
There are many other generative AI models under development, including Claude and Llama. Claude is similar to ChatGPT and is produced by Anthropic, while Llama is an open-source model developed by Meta.
Here are some examples of generative AI tools across different modalities:
- Text generation: GPT, Jasper, AI-Writer, Lex
- Image generation: Dall-E 2, Midjourney, Stable Diffusion
- Music generation: Amper, Dadabots, MuseNet
- Code generation: CodeStarter, Codex, GitHub Copilot, Tabnine
- Voice synthesis: Descript, Listnr, Podcast.ai
- AI chip design: Synopsys, Cadence, Google, Nvidia
Ethics and Future
Generative AI raises concerns about accuracy, trustworthiness, bias, hallucination, and plagiarism, issues that will take years to sort out.
The convincing realism of generative AI content makes it harder to detect AI-generated content, which can be a big problem when relying on generative AI results for critical tasks like writing code or providing medical advice.
Microsoft's Tay chatbot in 2016 had to be shut down due to its inflammatory rhetoric on Twitter, highlighting the potential risks of generative AI.
Industry and society will build better tools for tracking the provenance of information to create more trustworthy AI, and researchers are working on detecting AI-generated text, images, and video.
The future of generative AI holds promise in areas like translation, drug discovery, and anomaly detection, but its impact will be shaped by how we integrate these capabilities into our existing tools.
Recommended read: What Are Generative Ai Tools
Ethics and Bias
The ethics of generative AI are a complex issue. Despite their promise, these tools open a can of worms regarding accuracy, trustworthiness, bias, hallucination, and plagiarism.
The new crop of generative AI apps sounds more coherent on the surface, but this combination of humanlike language and coherence is not synonymous with human intelligence. Great debate exists about whether generative AI models can be trained to have reasoning ability.
Microsoft's first foray into chatbots, Tay, had to be turned off in 2016 after it started spewing inflammatory rhetoric on Twitter. This shows that even early attempts at generative AI can have serious consequences.
The convincing realism of generative AI content introduces a new set of AI risks. It makes it harder to detect AI-generated content and, more importantly, makes it more difficult to detect when things are wrong.
A Google engineer was even fired after publicly declaring the company's generative AI app, LaMDA, was sentient. This highlights the controversy surrounding the capabilities of generative AI.
On a similar theme: New Generative Ai
Many results of generative AI are not transparent, so it is hard to determine if, for example, they infringe on copyrights or if there is a problem with the original sources from which they draw results. If you don't know how the AI came to a conclusion, you cannot reason about why it might be wrong.
The Future of
The Future of Generative AI holds great promise, with advancements in translation, drug discovery, anomaly detection, and the generation of new content, from text and video to fashion design and music.
Generative AI tools will seamlessly integrate into our workflows, making design tools more useful and grammar checkers better. Training tools will automatically identify best practices in one part of an organization to help train other employees more efficiently.
The popularity of generative AI tools has fueled an endless variety of training courses at all levels of expertise, helping developers create AI applications and business users apply the new technology across the enterprise.
Industry and society will build better tools for tracking the provenance of information to create more trustworthy AI, addressing the difficulties in rolling out generative AI safely and responsibly.
Broaden your view: How Generative Ai Will Transform Knowledge Work
Sources
- https://www.mdpi.com/2227-7102/14/2/172
- https://www.architectureandgovernance.com/applications-technology/generative-ai-genai-a-primer/
- https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-generative-ai
- https://www.techtarget.com/searchenterpriseai/definition/generative-AI
- https://nationalcentreforai.jiscinvolve.org/wp/2024/08/14/generative-ai-primer/
Featured Images: pexels.com