Create Your Own AI Image Generator App With JavaScript and DALL-E 3

In this tutorial, we’ll build an AI image-generation app with JavaScript. Users will enter text describing the image they want, then behind the scenes we’ll call the DALL-E 3 API to generate it!

DALL-E 3 is an image generation model that excels at generating images from text prompts. It can understand and interpret complex textual descriptions and translate them into visual representations. The images generated are also high-resolution and diverse in style.By the end of this tutorial, we will have something like this:

HTML Structure

The HTML structure will consist of the following elements:

A small button at the top right, which, when clicked, will allow the user to add their API KEY to local storage.
A text input where users will enter their prompt
A button that, when clicked, will take the prompt from the user and call the DALL-E API to generate the image

Install Bootstrap

We will be using Bootstrap to build the interface. Bootstrap is a framework that allows developers to build responsive sites in a short amount of time. Either link to the relevant CSS and JS files in the head of your HTML document, or (if you’re using CodePen) you’ll find the Bootstrap dependencies under the CSS and JS settings tabs.

API Message

First, start by displaying a message that the app requires an API key and then show a link where users can get the API KEY. Here’s the markup:

<div class="container-fluid">
    <div class="row justify-content-center">
      <div class="col-12 col-md-10 col-lg-8 col-xl-6 mt-5">
        <p class="lead">This demo requires an API key. <a target="_blank" href="https://platform.openai.com/">Get yours here</a></p>
      </div>
    </div>
</div>

Next, add the ADD API KEY button

<div class="position-absolute top-0 end-0 mt-2 me-3">
  <button
    id="api"
    type="button"
    class="btn btn-info"
    data-bs-toggle="modal"
    data-bs-target="#KeyModal"
  >
    ADD API KEY
  </button>
</div>

The button uses absolute positioning to ensure it stays at the top right, and it is also set to data-bs-target="#KeyModal. This attribute means the button is linked to an element with the ID KeyModal.

KeyModal Button

When clicked, the button will trigger the modal to open. Bootstrap uses data-bs-target to reference an element by its ID, so when the button is clicked, it will look for the element with the id of KeyModal and perform the specified actions.

Let’s add the modal below the button.

<div class="container mt-5">
    <div class="modal fade" id="KeyModal" tabindex="-1" aria-hidden="true">
      <div class="modal-dialog">
        <div class="modal-content">
          <div class="modal-header">
            <h5 class="modal-title" id="exampleModalLabel">
              Your API Key remains stored locally in your browser
            </h5>
          </div>
          <div class="modal-body">
            <div class="form-group">
              <label for="apikey">API KEY</label>
              <input type="text" class="form-control" id="apikey" />
            </div>
          </div>
          <div class="modal-footer">
            <button
              type="button"
              class="btn btn-secondary"
              data-bs-dismiss="modal"
            >
              Close
            </button>
            <button type="button" class="btn btn-primary">Save</button>
          </div>
        </div>
      </div>
    </div>
    </div>

The modal contains the following elements:

A modal dialog which ensures the modal is centered on the page
The modal content contains an input text for entering the API key, a button for saving the key, and a close button that removes the modal from the page .

Our App’s Main Section

Now let’s start building the main section of the application. The main section will consist of the following elements

TextInput: This input field will take in the user’s prompt. The prompt will describe the image they want to generate, for example, a "A cat chasing a mouse".
Button: This button will initiate the image generation process when clicked.
Gallery: A display of sample images previously generated by DALLE to showcase its capabilities.

Create a Bootstrap container which will house the elements:

<div class="container mt-5">
  <!--input text-->
  <!--button-->
  <!--galletry-->
</dvi>

Let’s start by adding a header at the top of the page with the title "AI Image Generator" and a description of the application

<header class="mt-5">
  <h1 class="text-center">AI Image Generator</h1>
  <p class="lead text-center">
    Bring your vision to life with Generative AI. Simply describe what you
    want to see!
  </p>
</header>

The Form

Next add a form that will contain the input text and the generate button.

<section class="mt-5">
  <form id="generate-form">
    <div class="row">
      <div class="col-md-9">
        <div class="form-group">
          <input
            type="text"
            class="form-control py-2 pb-2"
            id="prompt"
            placeholder="A cartoon of a cat catching a mouse"
          />
        </div>
      </div>
      <div class="col-md-3">
        <div class="form-group">
          <button type="submit" class="btn btn-primary btn-lg" id ="generate">
            Generate Image
          </button>
        </div>
      </div>
    </div>
  </form>
</section>

The layout ensures that the input text spans 3/4 of the entire space to provide enough space for the prompt, and the button is positioned at the right to occupy the remaining space.

Spinner

Next, add a spinner that will show when an image is generated.

1	<!-- spinner -->
2	<div class="spinner-border text-danger" role="status" id="spinner">
3	<span class="visually-hidden">Loading...</span>
4	</div>

Image Gallery

The last section will contain a few images generated by the DALL-E model.

<!-- Generated Images -->
<section class=" container mt-5">
  <div class="row justify-content-center" id="gallery">

  <!--generated images will go here-->
  
  </div>
</section>

We will use JavaScript to display the images dynamically. The gallery container will also be used to display the image generated from a prompt.

Styling With CSS

Besides the Bootstrap framework, we will also add a few custom CSS classes:

@import url("https://fonts.googleapis.com/css2?family=DM+Mono:ital,wght@0,300;0,400;0,500;1,300;1,400;1,500&display=swap");

  body {
    font-family: "DM Mono", monospace;
  }
  h1 {
    font-weight: 900;
  }
  p {
    font-weight: 500;
  }
  .message,
  #spinner {
    display: none;
  }

Here, we are using a custom font from Google Fonts, and we have also set the message element and the spinner to be hidden by default.

JavaScript Functionality

On to the behaviour! The first thing we want to do is to add the functionality for enabling users to add their API key to local storage. We will use jQuery to open and close the modal.

We already have data-bs-target="#KeyModal" on the ADD API KEY button, which opens the modal. Now, we will listen for the shown.bs.modal event. The shown.bs.modal is a Bootstrap functionality for modal dialogs which is triggered after the modal has been shown to the user

1	$("#KeyModal").on("shown.bs.modal", function () {
2	// get api key from user and save to local storage
3
4	});

Inside the event listener function, we will get the modal components, which include a text input and a button.

1	$("#KeyModal").on("shown.bs.modal", function () {
2	const saveButton = document.querySelector("#KeyModal .btn-primary");
3	const apiKeyInput = document.querySelector("#apikey");
4	});

Save Button Event Listener

Next, we will add an event listener to the save button of the modal. Inside the event listener function, we will get the value of the API KEY, save it to local storage, and then close the modal.

$("#KeyModal").on("shown.bs.modal", function () {
  const saveButton = document.querySelector("#KeyModal .btn-primary");
  const apiKeyInput = document.querySelector("#apikey");

  saveButton.addEventListener("click", function () {
    const apiKeyValue = apiKeyInput.value;
    localStorage.setItem("API_KEY", apiKeyValue);
    $("#KeyModal").modal("hide");
  });
});

DALL-E 3

OpenAI provides two models for text-to-image generation, DALL·E 3 and DALL·E 2. We are going to use DALLE3 the latest model,

DALL-E 3 is a new state of the art text to image generator which adheres closely to the text provided when generating images.

While you dont have to be an expert in prompt engineering to use DALL-E 3, better prompts will generate better results.

Get API KEY

To obtain an API key, you need an OpenAI account. Go to the OpenAI website and create an account. Once you log in, you will see this page.

On the top left side, click on the API keys icon, and you will be redirected to a page where you can create your API KEY.

Once you create your API KEY, ensure you copy it since it wont be shown again.

How to use DALL-E 3

The DALL·E 3 model allows developers to generate images from text using this API endpoint.

1	https://api.openai.com/v1/images/generations

The API endpoint allows you to create standard and HD-quality images. If the quality is not set, standard images will be generated by default, and the image sizes are 1024x1024, 1024x1792, or 1792x1024 pixels.

DALL-E 3 allows you to request 1 or more images(up to 10). If you want to request more than 1 image, you can do so by making parallel requests This is how you would generate a standard image of size 1024x1024 from the prompt " a red cat."

curl https://api.openai.com/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "dall-e-3",
    "prompt": "a red cat",
    "n": 1,
    "size": "1024x1024"
  }

As you can see above, the API endpoint requires you to include the following headers in your request:

Content-Type set to application/json
Authorization set to Bearer, followed by your OpenAI API key

The data sent in the request will include :

model is the model to use for generating an image
prompt - this is the text or the description of the image you want generated.
n is an integer that specifies the number of images to generate.
size is the size of the image in pixels

Image Generation

The next step is to generate an image from the prompt provided by the user. To do that we will add an event listener to the generate form. When the form is submitted, it will retrieve the prompt from the user, obtain the API key from local storage, and call another function (fetchImage()), which will in turn generate an image.

But first , let’s get the necessary elements from the DOM:

1	const message = document.getElementById("message");
2	const generateForm = document.getElementById("generate-form");
3	const spinner = document.getElementById("spinner");

Next, let’s add an event listener that listens for the submitted event from the form.

generateForm.addEventListener("submit", function (e) {
  e.preventDefault();
//   get prompt
//   get api key
//   perform validation
//   call fetchImage() function
 
});

Inside the event listener function, update the code as follows:

generateForm.addEventListener("submit", function (e) {
  e.preventDefault();
  const promptInput = document.getElementById("prompt");
  const prompt = promptInput.value;
  const key = localStorage.getItem("API_KEY");
  console.log(key);

  if (!prompt) {
    displayMessage("Please enter a prompt");
    return;
  }
  if (!key) {
    displayMessage(
      "Please add your API KEY, The Key will be store locally on your browser"
    );
    return;
  } else {
    fetchImage(prompt, key);
    
  }
});

In the updated code, after the submit event is fired by the form, we get the prompt from the user and the API key from local storage. If the user has not provided a prompt, we display a message asking the user to enter one.

Similarly, if the API key is missing, we prompt the user to add their API key, if both the prompt and API key are present, we call the fetchImage function and pass the prompt and the API KEY values as arguments

fetchImage() is the function that will use the DALL-E 3 API endpoint to generate an image based on the user’s prompt.

The displayMessage() function looks like this:

function displayMessage(msg) {
  message.textContent = msg;
  message.style.display = "block";
  setTimeout(function () {
    message.style.display = "none";
  }, 3000);
}

We are setting the content of the alert element to the message from the form event. The setTimeout function ensures that the message element will be hidden after 3 seconds.

fetchImage Function

Next, let’s create the fetchImage function, which will be an async function. It will take the prompt and API_KEY as parameters.

1	const fetchImage = async (prompt, API_KEY) => {
2
3	}

Inside the function, we define the API endpoint and store the required headers and data required by the API in a variable called options.

The options object includes:

The HTTP method.
Headers for content type and authorization.
The body (a JSON string containing the model, prompt, n(number of images), and image size.

const url = "https://api.openai.com/v1/images/generations";
const options = {
    method: "POST",
    headers: {
      "content-type": "application/json",
      Authorization: `Bearer ${API_KEY}`,
    },
    body: JSON.stringify({
      model: "dall-e-3",
      prompt: prompt,
      n: 1,
      size: "1024x1024",
    }),
  };

Next, inside a try block, we perform a POST request using the fetch API, specifying the url and the options object. While the fetch is happening, we display the spinner immediately.

We then check the response, and if it’s not successful (!response.ok) , we display an error message to the user, and then we exit the function to prevent further execution.

const fetchImage = async (prompt, API_KEY) => {
  const url = "https://api.openai.com/v1/images/generations";
  const options = {
    method: "POST",
    headers: {
      "content-Type": "application/json",
      Authorization: `Bearer ${API_KEY}`,
    },
    body: JSON.stringify({
      model: "dall-e-3",
      prompt: prompt,
      n: 1,
      size: "1024x1024",
    }),
  };

  try {
    spinner.style.display = "block";
    const response = await fetch(url, options);

    if (!response.ok) {
    const error = await response.json();
    const message = error.error.message ? error.error.message : "Failed to fetch image";
     displayMessage(message);
      return;
    }

 
  } catch (error) {
  
  }finally {
 
}
};

If the response is successful, we will asynchronously obtain the JSON data from the response object and store it in a variable called result.

1	const result = await response.json();

For example, the prompt "a blue cat " returns this object. The url has been truncated

{
    "created": 1713625375,
    "data": [
        {
            "revised_prompt": "Imagine a cat with the most unique color you can 
            think of - a brilliant shade of dark cerulean. This is no ordinary
            cat. Picture this feline lounging in the midday sun, its fur 
            shimmering in the light. The color is an almost surreal hue, 
            rich and saturated, as if pulled straight from a painter's palette.
            The cat's eyes are a contrasting emerald green, watching the world 
            with a wise but relaxed gaze. Imagine the blue cat's body shape, 
            muscular and agile, made for speedy pursuits and stealthy approaches.
            Now, consider how this splendid creature would look in its natural habitat.",
            "url": "https://oaidalleapiprodscus.blob.core.windows.net/private/org-..."
        }
    ]
}

The data also includes a revised_prompt, which DALL-E 3 used to refine the image generation process. From the object received, we can get the url of the image and pass it to another function displayImage(), which will display it to the user on the web page.

1	const imageUrl = result.data[0].url
2	displayImage(imageUrl);

The next thing we want to do is pass the image url to a function called displayImage().

1	const imageUrl = result.data[0].url
2	displayImage(imageUrl);

In the catch block, we handle any exceptions that might occur during the fetch operation by displaying an appropriate error message to the user.

The final block will be executed regardless of the outcome of the fetch request; therefore, it’s a good place to ensure the spinner is hidden regardless of whether the request is successful.

 catch (error) {
    console.error(error);
    displayMessage("There was an error , try again");
  }finally {
  spinner.style.display = "none"; 
}

displayImage Function

The displayImage() function will look like this:

function displayImage(image) {
 
  const imageMarkup = `
  <div class="row justify-content-center">
      <div class="col d-flex justify-content-center">
          <img src="${image}" class="img-fluid" alt="Placeholder Image">
      </div>
  </div>`;

  imageGallery.innerHTML = imageMarkup;
  spinner.style.display = "none";
}

Let’s break it down ,

First, we create HTML markup to specify a responsive Bootstrap column and set the src attribute of the img tag to the generated image url. Then, we inject this markup into the imageGallery container

The final step is to display some of the images generated by DALL-E 3 as a gallery so that when the users first open the app, the images will showcase the app’s capabilities.

First let’s store the images in an array:

const images = [
  "https://essykings.github.io/JavaScript/image%207.png",
  "https://essykings.github.io/JavaScript/image1.png",
  "https://essykings.github.io/JavaScript/image2.png",
  "https://essykings.github.io/JavaScript/image3.png",
  "https://essykings.github.io/JavaScript/image9.png",
  "https://essykings.github.io/JavaScript/image5.png",
  "https://essykings.github.io/JavaScript/image6.png",
  "https://essykings.github.io/JavaScript/cat.png",
];

Next, we will use the map() method to iterate over the images. For each image, we will set the src attribute of an <img> element to the image URL and then append it to the image gallery container.

Finally we will invoke the displayImages() function.

function displayImages() {
  const imageMarkup = images
    .map((image) => {
      return `
        <div class="col-12 col-sm-6 col-md-3 mb-4 position-relative" id ="image-container ">
          <img src="${image}" class="img-fluid" alt="Placeholder Image">
        </div>
        `;
    })
    .join("");

  imageGallery.innerHTML = imageMarkup;
}

displayImages();

Final Demo

We’ve done it! Our app is fully functional!

Conclusion

This tutorial has covered how to build an image-generation app with AI. This app can be applied in various fields, such as education to create illustrations, gaming to create visuals, etc. I hope you enjoyed it!