Uploading images and forms to a server using URLSession

Published on: October 30, 2019

One of those tasks that always throws me off balance is building a form that allows users to upload a form with a picture attached to it. I know that it involves configuring my request to be multipart, that I need to attach the picture as data and there’s something involved with setting a content disposition. This is usually about as far as I go until I decide it might be a good time to go to github.com and grab the Carthage URL for Alamofire. If you’re reading this and you’ve implemented POST requests that allow users to upload photos and forms, I’m sure this sounds familiar to you.

In this week’s quick tip I will show you how to implement a multipart form with file upload using only Apple’s built-in URLSession. Ready? On your marks. GO!

Understanding what a multipart request actually looks like

If you’ve ever inspected a multipart request using a tool like Charles or Proxyman you may have found out that the headers of your post requests contained the following key amongst several others:

Content-Type: multipart/form-data; boundary=3A42CBDB-01A2-4DDE-A9EE-425A344ABA13

This header tells us that the content that's being sent to the server is multipart/form-data. This content type is used when you upload a file alongside other fields in a single request. This is very similar to how an HTML form is uploaded to a server for example. This header also specifies a boundary which is a string that's used by the server to detect where lines / values start and end.

The value you saw for the boundary was very likely to be different, but it should look familiar. The body of your post request typically looks a little bit like like the following:

--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13
Content-Disposition: form-data; name="family_name"

Wals
--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13
Content-Disposition: form-data; name="name"

Donny
--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13
Content-Disposition: form-data; name="file"; filename="somefilename.jpg"
Content-Type: image/png

-a long string of image data-
--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13—

If you've inspected your request and found similar content to the content above, you might have decided that this looks complicated and you’re better off using a library that handles the creation of the header and HTTP body for you. If that’s the case, I completely understand. Especially because the part where I wrote -a long string of image data- can be really long. I used to reach for Alamofire to handle uploads for the longest time. However, once you take the time to dissect the Content-Type header and HTTP body a little bit, you’ll find that it follows a pretty logical pattern that takes a bunch of effort to implement but I wouldn't say it's very hard. It's mostly very tedious.

First, there’s the Content-Type header. It contains information about the type of data you’re sending (multipart/form-data;) and a boundary. This boundary should always have a unique, somewhat random value. In the example above I used a UUID. Since multipart forms are not always sent to the server all at once but rather in chunks, the server needs some way to know when a certain part of the form you’re sending it ends or begins. This is what the boundary value is used for. This must be communicated in the headers since that’s the first thing the receiving server will be able to read.

Next, let’s look at the http body. It starts with the following block of text:

--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13
Content-Disposition: form-data; name="family_name"

Wals

We send two dashes (--) followed by the predefined boundary string (Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13) to inform the server that it’s about to read a new chunk of content. In this case a form field. The server knows that it’s receiving a form field thanks to the first bit of the next line: Content-Disposition: form-data;. It also knows that the form field it’s about to receive is named family-name due to the second part of the Content-Disposition line: name=“family_name”. This is followed by a blank line and the value of the form field we want to send the server.

This pattern is repeated for the other form field in the example body:

--Boundary-3A42CBDB-01A2-4DDE-A9EE-425A344ABA13
Content-Disposition: form-data; name="name"

Donny

The third field in the example is slightly different. It’s Content-Disposition looks like this:

Content-Disposition: form-data; name="file"; filename="somefilename.jpg"
Content-Type: image/png

It has an extra field called filename. This tells the server that it can refer to the uploaded file using that name once the upload succeeded. This last chunk for the file itself also has its own Content-Type field. This tells the server about the uploaded file’s Mime Type. In this example, it’s image/png because we’re uploading an imaginary png image.

After that, you should see another empty line and then a whole lot of cryptic data. That’s the raw image data. And after all of this data, you’ll find the last line of the HTTP body:

--Boundary-E82EE6C1-377D-486C-AFE1-C0CE9A03E9A3--

It’s one last boundary, prefixed and suffixed with --. This tells the server that it has now received all of the HTTP data that we wanted to send it.

Every form field essentially has the same structure:

BOUNDARY
CONTENT TYPE
-- BLANK LINE --
VALUE

This structure is mandatory, we didn't pick it ourselves, and we shouldn't modify it.

Once you understand this structure, the HTTP body of a multipart request should look a lot less daunting, and implementing your own multipart uploader with URLSession shouldn’t sound as scary anymore. Let’s dive right in and implement a multipart URLRequest that can be executed by URLSession!

Preparing a multipart request with an image

In the previous section, we focussed on the contents of a multipart form request in an attempt de demystify its contents. Now it’s time to construct a URLRequest, configure it and build its httpBody so we can send it off to the server with URLSession instead of a third-party solution.

Since I only want to focus on building a multipart for request that contains a file, I won’t be showing you how you can obtain an image that your user can upload.

The first bit of this task is pretty straightforward. We’ll create a URLRequest, make it a POST request and set its Content-Type header:

let boundary = "Boundary-\(UUID().uuidString)"

var request = URLRequest(url: URL(string: "https://some-page-on-a-server")!)
request.httpMethod = "POST"
request.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")

Next, let’s look at the HTTP body. In the previous section, you saw that every block in a multipart request is constructed similarly. Let’s create a method that will output these chunks of body data so that we're not having to bother with writing the same code over and over again.

func convertFormField(named name: String, value: String, using boundary: String) -> String {
  var fieldString = "--\(boundary)\r\n"
  fieldString += "Content-Disposition: form-data; name=\"\(name)\"\r\n"
  fieldString += "\r\n"
  fieldString += "\(value)\r\n"

  return fieldString
}

The code above should pretty much speak for itself. We construct a String that has all the previously discussed elements. Note the \r\n that is added to the string after every line. This is needed to add a new line to the string so we get the output that we want.

While this method is pretty neat for the form fields that contain text, we need a separate method to create the chunk for file data since it works slightly different from the rest. This is mainly because we need to specify the content type for our file, and we have file's Data as the value rather than a String. The following code can be used to create a body chunk for the file:

func convertFileData(fieldName: String, fileName: String, mimeType: String, fileData: Data, using boundary: String) -> Data {
  let data = NSMutableData()

  data.appendString("--\(boundary)\r\n")
  data.appendString("Content-Disposition: form-data; name=\"\(fieldName)\"; filename=\"\(fileName)\"\r\n")
  data.appendString("Content-Type: \(mimeType)\r\n\r\n")
  data.append(fileData)
  data.appendString("\r\n")

  return data as Data
}

extension NSMutableData {
  func appendString(_ string: String) {
    if let data = string.data(using: .utf8) {
      self.append(data)
    }
  }
}

Instead of a String, we create Data this time. The reason for this is twofold. One is that we already have the file data. Converting this to a String and then back to Data when we add it to the HTTP body is wasteful. The second reason is that the HTTP body itself must be created as Data rather than a String. To make appending text to the Data object, we add an extension on NSMutableData that safely appends the given string as Data. From the structure of the method, you should be able to derive that it matches the HTTP body that was shown earlier.

Let’s put all these pieces together and finish preparing the network request!

let httpBody = NSMutableData()

for (key, value) in formFields {
  httpBody.appendString(convertFormField(named: key, value: value, using: boundary))
}

httpBody.append(convertFileData(fieldName: "image_field",
                                fileName: "imagename.png",
                                mimeType: "image/png",
                                fileData: imageData,
                                using: boundary))

httpBody.appendString("--\(boundary)--")

request.httpBody = httpBody as Data

print(String(data: httpBody as Data, encoding: .utf8)!)

The preceding code shouldn’t be too surprising at this point. You use the methods you wrote earlier to construct the HTTP body. After adding the form fields you add the final boundary with the two trailing dashes and the resulting data is set as the request’s httpBody. Note the print statement at the end of this snippet. Try printing the HTTP body and you’ll see that it matches the format from the beginning of this post perfectly.

Now all that’s left to do is run your request just like you would normally:

do {
  let (data, response) = try await URLSession.shared.data(from: request)
  // use your data
} catch {
  print(error)
}

// Or if you're not using async / await yet
URLSession.shared.dataTask(with: request) { data, response, error in
  // handle the response here
}.resume()

Pretty awesome right? It took a little bit of work, but once you understand what you’re doing it’s suddenly not so bad anymore.

In summary

Even though I called this post a Quick Tip, this turned out to be an information-packed, fairly long write up on making a multipart form request with URLSession. And while there is quite some text involved in explaining everything, I hope you see now that making this kind of request doesn’t require a third-party library. Sure, it takes away the boilerplate for you, but at the same time, you might want to take a step back and ask yourself whether the boilerplate is really so bad that you want to use an external dependency for a single task.

If you want to grab a copy of the finished example, head over to Github. I’ve uploaded a Playground for you to, well, play with. As always, thanks for reading this week’s Quick Tip and any questions, feedback and even compliments are more than welcome. You can reach me on X or Threads.

Categories

Networking Swift

Subscribe to my newsletter