Google takeout - Step by step

How to make sense of the mail, photos and other datatype export
video version: video_link

Google Takeout gives your data to download. In this blog, we will make sense of the data.

Table of Contents

Downloading Data

  1. Go to Google Takeout page.
  2. Select the data types you want to download.
  3. Choose the export frequency, file type, and size.
  4. Click on "Create Export".
  5. Wait for the export to be prepared and download the file.
  6. Download the data. For me over 10 years of Google data, it was 60 files to download. So make sure you are on good internet connection.
  7. It was 180 GB of data for me. and each file around 3-4 GB.
  8. Below code will help you to unzip all files in one go.
import pathlib
import zipfile
import os

def sanitize_path(path):
    # Remove invalid Windows characters and strip trailing spaces
    invalid_chars = '<>:"/\\|?*'
    for char in invalid_chars:
        path = path.replace(char, '_')
    return path.strip()

# Paths
source_dir = pathlib.Path(r"F:\google-takeout-orig - Copy")
unzip_root = pathlib.Path(r"F:\takeout-unzipped")
unzip_root.mkdir(parents=True, exist_ok=True)

# Process each zip file
for zip_file in source_dir.glob("*.zip"):
    try:
        dest_dir = unzip_root / zip_file.stem
        dest_dir.mkdir(parents=True, exist_ok=True)

        with zipfile.ZipFile(zip_file, 'r') as zip_ref:
            for member in zip_ref.infolist():
                try:
                    # Create sanitized path
                    original_path = pathlib.Path(member.filename)
                    sanitized_parts = [sanitize_path(part) for part in original_path.parts]
                    safe_path = dest_dir.joinpath(*sanitized_parts)

                    # Create parent folders
                    safe_path.parent.mkdir(parents=True, exist_ok=True)

                    if not member.is_dir():
                        # Extract the file
                        with zip_ref.open(member) as source, open(safe_path, "wb") as target:
                            target.write(source.read())
                except Exception as e:

                    print(f"Error extracting {member.filename}: {e}")

        # print(f"Unzipped: {zip_file.name} to {dest_dir}")
    except Exception as e:
        print(f"Error processing {zip_file.name}: {e}")
print("All zip files have been processed.")

After unzipping, you will see multiple folders for each datatype like Mail, Photos, Contacts etc. Photos will be repeated in albums vs year albums. they dont have metadata as well, and meta data is in json files.