James Yeates Cells, Climbing, Computers

Making PDFs with AI

It’s a tumultuous time as we approach another paradigm shift in how we live, learn and work. I dabble with computers, but I’m by no means proficient at programming. However, the advent of AI tools gives the ability to leverage them to save hours of time.

I had a directory, with multiple sub directories filled with about 5200 files - mostly .webp with a smattering of .jpg. These all had unhelpful names and their associated file properties didn’t allow for easy PDF creation.

I basically needed to make a PDF of the contents of each sub-directory with the page order following the file creation date, with the oldest file forming the first page and so on.

Articulating and troubleshooting this problem led me to this almost entirely AI-derived solution.

I used the following in powershell:

Get-ChildItem -Path . -Directory | ForEach-Object { Get-ChildItem -Path $_.FullName | Sort-Object LastWriteTime | Select-Object Name | Out-File -FilePath (Join-Path -Path $_.FullName -ChildPath "filelist.txt") }

This created a filelist.txt in each sub-directory, ordered how I needed it.

I then used the following python code:

# Ensure you have the required libraries installed:
# pip install fpdf2 Pillow
from fpdf import FPDF
from PIL import Image
import os

def create_pdf(image_folder, output_pdf):
    """
    Creates a PDF from images in a specified folder, with the order
    determined by a 'filelist.txt' file within that folder.
    (This function is the same as your original)
    """
    pdf = FPDF()
    filelist_path = os.path.join(image_folder, 'filelist.txt')

    if not os.path.isfile(filelist_path):
        print(f"Error: 'filelist.txt' not found in the folder: {image_folder}")
        return

    try:
        # NOTE: Your PowerShell script creates UTF-16 encoded files.
        with open(filelist_path, 'r', encoding='utf-16') as f:
            lines = f.readlines()
            # Skip header lines, strip whitespace, and exclude blank lines/self-reference
            image_files = [
                line.strip() for line in lines[2:] 
                if line.strip() and line.strip().lower() != 'filelist.txt'
            ]
    except Exception as e:
        print(f"Error reading or processing {filelist_path}: {e}")
        return

    for image_file in image_files:
        image_path = os.path.join(image_folder, image_file)
        
        if not os.path.isfile(image_path):
            print(f"Warning: Skipping '{image_file}' as it was not found.")
            continue
            
        try:
            with Image.open(image_path) as img:
                img_rgb = img.convert('RGB')
                
                a4_width_mm = 210
                a4_height_mm = 297
                img_width, img_height = img_rgb.size
                aspect_ratio = img_height / img_width

                if (img_width > img_height):
                    page_orientation = 'L'
                    page_width, page_height = a4_height_mm, a4_width_mm
                else:
                    page_orientation = 'P'
                    page_width, page_height = a4_width_mm, a4_height_mm

                scaled_width = page_width
                scaled_height = scaled_width * aspect_ratio
                if scaled_height > page_height:
                    scaled_height = page_height
                    scaled_width = scaled_height / aspect_ratio

                x_pos = (page_width - scaled_width) / 2
                y_pos = (page_height - scaled_height) / 2

                pdf.add_page(orientation=page_orientation)
                pdf.image(img_rgb, x=x_pos, y=y_pos, w=scaled_width, h=scaled_height)
        except Exception as e:
            print(f"Could not process {image_path}: {e}")

    pdf.output(output_pdf)
    print(f"SUCCESS: PDF created at {output_pdf}")

# --- NEW FUNCTION TO PROCESS ALL SUB-FOLDERS ---
def process_all_subfolders(root_folder):
    """
    Finds all sub-folders in the root_folder and runs create_pdf on each one.
    The output PDF is named after the sub-folder and saved in the root_folder.
    """
    print(f"Starting to process sub-folders in: {root_folder}\n")
    # Find all items in the root folder that are directories
    subfolders = [d for d in os.listdir(root_folder) if os.path.isdir(os.path.join(root_folder, d))]

    if not subfolders:
        print("No sub-folders found to process.")
        return

    # Loop through each found sub-folder
    for folder_name in subfolders:
        image_folder_path = os.path.join(root_folder, folder_name)
        # Name the output PDF after the folder and save it in the root directory
        output_pdf_path = os.path.join(root_folder, f"{folder_name}.pdf")
        
        print(f"--- Processing folder: [{folder_name}] ---")
        create_pdf(image_folder_path, output_pdf_path)
        print("-" * (len(folder_name) + 24)) # Separator for clarity
        
    print("\nAll tasks complete.")

# --- UPDATED MAIN EXECUTION BLOCK ---
if __name__ == "__main__":
    # IMPORTANT: Set this to the main folder containing all your sub-folders.
    # For example, if you have C:\Desktop\Projects\ProjectA and C:\Desktop\Projects\ProjectB,
    # you would set the path to C:\Desktop\Projects
    top_level_folder = r"INSERT PATH HERE"
    
    process_all_subfolders(top_level_folder)

Could I have done this myself? Eh, probably, but it would have required a significant investment of time and effort. Could AI have reached this solution without my troubleshooting and correcting? For the moment, no.