Comparing Two Single-row CSV

Comparing Two Single-row CSV | Generated by AI

Home 2025.07.19

Below is a Python script that compares two single-row CSV files containing strings (e.g., “P0001”) and identifies:

Common items (present in both files)
Items unique to the first file
Items unique to the second file

The script uses argparse to accept two CSV file paths, reads the first row from each, performs a set-based comparison, and writes the results to comparison.txt.

#!/usr/bin/env python3
import argparse
import csv
import os

def main():
    # Set up argument parser
    parser = argparse.ArgumentParser(description='Compare two single-row CSV files for common and unique items')
    parser.add_argument('file1_path', help='Path to the first CSV file')
    parser.add_argument('file2_path', help='Path to the second CSV file')
    
    # Parse arguments
    args = parser.parse_args()
    
    try:
        # Read the first row of the first CSV file
        with open(args.file1_path, 'r') as file1:
            reader1 = csv.reader(file1)
            for row in reader1:
                values1 = row
                break
        
        # Read the first row of the second CSV file
        with open(args.file2_path, 'r') as file2:
            reader2 = csv.reader(file2)
            for row in reader2:
                values2 = row
                break
        
        # Convert to sets for comparison
        set1 = set(values1)
        set2 = set(values2)
        
        # Find common and unique items
        common_items = set1.intersection(set2)
        unique_to_file1 = set1.difference(set2)
        unique_to_file2 = set2.difference(set1)
        
        # Determine output directory (use directory of first file)
        output_dir = os.path.dirname(args.file1_path) or '.'
        output_file = os.path.join(output_dir, 'comparison.txt')
        
        # Write results to output file
        with open(output_file, 'w') as file:
            file.write("Comparison Results:\n\n")
            
            file.write("Common Items (in both files):\n")
            if common_items:
                for item in sorted(common_items):
                    file.write(f"{item}\n")
            else:
                file.write("None\n")
                
            file.write("\nItems only in first file:\n")
            if unique_to_file1:
                for item in sorted(unique_to_file1):
                    file.write(f"{item}\n")
            else:
                file.write("None\n")
                
            file.write("\nItems only in second file:\n")
            if unique_to_file2:
                for item in sorted(unique_to_file2):
                    file.write(f"{item}\n")
            else:
                file.write("None\n")
        
        print(f"Comparison complete. Results saved to {output_file}")
        
    except FileNotFoundError as e:
        print(f"Error: File not found - {e}")
    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

How it works:

Input: Takes two CSV file paths via command-line arguments.
Reading: Reads only the first row from each CSV file.
Comparison:
- Converts the rows to sets for efficient comparison.
- Finds:
  - Common items: Items present in both rows (set intersection).
  - Unique to first file: Items only in the first row (set difference: file1 - file2).
  - Unique to second file: Items only in the second row (set difference: file2 - file1).
Output: Writes results to comparison.txt in the directory of the first file, listing:
- Common items
- Items only in the first file
- Items only in the second file
- Each section is sorted alphabetically; “None” is written if a section is empty.
Error Handling: Catches file-not-found errors and other exceptions.

Example usage:

Save the script as compare_csv.py. Create two CSV files, e.g.:

file1.csv: P0001,P0002,P0003
file2.csv: P0001,P0004,P0003

Run the script:

python compare_csv.py file1.csv file2.csv

Output (comparison.txt):

Comparison Results:

Common Items (in both files):
P0001
P0003

Items only in first file:
P0002

Items only in second file:
P0004

Notes:

The script ignores the position of items and focuses only on presence/absence.
Duplicate items in a row are automatically handled (sets remove duplicates).
The output is written to comparison.txt in the directory of the first input file.
If a file is empty or has no row, the script will treat it as an empty set.

Would you like me to modify the script (e.g., change the output format or add more details)?

Back Donate