APT Package Repository Management Guide

Introduction

Understanding where your installed packages come from is crucial for system maintenance, security auditing, and troubleshooting. This guide shows how to identify package sources, manage repositories, list packages by origin, and audit your APT installation on Debian/Ubuntu systems.

What are Package Repositories?

Package repositories are servers that host software packages for installation via APT (Advanced Package Tool). Understanding repository sources helps you:

  • Security: Identify packages from untrusted sources
  • Maintenance: Track PPA dependencies
  • Troubleshooting: Find conflicting package sources
  • System Cleanup: Remove unused repositories
  • Documentation: Maintain installation records

Repository Types

TypeDescriptionExample
OfficialUbuntu/Debian main repositoriesarchive.ubuntu.com
PPAPersonal Package Archives (Ubuntu)ppa.launchpad.net/user/repo
Third-partyExternal repositoriesdownload.docker.com
LocalLocally installed packages/var/lib/dpkg/status

Prerequisites

  • Debian or Ubuntu-based system
  • Basic terminal knowledge
  • apt, dpkg, and perl installed (usually pre-installed)

πŸš€ Quick Commands

List All Packages with Repositories

dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
xargs -I{} sh -c 'echo "=== $1 ===" && apt-cache policy "$1" | grep -E "^\s+\*\*\*" | head -1' -- {} | \
paste - - | sed 's/=== \([^ ]*\) ===/\1:/' | \
perl -pe 's/:\s+\*\*\*[^5]*500\s+/: /' | \
sed 's/\s\[.*\]$//'

List Only PPA Packages

dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
xargs -I{} apt-cache policy {} | \
perl -0777 -ne 'while(/^(\S+?):\n.*?\n\s+\*\*\*.*?500 http:\/\/ppa\.launchpad\.net\/([^\s\/]+)/gms) {print "$1: $2\n"}'

Count Packages per Repository

dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
xargs -I{} apt-cache policy {} 2>/dev/null | \
perl -0777 -ne 'while(/^(\S+?):\n.*?\n\s+\*\*\*.*?500\s+(http:\/\/[^\s]+)/gms) {
    $repo = $2; $repo =~ s/\s.*//; $repos{$repo}++
} END { for $r (sort keys %repos) { print "$r: $repos{$r} packages\n" } }'

πŸ“¦ Complete Package Audit Script

Full-featured bash script with multiple options

Save as package-audit.sh:

#!/bin/bash

# Script to show installed packages and their repository origins
# Enhanced version with options and clearer output

show_help() {
    cat << EOF
Usage: $0 [OPTIONS]

OPTIONS:
    -h, --help          Show this help
    -p, --ppa-only      Show only PPA packages
    -a, --all           Show all packages with repositories (default)
    -f, --format FORMAT Output format (default|json|csv)
    -s, --sort          Sort by repository
    -c, --count         Show count per repository
    -o, --output FILE   Save output to file

EXAMPLES:
    $0                  # List all packages with repositories
    $0 -p               # Only PPA packages
    $0 -f json          # JSON output
    $0 -c               # Count per repository
    $0 -p -o ppa.txt    # Save PPA list to file

REPOSITORY TYPES:
    - PPA: Personal Package Archives (Ubuntu)
    - Official: Ubuntu/Debian main repositories
    - Third-party: External repositories
    - Local: Locally installed packages
EOF
}

# Function to get package information
get_package_info() {
    local format="$1"
    local ppa_only="$2"
    local sort_by_repo="$3"
    
    # Get list of installed packages
    local packages=$(dpkg --get-selections | grep -v deinstall$ | awk '{print $1}')
    
    # Arrays to store results
    declare -A results
    declare -A repo_counts
    
    echo "Processing packages..." >&2
    
    local total=0
    local processed=0
    
    # Count total packages
    total=$(echo "$packages" | wc -l)
    
    # Process each package
    while IFS= read -r package; do
        if [[ -n "$package" ]]; then
            ((processed++))
            
            # Show progress
            if (( processed % 50 == 0 )); then
                echo "Processed $processed/$total packages..." >&2
            fi
            
            # Get package policy
            local policy_output=$(apt-cache policy "$package" 2>/dev/null)
            
            if [[ -n "$policy_output" ]]; then
                # Extract main repository
                local repo=$(echo "$policy_output" | grep -E '^\s+\*\*\*|^\s+[0-9]+' | head -1 | \
                           perl -ne 'if (/^\s+(?:\*\*\*\s+)?[0-9]+\s+(.*)/) { 
                               $repo = $1; 
                               $repo =~ s/\s+\[.*?\]$//;  # remove architecture
                               if ($repo =~ /http:\/\/ppa\.launchpad\.net\/([^\/\s]+\/[^\/\s]+)/) {
                                   print "PPA: $1\n";
                               } elsif ($repo =~ /http:\/\/([^\/\s]+)/) {
                                   print "Repository: $1\n";
                               } elsif ($repo =~ /\/var\/lib\/dpkg\/status/) {
                                   print "Local\n";
                               } else {
                                   print "Other: $repo\n";
                               }
                           }')
                
                if [[ -n "$repo" ]]; then
                    # Filter PPA if requested
                    if [[ "$ppa_only" == "true" && ! "$repo" =~ ^PPA: ]]; then
                        continue
                    fi
                    
                    results["$package"]="$repo"
                    repo_counts["$repo"]=$((${repo_counts["$repo"]} + 1))
                fi
            fi
        fi
    done <<< "$packages"
    
    echo "Done! Found ${#results[@]} packages." >&2
    echo "" >&2
    
    # Output based on requested format
    case "$format" in
        "json")
            echo "{"
            local first=true
            for package in $(printf '%s\n' "${!results[@]}" | sort); do
                if [[ "$first" == "true" ]]; then
                    first=false
                else
                    echo ","
                fi
                printf '  "%s": "%s"' "$package" "${results[$package]}"
            done
            echo ""
            echo "}"
            ;;
        "csv")
            echo "Package,Repository"
            for package in $(printf '%s\n' "${!results[@]}" | sort); do
                echo "$package,${results[$package]}"
            done
            ;;
        "count")
            echo "================================================"
            echo "  Packages per Repository"
            echo "================================================"
            echo ""
            printf "%-60s %s\n" "Repository" "Count"
            echo "------------------------------------------------"
            for repo in "${!repo_counts[@]}"; do
                printf "%-60s %s\n" "$repo" "${repo_counts[$repo]}"
            done | sort -t$'\t' -k2 -nr
            echo ""
            echo "Total repositories: ${#repo_counts[@]}"
            echo "Total packages: ${#results[@]}"
            ;;
        *)
            echo "================================================"
            echo "  Installed Packages by Repository"
            echo "================================================"
            echo ""
            if [[ "$sort_by_repo" == "true" ]]; then
                # Sort by repository
                local current_repo=""
                for package in "${!results[@]}"; do
                    echo "${results[$package]}|$package"
                done | sort | while IFS='|' read -r repo package; do
                    if [[ "$repo" != "$current_repo" ]]; then
                        echo ""
                        echo "[$repo]"
                        echo "---"
                        current_repo="$repo"
                    fi
                    echo "  - $package"
                done
            else
                # Sort by package name
                printf "%-40s -> %s\n" "Package" "Repository"
                echo "------------------------------------------------"
                for package in $(printf '%s\n' "${!results[@]}" | sort); do
                    printf "%-40s -> %s\n" "$package" "${results[$package]}"
                done
            fi
            echo ""
            echo "Total packages: ${#results[@]}"
            ;;
    esac
}

# Parse arguments
format="default"
ppa_only=false
sort_by_repo=false
output_file=""

while [[ $# -gt 0 ]]; do
    case $1 in
        -h|--help)
            show_help
            exit 0
            ;;
        -p|--ppa-only)
            ppa_only=true
            shift
            ;;
        -a|--all)
            # Default behavior
            shift
            ;;
        -f|--format)
            format="$2"
            shift 2
            ;;
        -s|--sort)
            sort_by_repo=true
            shift
            ;;
        -c|--count)
            format="count"
            shift
            ;;
        -o|--output)
            output_file="$2"
            shift 2
            ;;
        *)
            echo "Unknown option: $1" >&2
            show_help
            exit 1
            ;;
    esac
done

# Check dependencies
if ! command -v apt-cache &> /dev/null; then
    echo "Error: apt-cache not found. This script works only on Debian/Ubuntu systems." >&2
    exit 1
fi

if ! command -v dpkg &> /dev/null; then
    echo "Error: dpkg not found." >&2
    exit 1
fi

# Execute main function
if [[ -n "$output_file" ]]; then
    get_package_info "$format" "$ppa_only" "$sort_by_repo" > "$output_file"
    echo "Output saved to: $output_file" >&2
else
    get_package_info "$format" "$ppa_only" "$sort_by_repo"
fi

Make executable:

chmod +x package-audit.sh

πŸ’‘ Usage Examples

Basic Usage

# List all packages with their repositories
./package-audit.sh

# List only PPA packages
./package-audit.sh -p

# Sort packages by repository
./package-audit.sh -s

# Show count per repository
./package-audit.sh -c

Advanced Usage

# Export to JSON
./package-audit.sh -f json > packages.json

# Export to CSV
./package-audit.sh -f csv > packages.csv

# Save PPA list to file
./package-audit.sh -p -o ppa-packages.txt

# Get PPA count only
./package-audit.sh -p -c

# Combine options
./package-audit.sh -p -s -o ppa-sorted.txt

πŸ” Understanding Package Sources

Check Individual Package

# Show package policy
apt-cache policy package-name

# Example output:
# package-name:
#   Installed: 1.2.3-1
#   Candidate: 1.2.3-1
#   Version table:
#  *** 1.2.3-1 500
#         500 http://ppa.launchpad.net/user/repo/ubuntu focal/main amd64 Packages
#         100 /var/lib/dpkg/status

Find Package Repository

# Which repository provides a package?
apt-cache policy package-name | grep -A1 "Installed:"

# List all available versions
apt-cache madison package-name

# Show package origin
apt-cache show package-name | grep -E "^(Package|Version|Origin):"

πŸ“Š Repository Analysis

Analyze repository distribution
#!/bin/bash
# Analyze repository distribution

echo "Repository Distribution Analysis"
echo "================================="
echo ""

# Count by type
echo "By Type:"
./package-audit.sh -c | grep -E "^(PPA|Repository|Local)" | \
    awk -F: '{type=$1; count=$2; types[type]+=count} END {
        for (t in types) print t": "types[t]
    }' | sort -t: -k2 -nr

echo ""
echo "Top 10 Repositories:"
./package-audit.sh -c | head -12 | tail -10
Find packages from specific PPA
# List packages from specific PPA
PPA_NAME="ppa-owner/ppa-name"

dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
while read package; do
    apt-cache policy "$package" | grep -q "$PPA_NAME" && echo "$package"
done
Compare two systems
# Generate package list on system 1
./package-audit.sh -f csv > system1.csv

# Generate package list on system 2
./package-audit.sh -f csv > system2.csv

# Compare
diff <(sort system1.csv) <(sort system2.csv)

# Or use comm for better comparison
comm -3 <(sort system1.csv) <(sort system2.csv)

πŸ› οΈ Repository Management

List Active Repositories

# List all configured repositories
grep -r --include '*.list' '^deb ' /etc/apt/sources.list /etc/apt/sources.list.d/

# List only PPAs
grep -r --include '*.list' '^deb ' /etc/apt/sources.list.d/ | grep ppa.launchpad.net

# Show repository keys
apt-key list

Add/Remove Repositories

# Add PPA
sudo add-apt-repository ppa:user/repo
sudo apt update

# Remove PPA
sudo add-apt-repository --remove ppa:user/repo
sudo apt update

# Add third-party repository
echo "deb [signed-by=/usr/share/keyrings/repo.gpg] https://repo.url distribution component" | \
    sudo tee /etc/apt/sources.list.d/repo.list

# Remove third-party repository
sudo rm /etc/apt/sources.list.d/repo.list
sudo apt update

Clean Unused Repositories

Find and remove unused PPAs
#!/bin/bash
# Find PPAs with no installed packages

echo "Checking for unused PPAs..."

# Get list of PPAs
ppas=$(grep -r --include '*.list' '^deb ' /etc/apt/sources.list.d/ | \
       grep ppa.launchpad.net | \
       sed 's/.*ppa\.launchpad\.net\/\([^\/]*\/[^\/]*\).*/\1/' | \
       sort -u)

# Check each PPA
while IFS= read -r ppa; do
    count=$(dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
            xargs -I{} apt-cache policy {} 2>/dev/null | \
            grep -c "ppa.launchpad.net/$ppa")
    
    if [ "$count" -eq 0 ]; then
        echo "Unused PPA: $ppa (0 packages)"
        echo "  Remove with: sudo add-apt-repository --remove ppa:$ppa"
    else
        echo "Active PPA: $ppa ($count packages)"
    fi
done <<< "$ppas"

πŸ” Security Auditing

Identify Untrusted Sources

# List packages from non-official repositories
./package-audit.sh | grep -v "Repository: archive.ubuntu.com" | \
                      grep -v "Repository: security.ubuntu.com" | \
                      grep -v "Local"

# Check for packages without GPG verification
dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
while read pkg; do
    apt-cache policy "$pkg" | grep -q "\[trusted=yes\]" && echo "$pkg (unverified)"
done

Audit Third-party Packages

Generate security audit report
#!/bin/bash
# Security audit report for packages

cat > security-audit.sh << 'EOF'
#!/bin/bash

echo "================================================"
echo "  Package Security Audit Report"
echo "  Generated: $(date)"
echo "================================================"
echo ""

echo "1. Packages from PPAs:"
echo "----------------------"
./package-audit.sh -p -c
echo ""

echo "2. Packages from Third-party Repositories:"
echo "-----------------------------------------"
./package-audit.sh | grep "Repository:" | \
    grep -v "archive.ubuntu.com" | \
    grep -v "security.ubuntu.com" | \
    grep -v "ppa.launchpad.net" | \
    sort -u
echo ""

echo "3. Locally Installed Packages:"
echo "------------------------------"
./package-audit.sh | grep "^Local" | wc -l
echo ""

echo "4. Unsigned Repositories:"
echo "------------------------"
grep -r "trusted=yes" /etc/apt/sources.list /etc/apt/sources.list.d/ 2>/dev/null
echo ""

echo "5. Outdated Package Keys:"
echo "------------------------"
apt-key list | grep -E "expired:"
EOF

chmod +x security-audit.sh
./security-audit.sh

πŸ“‹ Common Tasks

Export Package List for Backup

# Save complete package list with sources
./package-audit.sh -f csv > package-backup-$(date +%Y%m%d).csv

# Save only package names
dpkg --get-selections | grep -v deinstall > packages-$(date +%Y%m%d).txt

# Save with versions
dpkg -l | grep ^ii > packages-versioned-$(date +%Y%m%d).txt

Restore Packages on New System

# From package-audit.sh CSV
# 1. Add repositories first (manually or script)
# 2. Install packages
cut -d, -f1 package-backup.csv | tail -n +2 | \
    xargs sudo apt install -y

# From dpkg list
sudo dpkg --set-selections < packages.txt
sudo apt-get dselect-upgrade

Find Package Dependencies

# Show dependencies
apt-cache depends package-name

# Show reverse dependencies
apt-cache rdepends package-name

# Show all dependency chain
apt-cache depends --recurse --no-recommends --no-suggests --no-conflicts \
    --no-breaks --no-replaces --no-enhances package-name

πŸ”§ Troubleshooting

Script runs slowly

Problem: Script takes too long on systems with many packages

Solution:

# Cache apt-cache policy output
dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
    xargs apt-cache policy > /tmp/apt-policy-cache.txt

# Then process the cache file instead of calling apt-cache repeatedly
# This speeds up processing significantly
Missing perl or dependencies

Problem: perl command not found

Solution:

sudo apt install perl

# Alternative without perl
dpkg --get-selections | awk '!/deinstall$/ {print $1}' | \
while read pkg; do
    repo=$(apt-cache policy "$pkg" | grep -m1 "500 http" | \
           awk '{print $2}')
    echo "$pkg: $repo"
done
Cannot identify some package sources

Problem: Some packages show “Other” or empty repository

Solution:

# Check package files directly
dpkg -L package-name | head -5

# Check package origin
dpkg -s package-name | grep -E "^(Package|Version|Status|Origin):"

# Check if it's a manually installed .deb
grep "Package: package-name" /var/lib/dpkg/status -A20 | grep "Origin"

πŸ’‘ Advanced Use Cases

Monitor Repository Changes

Track repository changes over time
#!/bin/bash
# Track repository changes

LOGDIR="$HOME/package-logs"
mkdir -p "$LOGDIR"

LOGFILE="$LOGDIR/packages-$(date +%Y%m%d-%H%M%S).csv"

# Generate current snapshot
./package-audit.sh -f csv > "$LOGFILE"

echo "Snapshot saved: $LOGFILE"

# Compare with previous
PREVIOUS=$(ls -t "$LOGDIR"/packages-*.csv | sed -n '2p')

if [ -n "$PREVIOUS" ]; then
    echo ""
    echo "Changes since last snapshot:"
    echo "----------------------------"
    
    # Added packages
    echo "Added:"
    comm -13 <(sort "$PREVIOUS") <(sort "$LOGFILE") | head -10
    
    # Removed packages
    echo "Removed:"
    comm -23 <(sort "$PREVIOUS") <(sort "$LOGFILE") | head -10
fi

Set up as cron job:

# Run weekly
0 0 * * 0 /path/to/track-repos.sh

Generate Installation Documentation

Create markdown documentation of installed software
#!/bin/bash
# Generate installation documentation

cat > system-documentation.md << EOF
# System Package Documentation
Generated: $(date)
Hostname: $(hostname)

## Package Summary

$(./package-audit.sh -c)

## Packages by Repository

$(./package-audit.sh -s)

## PPA Packages

$(./package-audit.sh -p)

## Repository Configuration

\`\`\`
$(grep -r --include '*.list' '^deb ' /etc/apt/sources.list /etc/apt/sources.list.d/)
\`\`\`

## System Information

- OS: $(lsb_release -d | cut -f2-)
- Kernel: $(uname -r)
- Architecture: $(dpkg --print-architecture)

EOF

echo "Documentation generated: system-documentation.md"

πŸ“š Command Reference

Essential Commands

CommandPurpose
dpkg --get-selectionsList installed packages
apt-cache policy PKGShow package sources
apt-cache madison PKGList available versions
apt-cache show PKGShow package details
apt-cache depends PKGShow dependencies
apt-cache rdepends PKGShow reverse dependencies
dpkg -L PKGList files in package
dpkg -S /path/to/fileFind package owning file

Repository Management

CommandPurpose
add-apt-repository ppa:user/repoAdd PPA
add-apt-repository --remove ppa:user/repoRemove PPA
apt-key listList repository keys
apt updateUpdate package lists
apt-cache search TERMSearch packages

🎯 Best Practices

  1. Regular Audits: Run package audits monthly
  2. Document Changes: Keep logs of repository additions
  3. Minimize PPAs: Use only trusted PPAs
  4. Security First: Audit third-party sources regularly
  5. Backup Lists: Maintain package backups before major changes
  6. Clean Regularly: Remove unused repositories
  7. Version Pin: Pin critical packages to specific versions

Version Pinning Example

# Pin package to specific version
sudo apt-mark hold package-name

# Show held packages
apt-mark showhold

# Unhold package
sudo apt-mark unhold package-name

Clean Package Cache

# Remove downloaded package files
sudo apt clean

# Remove orphaned packages
sudo apt autoremove

# Remove unnecessary dependencies
sudo apt-get autoclean

Fix Broken Packages

# Fix broken dependencies
sudo apt --fix-broken install

# Reconfigure packages
sudo dpkg --configure -a

# Force reinstall package
sudo apt install --reinstall package-name

Master your package management for a clean and secure system!