echo6-docs/runbooks/ia-cli-reference.md
Matt Johnson e9231ac24a Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:02:16 +00:00

6.9 KiB

Internet Archive CLI Reference

Quick reference for the ia command-line tool on pi-nas.


Location & Setup

Detail Value
Host pi-nas (192.168.1.245 / 100.64.0.21)
Binary ia (v5.7.2, pip-installed)
Config ~/.config/internetarchive/ia.ini

1. Configure / Authenticate

Required for uploads, metadata edits, and accessing restricted items. Not required for public downloads or searches.

ia configure
# Prompts for archive.org email + password
# Stores credentials in ~/.config/internetarchive/ia.ini

Verify:

ia configure --help   # Should show options without errors

Search the archive.org catalog. Returns JSON by default.

Basic syntax

ia search '<query>'

Query syntax

Queries use Lucene syntax. Combine fields with AND/OR, quote phrases.

Field Example Notes
collection collection:prelinger Items in a specific collection
subject subject:"ham radio" Subject/tag match
mediatype mediatype:texts texts, movies, audio, software, image, data, web, collection
creator creator:"ARRL" Author/creator
title title:"emergency" Item title
date date:[2020-01-01 TO 2024-12-31] Date range (YYYY-MM-DD)
year year:2023 Shorthand for year
language language:eng ISO language code
licenseurl licenseurl:*creativecommons* License filter

Combined queries

# PDFs about ham radio published after 2020
ia search 'subject:"ham radio" mediatype:texts date:[2020-01-01 TO 2099-12-31]'

# All items in a specific collection
ia search 'collection:prelinger'

# Creator + mediatype
ia search 'creator:"ARRL" AND mediatype:texts'

Output options

# Default: JSON objects, one per line
ia search 'collection:prelinger'

# Itemlist mode — outputs only identifiers, one per line
# Pipe this to ia download --itemlist
ia search 'collection:prelinger' --itemlist

# Save itemlist to file
ia search 'collection:prelinger' --itemlist > prelinger-items.txt

# Limit results with parameters
ia search 'subject:radio' --parameters='rows=50'

# Count results without downloading them all
ia search 'collection:prelinger' --num-found

Practical examples

# Find all items in a collection and count them
ia search 'collection:arrl_qst' --num-found

# Get identifiers for bulk download
ia search 'collection:arrl_qst' --itemlist > arrl-items.txt

# Search within a collection for specific subjects
ia search 'collection:prelinger subject:"san francisco"' --itemlist

# Find audio recordings by a specific creator
ia search 'creator:"Grateful Dead" mediatype:audio' --itemlist

# Search for items with specific file formats available
ia search 'collection:librivoxaudio format:"64Kbps MP3"' --itemlist

3. List Item Contents

View files within an item without downloading.

# List all files in an item
ia list <identifier>

# Example
ia list prelinger_films

Output shows filenames, sizes, and formats.


4. Metadata

View and modify item metadata.

Read metadata

# Full metadata as JSON
ia metadata <identifier>

# Pretty-print with jq
ia metadata <identifier> | jq .

# Get specific fields
ia metadata <identifier> | jq '.metadata.title'
ia metadata <identifier> | jq '.metadata.subject'
ia metadata <identifier> | jq '.metadata.collection'

# List available formats for an item
ia metadata <identifier> --formats

Modify metadata (requires authentication)

# Set a field
ia metadata <identifier> --modify="description:Updated description"

# Remove a field
ia metadata <identifier> --modify="subject:REMOVE_TAG"

# Append to existing value
ia metadata <identifier> --append="subject:new-tag"

# Add to array field
ia metadata <identifier> --append-list="collection:another-collection"

# Bulk modify from CSV (must have 'identifier' column)
ia metadata --spreadsheet=metadata.csv

5. Upload (requires authentication)

# Upload files to a new or existing item
ia upload <identifier> file1.pdf file2.pdf \
  --metadata="mediatype:texts" \
  --metadata="title:My Upload" \
  --metadata="subject:test"

# Upload from stdin
curl -sL https://example.com/file.pdf | \
  ia upload <identifier> - --remote-name=file.pdf

# Retry on failure
ia upload <identifier> largefile.zip --retries 10

# Bulk upload from CSV (requires 'identifier' and 'file' columns)
ia upload --spreadsheet=uploads.csv

Important: mediatype cannot be changed after initial upload.


6. Delete (requires authentication)

# Delete a specific file
ia delete <identifier> filename.pdf

# Delete file and all its derivatives
ia delete <identifier> filename.pdf --cascade

# Delete all files in an item
ia delete <identifier> --all

Deleted files are backed up to history/files/ automatically.


7. Copy / Move

# Copy a file between items
ia copy source-item/file.pdf dest-item/file.pdf

# Copy with metadata for new items
ia copy source/file.pdf new-item/file.pdf --metadata="title:Copied Item"

# Move (copy + delete source)
ia move source-item/file.pdf dest-item/file.pdf

8. Tasks

View catalog processing tasks (derive jobs, uploads in progress, etc.).

# Tasks for a specific item
ia tasks <identifier>

# All your queued/running tasks
ia tasks

Command Quick Reference

Command Alias Purpose
ia configure ia co Set up credentials
ia search ia se Search catalog
ia download ia do Download files
ia list ia ls List item files
ia metadata ia md View/edit metadata
ia upload ia up Upload files
ia delete ia rm Delete files
ia copy ia cp Copy between items
ia move ia mv Move between items
ia tasks ia ta View task queue

Global Flags

Flag Short Purpose
--help -h Show help
--version -v Show version
--config-file FILE -c Use alternate config
--log -l Enable logging
--debug -d Verbose debug output
--insecure -i Use HTTP instead of HTTPS

Troubleshooting

"You need to be logged in"

Run ia configure and enter your archive.org credentials. Verify with:

cat ~/.config/internetarchive/ia.ini

Search returns no results

  • Check query syntax — field names are case-sensitive
  • Use quotes around multi-word values: subject:"ham radio" not subject:ham radio
  • Verify the collection/identifier exists: ia metadata <identifier>

Slow searches

Large collections can take minutes to enumerate. Use --parameters='rows=100' to limit during testing, or --num-found to just get the count first.

Rate limiting

Archive.org may throttle aggressive requests. Space out bulk operations and use --retries on downloads.


Last updated: 2026-02-14