echo6-docs/runbooks/ia-cli-reference.md
Matt Johnson e9231ac24a Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:02:16 +00:00

300 lines
6.9 KiB
Markdown

# Internet Archive CLI Reference
Quick reference for the `ia` command-line tool on pi-nas.
---
## Location & Setup
| Detail | Value |
|--------|-------|
| Host | pi-nas (192.168.1.245 / 100.64.0.21) |
| Binary | `ia` (v5.7.2, pip-installed) |
| Config | `~/.config/internetarchive/ia.ini` |
---
## 1. Configure / Authenticate
Required for uploads, metadata edits, and accessing restricted items. Not required for public downloads or searches.
```bash
ia configure
# Prompts for archive.org email + password
# Stores credentials in ~/.config/internetarchive/ia.ini
```
Verify:
```bash
ia configure --help # Should show options without errors
```
---
## 2. Search
Search the archive.org catalog. Returns JSON by default.
### Basic syntax
```bash
ia search '<query>'
```
### Query syntax
Queries use Lucene syntax. Combine fields with AND/OR, quote phrases.
| Field | Example | Notes |
|-------|---------|-------|
| `collection` | `collection:prelinger` | Items in a specific collection |
| `subject` | `subject:"ham radio"` | Subject/tag match |
| `mediatype` | `mediatype:texts` | texts, movies, audio, software, image, data, web, collection |
| `creator` | `creator:"ARRL"` | Author/creator |
| `title` | `title:"emergency"` | Item title |
| `date` | `date:[2020-01-01 TO 2024-12-31]` | Date range (YYYY-MM-DD) |
| `year` | `year:2023` | Shorthand for year |
| `language` | `language:eng` | ISO language code |
| `licenseurl` | `licenseurl:*creativecommons*` | License filter |
### Combined queries
```bash
# PDFs about ham radio published after 2020
ia search 'subject:"ham radio" mediatype:texts date:[2020-01-01 TO 2099-12-31]'
# All items in a specific collection
ia search 'collection:prelinger'
# Creator + mediatype
ia search 'creator:"ARRL" AND mediatype:texts'
```
### Output options
```bash
# Default: JSON objects, one per line
ia search 'collection:prelinger'
# Itemlist mode — outputs only identifiers, one per line
# Pipe this to ia download --itemlist
ia search 'collection:prelinger' --itemlist
# Save itemlist to file
ia search 'collection:prelinger' --itemlist > prelinger-items.txt
# Limit results with parameters
ia search 'subject:radio' --parameters='rows=50'
# Count results without downloading them all
ia search 'collection:prelinger' --num-found
```
### Practical examples
```bash
# Find all items in a collection and count them
ia search 'collection:arrl_qst' --num-found
# Get identifiers for bulk download
ia search 'collection:arrl_qst' --itemlist > arrl-items.txt
# Search within a collection for specific subjects
ia search 'collection:prelinger subject:"san francisco"' --itemlist
# Find audio recordings by a specific creator
ia search 'creator:"Grateful Dead" mediatype:audio' --itemlist
# Search for items with specific file formats available
ia search 'collection:librivoxaudio format:"64Kbps MP3"' --itemlist
```
---
## 3. List Item Contents
View files within an item without downloading.
```bash
# List all files in an item
ia list <identifier>
# Example
ia list prelinger_films
```
Output shows filenames, sizes, and formats.
---
## 4. Metadata
View and modify item metadata.
### Read metadata
```bash
# Full metadata as JSON
ia metadata <identifier>
# Pretty-print with jq
ia metadata <identifier> | jq .
# Get specific fields
ia metadata <identifier> | jq '.metadata.title'
ia metadata <identifier> | jq '.metadata.subject'
ia metadata <identifier> | jq '.metadata.collection'
# List available formats for an item
ia metadata <identifier> --formats
```
### Modify metadata (requires authentication)
```bash
# Set a field
ia metadata <identifier> --modify="description:Updated description"
# Remove a field
ia metadata <identifier> --modify="subject:REMOVE_TAG"
# Append to existing value
ia metadata <identifier> --append="subject:new-tag"
# Add to array field
ia metadata <identifier> --append-list="collection:another-collection"
# Bulk modify from CSV (must have 'identifier' column)
ia metadata --spreadsheet=metadata.csv
```
---
## 5. Upload (requires authentication)
```bash
# Upload files to a new or existing item
ia upload <identifier> file1.pdf file2.pdf \
--metadata="mediatype:texts" \
--metadata="title:My Upload" \
--metadata="subject:test"
# Upload from stdin
curl -sL https://example.com/file.pdf | \
ia upload <identifier> - --remote-name=file.pdf
# Retry on failure
ia upload <identifier> largefile.zip --retries 10
# Bulk upload from CSV (requires 'identifier' and 'file' columns)
ia upload --spreadsheet=uploads.csv
```
**Important:** `mediatype` cannot be changed after initial upload.
---
## 6. Delete (requires authentication)
```bash
# Delete a specific file
ia delete <identifier> filename.pdf
# Delete file and all its derivatives
ia delete <identifier> filename.pdf --cascade
# Delete all files in an item
ia delete <identifier> --all
```
Deleted files are backed up to `history/files/` automatically.
---
## 7. Copy / Move
```bash
# Copy a file between items
ia copy source-item/file.pdf dest-item/file.pdf
# Copy with metadata for new items
ia copy source/file.pdf new-item/file.pdf --metadata="title:Copied Item"
# Move (copy + delete source)
ia move source-item/file.pdf dest-item/file.pdf
```
---
## 8. Tasks
View catalog processing tasks (derive jobs, uploads in progress, etc.).
```bash
# Tasks for a specific item
ia tasks <identifier>
# All your queued/running tasks
ia tasks
```
---
## Command Quick Reference
| Command | Alias | Purpose |
|---------|-------|---------|
| `ia configure` | `ia co` | Set up credentials |
| `ia search` | `ia se` | Search catalog |
| `ia download` | `ia do` | Download files |
| `ia list` | `ia ls` | List item files |
| `ia metadata` | `ia md` | View/edit metadata |
| `ia upload` | `ia up` | Upload files |
| `ia delete` | `ia rm` | Delete files |
| `ia copy` | `ia cp` | Copy between items |
| `ia move` | `ia mv` | Move between items |
| `ia tasks` | `ia ta` | View task queue |
---
## Global Flags
| Flag | Short | Purpose |
|------|-------|---------|
| `--help` | `-h` | Show help |
| `--version` | `-v` | Show version |
| `--config-file FILE` | `-c` | Use alternate config |
| `--log` | `-l` | Enable logging |
| `--debug` | `-d` | Verbose debug output |
| `--insecure` | `-i` | Use HTTP instead of HTTPS |
---
## Troubleshooting
### "You need to be logged in"
Run `ia configure` and enter your archive.org credentials. Verify with:
```bash
cat ~/.config/internetarchive/ia.ini
```
### Search returns no results
- Check query syntax — field names are case-sensitive
- Use quotes around multi-word values: `subject:"ham radio"` not `subject:ham radio`
- Verify the collection/identifier exists: `ia metadata <identifier>`
### Slow searches
Large collections can take minutes to enumerate. Use `--parameters='rows=100'` to limit during testing, or `--num-found` to just get the count first.
### Rate limiting
Archive.org may throttle aggressive requests. Space out bulk operations and use `--retries` on downloads.
---
*Last updated: 2026-02-14*