# Internet Archive CLI Reference Quick reference for the `ia` command-line tool on pi-nas. --- ## Location & Setup | Detail | Value | |--------|-------| | Host | pi-nas (192.168.1.245 / 100.64.0.21) | | Binary | `ia` (v5.7.2, pip-installed) | | Config | `~/.config/internetarchive/ia.ini` | --- ## 1. Configure / Authenticate Required for uploads, metadata edits, and accessing restricted items. Not required for public downloads or searches. ```bash ia configure # Prompts for archive.org email + password # Stores credentials in ~/.config/internetarchive/ia.ini ``` Verify: ```bash ia configure --help # Should show options without errors ``` --- ## 2. Search Search the archive.org catalog. Returns JSON by default. ### Basic syntax ```bash ia search '' ``` ### Query syntax Queries use Lucene syntax. Combine fields with AND/OR, quote phrases. | Field | Example | Notes | |-------|---------|-------| | `collection` | `collection:prelinger` | Items in a specific collection | | `subject` | `subject:"ham radio"` | Subject/tag match | | `mediatype` | `mediatype:texts` | texts, movies, audio, software, image, data, web, collection | | `creator` | `creator:"ARRL"` | Author/creator | | `title` | `title:"emergency"` | Item title | | `date` | `date:[2020-01-01 TO 2024-12-31]` | Date range (YYYY-MM-DD) | | `year` | `year:2023` | Shorthand for year | | `language` | `language:eng` | ISO language code | | `licenseurl` | `licenseurl:*creativecommons*` | License filter | ### Combined queries ```bash # PDFs about ham radio published after 2020 ia search 'subject:"ham radio" mediatype:texts date:[2020-01-01 TO 2099-12-31]' # All items in a specific collection ia search 'collection:prelinger' # Creator + mediatype ia search 'creator:"ARRL" AND mediatype:texts' ``` ### Output options ```bash # Default: JSON objects, one per line ia search 'collection:prelinger' # Itemlist mode — outputs only identifiers, one per line # Pipe this to ia download --itemlist ia search 'collection:prelinger' --itemlist # Save itemlist to file ia search 'collection:prelinger' --itemlist > prelinger-items.txt # Limit results with parameters ia search 'subject:radio' --parameters='rows=50' # Count results without downloading them all ia search 'collection:prelinger' --num-found ``` ### Practical examples ```bash # Find all items in a collection and count them ia search 'collection:arrl_qst' --num-found # Get identifiers for bulk download ia search 'collection:arrl_qst' --itemlist > arrl-items.txt # Search within a collection for specific subjects ia search 'collection:prelinger subject:"san francisco"' --itemlist # Find audio recordings by a specific creator ia search 'creator:"Grateful Dead" mediatype:audio' --itemlist # Search for items with specific file formats available ia search 'collection:librivoxaudio format:"64Kbps MP3"' --itemlist ``` --- ## 3. List Item Contents View files within an item without downloading. ```bash # List all files in an item ia list # Example ia list prelinger_films ``` Output shows filenames, sizes, and formats. --- ## 4. Metadata View and modify item metadata. ### Read metadata ```bash # Full metadata as JSON ia metadata # Pretty-print with jq ia metadata | jq . # Get specific fields ia metadata | jq '.metadata.title' ia metadata | jq '.metadata.subject' ia metadata | jq '.metadata.collection' # List available formats for an item ia metadata --formats ``` ### Modify metadata (requires authentication) ```bash # Set a field ia metadata --modify="description:Updated description" # Remove a field ia metadata --modify="subject:REMOVE_TAG" # Append to existing value ia metadata --append="subject:new-tag" # Add to array field ia metadata --append-list="collection:another-collection" # Bulk modify from CSV (must have 'identifier' column) ia metadata --spreadsheet=metadata.csv ``` --- ## 5. Upload (requires authentication) ```bash # Upload files to a new or existing item ia upload file1.pdf file2.pdf \ --metadata="mediatype:texts" \ --metadata="title:My Upload" \ --metadata="subject:test" # Upload from stdin curl -sL https://example.com/file.pdf | \ ia upload - --remote-name=file.pdf # Retry on failure ia upload largefile.zip --retries 10 # Bulk upload from CSV (requires 'identifier' and 'file' columns) ia upload --spreadsheet=uploads.csv ``` **Important:** `mediatype` cannot be changed after initial upload. --- ## 6. Delete (requires authentication) ```bash # Delete a specific file ia delete filename.pdf # Delete file and all its derivatives ia delete filename.pdf --cascade # Delete all files in an item ia delete --all ``` Deleted files are backed up to `history/files/` automatically. --- ## 7. Copy / Move ```bash # Copy a file between items ia copy source-item/file.pdf dest-item/file.pdf # Copy with metadata for new items ia copy source/file.pdf new-item/file.pdf --metadata="title:Copied Item" # Move (copy + delete source) ia move source-item/file.pdf dest-item/file.pdf ``` --- ## 8. Tasks View catalog processing tasks (derive jobs, uploads in progress, etc.). ```bash # Tasks for a specific item ia tasks # All your queued/running tasks ia tasks ``` --- ## Command Quick Reference | Command | Alias | Purpose | |---------|-------|---------| | `ia configure` | `ia co` | Set up credentials | | `ia search` | `ia se` | Search catalog | | `ia download` | `ia do` | Download files | | `ia list` | `ia ls` | List item files | | `ia metadata` | `ia md` | View/edit metadata | | `ia upload` | `ia up` | Upload files | | `ia delete` | `ia rm` | Delete files | | `ia copy` | `ia cp` | Copy between items | | `ia move` | `ia mv` | Move between items | | `ia tasks` | `ia ta` | View task queue | --- ## Global Flags | Flag | Short | Purpose | |------|-------|---------| | `--help` | `-h` | Show help | | `--version` | `-v` | Show version | | `--config-file FILE` | `-c` | Use alternate config | | `--log` | `-l` | Enable logging | | `--debug` | `-d` | Verbose debug output | | `--insecure` | `-i` | Use HTTP instead of HTTPS | --- ## Troubleshooting ### "You need to be logged in" Run `ia configure` and enter your archive.org credentials. Verify with: ```bash cat ~/.config/internetarchive/ia.ini ``` ### Search returns no results - Check query syntax — field names are case-sensitive - Use quotes around multi-word values: `subject:"ham radio"` not `subject:ham radio` - Verify the collection/identifier exists: `ia metadata ` ### Slow searches Large collections can take minutes to enumerate. Use `--parameters='rows=100'` to limit during testing, or `--num-found` to just get the count first. ### Rate limiting Archive.org may throttle aggressive requests. Space out bulk operations and use `--retries` on downloads. --- *Last updated: 2026-02-14*