TL;DR
The Wayback Machine API (from the Internet Archive) provides programmatic access to over 800 billion archived web pages. Query available snapshots for any URL, retrieve archived page content from specific dates, save new pages to the archive, and use CDX (Capture Index) for bulk queries. Timestamps down to the second. JSON or CDX format, no API key. Essential for web history research, broken link recovery, and historical web analysis.
Quick start: https://archive.org/wayback/available?url=example.com
No API key needed — just make a request!
How to Use This API
1. Check if a URL is Archived
https://archive.org/wayback/available?url=example.com
2. Get Closest Snapshot by Timestamp
https://archive.org/wayback/available?url=example.com×tamp=20200101
3. JavaScript — Find Historical Versions
fetch('https://archive.org/wayback/available?url=github.com')
.then(r => r.json())
.then(data => {
const snap = data.archived_snapshots;
if (snap.closest) {
console.log(`Available: ${snap.closest.url}`);
console.log(`Timestamp: ${snap.closest.timestamp}`);
console.log(`Status: ${snap.closest.status}`);
}
});
4. Python — Timeline of Snapshots
import requests
# Get CDX index for a URL (shows all snapshots)
params = {
'url': 'github.com',
'output': 'json',
'limit': 10,
'fl': 'timestamp,original,statuscode'
}
data = requests.get(
'https://web.archive.org/cdx/search/cdx', params=params
).json()
print("Snapshot timeline for github.com:")
for snap in data[1:]: # Skip header
print(f" {snap[0]}: HTTP {snap[2]}")
https://archive.org/wayback/available?url=example.com
Frequently Asked Questions
- What endpoints are available?
/wayback/available— Check availability./cdx/search/cdx— Index search./wayback/capture— Save a page (POST)./web/{timestamp}/{url}— Retrieve archived page.- How do I get a specific historical version?
- Use the Wayback Machine URL format:
https://web.archive.org/web/{timestamp}/{url}. Example:https://web.archive.org/web/20200101000000/example.comfor Jan 1, 2020. - What is CDX format?
- CDX is the Internet Archive's index format listing all captures of a URL. Fields include timestamp, original URL, MIME type, status code, digest, and length.
- Can I save a page to the archive via API?
- Yes, POST to
https://web.archive.org/save/{url}to request archiving of a URL. The archive happens asynchronously. - How far back does the archive go?
- The Wayback Machine has archived web pages since 1996, with over 800 billion URLs captured from the early web to the present.
- Are there API usage limits?
- The Wayback Machine API is a free public service from the Internet Archive, a non-profit. Reasonable usage is permitted. Bulk scraping should be done responsibly.
API Details
- API URL
https://archive.org/wayback- Documentation
- Wayback Machine API Docs
- Category
- Developer Tools
- Authentication
- Not Required
- Geographic Coverage
- Global — All archived web pages
What You Can Build
- Web history browser showing how websites have evolved over decades
- Broken link recovery tool that finds archived versions of dead pages
- Competitor website change tracker monitoring design and content updates
- Digital preservation dashboard for personal website snapshots
- Historical research platform comparing web content across different eras