arXiv

Academic · No API Key Required · Works Globally

TL;DR

What it does: Search 2.5+ million scholarly papers across physics, math, computer science, and more. Returns title, authors, abstract, categories — no API key needed.

Quick start: http://export.arxiv.org/api/query?search_query=all:electron&start=0&max_results=1

No API key needed - just call the URL

Overview

arXiv is a free distribution service and open-access archive for 2.5+ million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, and economics. The arXiv API lets you search papers programmatically, returning metadata like title, authors, abstract, publication date, and subject categories. The API returns Atom XML (not JSON) and requires no authentication.

Live Example

Here's the exact URL to call and the real response you'll get:

The URL to call:

http://export.arxiv.org/api/query?search_query=all:electron&start=0&max_results=1
Try This URL Now →

The actual response you get (simplified):

Note: arXiv returns Atom XML, not JSON. Here's what the response looks like:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://arxiv.org/api/...</id>
  <title>arXiv Query: search_query=all:electron</title>
  <entry>
    <id>http://arxiv.org/abs/cond-mat/0011267v1</id>
    <title>The electronic structure of cuprates from high energy spectroscopy</title>
    <published>2000-11-15T16:19:15Z</published>
    <summary>We report studies of the electronic structure...</summary>
    <author>
      <name>A. Author</name>
    </author>
    <link href="https://arxiv.org/abs/cond-mat/0011267v1" rel="alternate" type="text/html"/>
    <category term="cond-mat.supr-con"/>
  </entry>
</feed>

What does this data mean?

The arXiv API returns Atom XML. Each paper is inside an <entry> element. Here's what each field means:

<feed> > <title>
Shows the query you searched for (e.g., "arXiv Query: search_query=all:electron")
<entry> > <id>
The paper's URL on arxiv.org (e.g., http://arxiv.org/abs/cond-mat/0011267v1)
<entry> > <title>
The full paper title
<entry> > <published>
Publication date in ISO 8601 format (e.g., 2000-11-15T16:19:15Z)
<entry> > <summary>
The paper abstract (may contain HTML entities and LaTeX math notation)
<entry> > <author> > <name>
Author name. Each entry can have multiple <author> elements.
<entry> > <link>
Links to the paper. Look for rel="alternate" for the abstract page. The PDF link has title="pdf" or you can append .pdf to the abstract URL.
<entry> > <category>
Subject category (e.g., cond-mat.supr-con = Superconductivity). The term attribute contains the category code.

How to use this API

Important: The arXiv API returns Atom XML, not JSON. You need to parse the XML response. Here's how:

JavaScript Example (with DOMParser)

const url = 'https://export.arxiv.org/api/query?search_query=all:electron&start=0&max_results=3';

fetch(url)
  .then(res => res.text())
  .then(str => {
    const parser = new DOMParser();
    const xml = parser.parseFromString(str, 'text/xml');
    const entries = xml.querySelectorAll('entry');
    
    entries.forEach(entry => {
      const title = entry.querySelector('title').textContent;
      const id = entry.querySelector('id').textContent;
      const summary = entry.querySelector('summary').textContent;
      const authors = [...entry.querySelectorAll('author name')]
        .map(a => a.textContent);
      const category = entry.querySelector('category').getAttribute('term');
      const pdfLink = id.replace('abs', 'pdf');
      
      console.log({ title, id, authors, category, pdfLink });
    });
  });

Python Example (with xml.etree.ElementTree)

import requests
import xml.etree.ElementTree as ET

url = "http://export.arxiv.org/api/query"
params = {
    "search_query": "all:electron",
    "start": 0,
    "max_results": 3
}

response = requests.get(url, params=params)
# Define the Atom namespace
ns = {'atom': 'http://www.w3.org/2005/Atom'}

root = ET.fromstring(response.content)
entries = root.findall('atom:entry', ns)

for entry in entries:
    title = entry.find('atom:title', ns).text.strip()
    paper_id = entry.find('atom:id', ns).text
    summary = entry.find('atom:summary', ns).text.strip()
    published = entry.find('atom:published', ns).text
    category = entry.find('atom:category', ns).get('term')
    
    authors = [a.find('atom:name', ns).text
               for a in entry.findall('atom:author', ns)]
    
    # PDF link: replace 'abs' with 'pdf' in the ID
    pdf_link = paper_id.replace('abs', 'pdf')
    
    print(f"Title: {title}")
    print(f"Authors: {', '.join(authors)}")
    print(f"Category: {category}")
    print(f"PDF: {pdf_link}")
    print("---")

Frequently Asked Questions

Do I need an API key?
No! The arXiv API is completely free and requires no API key or authentication. Just call the URL with your search query.
Is there a rate limit?
arXiv does not document a specific rate limit, but they ask users to be reasonable. For large-scale harvesting, limit requests to a moderate pace (e.g., a few requests per second) and avoid hammering the server.
How do I search with AND / OR?
Use AND and OR in your search query. For example: search_query=all:electron AND all:quantum or search_query=cat:cond-mat.supr-con. Use au: for author, ti: for title, abs: for abstract, cat: for category.
Can I get PDF links?
Yes! Each paper's <id> is the abstract URL (e.g., http://arxiv.org/abs/cond-mat/0011267v1). Replace abs with pdf to get the PDF: http://arxiv.org/pdf/cond-mat/0011267v1.
The API returns XML. How do I parse it?
See the code examples above. In JavaScript, use DOMParser after getting the response as text. In Python, use xml.etree.ElementTree with the Atom namespace http://www.w3.org/2005/Atom.
What's the maximum number of results I can get?
The maximum is 30,000 results per query (max_results parameter). Use the start parameter for pagination. The default is 10 results if not specified.
Can I use this commercially?
Yes, arXiv is an open-access archive and the API is free for personal and commercial use. Check the official arXiv documentation for any specific terms.

API Details

Base URL
http://export.arxiv.org/api/query
Documentation
https://arxiv.org/help/api
Category
Academic
Authentication
None required - completely free
Rate Limit
None documented (be reasonable)
Geographic Coverage
Global